Blocks In Ruby

YukihiroMatsumoto on Blocks and Closures:

http://www.artima.com/intv/closures.html

A discussion of LexicalClosures and Blocks at RubyGarden's wiki:

http://wiki.rubygarden.org/Ruby/page/show/ClosuresAndBlocks

A small note before we get all excited about domain languages: In practice, few Ruby programs come even close to defining a new language. But they do often take much advantage of the available flexibility.

I disagree. Ruby can make domain specific languages. They are limited because of Ruby syntax but often appear quite convincing, and because of ruby's relatively flexible syntax it is a common idiom to overload a block of code to do more than one thing depending on the context it is interpreted in. For example, defining a database table specification that when loaded in one program dumps out a SQL construction statement and when loaded in another acts as a front-end and container class. Ruby meta-programming is different, that's all.

Yes, unfortunately RubyLanguage is not quite flexible enough to really allow the programmer to take full advantage of its blocks and continuations.

So, if it's not for language extension, what is it used for, typically?

Simple control structures such as:

Event handling:

 button.on_click do |event|
...callback code...
 end

Resource management:

 File.open(filename) do |file|
...read from file...
 end
 # the file has been closed

Iteration:

 collection.each do |element|
...process element of collection...
 end

 xml_doc.lookup(xpath_expression) do |element|
...process element matching xpath_expression...
 end

HigherOrderFunctions:

 numbers = [1,2,3,4]
 squares = numbers.map {|n| n*n }

Concurrency:

fork do ...this block runs in another process... end

Thread.new do ...this block runs in another thread... end

Synchronization:

 mutex = Mutex.new
 condition = Condition.new(mutex)

 mutex.locked do
print "in critical section"
condition.wait
print "condition signalled"
 end

Finalisers:

 ObjectSpace.define_finalizer( object, create_finalizer(object.resource) )

 def create_finalizer(resource)
proc do |id|
 # this block is called after the object owning the resource
 # has been garbage collected.

 free_resource( resource )
end
 end

Simple, yet cool multi level caching mechanism:

 def cache(cache_id)
   # pseudocode
   if cached
     return cached
   else
     result = yield
     store_in_cache(result)
     return result
   end
 end

 # usage
 result = cache('main') {
   # get all code
   # for many sections
   data = cache('section' + section_id.to_s) {
     some_item = cache('item' + item_id.to_s) {
       # get item
     }
   }
 }

Etc. etc.

Those are excellent examples of what you can do with FirstClass LexicalClosures.

OK, but I think they _are_ language extensions...

Yes, but Ruby is limited to quite simple language extensions because methods cannot take more than one block in a way that matches the syntax of the built in ControlStructures?. Using the same syntax for built-in and user-defined control structures is what makes LispLanguage and SmalltalkLanguage so expressive.

A major disappointment with LispLanguage for me was always that lists, hashes, and arrays are not treated in a uniform way, and you need special-purpose code for each.

This is quite literally nonsense. Arrays and lists are both sequences in CommonLisp, which can both be addressed in the same way. A hashtable is a completely different kind of beast from a sequence, so it makes no sense to treat it in the same way as a sequence. Of course, one could wrap the underlying data structure to produce a sequence.

Ruby and Python both unify these data structures in a much more useful way. I am still recovering from the fact that Ruby gives you at least three(!) different notations for sequence slicing. :-)

Actually sequence slicing is not something special in ruby (like in, say, python). The [] and []= are just operators, and are actually used in many different places (i.e. continuation calling). Not that this relates to the topic, anyway ;)

I've never used SmalltalkLanguage, but I have used Lisp quite a bit, and written and read code that does interesting things like create new control structures using macros, and frankly the Lisp is much more time-consuming to understand. Sure, once I understand it, it seems natural and simple, but it's the time and difficulty of that understanding that I think many programmers cannot get past. Ruby may suffer from a similar problem, but maybe less so than Lisp and Smalltalk since its syntax is, on the surface, a little less alien. Anyways, I got the impression that Ruby was meant to be a scripting language, and so I wonder if the effort involved in creating and testing new control structures is worth it? With Ruby, some people seem to be claiming that you first sit down and ask yourself a question like "which control structures do I need for solving this problem?", and then go and write and test them, and then write your program on top of that. It could be argued that the time a Ruby programmers spends creating just the right data structure is more productively (but, alas, less entertainingly) spent on better documentation or more UnitTests.

I had some fun writing my own iterators. I extended the String class to have an each_word iterator, and wrote a little JeffK filter. It's on one of those Ruby wikis somewhere. -- NickBensema

I've never come across anyone in the Ruby community claiming that you should first create "control structures" and then your program on top of that. Regarding effort, an experienced OO programmer will find that writing their Ruby scripts in a pretty OO-style isn't much more trouble than writing them in the procedural perl-style. -- AndersBengtsson

You don't sit down and decide which ControlStructures? you need, they come about through refactoring. Just as you refactor duplicated code in methods by creating a method in a shared base class (or module in Ruby), so you refactor duplicated code that uses existing control structures into new ControlStructures?. The two refactorings are really the same, because new ControlStructures? are just methods that take a block parameter.

Also, it is not difficult to create new ControlStructures?. It really is no harder than writing a method. If you need to iterate over all children in a composite structure, for example, it is much, much easier in Ruby to implement that as a method that takes a block (a new control structure), than it is to write a new iterator class as one would have to do in JavaLanguage or PythonLanguage, or collect all the children into a list and return that.

A block (in RubyLanguage) is a form of CoRoutine. A Ruby block can be turned into a function object by calling a built in function. Because Ruby is dynamically typed, any object that answers the "call" method is a function. So really, a Ruby block and a function object have the same semantic effect, but illustrate a syntactic DesignBurp.

A RubyLanguage block is not a coroutine, but merely an anonymous function, almost identical to the SmalltalkLanguage version. If a block is never referenced as an object, then Ruby doesn't bother to reify a full object for it ... but that's an implementation detail that most programmers can ignore.

Unfortunately it is an implementation detail that one cannot ignore, because blocks are not constructed as objects, are not passed to methods in the same way as other parameters and are not invoked by message passing in the same way as other objects.

Where on earth did you get the idea that blocks are not objects? They exist as objects in the internals when the distinction is important. Blocks are special cases of procs, which the internals handle differently but abstractly makes no difference. You just use the keyword "lambda" or "proc" to get the object reference of a block (or use a parameter with the & qualifier), and these blocks are honest-to-god closures.

The "block_given?/yield" notation can be regarded syntactic sugar for an ugly def prototype to make an optional block in a function. Under-the-covers, Ruby does some things to speed up inline blocks when you never request an object reference for them, but this is an implementation detail.

What I don't get is why the creator of the language let so many bad decisions build up? No way to make properties --> allow methods to be called without parentheses, thus you can say y = x.attr where attr is actually an accessor method, but it LOOKS like an attribute, which looks good. But then methods (even global methods (functions)) can't be first class, because there is no difference between method and method() --> add blocks and procs to "fix" problem.

Heck, ScalaLanguage even allows both object.method or object.method(), but it doesn't have this problem, as you can also write val x = ((a:Int)=>a+1); print x(1) without any trouble. Probably, adding this to Ruby would require some sort of parser + type-analysis, though. Hmm...

That means that we will always have to use object.call(x), but methods are always just written method x, which seems inconsistant. Wouldn't it be easier to just add a facility to make properties, rather than make ALL methods second class? What happened to "Everything is an object"?

Observe this code and its behavior in IRB:

 def magic( &b )
   if block_given? 
     puts "Block given!" 
   end
   return b
 end

 irb(main):011:0> magic {|x| x} 
 Block given!
 => #<Proc:0x401f265c@(irb):11> # Returned our block's object reference
 irb(main):012:0> magic # No block given, so we return nil
 => nil
 irb(main):015:0> my_proc = lambda { |x| x }
 => #<Proc:0x401b7a94@(irb):15>
 irb(main):016:0> magic my_proc
 ArgumentError?: wrong number of arguments (1 for 0)
 # This is significant. The only thing people need to be aware of is that the "&b" notation and block_given? look in a special block argument "slot", which you have to be careful to fill.
 irb(main):017:0> magic &my_proc # & makes this work as expected because we put it in the right "spot"
 Block given!
 => #<Proc:0x401b7a94@(irb):15>

There is at least one place where I have found the ampersand to be very useful: a method that takes an optional block and has multiple optional parameters, and in turn passes some of them, together with the optional block to an inner helper method. Firstly, even though the block is collected and passed using the ampersand notation, the receiving helper method continues to use the same _block_given?_ test. Secondly, the actual block is not a positional parameter, which is important owing to the other multiple optional parameters. -- SrinivasJonnalagadda

In BlocksInPython, someone said you could implement try/catch with blocks if it didn't exist. I would be very interested if some Ruby guru would show TryCatchWithRubyBlocks.

It's quite easy to create a construct with the following syntax in Ruby:

 my_try {
...some code...
 }.my_catch(MyExceptionClass?) {|e|
...exception handling code...
 }

The exception gets passed into the second block, if it matches the class (classes are objects in Ruby). Of course, to implement this, some stack-unwinding primitive has to exist in the language already... -- VladimirSlepnev?

One [stack-unwinding primitive] does: continuations.

class TryBlock?
$escape = []
def initialize(proc)
@exception = callcc {|cont|
$escape.push cont
@result = proc.call
cont nil
}
end
def except(ex_type)
if @exception and @exception.instanceof? ex_type
yield @exception
@exception = nil
end
                        return self
end
def finally
yield
try_end
end
def try_end
do_raise @exception if @exception
                        return @result
end
end

def my_try(&block)
TryBlock?.new block
end
def do_raise(exception)
if ($escape)
$escape.pop.call exception
else
exit(1)
end
end
        my_try {
             ...
        }.except(ExceptionType?) do |x|
             <handle>
        }.finally {
             <...>
        }

Horrendously slow, doesn't work in threads, uses a global variable (shudder), and is completely untested.

But it should work.

Oh, some other examples of how blocks are useful.

Unit testing:

 assert_raises(SomeExceptionClass?) do
...the code that should raise the exception...
 end

Custom sorting (closures can take 2 parameters):

 my_array.sort {|x,y| y <=> x }  # sorts an array in reverse order

made easier in 1.8: silently applying a SchwartzianTransform:

 my_array.sort_by { |x| x.someproperty }

Database queries:

 dbh.select_all("select name, phone from phonebook") do |name, phone|
...this code will be executed for each row selected...
 end

String processing beyond regexps:

 "hello".gsub(/./) {|s| s[0].to_s + ' '}  # converts each char to its code, from Pickaxe book

    Python equivalent is terse enough:
 ' '.join([str(ord(x)) for x in 'hello'])

Match the terseness of this stuff in Python. -- VladimirSlepnev?

Since when is terseness considered to be a feature? I found the block syntax difficult to read. I fully admit that it is because I haven't lived it for any length of time. Explain to me, in simple terms, how blocks should introduced to a novice programmer. I don't happen to be a novice, but I am often in the role of tutoring novices and have found Python to be much easier to teach than either Ruby or Perl (or, for that matter, Java or C).

I for one would assert that the distinction between BlocksInRuby and AnonymousFunctions? (also in Ruby) is somewhat unfortunate, because blocks are just a special kind of AnonymousFunction / LexicalClosure. They're too much special syntax for the common case that you want to pass just one closure to a function, as the last parameter. Just for this special case, a syntax is invented that lets you write

  f(x,y,..) {|a,b,..| code... }

instead of

  f(x,y,.., proc{|a,b,..| code... })

. (and, a special keyword "yield" is invented that you have to use instead of the usual proc.call method to call that "last parameter"-closure from inside the function)

Why not just generalize this so you can use "{|a,b,..| code... }" instead of "proc{|a,b,..| code... }" to create any kind of AnonymousFunction? That way, one could probably retain the current syntax completely, but generalize it to cases where you want to pass more than one closure to a function, and/or pass closures as something other than the last parameter. In such cases, you currently have to pass all but the last parameter as a proc{...}, and then call them using .call (instead of "yield"). -- OlafKlischat

Actually the special case of f(parameters, block) is one of the most important special cases of all since it pops up over and over again. It makes sense to write a special syntactic sugar just for it. Every control structure is essentially of the form

function someparameters someblock

This is true for the following control structures: for while loop if etc

The genius of Matz was to figure this out and basically refactor it into a syntax. The reason so many of the ruby examples are so beautiful and easy to understand is because they are very close to being like normal control structures. Thus ruby makes it easy to create what appear to be new control structures. On the other hand the more general case of passing several blocks to a function is a little more complicated. But so what? This case occurs far more rarely so what is the use of optimizing for it?

CategoryRuby CategoryFunctionalProgramming CategoryClosure