In Progress
Unit 1, Lesson 21
In Progress

Enumerable Internals

Video transcript & code

Hi, this is Pat Shaughnessy.

One of the most useful features of Ruby is the way you can iterate over a series of values and process each one with a block. Let’s take a simple example in IRB.

Here I’m using the Enumerable#all? method to check whether all of the array’s elements are even. Since they’re not all even, I get a result of false. This looks simple enough, but how does this actually work? How do the values get from the array into the block?

The best way to get a deeper understanding of how Ruby itself works is to look at Ruby’s own C source code. This is the enum.c file, which defines the Enumerable module. To find your way around, search for the Init_Enumerable method at the bottom of the file.

Here you’ll see a series of calls to rb_define_method, one for each method of Enumerable. If we search for all, we’ll see that it’s defined by the enum_all C function. Each built-in Ruby method has a corresponding C function.

Now if I search for enum_all using vim, I’ll find the actual implementation of Enumerable#all somewhere above in the middle of enum.c. It’s easy to find where Ruby implements built in methods - the hard part is understanding what the C code actually means. Here you can see there are 3 lines of code; the important one is the second one, the line that contains rb_block_call. Let’s read through this line carefully.

rb_block_call means: Take the first parameter, the receiver, and call a given method, the second parameter on it. In this case, Ruby calls each on obj, the array from my example. The third and fourth parameters, zeroes in this case, would specify method parameters. The last two values, ENUMFUNC and memo, specify a block and block parameter to use for the call to each. This is actually a block built into Ruby for all to use, not the block I wrote that checks whether numbers are even.

So you can see under the hood all? works by calling each on the array. If you search for id_each here in enum.c, you’ll see all the different Enumerable methods work the same way, by calling each on the receiver or target object. You can think of the Enumerable module as a collection of fancy ways to call each.

Sometimes the best way to get a sense of how Ruby works is to see it in action. Let’s go back to the console and re-run my simple example in GDB, a C debugger. Here you can see my example code is in all.rb, and if I re-run it I get the same result. But now let’s run GDB.

The first thing I’ll do is put a breakpoint using the b comment on the enum_all function we were just reading. And then I’ll run my program using the r command. You can see right away Ruby hits my breakpoint.

Now let’s set another breakpoint - this time on a C function called all_iter_i. As we’ll see later, this C function implements that internal block Enumerable#all? calls from rb_block_call.

If we let Ruby continue, we’ll hit the new breakpoint. Now I’ll ask GDB for a C backtrace - the equivalent of calling puts caller in a Ruby program. Since this is kind of hard to read here, let’s paste it into VIM so the lines don’t wrap.

Now the backtrace is a bit easier to read. You can see the call to enum_all in the middle, and at the top we see Ruby calls all_iter_i.

It’s still a little bit hard to read this, so let’s paste it into a slide where I have a diagram that shows the different types of C functions Ruby is using. At the bottom you can see Ruby calls enum_all while running my simple example program. Then moving up, you can see enum_all calls rb_block_call, which we saw earlier, which in turns calls a series of internal, YARV functions. YARV is Ruby’s virtual machine: the real guts of Ruby’s implementation.

Moving up to the center of the slide, you can see Ruby calls rb_ary_each, which implements Array#each. So far so good, we can see Enumerable#all? calls each on my object, the example array.

Moving even farther up, we can see Array#each calls rb_yield; in order words, Array#each yields to a block.

And at the top of the stack, we can see the block that Array#each yielded to is an internal block inside of Enumerable#all? implemented by the all_iter_i C function. We’ll look at that in just a moment. If I let the program continue, this internal block would, in turn, yield to the Ruby block I wrote that checks whether the numbers were even. Now’s let’s return to enum.c and take a look at all_iter_i.

We can find the definition of all_iter_i by searching for ENUMFUNC above where we saw it earlier. Here you can see a lot of complex C macros, which I don’t have time to explain today.

But the line I highlighted in the center of the screen defines all_iter_i. You can see it calls enum_yield, yielding to my block about the even numbers, and then passes my block’s return value to the C function at the bottom of the screen. This uses RTEST to see whether my block returned a true or false. If it was false, Ruby call rb_iter_break, breaking out of the loop started by Array#each.

Anyway that’s all the time I have today, but I hope you had fun looking at how Enumerable#all? works under the hood. The next time you have a question about why Ruby works the way that it does, don’t be afraid to dive into Ruby’s C source code to find out!

Responses