In Progress
Unit 1, Lesson 1
In Progress

Itself

Video transcript & code

If you are lucky enough to be using Ruby 2.2, there are several new features available to you. Among these is a new method on objects called "itself".

At first glance, itself may seem like the least useful method ever added to Ruby. If you send the #itself message to a Ruby object, it returns the same object. That's it. No really, that's all it does.

The method is defined on Kernel, so it is available on pretty much any object other than BasicObject.

123.itself                      # => 123
"foo"                           # => "foo"
o = Object.new                  # => #<Object:0x007f4fe7a5ba98>
o.itself                        # => #<Object:0x007f4fe7a5ba98>

BasicObject.new.itself          # => NoMethodError: undefined method `itself' for #<BasicObject:0x007f4fe7a5b6b0>

# ~> NoMethodError
# ~> undefined method `itself' for #<BasicObject:0x007f4fe7a5b6b0>
# ~>
# ~> xmptmp-in254798vt.rb:6:in `<main>'

If you think this functionality sounds a bit less than earthshaking, you could easily be forgiven. However, this method isn't as useless as it might first appear.

Here's a convenience method called #top. The job of #top is to take a collection of objects, sort it, and return a list of the top 10 objects. For flexibility, it takes several keyword arguments. The first, on, controls what attribute of the object will be sorted on. The second, by, controls the predicate that will be used to order the sort. It defaults to the lesser-than operator. Finally, the as keyword determines which aspect of the object will be returned in the final list.

def top(things, on:, by: :<, as: nil)
  things.sort{|x, y|
    x_key = x.public_send(on)
    y_key = y.public_send(on)
    if x_key.public_send(by, y_key)
      -1
    elsif y_key.public_send(by, x_key)
      1
    else
      0
    end
  }.last(10).map{|o| as ? o.public_send(as) : o}
end

This will all be more clear if we try it out. Let's get a list of files. Then let's use top to find the top 10 files in terms of file size.

We can flip this around and find the smallest files by specifying the grater-than relationship instead of lesser-than.

Or, we can sort by a different file attribute entirely, such as the last time the file was changed.

So far, the top method has been returning arrays of Pathname objects. We can have it return strings instead, by specifying as: :to_s to set the attribute which should represent the objects in the output.

require "./top"
require "pathname"

files = Pathname.glob("/home/avdi/Dropbox/rubytapas/**/*.mp4")

top files, on: :size
# => [#<Pathname:/home/avdi/Dropbox/rubytapas/120-outside-in/screencast-20130630-2000.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-14_16.28.41.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/116-extract-command-object/screencast-20130612-1706.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/278-lazy/media/screen-capture-2015-01-14_15.25.00.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/113-p/screencast-20130604-1113.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-17_04.49.32.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/279-audited-predicate/media/footage/screen-capture-2015-01-17_04.49.32.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-08_15.46.15.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/276-fattr/media/screen-capture-2015-01-08_15.46.15.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/095-gem-love-6/screencast-20130409-1751.mp4>]

top files, on: :size, by: :>
# => [#<Pathname:/home/avdi/Dropbox/rubytapas/196-string-templates/screencast-20140319-1604.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/189-assisted-refactoring/slides.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/203-hash-table/slides.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/205-comparable/slides.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/218-spaceship-revisited/screencast-20140604-1811.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/235-load/screencast-20140807-1701.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/233-flip-flop/screencast-20140801-2156_5951.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/RubyTapas Slides.potx.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/235-load/screencast-vlc.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/127-parallel-fib/screencast-20130716-1658.mp4>]

top files, on: :ctime
# => [#<Pathname:/home/avdi/Dropbox/rubytapas/278-lazy/media/screen-capture-2015-01-14_15.25.00.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-14_16.27.55.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-14_16.28.41.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/278-lazy/Presentation1.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/278-lazy/278-lazy.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-16_18.38.06.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/279-audited-predicate/media/footage/screen-capture-2015-01-16_18.38.06.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-17_04.49.32.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/279-audited-predicate/media/footage/screen-capture-2015-01-17_04.49.32.mp4>,
#     #<Pathname:/home/avdi/Dropbox/rubytapas/279-audited-predicate/279-audited-predicate.mp4>]

top files, on: :size, as: :to_s
# => ["/home/avdi/Dropbox/rubytapas/120-outside-in/screencast-20130630-2000.mp4",
#     "/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-14_16.28.41.mp4",
#     "/home/avdi/Dropbox/rubytapas/116-extract-command-object/screencast-20130612-1706.mp4",
#     "/home/avdi/Dropbox/rubytapas/278-lazy/media/screen-capture-2015-01-14_15.25.00.mp4",
#     "/home/avdi/Dropbox/rubytapas/113-p/screencast-20130604-1113.mp4",
#     "/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-17_04.49.32.mp4",
#     "/home/avdi/Dropbox/rubytapas/279-audited-predicate/media/footage/screen-capture-2015-01-17_04.49.32.mp4",
#     "/home/avdi/Dropbox/rubytapas/footage/screen-capture-2015-01-08_15.46.15.mp4",
#     "/home/avdi/Dropbox/rubytapas/276-fattr/media/screen-capture-2015-01-08_15.46.15.mp4",
#     "/home/avdi/Dropbox/rubytapas/095-gem-love-6/screencast-20130409-1751.mp4"]

Now let's try using top on a different kind of collection. We'll read in the system dictionary, and find the top 10 longest words by sorting on the size attribute.

require "./top"

words = IO.readlines("/usr/share/dict/words")

top words, on: :size
# => ["uncharacteristically\n",
#     "counterintelligence's\n",
#     "electroencephalograms\n",
#     "electroencephalograph\n",
#     "Andrianampoinimerina's\n",
#     "counterrevolutionaries\n",
#     "counterrevolutionary's\n",
#     "electroencephalogram's\n",
#     "electroencephalographs\n",
#     "electroencephalograph's\n"]

Next we decide we want to see if we can use this same method to find the top 10 earliest words when sorted lexicographically. This means that instead of being applied to some attribute of the strings, the lesser-than comparison needs to be applied to the strings themselves.

This presents us with a bit of a conundrum. If we look back at the source of top, we can see that it always expects an on argument. And this argument should be a message which will be sent to each element in order to discover the sort key attribute.

This is the "pluggable selector" pattern, which we talked about way back in Episode #19.

Suddenly, our top method doesn't feel so flexible. But fortunately, we have a solution. For the on keyword, we can pass in the :itself message. When sent to the items in the collection, this message will result in getting the same item back, and our lexicographic sort will proceed successfully.

require "./top"

words = IO.readlines("/usr/share/dict/words")

top words, on: :itself
# => ["élan's\n",
#     "émigré\n",
#     "émigré's\n",
#     "émigrés\n",
#     "épée\n",
#     "épée's\n",
#     "épées\n",
#     "étude\n",
#     "étude's\n",
#     "études\n"]

As we look at this, we realize that this would make for a pretty good default value for the on argument. So we go ahead and update the top method.

def top(things, on: :itself, by: :<, as: nil)
  things.sort{|x, y|
    x_key = x.public_send(on)
    y_key = y.public_send(on)
    if x_key.public_send(by, y_key)
      -1
    elsif y_key.public_send(by, x_key)
      1
    else
      0
    end
  }.last(10).map{|o| as ? o.public_send(as) : o}
end

Now we can find the top 10 words lexicographically without any arguments at all.

require "./top2"

words = IO.readlines("/usr/share/dict/words")

top words
# => ["élan's\n",
#     "émigré\n",
#     "émigré's\n",
#     "émigrés\n",
#     "épée\n",
#     "épée's\n",
#     "épées\n",
#     "étude\n",
#     "étude's\n",
#     "études\n"]

Having made this change, we next realize that there we can use the same technique to simplify another part of the top code. Presently, the as parameter defaults to nil. The code then has to test the parameter to see if something other than nil has been supplied by the caller. If so, it is used as a transform on the returned items. Otherwise, the original items are used.

as ? o.public_send(as) : o

By making :itself the default for the as parameter, we can get rid of this conditional, and just apply the as value as a transformer in every case. If no special value is given, the default value of :itself will result in the original item being returned.

def top(things, on: :itself, by: :<, as: :itself)
  things.sort{|x, y|
    x_key = x.public_send(on)
    y_key = y.public_send(on)
    if x_key.public_send(by, y_key)
      -1
    elsif y_key.public_send(by, x_key)
      1
    else
      0
    end
  }.last(10).map(&as)
end

If we run it again, our lexicographic sort still works. By using the #itself message, we have preserved the original semantics, but with simpler code.

require "./top3"

words = IO.readlines("/usr/share/dict/words")

top words
# => ["élan's\n",
#     "émigré\n",
#     "émigré's\n",
#     "émigrés\n",
#     "épée\n",
#     "épée's\n",
#     "épées\n",
#     "étude\n",
#     "étude's\n",
#     "études\n"]

If you have any familiarity with functional programming, you may recognize the pattern we've used today. In functional languages, #itself is usually known as the identity function: a function that returns its argument unaltered. It's a common feature in generic code, where it is often used as an argument to functions which take other functions as transformers for data.

Of course, there is nothing special about the implementation of #itself. If it were missing from the language, we could easily supply it ourselves. Apart from the fact that this is probably less efficient than the native version, since it is implemented in Ruby instead of in raw C code, it is identical.

class Object
  def itself
    self
  end
end

But the inclusion of the #itself method into core Ruby means that if we are targeting Ruby versions 2.2 or newer, we can write code with the assumption that #itself is available. And this means we can use it to simplify methods anywhere that selectors are pluggable.

Happy hacking!

Responses