In Progress
Unit 1, Lesson 1
In Progress

Evil Monkeys

Video transcript & code

In the last episode I mentioned that the Unitwise gem has an optional core-extension library that provides a shorthand syntax for instantiating quantity objects. We can type something like 23.foot and get a Unitwise::Measurement object back.

require "unitwise/ext"

23.foot                         # => #<Unitwise::Measurement value=23 unit=foot>

I also said in that episode that I'm not a fan of extensions like this, and I'm glad that Unitwise makes it optional. Today I'd like to expand on that opinion a bit.

First off, a quick refresher. Ruby has what are known as "open classes". Any class can be re-opened at any time, and methods changed or new methods added. For instance, let's add a new method to Ruby's Numeric class, which is inherited by all core numeric types.

class Numeric
  def cow
    puts `cowsay #{self} is my favorite number!`
  end
end

42.cow
# >>  ___________________________
# >> < 42 is my favorite number! >
# >>  ---------------------------
# >>         \   ^__^
# >>          \  (oo)\_______
# >>             (__)\       )\/\
# >>                 ||----w |
# >>                 ||     ||

This technique of adding or changing methods to existing classes is colloquially known as "monkey patching". It is an tremendously powerful technique. It enables us to mold the language to our needs. It also gives us a tool of last resort when some library is broken and we need to fix it faster than we can get our hands on a new version.

But as usual, with great power comes great danger. Nowhere is that more true than when monkey-patching core or third-party classes in Ruby.

First off, the way monkey-patched methods are implemented is often confusing to newcomers. Let's say we're new to a project, and we run across some code that uses the Unitwise shorthand syntax.

require "unitwise/ext"

# ...

23.foot                         # => #<Unitwise::Measurement value=23 unit=foot>

"Hmm", we think. "I wonder where that method comes from?". First off, let's grab a method object for it which we can then query for source location.

require "unitwise/ext"
23.method(:foot)           # => 
# ~> -:2:in `method': undefined method `foot' for class `Fixnum' (NameError)
# ~>    from -:2:in `<main>'

Wait, undefined method? That's weird. Let's ask the object if it responds to :foot.

require "unitwise/ext"
23.respond_to?(:foot)           # => false

OK, this is getting weirder and weirder. Is it even in the method list?

require "unitwise/ext"
23.methods.grep(:foot)          # => []

So the method doesn't exist… except when we send the message, and then it does.

Now, some of this can be chalked up to how the Unitwise dynamic method lookup is currently implemented, and could be fixed up to be slightly less surprising. But this is a common problem with core extensions of this nature, where the intention is to add a whole range of dynamically-discovered methods in order to provide a sort of "little language" for special values. It takes a fair amount of metaprogramming effort to come up with dynamic methods which behave in every way like built-in methods.

Of course, one of the alternatives to the current situation is that we see dozens or perhaps hundreds of extra methods listed, most of which we will never care about, when we send the methods message. I'm not sure that this is preferable.

But when it comes to monkey patching, the discover-ability of added methods is actually the least of our problems. A much more serious danger is method collisions.

Consider this scenario. We're running a shipping business, and we use Unitwise to represent the size and weights of batches.

require "unitwise/ext"

weight = 150.pound
weight # => #<Unitwise::Measurement value=150 unit=pound>

We also use a separate Money library for managing accounts.

class Money
  def initialize(amount, currency)
    @amount   = amount
    @currency = currency
  end

  # ...
end

One day the author of the money library gets clever and decides to add some shorthand methods to Numeric to facilitate creating new Money objects. This includes a #pound method for representing amounts in Pounds Sterling.

require "unitwise/ext"

class Money
  def initialize(amount, currency)
    @amount   = amount
    @currency = currency
  end

  # ...
end

class Numeric
  def pound
    Money.new(self, :pounds_sterling)
  end
end

weight = 150.pound
weight # => #<Money:0x00000001aca4f8 @amount=150, @currency=:pounds_sterling>

We update to the new version of the library and bang, just like that, we have a bug. Depending on how similar the Money objects and Unitwise objects behave, the bug may be more or less easy to find.

Now imagine a situation where depending on how the system's tests are run, they may load the libraries in a different order than they are loaded in production. So sometimes the bug shows up, and sometimes it doesn't, depending on which library "wins".

Now take a moment to breathe slowly and let your blood pressure get back to normal.

This example illustrates an important point about monkey patching. I've sometimes seen it said that "monkey patching" only refers to overriding existing methods, not to adding new methods. But this is a distinction without a difference. As this example demonstrates, every method addition is a method override waiting to happen.

Looking at an individual case, it may seem unlikely that a given method will ever clash with another library's patch. I can only tell you what my experience has taught me. Over the decade or so I've worked in Ruby, I have seen monkey-patching method collisions cause problems over, and over, and over again in many different projects.

One early example that comes to mind is this one: back in the Ruby 1.8 days, there was no built-in String#chars method. The ActiveSupport library added this method, which returns a list of the individual characters in the string.

"foo".chars                     # => ["f", "o", "o"]

Then Ruby 1.8.7 came along. The Ruby core team decided that #chars was a good idea. Not only did they add it to Ruby 1.9, but they backported it to 1.8.7. Unfortunately, their version was subtly different from the ActiveSupport patched version. It returned an Enumerator instead of an Array. In an application I worked on this incompatibility caused some rather difficult-to-diagnose bugs deep in the bowels of Ruby on Rails when I upgraded to Ruby 1.9.

This was before Ruby made any attempt to comply with semantic versioning. However, it's important to note that from Ruby's perspective, this should not have been a breaking API change. It was the addition of a feature, rather than a change to an existing feature. If Rails had not depended on a core extension, and had instead supplied its "chars" method in some non-intrusive way, this would never have caused a problem.

So what can we do to avoid these kinds of extension clashes? I've seen code that checks to see if a method already exists before patching, although as we've already seen, asking if a method exists is unreliable in the presence of other patches. I've seen some elaborate strategies employed that involve verifying that the class to be patches has not deviated in any way from the original before applying a patch. All of these methods seem to me to be overcomplicated and easy to mess up.

The other solution is much simpler: don't extend classes we don't own. This can mean a slight reduction in the fluency of code, but to a lesser degree than we might imagine.

Personally, I don't think that typing out the Unitwise conversion function is all that onerous, and it's wonderfully unambiguous. Take one look at this code and you know exactly what library the resulting object comes from.

require "unitwise"

Unitwise(23, "foot")
# => #<Unitwise::Measurement value=23 unit=foot>

But if we want to make this a little shorter to type, we can always define some local aliases in our own code. For instance, we might alias Unitwise to just U.

require "unitwise"

module Units
  alias U Unitwise
end

include Units

U(23, "foot")
# => #<Unitwise::Measurement value=23 unit=foot>

Or, if we mainly work with just a few units, we could define some specialized conversion functions for those units.

require "unitwise"

module Units
  def Feet(value) Unitwise(value, "foot") end
end

include Units

Feet(23)
# => #<Unitwise::Measurement value=23 unit=foot>

If I find myself wanting to perform a particular operation on a core type in many different places in a codebase, I'll often just define that operation in a helpers module and include that module everywhere I need to. For instance, if I find myself wanting to reduce a string to a simplified "slug" in many different contexts, I might put a #slug method in a StringHelpers module.

module StringHelpers
  def slug(str)
    str.to_s.downcase.tr_s("^a-z0-9", " ").strip.tr(" ", "-")
  end
end

include StringHelpers

slug("All mimsy were the borogroves") # => "all-mimsy-were-the-borogroves"

Another approach is to wrap objects in delegator objects which add some extra methods on top. We can quickly convert the StringHelpers module to such a wrapper with the help of Ruby's delegate library.

require "delegate"

class StringHelpers < DelegateClass(String)
  def to_slug
    downcase.tr_s("^a-z0-9", " ").strip.tr(" ", "-")
  end
end

line = "All mimsy were the borogroves"
line = StringHelpers.new(line)
line.to_slug                          # => "all-mimsy-were-the-borogroves"

I don't use this pattern that often, but it can occasionally be useful.

A more exotic approach is to locally extend individual instances with our patches, rather than changing the global class.

module StringHelpers
  def to_slug
    downcase.tr_s("^a-z0-9", " ").strip.tr(" ", "-")
  end
end

line = "All mimsy were the borogroves"
line.extend(StringHelpers)
line.to_slug                          # => "all-mimsy-were-the-borogroves"

What all of these approaches share in common is the locality of the changes they make. To quote Einstein, there's no "spooky action at a distance"  going on here.

I don't want to end on a completely pessimistic note about extending core classes. One of the lessons Ruby learned from Lisp is the idea that you should be able to build your language up to match its domain. Open classes are one of the ways Ruby lives up to this ideal.

Rails is actually a great example of this. It is effectively a collection of domain-specific languages for building web applications. As such, it extends core classes extensively in order to make them more amenable to the needs of web application developers.

As we saw earlier, this can sometimes lead to problems. But when it works, it works really well. And when it breaks, Rails has a small army of active maintainers ready to do the heavy lifting to find and fix the problem.

Likewise, within our own applications we should feel empowered to extend the language towards our domain if it results in significant gains in clarity and expressiveness. But when we do, we need to keep one very important point in mind. Every extension we make to core or third-party classes burdens us with a new responsibility: the responsibility to explain and maintain that extension for every other user of the codebase.

Happy hacking!

Responses