In Progress
Unit 1, Lesson 1
In Progress

Excessive Decoupling

Video transcript & code

Here's some code you might find familiar.

Back in Episode #443, after experimenting with a few different designs for extensibility, we settled on this one for plugging in new kinds of Duration class.

We have a list of known implementations…

…and for each implementation, we send it the try_parse message, looking for the one—if any—that returns a non-nil value.

class Duration < WholeValue
  # ...
  def self.[](raw_value)
    return new(raw_value) if self < Duration
    case raw_value
    when Duration
      raw_value
    when String
      implementations.detect{|c|
        value = c.try_parse(raw_value) and break value
      } || ExceptionalValue.new(raw_value, reason: "Unrecognized format")
    else
      fail TypeError, "Can't make a Duration from #{raw_value.inspect}"
    end
  end
  # ...
end

In order to add new kinds of duration, we just have to register their classes with the Duration superclass.

Duration.register(Days)
Duration.register(Weeks)
Duration.register(Months)

As you might recall from that episode, this design was the culmination of a successive series of experiments. Each experiment increased the level of decoupling over the preceding one. In the approach we wound up with, the Duration class has no hardcoded knowledge of concrete implementation classes.

And the only thing it knows about the classes in its implementation list is that they respond to try_parse.

Have we taken this code to the limits of decoupling, or could we go further?

Well as a general rule, we can always add new layers of decoupling to software. The only question is, is it worth it?

For instance, having taken things this far, here's a next step that I would typically consider. We've said that the implementations list must contain things that respond to the message try_parse. But try_parse has the expected semantics of a very simple function: given a single argument, it returns either nil or a Duration object.

This is effectively a mapping function. And when we map inputs to values in Ruby, a lot of times we do it with anonymous blocks of code.

For instance, if we want to map a list of integers to a list of their square roots, we send the map message with a block that performs the intended transformation.

[4, 5, 16, 19].map{|n| value = Math.sqrt(n) }
# => [2.0, 2.23606797749979, 4.0, 4.358898943540674]

In a sense, the try_parse message is just another transformation. Looking at from this perspective, we could envision a design where we register a new duration type by simply supplying a block that will map from inputs to either duration or nil.

class Fortnights < Duration
end

Duration.register{|raw_value|
  if (match = /\A(\d+)\s+fortnights\z/i.match(raw_value))
    Fortnights.new(match[1].to_i)
  end
}

Let's go ahead and make the changes to make this code work.

Instead of an implementations list, we'll have a parser_list.

The register method will take a block, using the ampersand operator to automatically convert the block to a Proc object.

It will add this Proc to the list of parsers.

Now we need to change the subscript construction method to use this parser list instead of the old implementation list.

Inside the detect block, we change from sending try_parse to an implementation, to sending call to the parser.

Remember, we now expect the parser to be a Proc or some similarly call-able object.

require "./models"
class Duration
  def self.parser_list
    (@parser_list ||= [])
  end

  def self.register(&parser)
    parser_list << parser
  end

  def self.[](raw_value)
    return new(raw_value) if self < Duration
    case raw_value
    when Duration
      raw_value
    when String
      parser_list.detect{|p|
        value = p.call(raw_value) and break value
      } || ExceptionalValue.new(raw_value, reason: "Unrecognized format")
    else
      fail TypeError, "Can't make a Duration from #{raw_value.inspect}"
    end
  end
  # ...
end

Let's make sure this works.

Duration["5 fortnights"]
# => Fortnights[5]

Yep, it looks like our registration of a Fortnight conversion proc worked just fine!

We're now sending the very generic call message, instead of the domain-specific try_parse message.

But there's also a less obvious way in which we've further decoupled our code. We've now separated the process of mapping input into a Fortnight object…

…from the actual definition of the Fortnight class.

The Fortnight class is no longer required to know about possible user input formats.

What about the other Duration subtypes we already have defined?

They already have try_parse methods defined.

Do we have to move them all out into anonymous blocks?

Thankfully, we don't. We can register the try_parse class methods as if they were blocks. We do this by getting references to the method objects using the method message, and then using the ampersand operator to use these objects in place of the block.

Duration.register(&Days.method(:try_parse))
Duration.register(&Weeks.method(:try_parse))
Duration.register(&Months.method(:try_parse))

We can now instantiate our traditional set of duration types.

require "./models2"
require "./registration2"

Duration["5 days"]              # => Days[5]
Duration["10 weeks"]            # => Weeks[10]

So, we've further decoupled our design, but we've been able to re-use our old try_parse methods. Is there any down side to this new version?

Let's take a look at the implementation list from our original Duration class.

require "./models"
require "./registration"

Duration.implementations
# => [Days, Weeks, Months]

We see a list of the implementation classes.

Now let's examine the contents of the new parser_list.

require "./models2"
require "./registration2"

Duration.parser_list
# => [#<Proc:0x0055fd14e99f50@/home/avdi/Dropbox/rubytapas-shared/working-episodes/445-excessive-decoupling/registration2.rb:4>,
#     #<Proc:0x0055fd14e99eb0 (lambda)>,
#     #<Proc:0x0055fd14e99aa0 (lambda)>,
#     #<Proc:0x0055fd14e999d8 (lambda)>]

This time, we see a list of Proc objects. Only one of them indicates what file it is defined in. None of them give any indication of what they do. It turns out that anonymous blocks of code are pretty, well… anonymous.

This is our first indication that there may be some drawbacks to the level of decoupling we've now introduced.

Now imagine that we weren't the ones to write the Duration class. But we've been tasked with adding a new kind of duration. If we examine the code that uses the parser_list, we see that it expects the parsers to be call-able.

This gives us zero clue about what the expected semantics are, and no real hint as to how to find out. We're dependent on whatever documentation the original writers left us.

Now imagine we're trying to add a new duration type, but we're working with the old version of Duration. We naively create a new class, and register it.

Then we try to parse some test input text.

require "./models"

class Parsecs
end

Duration.register(Parsecs)

kessel_run = Duration["11 parsecs"] # ~> NoMethodError: undefined method `try_parse' for Parsecs:Class
# =>

# ~> NoMethodError
# ~> undefined method `try_parse' for Parsecs:Class

Immediately we see the code is expecting a try_parse class method.

How should we write such a method? Well, it's very easy for us to search through some of the existing duration types for the try_parse method and see how they define it. Then we can use what we learn to inform how we write our own implementation.

What we see from these last couple of examples is that the more-coupled version is also a lot more accessible. It provides affordances for later programmers which are missing from the extreme-decoupling version.

The key to this accessibility is the prominence of names. The implementations list returned a list of class names. The NoMethodError from a missing try_parse method gave us an obvious clue, in the form of a message name, about what code we should be looking at for examples.

Remember, the computer doesn't care if anything has a name. As far as it is concerned, everything could just be anonymous functions calling anonymous functions forever. We connect things with names instead of with anonymous function references in order that we humans can make some sense of the way our programs are connected together.

There are cases where the generic nature of the call-able object concept buys us so much desirable flexibility, that it's worth the reduction in trace-ability. We can see this sometimes with callbacks or "hooks", as well as with operations on data structures. We explored some of the virtues of the call-able concept back in Episode #35.

But in this case, I don't think the extra decoupling of block-based conversion registration nets us enough advantages to make up for the reduced accessibility. Sometimes, we need to resist the siren song of ever-increasing decoupling, and find a happy midpoint. I think this is one of those times. Happy hacking!

Responses