Pluggable Conversion
Video transcript & code
Back in Episode #438, we created a "subscript constructor" method
…that was able to generate different concrete subclasses of an abstract Duration
class.
If it sees a string describing a period of months, it returns a Months
object.
If it sees a period of weeks, it returns a Weeks
object.
For days, a days object.
And for anything unrecognized, the ExceptionalValue
class we defined in Episode #431.
require "./models"
Duration["6 months"] # => Months[6]
Duration["12 weeks"] # => Weeks[12]
Duration["21 days"] # => Days[21]
Duration["2 seasons"]
# => #<ExceptionalValue:0x00562236617278
# @raw="2 seasons",
# @reason="Unrecognized format">
The implementation of this method is straightforward. It's all done in one big case statement in the Duration
class.
def self.[](raw_value)
return new(raw_value) if self < Duration
case raw_value
when Duration
raw_value
when /\A(\d+)\s+months\z/i
Months[$1.to_i]
when /\A(\d+)\s+weeks\z/i
Weeks[$1.to_i]
when /\A(\d+)\s+days\z/i
Days[$1.to_i]
when String
ExceptionalValue.new(raw_value, reason: "Unrecognized format")
else
fail TypeError, "Can't make a Duration from #{raw_value.inspect}"
end
end
Right now, the possible subclasses that might be instantiated are hardcoded. There may be Months
, Weeks
, or Days
, and that's it. If we decide to add a new Duration
type, we'll also have to modify this case statement in the base class.
This is actually a bit of a code smell. As a general guideline, subclasses should know about their superclasses, but superclasses shouldn't have to know anything about their children.
But a smell doesn't always indicate a problem, and in this case the abstract superclass and the concrete subclasses all live in the same file together. So long as we don't anticipate frequently adding or removing duration types, this isn't a big deal.
But for the sake of example, let's say we do plan on regularly adding new Duration
types. Let's use this abstract class as a stand-in for the all-too-common scenario where we are making frequent additions to a family of classes.
In that case, it starts to become more of a problem that this parent class has to be modified every time a new child is added or removed. We need to find a way to make the subclasses pluggable, while still enabling this parent-class factory method to locate the right concrete subclass for the given input.
One way we could do this is using naming conventions and dynamic constant lookup.
To do this, we get rid of all the subclass-specific cases.
Then we add some new code to the generic String
case. We add a condition to see if the given string matches a regular expression consisting of some decimal digits, followed by whitespace, followed by a word.
And, if that's true, we also check to see if we're able to find a class name matching the capitalized version of that word.
If so, we delegate to that class's constructor.
Otherwise, we move on to the original bad-input handling code that already lived in this case.
With this code in place, we can define a new Fortnights
duration type, and leave it completely empty.
When we attempt to convert a string specifying a number of fortnights into a Duration
, we get the correct object type back.
require "./models"
class Duration
def self.[](raw_value)
return new(raw_value) if self < Duration
case raw_value
when Duration
raw_value
when String
if raw_value =~ /\A(\d+)\s+(\w+)\z/i &&
(klass = Object.const_get($2.capitalize))
klass.new($1.to_i)
else
ExceptionalValue.new(raw_value, reason: "Unrecognized format")
end
else
fail TypeError, "Can't make a Duration from #{raw_value.inspect}"
end
end
end
class Fortnights < Duration
end
Duration["2 fortnights"]
# => Fortnights[2]
This naming-convention-based approach is one you'll see fairly often in Ruby libraries and frameworks. And in many cases, it's an appropriate strategy.
In this case though, we're trying to construct objects based on user input. Our use of a hardcoded regular expression in the superclass means that subclasses have zero leeway to customize what strings they should match. And by translating user input directly into constant lookup, this approach also potentially introduces serious security concerns.
So let's set this version aside, and explore some other possibilities.
If we're not going to automagically look up classes based on their names, we need some way to register new Duration
subclasses instead.
To make this possible, we'll first add an implementations
method to the class, which will return the @implementations
instance variable, first initializing it to an empty array if needed.
Then we'll add a register
method, which simply adds the given class to the list of implementations.
Next we update the string-matching case in our subscript factory method. We iterate over our list of implementations
.
For each class, we use a pattern
method defined on the class to get a regular expression we can use to match against input strings.
If this pattern matches the given string, we return a new instance of the class, using the first capture group from the regex match as the magnitude argument.
In order to make this approach work with our Fortnights
subclass, we first have to add the pattern
class method.
Then we have to register the class with Duration
.
Let's check to make sure all this worked.
require "./models"
class Duration
def self.[](raw_value)
return new(raw_value) if self < Duration
case raw_value
when Duration
raw_value
when String
implementations.each do |c|
if c.pattern =~ raw_value
return c.new($1.to_i)
end
end
ExceptionalValue.new(raw_value, reason: "Unrecognized format")
else
fail TypeError, "Can't make a Duration from #{raw_value.inspect}"
end
end
def self.implementations
@implementations ||= []
end
def self.register(klass)
implementations << klass
end
end
class Fortnights < Duration
def self.pattern
/\A(\d+)\s+fortnights\z/i
end
end
Duration.register(Fortnights)
Duration["2 fortnights"]
# => Fortnights[2]
Looks like it worked!
Unfortunately, there are a number of problems with this approach.
Yes, it's true that we are now delegating the pattern to be used for input matching to the concrete subclasses.
But instead of making those subclasses an active participant in the input matching process, the Duration
superclass is simply grabbing their pattern
attribute and doing the matching itself.
To make matters even worse, it then makes the bold assumption that the duration magnitude part of the input will be matched by the first capture group in the regex.
This means that implementors of Duration
subclasses have to know, not only to add a pattern
method, but that the returned regex must include a first capture group that grabs the magnitude.
This is a nasty bit of implicit coupling.
In fact, it might seem unrealistically nasty. But this state of affairs is representative of situations that often occur as the result of extracting code from one method into separate, pluggable methods. Where once this was a relatively benign dependency, one line of code depending on the line before it, there are now multiple classes bound together by unstated assumptions.
That doesn't mean that we performed the extraction incorrectly. But the question now becomes, how do we minimize the coupling between implementation classes, and the parent class they register themselves with?
That's the question we'll attempt to answer in the next episode. Until then, happy hacking!
Responses