Break The Circle
Video transcript & code
In episode #380, we looked at a fragment of a class I had been working on. The class had a problem: a circular dependency that I had inadvertently introduced. As a consequence, under certain circumstances, it was getting into an infinite recursion and blowing the stack.
Here's the code.
The specific problem is this: in the orgmode_lisp_dir
method, so long as the user explicitly sets a special environment variable, QUARTO_ORGMODE_VENDOR_LISP_DIR
, the code works fine.
But when this variable is unset, the code falls back to deriving the lisp directory value from a vendor_dir
method elsewhere in the class.
Unfortunately, vendor_dir
has been changed to delegate back to orgmode_lisp_dir
. So when the environment variable hasn't been set up, this code calls itself back and forth until it runs out of stack frames.
#<<deps>>
module Quarto
class Orgmode < Plugin
# ---- 74 lines snipped ----
def version
"8.0.7"
end
# --- 8 lines snipped ----
def orgmode_lisp_dir
ENV.fetch("QUARTO_ORGMODE_VENDOR_LISP_DIR") {
"#{vendor_dir}/lisp"
}
end
# ---- 15 lines snipped ----
def vendor_dir
Pathname(orgmode_lisp_dir).parent.to_s
end
# ---- 70 lines snipped
end
end
My first thought was simply to revert the change that introduced this bug.
Here's the reverted version.
This code depends on an external, global vendor_dir
in order to derive a plugin-specific vendor directory. There's no circular reference back to #orgmode_lisp_dir
.
#<<original>>
module Quarto
class Orgmode < Plugin
# ...
def vendor_dir
(Pathname(main.vendor_dir) + "org-#{version}").to_s
end
# ...
end
end
#<<all_tests>
Unfortunately, this version doesn't capture the semantics I want.
Let's talk about those semantics for a minute. The core of the problem we're dealing with is that there may be one of two different sources of truth for this code to derive paths from. If the user has done no special configuration, then the prime source of truth is the globally configured vendor directory. Both the plugin-specific vendor_dir
and the orgmode_lisp_dir
should be derived from this value.
But, if the user specifically configures a particular QUARTO_ORGMODE_VENDOR_LISP_DIR
, then that value becomes the primary source of truth. Now, instead of the lisp directory being derived from the vendor directory, the reverse is true: the vendor directory should be extrapolated from the lisp directory.
We can talk about whether this is an ideal system design another time. The important thing right now is that this kind of behavior, where the source of truth changes depending on what the user has configured, is a requirement that comes up from time to time in application construction. So it's worth thinking about how to solve this problem cleanly.
In order to express the true intended semantics of the class, we need to introduce a conditional to the vendor_dir
method. We need to check if the user has configured QUARTO_ORGMODE_VENDOR_LISP_DIR
. If so, we need to treat it as the source of truth. Otherwise, we need to build our result on the global vendor directory.
#<<original>>
module Quarto
class Orgmode < Plugin
# ...
def vendor_dir
if ENV.key?("QUARTO_ORGMODE_VENDOR_LISP_DIR")
Pathname(ENV["QUARTO_ORGMODE_VENDOR_LISP_DIR"]).parent.to_s
else
(Pathname(main.vendor_dir) + "org-#{version}").to_s
end
end
# ...
end
end
#<<all_tests>>
Let's look at the two methods we've been talking about. What do they now share in common?
It might not be immediately obvious, but both of these methods now share a common structure. They are both switching their behavior on the presence or absence of the same environment variable.
Let's make this sameness more obvious, by converting orgmode_lisp_dir
from using the #fetch
method to using an explicit if/else
conditional.
#<<original>>
module Quarto
class Orgmode < Plugin
# ...
def orgmode_lisp_dir
if ENV.key?("QUARTO_ORGMODE_VENDOR_LISP_DIR")
ENV["QUARTO_ORGMODE_VENDOR_LISP_DIR"]
else
"#{vendor_dir}/lisp"
end
end
# ...
def vendor_dir
if ENV.key?("QUARTO_ORGMODE_VENDOR_LISP_DIR")
Pathname(ENV["QUARTO_ORGMODE_VENDOR_LISP_DIR"]).parent.to_s
else
(Pathname(main.vendor_dir) + "org-#{version}").to_s
end
end
# ...
end
end
#<<all_tests>>
This version has identical semantics, but it's now a lot easier to see the similarities in structure between our two methods. This is a phenomena we've talked about in prior episodes: sometimes we need to "de-factor" code in order to better reveal code smells that we can then refactor away.
What's the code smell in this case? It's one we've seen before. We might say it is the object-oriented code smell to rule them all: repeating the same conditional code over and over again.
At this point we are on familiar ground. What do we do when the same conditional keeps being repeated? We come up with two different objects to represent the two branches of the conditional, and then ensure that the switch is only performed in one place.
But wait a second… coming up with two new objects, just for this pair of methods? Doesn't that seem a bit… heavyweight?
Don't worry. I didn't say we needed new classes. Just two different objects, to hold two different sets of answers. For this type of "data-holder", I think a simple Hash will do nicely.
We write a new method, called dirs
.
Inside, we write our switch on the presence or absence of the environment variable.
If the variable is present, we assign it to a convenience variable.
Then we construct and return a hash, with values for the plugin vendor directory, and for the lisp directory, both derived from the user-configuration variable.
If it is absent, we derive the plugin specific vendor directory from the global vendor directory.
Then we construct and return a different hash. This one has the same two keys, but derives the values from the main vendor directory value instead.
#<<original>>
module Quarto
class Orgmode < Plugin
# ...
def dirs
if ENV.key?("QUARTO_ORGMODE_VENDOR_LISP_DIR")
lisp_dir = ENV["QUARTO_ORGMODE_VENDOR_LISP_DIR"]
{
vendor: Pathname(lisp_dir).parent.to_s,
lisp: lisp_dir
}
else
vendor_dir = (Pathname(main.vendor_dir) + "org-#{version}").to_s
{
vendor: vendor_dir,
lisp: "#{Pathname(main.vendor_dir)}/lisp"
}
end
end
# ...
end
end
#<<all_tests>>
With this done, we rewrite the orgmode_lisp_dir
method to call dirs
and return the :lisp
key.
And we rewrite the vendor_dir
to return the :vendor
key.
Remember how back in episode #380 we talked about the guideline of making sure that methods higher up in a class have dependencies on the methods lower down, and not vice-versa? Well, that's what we've now accomplished.
Instead of pointing at each other, both the orgmode_lisp_dir
and the vendor_dir
methods now depend downwards, to the dirs
method.
#<<original>>
module Quarto
class Orgmode < Plugin
# ...
def orgmode_lisp_dir
dirs[:lisp]
end
# ...
def vendor_dir
dirs[:vendor]
end
def dirs
if ENV.key?("QUARTO_ORGMODE_VENDOR_LISP_DIR")
lisp_dir = ENV["QUARTO_ORGMODE_VENDOR_LISP_DIR"]
{
vendor: Pathname(lisp_dir).parent.to_s,
lisp: lisp_dir
}
else
vendor_dir = (Pathname(main.vendor_dir) + "org-#{version}").to_s
{
vendor: vendor_dir,
lisp: "#{Pathname(main.vendor_dir)}/lisp"
}
end
end
# ...
end
end
#<<all_tests>>
The circle of dependencies has now been broken. And the truth about our application's intended behavior is now more apparent in the code. The truth, that is, that setting one environment variable changes how the location of more than one directory will be determined.
I think that's plenty for today. Happy hacking!
Responses