Ghost Load
Video transcript & code
Back in episode 164, I showed you how I used the Mapper pattern to separate data scraped from the DPD website and my Episode
domain model. I have a ContentPostGateway
which is only concerned with getting data from DPD; I have an Episode class which is only concerned with representing one episode of this show; and I have an EpisodeMapper
which is only worried about mapping the one to the other.
But things never stay this simple. When I fetch a list of episodes, in the background the data comes from a gateway method called #content_post_list
which gets a summary of all of the episodes. There's a lot of data which is missing from this request, though. For instance, I don't have each episode's full description. Right now, the episodes that are returned by the mapper are incomplete.
module DPD
module ContentPostGateway
# ...
def content_post_list
# fetch summary of all posts...
end
# ...
end
end
require "./episode"
require "./episode_mapper"
mapper = RubyTapas::EpisodeMapper.new
ep = mapper.all.last
ep
# => #<RubyTapas::Episode:0x00000001a51198
# @id=456,
# @name="Mapper",
# @number=164>
ep.description # => nil
There's another method on the gateway, #find_content_post_by_id
, which returns a complete set of data for a given episode. But the price is an extra HTTP request for each episode retrieved. When building a complete list of episodes, I can't afford to use this method to get complete episode data for every single one.
module DPD
module ContentPostGateway
# ...
def find_content_post_by_id(id)
# fetch complete info for post...
end
# ...
end
end
Which means I'm sometimes going to be working with incomplete Episode
objects. Ideally, what I'd like to happen is for episodes to start out incomplete, but magically "fill themselves in" if I ask for an episode attribute which hasn't yet been loaded. But remember, these Episode
objects have no knowledge of the EpisodeMapper
, let alone of the ContentPostGateway
!
The solution I turn to is a family of patterns collectively known as "Lazy Load". Martin Fowler lists four different types of lazy loading in his book Patterns of Enterprise Application Architecture. I'm going to present just one of those patterns today: the "ghost object" style of lazy loading. Unlike other lazy loading patterns which use some kind of proxy object, "ghost" objects are real domain model objects in a partial state.
To my Episode
model, I add a load_state
attribute, and a :data_source
attribute. I make the :load_status
default to a state named :ghost
.
I pick one of the attributes which may need to be lazily loaded–the description
attribute. I override the getter method with one that does two things: first, it calls a method called #load
. Then it returns the value of the @description
instance variable.
Next I define the #load
method. It returns early if the object is in a :loaded
state already. Otherwise, it sends the #load
message to the :data_source
, with self
as the argument.
module RubyTapas
class Episode
attr_accessor :video,
:id,
:number,
:name,
:description,
:synopsis,
:video_url,
:publish_time,
:load_state,
:data_source
def initialize(attributes={})
attributes.each do |key, value|
public_send("#{key}=", value)
end
@load_state = :ghost
end
def to_s
inspect
end
def ==(other)
other.is_a?(Episode) && other.id == id
end
def published?(time_now=Time.now)
publish_time <= time_now
end
def load
return if load_state == :loaded
data_source.load(self)
end
def description
load
@description
end
end
end
Those are all the changes I make to the model class for now. Next up, I need to update the mapper.
In order to keep the focus on lazy loading, the mapper you see here is an ultra-simplified fake which just returns fixed values.
module RubyTapas
class EpisodeMapper
def all
[
Episode.new(
id: 123,
name: "YAML::Store",
number: 163),
Episode.new(
id: 456,
name: "Mapper",
number: 164),
]
end
end
The first thing I do is change the method which returns a summary list of all episodes. In it, I set the data_source
attribute on each returned Episode
. In effect, this is the mapper's way of saying "if you ever need more data, here's my number".
Next I add the #load
method. First, it extracts the id
attribute from the Episode
. If this were a real implementation, it would then use the ContentPostGateway
to fetch data using that ID and then transform it into domain terms. For this example I'm going to pretend I've already taken care of that, and instead move straight on to filling in the model.
Before I do that, though, I set the load_state
attribute to :loading
. This isn't strictly necessary right now. But it's part of the pattern, and it may become important down the road. That's because once we start loading networks of associated objects, we may need to flag objects that are already being loaded in order to avoid infinite recursion.
Next I load up the episode object with detailed field data. Again, I'm just using hardcoded values here to keep this demonstration simple.
When I'm done fully loading the episode, I set the load_state
to :loaded
, and return.
Notice here that it is the mapper which is responsible for telling an object when it is fully loaded - it is not the object's responsibility.
module RubyTapas
class EpisodeMapper
def all
[
Episode.new(
id: 123,
name: "YAML::Store",
number: 163,
data_source: self),
Episode.new(
id: 456,
name: "Mapper",
number: 164,
data_source: self)
]
end
def load(episode)
id = episode.id
# ...retrieve episode data based on ID...
episode.load_state = :loading
episode.description = "Today we explore a pattern for bridging "\
"the gap between different domain models."
episode.synopsis = "Bridging the gap between domain models"
episode.publish_time = Time.new(2013, 12, 30, 9, 11)
episode.load_state = :loaded
end
end
end
Now, when I grab an episode object from the list that the mapper returns, I can see that it is in the :ghost
state. Then I can access the description
attribute, and get text back. When I check the load state, the object is now :loaded
. In the background, fetching the description
attribute triggered the object to go from a ghost to a fully-loaded model.
require "./ghost_episode"
require "./episode_mapper"
mapper = RubyTapas::EpisodeMapper.new
ep = mapper.all.last
ep.load_state # => :ghost
ep.description
# => "Today we explore a pattern for bridging the gap between different domain models."
ep.load_state # => :loaded
So far, I've only implemented ghost-loading for the description
attribute. But it would be a simple matter to extend that to other fields, like the synopsis
and the publish_time
.
When I started out, my Episode
objects were totally isolated from the messy world of loading and transforming data from external sources. With this new design, I've compromised a little bit. But what I like about this pattern is that it's a compromise with very clear limits. It's not a slippery slope.
Episode
objects are now aware that they might start out without all of their data. But they still have no idea how to retrieve or process that data. All they know is that there is some object out there to which they can say: "load me up, please!". I could add a dozen more lazy-loaded attributes to the Episode
class, and it still wouldn't need to know anything more than this about how to fill those attributes in. The responsibility of mapping from the ContentPostGateway
to Episode
data is still firmly in the EpisodeMapper
's court.
As it stands now, this code only lazy-loads triggered on a single attribute of a single model class. In a future episode I'll talk about generalizing this approach to many attributes and arbitrary model classes. But this is enough for now. Happy hacking!
Responses