In Progress
Unit 1, Lesson 1
In Progress

Virtual Proxy

Video transcript & code

A few episodes ago we discussed using "ghost objects" for lazy loading. Today I want to talk about another form of lazy-loading. Today, we'll use virtual proxy objects instead. Once again, this comes from Martin Fowler's book Patterns of Enterprise Application Architecture.

Let's say we have a class representing menu items for a cafe. Each menu item has a recipe, so we also have a class to represent recipes.

MenuItem = Struct.new(:name, :price, :recipe)
Recipe   = Struct.new(:ingredients, :directions)

We've separated these model objects from the code needed to load and store them. That responsibility falls to MenuItemMapper and RecipeMapper classes. We'll look at these mappers in more detail momentarily.

class MenuItemMapper
  DATA = {
    1 => { name: "Phatty Dank", price: "$8.75", recipe_id: 1 }
  }

  def recipe_mapper
    RecipeMapper.new
  end

  def find(id)
    data = DATA[id]
    MenuItem.new(
      data[:name],
      data[:price],
      recipe_mapper.find(data[:recipe_id]))
  end
end
class RecipeMapper
  DATA = {
    1 => {
      ingredients: [
        "Tortilla",
        "Black Bean Hummus",
        "Apple Cabbage Slaw",
        "Greens",
        "Tomato",
        "Cucumber",
        "Onion",
        "Cheese"
      ],
      directions: "Assemble into wrap. Add Siracha to taste."
    }
  }

  def find(id)
    data = DATA[id]
    Recipe.new(data[:ingredients], data[:directions])
  end
end

To look up a menu item, we create a MenuItemMapper and then send it the #find message with an ID.

item = MenuItemMapper.new.find(1)

A menu item has a name, and a price.

item.name                       # => "Phatty Dank"
item.price                      # => "$8.75"

From time to time, a customer will ask what exactly is in a menu item. To answer them, we find the recipe and look at its list of ingredients.

item.recipe.ingredients
# => ["Tortilla",
#     "Black Bean Hummus",
#     "Apple Cabbage Slaw",
#     "Greens",
#     "Tomato",
#     "Cucumber",
#     "Onion",
#     "Cheese"]

By the way, this is an actual sandwich at my local coffee shop. And it's delicious, especially with some Siracha sauce.

In order to get a menu item's ingredients, the recipe has to be looked up in a separate recipes table. Let's take a look at how this is currently accomplished.

In the MenuItemMapper, the #find method first looks up menu item data for the given ID. For the sake of keeping this example simple, we're simulating the lookup with a simple in-memory Hash. In the real world this would be querying a database, or perhaps even a remote API. It creates a new MenuItem object, and fills in the name and price fields with data from the table.

For the recipe field, it needs to talk to another mapper object. It uses a recipe_mapper to find the associated recipe using a :recipe_id column.

def find(id)
  data = DATA[id]
  MenuItem.new(
    data[:name],
    data[:price],
    recipe_mapper.find(data[:recipe_id]))
end

Not every customer asks about ingredients. But as we can see, we're filling in the recipe field every time we fetch a menu item. This is all fine and good for our little in-memory example here. But in the real world this means a second database query or HTTP request whether we need it or not. This could get inefficient fast, especially when we're listing our entire menu.

What we need is a way to lazily load the associated recipe only when it is needed. If you watched episode 180, you know that we already explored one technique for lazy loading. But in that case, we were filling in extra fields of a single model. In this case, we need to lazily fetch an associated model.

To accomplish this we're going to use something called a "virtual proxy". We create a new class, called VirtualProxy, as a subclass of BasicObject. The constructor for this class takes a block, which is converted into a proc named loader. It saves this loader into an instance variable. It also sets another instance variable called @object to nil.

We then define method_missing. Ruby will call method_missing if this object receives any message it doesn't understand. Since we inherited from BasicObject, pretty much any message we send to the object will result in this method_missing catch-all being invoked.

Inside the method, we first check to see if the @object instance variable has a value. If not, we load it by invoking the @loader proc.

Then we take the method name, arguments, and block (if any) that were originally sent, and forward them to the now-loaded @object instead.

class VirtualProxy < BasicObject
  def initialize(&loader)
    @loader = loader
    @object = nil
  end

  def method_missing(name, *args, &block)
    @object ||= @loader.call
    @object.public_send(name, *args, &block)
  end
end

Let's play around with this a little bit to get a feel for how it works. We create a MenuItem. Then we create a VirtualProxy. In the block we pass to new, we first output a message. Then we return the sandwich menu item we just created.

When we send the virtual proxy the #name message, we get the name of the sandwich. We can also see the output signaling that the loader block was called. When we send it the #price message, we get the sandwich price.

Not only that, when we send it the class message, it tells us it is a MenuItem! This is because BasicObject is so basic that it doesn't even define the method named class. So that message is forwarded along to the proxied object just like every other message. This is why we used BasicObject as a base for our VirtualProxy; since BasicObject has so few methods defined, there are fewer opportunities for conflict between methods defined on the proxy object and methods defined on the "target" of the proxy.

We can also see that no matter how many messages we send, the "Loading!" message only appears once. That's because once the virtual proxy loads the target object it saves a reference to it in the @object instance variable.

require "./models"
require "./virtual_proxy"

sandwich = MenuItem.new("Grilled Cheese", "$5.75")
vp = VirtualProxy.new {
  puts "Loading!"
  sandwich 
}
vp.name                         # => "Grilled Cheese"
vp.price                        # => "$5.75"
vp.class                        # => MenuItem
# >> Loading!

This forwarding behavior is kind of magical. It's also a little on the surprising side. Because objects that appear to be one type of object but are really another can be confusing, I typically like to add some extra method definitions to ease debugging.

For instance, we can add an #inspect method that makes it clear that this is a proxy object, not the target object itself. Second, we can add an accessor method to get at the target object directly. The underscores in the name avoid potential conflicts with methods on the target. Finally, we can move the object loading code into its own method so that we can force the target to be loaded if needed.

class VirtualProxy < BasicObject
  def initialize(&loader)
    @loader = loader
    @object = nil
  end

  def method_missing(name, *args, &block)
    __load__
    @object.public_send(name, *args, &block)
  end

  def inspect
    "VirtualProxy(#{@object ? @object.inspect : ''})"
  end

  def __object__
    @object
  end

  def __load__
    @object ||= @loader.call
  end
end

Let's try some of these out. Before loading, we can inspect the virtual proxy and see that it is empty. Then we can explicitly load it. After the load, we can inspect it and see that it now has an target object loaded.

require "./models"
require "./virtual_proxy2"

sandwich = MenuItem.new("Grilled Cheese", "$5.75")
vp = VirtualProxy.new { 
  sandwich 
}

vp.inspect                      # => "VirtualProxy()"
vp.__object__                   # => nil
vp.__load__
vp.inspect                      # => "VirtualProxy(#<struct MenuItem name=\"Grilled Cheese\", price=\"$5.75\", recipe=nil>)"
vp.__object__                   # => #<struct MenuItem name="Grilled Cheese", price="$5.75", recipe=nil>

Now that we have a VirtualProxy class, let's apply it to our lazy loading problem.

We go back to the code that loads up MenuItem objects. Instead of eagerly finding a Recipe and sticking it in the item's recipe slot, we use a VirtualProxy to stand in for the recipe. We pass the VirtualProxy a block which will load the real Recipe if and when it is needed.

class MenuItemMapper
  DATA = {
    1 => { name: "Phatty Dank", price: "$8.75", recipe_id: 1 }
  }

  def recipe_mapper
    RecipeMapper.new
  end

  def find(id)
    data = DATA[id]
    MenuItem.new(
      data[:name],
      data[:price],
      VirtualProxy.new { recipe_mapper.find(data[:recipe_id]) })
  end
end

Now when we look at a newly-loaded item's recipe, we can see that it is an unloaded VirtualProxy. But when we ask for the recipe's ingredients, the real recipe is loaded up behind the scenes and we see the expected list.

require "./models"
require "./virtual_proxy2"
require "./menu_item_mapper2"
require "./recipe_mapper"

item = MenuItemMapper.new.find(1)
item.recipe.inspect             # => "VirtualProxy()"
item.recipe.ingredients
# => ["Tortilla",
#     "Black Bean Hummus",
#     "Apple Cabbage Slaw",
#     "Greens",
#     "Tomato",
#     "Cucumber",
#     "Onion",
#     "Cheese"]

There is a variation on this pattern which is worth considering. Forwarding methods with methodmissing is not the most performant technique in the world. And as we noted before, dealing with proxy objects can have some surprising effects. For instance, we might have some code which uses case matching to decide what to do with a recipe object. However, the virtual proxy does not case-match as a Recipe even though it claims to be one.

require "./models"
require "./virtual_proxy2"
require "./menu_item_mapper2"
require "./recipe_mapper"

item = MenuItemMapper.new.find(1)
item.recipe.ingredients
# => ["Tortilla",
#     "Black Bean Hummus",
#     "Apple Cabbage Slaw",
#     "Greens",
#     "Tomato",
#     "Cucumber",
#     "Onion",
#     "Cheese"]
item.recipe.class               # => Recipe
Recipe === item.recipe          # => false

In the mapper, we can rearrange things to initialize the item in two steps. First we instantiate it with just the innate MenuItem attributes. Then we set the recipe attribute. We still supply a VirtualProxy, but this time the block passed to the proxy doesn't just load and return the Recipe association. It also re-sets the item's recipe attribute to be the newly loaded target object.

What this means is that a message will only be forwarded to a given target object once. After the first time, the VirtualProxy will have been replaced with the actual object, and there is no more need to forward.

class MenuItemMapper
  DATA = { 
   1 => { name: "Phatty Dank", price: "$8.75", recipe_id: 1 }
  }

  def recipe_mapper
    RecipeMapper.new
  end

  def find(id)
    data = DATA[id]
    item = MenuItem.new(
      data[:name],
      data[:price])
    item.recipe = VirtualProxy.new {      
      item.recipe = recipe_mapper.find(data[:recipe_id])
    }
    item
  end
end

Let's take a look at this in action. We ask an item for its recipe. At first, we see a virtual proxy. Then we send a message forcing causing the recipe to be loaded. When we next inspect the recipe, we can see that it is not wrapped in a VirtualProxy—it is a plain old Recipe object. When we try and case-match on this object, we can see that it really is a true Recipe. The VirtualProxy served it's purpose, delaying the load of the recipe until it was actually needed. But once it carried out that role, it was discarded and replaced with the real target object.

require "./models"
require "./virtual_proxy2"
require "./menu_item_mapper3"
require "./recipe_mapper"

item = MenuItemMapper.new.find(1)
item.recipe.inspect             # => "VirtualProxy()"
item.recipe.ingredients
# => ["Tortilla",
#     "Black Bean Hummus",
#     "Apple Cabbage Slaw",
#     "Greens",
#     "Tomato",
#     "Cucumber",
#     "Onion",
#     "Cheese"]
item.recipe.inspect             # => "#<struct Recipe ingredients=[\"Tortilla\", \"Black Bean Hummus\", \"Apple Cabbage Slaw\", \"Greens\", \"Tomato\", \"Cucumber\", \"Onion\", \"Cheese\"], directions=\"Assemble into wrap. Add Siracha to taste.\">"
Recipe === item.recipe          # => true

So now we have a way of injecting lazily-loaded associations into our models, without the models themselves being aware of this trickery. We've preserved the separation of mapper from domain model, while still making allowances for pragmatic concerns about excessive load time and memory use.

And that's all for today. Happy hacking!

Responses