Deep Dup
Copying objects can be surprisingly prone to error in Ruby. With the language only providing shallow copying semantics out of the box, copied objects can often share more state than you intended them to. In this episode, you’ll learn a few techniques for reliably making deep copies of complex object trees.
Video transcript & code
In a recent episode we found a way to easily freeze a whole tree of objects, using the ice_nine
gem. This helped us ensure that our master shopping list could only be used as a template, and could not be accidentally modified in-place.
We discovered a problem however, when we tried to create a duplicate object and then add an entry to the items array.
Because Ruby's dup
method only performs a shallow copy, the items
property still referred to the original, frozen master list, and the append failed.
require "ice_nine"
ShoppingList = Struct.new(:name, :items)
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer"])
IceNine.deep_freeze(MASTER_LIST)
today_list = MASTER_LIST.dup
today_list.items << "Fireworks" # ~> RuntimeError: can't modify frozen Array
# ~> RuntimeError
# ~> can't modify frozen Array
# ~>
# ~> xmptmp-in909276Q.rb:12:in `<main>'
In that episode, we addressed this by adding an initialize_copy
method to our shopping list class. As we saw in Episode #486, Ruby invokes this method anytime an object is duplicated or cloned.
Inside the method, we explicitly made duplicates of each of the shopping list attributes.
With this change in place, we were then able to modify the nested items
array of the shopping list copy.
require "ice_nine"
ShoppingList = Struct.new(:name, :items) do
def initialize_copy(original)
self.name = original.name.dup
self.items = original.items.dup
end
end
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer"])
IceNine.deep_freeze(MASTER_LIST)
today_list = MASTER_LIST.dup
today_list.name = "July 4th Shopping"
today_list.items << "Fireworks"
today_list
# => #<struct ShoppingList
# name="July 4th Shopping",
# items=["Bread", "Milk", "Beer", "Fireworks"]>
This seems like a lot of effort to go to just to create a deep copy of an object. Surely there's an easier way.
Well, if you're working in a Rails application and ActiveSupport
is available, you can try using the deep_dup
method that it adds to objects.
However, in my experience this method is not always reliable. In fact, it doesn't work for this example. It seems it fails to to deeply duplicate Ruby Struct
attributes.
require "ice_nine"
require "active_support/core_ext/object"
ShoppingList = Struct.new(:name, :items)
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer"])
IceNine.deep_freeze(MASTER_LIST)
today_list = MASTER_LIST.deep_dup
today_list.name = "July 4 Shopping"
today_list.items << "Fireworks" # ~> RuntimeError: can't modify frozen Array
MASTER_LIST.items
# =>
# ~> RuntimeError
# ~> can't modify frozen Array
# ~>
# ~> xmptmp-in9092igX.rb:15:in `<main>'
There is another trick for creating deep copies of objects that doesn't require any extra libraries at all.
We can use Ruby's Marshal
module to dump
a serialized version of the master list. Normally, we'd write the resulting data to disk or to a database field. But in this case, we immediately reconstitute the serialized data using Marshal.load
.
The net result of this dump-and-load procedure is a deeply copied object graph. We can see this when we execute the code, and compare the items in the copied list to the items in the master list.
require "ice_nine"
ShoppingList = Struct.new(:name, :items)
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer"])
IceNine.deep_freeze(MASTER_LIST)
today_list = Marshal.load(Marshal.dump(MASTER_LIST))
today_list.name = "July 4 Shopping"
today_list.items << "Fireworks"
today_list.items
# => ["Bread", "Milk", "Beer", "Fireworks"]
MASTER_LIST.items
# => ["Bread", "Milk", "Beer"]
For many simple kinds of object, this technique works well. Although you may see performance issues with it at scale.
But there are some objects that Marshal
is unable to dump and load.
As a somewhat contrived example, imagine that we were to add a lambda to the master shopping list.
When we execute the code this time, we get an error. That's because Ruby doesn't know how to serialize a Proc
object, which is what the lambda syntax creates.
require "ice_nine"
ShoppingList = Struct.new(:name, :items)
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer",
->{["Pie", "Cake", "Cookies"].sample(1)}])
IceNine.deep_freeze(MASTER_LIST)
today_list = Marshal.load(Marshal.dump(MASTER_LIST)) # ~> TypeError: no _dump_data is defined for class Proc
today_list.name = "July 4 Shopping"
today_list.items << "Fireworks"
MASTER_LIST.items
# =>
# ~> TypeError
# ~> no _dump_data is defined for class Proc
# ~>
# ~> xmptmp-in27080eMR.rb:12:in `dump'
# ~> xmptmp-in27080eMR.rb:12:in `<main>'
For a more comprehensive deep duplication solution, we can turn to the duplicate
gem, by Adam Luszi.
We use the duplicate
method on the Duplicate
module, passing in the object to be copied.
This method happily copies our shopping list, without choking on the Proc
object. The resulting output shows that both master and duplicate objects include the same proc, but only the duplicate object includes our added item.
require "ice_nine"
require "duplicate"
ShoppingList = Struct.new(:name, :items)
MASTER_LIST = ShoppingList.new("Master List", [
"Bread",
"Milk",
"Beer",
->{["Pie", "Cake", "Cookies"].sample(1)}])
IceNine.deep_freeze(MASTER_LIST)
today_list = Duplicate.duplicate(MASTER_LIST)
today_list.name = "July 4 Shopping"
today_list.items << "Fireworks"
today_list.items
# => ["Bread",
# "Milk",
# "Beer",
# #<Proc:0x00000002590d38@xmptmp-in9092XRr.rb:10 (lambda)>,
# "Fireworks"]
MASTER_LIST.items
# => ["Bread",
# "Milk",
# "Beer",
# #<Proc:0x00000002592458@xmptmp-in9092XRr.rb:10 (lambda)>]
The duplicate
gem includes smart handling for non-copyable objects in Ruby, and as a result it performs deep copies of pretty much any object we throw at it.
Now, I'll be honest: I only found out about this gem it as a result of the research for this episode, so it's not a one that I've been using for a long time. However, it's the best off-the-shelf solution for deep copying in Ruby that I've been able to locate.
Happy hacking!
Responses