ActiveRecord Optimizations
For anyone who has been working with Ruby and Rails for any length of time, today’s guest chef needs no introduction. Aaron Patterson is a longtime member of both the Ruby and Rails core teams, and a prolific contributor to both projects.
One of the areas that Aaron has specialized in over the years is finding ways to make Ruby on Rails go faster and use fewer resources. In today’s episode, you’re going to get a rare peek behind the scenes, as Aaron uses his debugging and profiling skills to get to diagnose and fix some ActiveRecord code that uses an excessive amount of memory. In the process, you’ll get a window into how Aaron instruments code to understand its memory usage, and how he explores unfamiliar code to discover how it works. Enjoy!
Video transcript & code
Reducing Active Record Allocations
Why hello! I didn't see you come in. Please have a seat while I work on this thing. My name is Aaron Patterson a.k.a. tenderlove, and today I'm looking at performance improvements to Active Record. I love working on performance improvements and I'd like to work on one today with you!
First thing we need to do is write a benchmark. We're going to be investigating the number of objects require to read an attribute.
require 'active_record'
require 'securerandom'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end
20.times do
User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end
def count_allocations
x = GC.stat(:total_allocated_objects)
yield
GC.stat(:total_allocated_objects) - x
end
user = User.first
p count_allocations { user.id }
p count_allocations { user.id }
p count_allocations { user.id }
First, we'll connect to a SQLite database. If we specify the database as ":memory:", SQLite will create an in-memory database. I like using an in-memory database because the benchmark will always start with a clean slate and we don't have to factor in IO time to our benchmark.
Next we disable migration output, create a users table, and a user class. Next we insert 20 records in the database with some random data. In this case the number of records we're inserting doesn't matter so much, but we're going to use more than one record.
We have a small measurement function here called count_allocations
. This function asks the garbage collector for a statistic called total_allocated_objects
. total_allocated_objects
is a number that increments every time an object is allocated. The count_allocations
function simply saves the current number of allocations, yields to the block, then takes another measurement and returns the difference. This simple benchmark doesn't require any external gems, so its a nice place to start.
Finally, we'll get the first user record and count the number of allocations the first 3 times we call id
on the user object.
Lets run the script and see what the output is:
[aaron@TC activerecord (allocations-cast)]$ be ruby allocations.rb
6
0
0
[aaron@TC activerecord (allocations-cast)]$
The first call to .id
allocated 6 objects, but the next two allocated nothing. From this output we can guess that the implementation of the id
method does some sort of computation on the first call, but caches that computation such that subsequent calls are cheaper.
Lets modify the benchmark a little to see if the pattern holds. We'll modify the benchmark to get the first 5 records, then count the number of allocations made when calling the id
method.
require 'active_record'
require 'securerandom'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end
20.times do
User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end
def count_allocations
x = GC.stat(:total_allocated_objects)
yield
GC.stat(:total_allocated_objects) - x
end
User.limit(5).each_with_index do |user, i|
puts "TEST: #{i}"
p count_allocations { user.id }
p count_allocations { user.id }
p count_allocations { user.id }
end
Lets run this script and see if there is a pattern.
[aaron@TC activerecord (allocations-cast)]$ be ruby allocations.rb
TEST: 0
6
0
0
TEST: 1
2
0
0
TEST: 2
2
0
0
TEST: 3
2
0
0
TEST: 4
2
0
0
[aaron@TC activerecord (allocations-cast)]$
The first thing we see here is that the first call to id
in each iteration allocates at least 2 objects, but subsequent calls all allocate 0 objects. So we can probably safely ignore the second call to any attribute method in our benchmark.
The second thing we can see here is that the very first iteration allocates more objects than subsequent iterations. Again this probably means that the first call to id
is setting up some sort of cache that other calls to id
don't need to do.
Lets modify the benchmark to find where these objects are allocated, and see if we can reduce them.
require 'active_record'
require 'securerandom'
require 'allocation_tracer'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end OBJECT_LIMIT = 20 OBJECT_LIMIT.times do User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid) end ObjectSpace::AllocationTracer.setup(%i{path line type}) users = User.limit(OBJECT_LIMIT).to_a res = ObjectSpace::AllocationTracer.trace{ users.each do |u| u.id end } res.sort_by { |_, v| v.first }.each do |k,v| next unless v.first >= OBJECT_LIMIT
puts "#{k[0]}:#{k[1]}, #{k[2]} #{v.inspect}"
end
The first change we'll make is to pull in the allocation_tracer
gem. This gem will tell us where we're allocating objects, what type they are, and how many allocations were made.
Next we refactor the number of objects we'll allocate in to a constant. This way we can change the number of records in our bench mark in case there is any noise we want to eliminate.
Next we configure Allocation Tracer to keep track of the filename (or path), the line number, and the type of object that has been allocated. Next we run the benchmark while Allocation Tracer is instrumenting the code.
The Allocation Tracer result object is a hash. The keys of the hash are arrays with three elements that correspond to the configuration options we passed to Allocation Tracer. In this case the key is an array with three elements, the file, line, and type. The value is an array that contains statistics about the allocation including number of allocations, size, age, and other data. Here, we only care about reducing the number of allocations, and the total number of allocations for that type is the first element of the value array.
First we sort the result by the number of allocations, then iterate over the results and print them out. We skip any allocations that are less than OBJECT_LIMIT
because we know from our previous tests that the allocations we want to reduce have a one to one relationship with the number of Active Record objects we looked up. Any allocations less than the number of Active Record objects we looked up are probably not interesting. To test this, we can increase the OBJECT_LIMIT
and see how the output changes, but we'll skip that for today.
Lets run the benchmark and check the output.
[aaron@TC activerecord (allocations-cast)]$ be ruby -I~/git/allocation_tracer/lib allocations.rb
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:113, T_STRING [20, 0, 0, 0, 0, 0]
[aaron@TC activerecord (allocations-cast)]$
From this output, we can see that we're allocating a T_OBJECT
and a T_STRING
for every first call of id
on an Active Record object. T_OBJECT
and T_STRING
are object types that Ruby uses internally, but they roughly correspond to "a Ruby object" and "a String object" respectively. The id
method is supposed to return an integer, so the string allocation is interesting.
Lets take a look at the string allocation location:
def assign_default_value(name)
type = additional_types.fetch(name, types[name])
value_present = true
value = values.fetch(name) { value_present = false }
if value_present
delegate_hash[name] = Attribute.from_database(name, value, type) # Here
elsif types.key?(name)
attr = default_attributes[name]
if attr
delegate_hash[name] = attr.dup
else
delegate_hash[name] = Attribute.uninitialized(name, type)
end
end
end
[su_spoiler title="Complete Source Code."] (see line #107)
# frozen_string_literal: true
require "active_model/attribute"
module ActiveModel
class AttributeSet # :nodoc:
class Builder # :nodoc:
attr_reader :types, :default_attributes
def initialize(types, default_attributes = {})
@types = types
@default_attributes = default_attributes
end
def build_from_database(values = {}, additional_types = {})
attributes = LazyAttributeHash.new(types, values, additional_types, default_attributes)
AttributeSet.new(attributes)
end
end
end
class LazyAttributeHash # :nodoc:
delegate :transform_values, :each_key, :each_value, :fetch, :except, to: :materialize
def initialize(types, values, additional_types, default_attributes, delegate_hash = {})
@types = types
@values = values
@additional_types = additional_types
@materialized = false
@delegate_hash = delegate_hash
@default_attributes = default_attributes
end
def key?(key)
delegate_hash.key?(key) || values.key?(key) || types.key?(key)
end
def [](key)
delegate_hash[key] || assign_default_value(key)
end
def []=(key, value)
if frozen?
raise RuntimeError, "Can't modify frozen hash"
end
delegate_hash[key] = value
end
def deep_dup
dup.tap do |copy|
copy.instance_variable_set(:@delegate_hash, delegate_hash.transform_values(&:dup))
end
end
def initialize_dup(_)
@delegate_hash = Hash[delegate_hash]
super
end
def select
keys = types.keys | values.keys | delegate_hash.keys
keys.each_with_object({}) do |key, hash|
attribute = self[key]
if yield(key, attribute)
hash[key] = attribute
end
end
end
def ==(other)
if other.is_a?(LazyAttributeHash)
materialize == other.materialize
else
materialize == other
end
end
def marshal_dump
[@types, @values, @additional_types, @default_attributes, @delegate_hash]
end
def marshal_load(values)
if values.is_a?(Hash)
empty_hash = {}.freeze
initialize(empty_hash, empty_hash, empty_hash, empty_hash, values)
@materialized = true
else
initialize(*values)
end
end
protected
def materialize
unless @materialized
values.each_key { |key| self[key] }
types.each_key { |key| self[key] }
unless frozen?
@materialized = true
end
end
delegate_hash
end
private
attr_reader :types, :values, :additional_types, :delegate_hash, :default_attributes
def assign_default_value(name)
type = additional_types.fetch(name, types[name])
value_present = true
value = values.fetch(name) { value_present = false }
if value_present
delegate_hash[name] = Attribute.from_database(name, value, type)
elsif types.key?(name)
attr = default_attributes[name]
if attr
delegate_hash[name] = attr.dup
else
delegate_hash[name] = Attribute.uninitialized(name, type)
end
end
end
end
end
[/su_spoiler]
(The complete source code can also be found at GitHub.)
This line seems to allocate a Ruby string. There are two method calls on this line, so it's not totally clear where the allocation is happening. This benchmark is pretty fast to run, so lets just break the line up and re-run.
diff --git a/activemodel/lib/active_model/attribute_set/builder.rb b/activemodel/lib/active_model/attribute_set/builder.rb
index 2b1c2206ec..2fe33ccc15 100644
--- a/activemodel/lib/active_model/attribute_set/builder.rb
+++ b/activemodel/lib/active_model/attribute_set/builder.rb
@@ -110,7 +110,8 @@ def assign_default_value(name)
value = values.fetch(name) { value_present = false }
if value_present
- delegate_hash[name] = Attribute.from_database(name, value, type)
+ delegate_hash[name] =
+ Attribute.from_database(name, value, type)
elsif types.key?(name)
attr = default_attributes[name]
if attr
We can see the allocation is happening on the same line, which is the hash set line.
[aaron@TC activerecord (allocations-cast)]$ be ruby -I~/git/allocation_tracer/lib allocations.rb
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:113, T_STRING [20, 0, 0, 0, 0, 0]
[aaron@TC activerecord (allocations-cast)]$
So what is going on? We know that a string is being allocated on this line, but why? The reason is that when a string key is set on a hash, the hash will duplicate the string.
We can observe this behavior in irb:
>> hash = {}
=> {}
>> string = "a string"
=> "a string"
>> hash[string] = true
=> true
>> hash.keys.first.object_id
=> 70105153309120
>> string.object_id
=> 70105153309080
However, this isn't the case with frozen strings. If a string is frozen, the hash will not duplicate the key.
>> hash = {}
=> {}
>> string = "a string".freeze
=> "a string"
>> hash[string] = true
=> true
>> hash.keys.first.object_id
=> 70268592761000
>> string.object_id
=> 70268592761000
To solve this problem we need to track down where the string was allocated and see if we can change it to be a frozen string.
This method is being called with a non-frozen string, so lets find who is calling the method. First we'll modify the benchmark to be less noisy.
require 'active_record'
require 'securerandom'
require 'objspace'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end
OBJECT_LIMIT = 20
OBJECT_LIMIT.times do
User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end
users = User.limit(OBJECT_LIMIT).to_a
users.each do |u|
u.id
end
Next we'll modify the Rails source to output the caller.
diff --git a/activemodel/lib/active_model/attribute_set/builder.rb b/activemodel/lib/active_model/attribute_set/builder.rb
index 2b1c2206ec..51f820d96b 100644
--- a/activemodel/lib/active_model/attribute_set/builder.rb
+++ b/activemodel/lib/active_model/attribute_set/builder.rb
@@ -110,7 +110,11 @@ def assign_default_value(name)
value = values.fetch(name) { value_present = false }
if value_present
- delegate_hash[name] = Attribute.from_database(name, value, type)
+ puts "#" * 90
+ puts caller
+ puts "#" * 90
+ delegate_hash[name] =
+ Attribute.from_database(name, value, type)
elsif types.key?(name)
attr = default_attributes[name]
if attr
Now when we run the benchmark, we can see that the method is being called from the id
method in primary_key.rb
##########################################################################################
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:39:in `[]'
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set.rb:16:in `[]'
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set.rb:48:in `fetch_value'
/Users/aaron/git/rails/activerecord/lib/active_record/attribute_methods/read.rb:77:in `_read_attribute'
/Users/aaron/git/rails/activerecord/lib/active_record/attribute_methods/primary_key.rb:21:in `id'
allocations.rb:27:in `block in '
allocations.rb:26:in `each'
allocations.rb:26:in `'
##########################################################################################
primary_key.rb
gets the string value from the primary_key
class method. If we jump to that method,
we'll see that it calls reset_primary_key
. We're on a base class, so this method then calls get_primary_key
.
get_primary_key
will ask the schema cache for the keys on a table.
If we print out the pk
variable, we see it's the "id" string and that this string is not frozen.
diff --git a/activerecord/lib/active_record/attribute_methods/primary_key.rb b/activerecord/lib/active_record/attribute_methods/pr
imary_key.rb
index 9b267bb7c0..4bff18f4b5 100644
--- a/activerecord/lib/active_record/attribute_methods/primary_key.rb
+++ b/activerecord/lib/active_record/attribute_methods/primary_key.rb
@@ -98,6 +98,7 @@ def get_primary_key(base_name) #:nodoc:
else
if ActiveRecord::Base != self && table_exists?
pk = connection.schema_cache.primary_keys(table_name)
+ p pk => pk.frozen?
suppress_composite_primary_key(pk)
else
"id"
Looking at the primary_keys
definition in the SQLite3 adapter, we see that it's just querying the database for the primary keys on a table. This is the source of our string, and the place where we should add a .freeze
call.
After adding the freeze call, lets rerun the benchmark.
require 'active_record'
require 'securerandom'
require 'allocation_tracer'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end OBJECT_LIMIT = 20 OBJECT_LIMIT.times do User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid) end ObjectSpace::AllocationTracer.setup(%i{path line type}) users = User.limit(OBJECT_LIMIT).to_a res = ObjectSpace::AllocationTracer.trace{ users.each do |u| u.id end } res.sort_by { |_, v| v.first }.each do |k,v| next unless v.first >= OBJECT_LIMIT
puts "#{k[0]}:#{k[1]}, #{k[2]} #{v.inspect}"
end
Adding the freeze in the right location...
[aaron@TC-10 activerecord (allocations-cast)]$ be ruby -I lib:~/git/allocation_tracer/lib allocations.rb
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
[aaron@TC-10 activerecord (allocations-cast)]$
...eliminated all the string object allocations for this method.
require 'active_record'
require 'securerandom'
ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"
ActiveRecord::Migration.verbose = false
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string "name"
t.string "email"
t.timestamps null: false
end
end
class User < ActiveRecord::Base; end
20.times do
User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end
def count_allocations
x = GC.stat(:total_allocated_objects)
yield
GC.stat(:total_allocated_objects) - x
end
User.limit(5).each_with_index do |user, i|
puts "TEST: #{i}"
p count_allocations { user.id }
p count_allocations { user.id }
p count_allocations { user.id }
end
Re-running the pattern finding benchmark shows that the first call to the id
method now only allocates one object rather than two. Great!
[aaron@TC-10 activerecord (allocations-cast)]$ be ruby -I lib:~/git/allocation_tracer/lib allocations.rb
TEST: 0
5
0
0
TEST: 1
1
0
0
TEST: 2
1
0
0
TEST: 3
1
0
0
TEST: 4
1
0
0
[aaron@TC-10 activerecord (allocations-cast)]$
The End.
Responses