In Progress
Unit 1, Lesson 21
In Progress

ActiveRecord Optimizations

For anyone who has been working with Ruby and Rails for any length of time, today’s guest chef needs no introduction. Aaron Patterson is a longtime member of both the Ruby and Rails core teams, and a prolific contributor to both projects.

One of the areas that Aaron has specialized in over the years is finding ways to make Ruby on Rails go faster and use fewer resources. In today’s episode, you’re going to get a rare peek behind the scenes, as Aaron uses his debugging and profiling skills to get to diagnose and fix some ActiveRecord code that uses an excessive amount of memory. In the process, you’ll get a window into how Aaron instruments code to understand its memory usage, and how he explores unfamiliar code to discover how it works. Enjoy!

Video transcript & code

Reducing Active Record Allocations


Why hello! I didn't see you come in. Please have a seat while I work on this thing. My name is Aaron Patterson a.k.a. tenderlove, and today I'm looking at performance improvements to Active Record. I love working on performance improvements and I'd like to work on one today with you!

First thing we need to do is write a benchmark. We're going to be investigating the number of objects require to read an attribute.


require 'active_record'
require 'securerandom'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

20.times do
  User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end

def count_allocations
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

user = User.first
p count_allocations { user.id }
p count_allocations { user.id }
p count_allocations { user.id }

First, we'll connect to a SQLite database. If we specify the database as ":memory:", SQLite will create an in-memory database. I like using an in-memory database because the benchmark will always start with a clean slate and we don't have to factor in IO time to our benchmark.

Next we disable migration output, create a users table, and a user class. Next we insert 20 records in the database with some random data. In this case the number of records we're inserting doesn't matter so much, but we're going to use more than one record.

We have a small measurement function here called count_allocations. This function asks the garbage collector for a statistic called total_allocated_objects. total_allocated_objects is a number that increments every time an object is allocated. The count_allocations function simply saves the current number of allocations, yields to the block, then takes another measurement and returns the difference. This simple benchmark doesn't require any external gems, so its a nice place to start.

Finally, we'll get the first user record and count the number of allocations the first 3 times we call id on the user object.

Lets run the script and see what the output is:


[aaron@TC activerecord (allocations-cast)]$ be ruby allocations.rb 
6
0
0
[aaron@TC activerecord (allocations-cast)]$

The first call to .id allocated 6 objects, but the next two allocated nothing. From this output we can guess that the implementation of the id method does some sort of computation on the first call, but caches that computation such that subsequent calls are cheaper.

Lets modify the benchmark a little to see if the pattern holds. We'll modify the benchmark to get the first 5 records, then count the number of allocations made when calling the id method.


require 'active_record'
require 'securerandom'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

20.times do
  User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end

def count_allocations
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

User.limit(5).each_with_index do |user, i|
  puts "TEST: #{i}"
  p count_allocations { user.id }
  p count_allocations { user.id }
  p count_allocations { user.id }
end

Lets run this script and see if there is a pattern.


[aaron@TC activerecord (allocations-cast)]$ be ruby allocations.rb 
TEST: 0
6
0
0
TEST: 1
2
0
0
TEST: 2
2
0
0
TEST: 3
2
0
0
TEST: 4
2
0
0
[aaron@TC activerecord (allocations-cast)]$

The first thing we see here is that the first call to id in each iteration allocates at least 2 objects, but subsequent calls all allocate 0 objects. So we can probably safely ignore the second call to any attribute method in our benchmark.

The second thing we can see here is that the very first iteration allocates more objects than subsequent iterations. Again this probably means that the first call to id is setting up some sort of cache that other calls to id don't need to do.

Lets modify the benchmark to find where these objects are allocated, and see if we can reduce them.


require 'active_record'
require 'securerandom'
require 'allocation_tracer'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end OBJECT_LIMIT = 20 OBJECT_LIMIT.times do User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid) end ObjectSpace::AllocationTracer.setup(%i{path line type}) users = User.limit(OBJECT_LIMIT).to_a res = ObjectSpace::AllocationTracer.trace{ users.each do |u| u.id end } res.sort_by { |_, v| v.first }.each do |k,v| next unless v.first >= OBJECT_LIMIT
  puts "#{k[0]}:#{k[1]}, #{k[2]} #{v.inspect}"
end

The first change we'll make is to pull in the allocation_tracer gem. This gem will tell us where we're allocating objects, what type they are, and how many allocations were made.

Next we refactor the number of objects we'll allocate in to a constant. This way we can change the number of records in our bench mark in case there is any noise we want to eliminate.

Next we configure Allocation Tracer to keep track of the filename (or path), the line number, and the type of object that has been allocated. Next we run the benchmark while Allocation Tracer is instrumenting the code.

The Allocation Tracer result object is a hash. The keys of the hash are arrays with three elements that correspond to the configuration options we passed to Allocation Tracer. In this case the key is an array with three elements, the file, line, and type. The value is an array that contains statistics about the allocation including number of allocations, size, age, and other data. Here, we only care about reducing the number of allocations, and the total number of allocations for that type is the first element of the value array.

First we sort the result by the number of allocations, then iterate over the results and print them out. We skip any allocations that are less than OBJECT_LIMIT because we know from our previous tests that the allocations we want to reduce have a one to one relationship with the number of Active Record objects we looked up. Any allocations less than the number of Active Record objects we looked up are probably not interesting. To test this, we can increase the OBJECT_LIMIT and see how the output changes, but we'll skip that for today.

Lets run the benchmark and check the output.


[aaron@TC activerecord (allocations-cast)]$ be ruby -I~/git/allocation_tracer/lib allocations.rb 
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:113, T_STRING [20, 0, 0, 0, 0, 0]
[aaron@TC activerecord (allocations-cast)]$

From this output, we can see that we're allocating a T_OBJECT and a T_STRING for every first call of id on an Active Record object. T_OBJECT and T_STRING are object types that Ruby uses internally, but they roughly correspond to "a Ruby object" and "a String object" respectively. The id method is supposed to return an integer, so the string allocation is interesting.

Lets take a look at the string allocation location:


      def assign_default_value(name)
        type = additional_types.fetch(name, types[name])
        value_present = true
        value = values.fetch(name) { value_present = false }

        if value_present
          delegate_hash[name] = Attribute.from_database(name, value, type) # Here
        elsif types.key?(name)
          attr = default_attributes[name]
          if attr
            delegate_hash[name] = attr.dup
          else
            delegate_hash[name] = Attribute.uninitialized(name, type)
          end
        end
      end

[su_spoiler title="Complete Source Code."] (see line #107)


# frozen_string_literal: true

require "active_model/attribute"

module ActiveModel
  class AttributeSet # :nodoc:
    class Builder # :nodoc:
      attr_reader :types, :default_attributes

      def initialize(types, default_attributes = {})
        @types = types
        @default_attributes = default_attributes
      end

      def build_from_database(values = {}, additional_types = {})
        attributes = LazyAttributeHash.new(types, values, additional_types, default_attributes)
        AttributeSet.new(attributes)
      end
    end
  end

  class LazyAttributeHash # :nodoc:
    delegate :transform_values, :each_key, :each_value, :fetch, :except, to: :materialize

    def initialize(types, values, additional_types, default_attributes, delegate_hash = {})
      @types = types
      @values = values
      @additional_types = additional_types
      @materialized = false
      @delegate_hash = delegate_hash
      @default_attributes = default_attributes
    end

    def key?(key)
      delegate_hash.key?(key) || values.key?(key) || types.key?(key)
    end

    def [](key)
      delegate_hash[key] || assign_default_value(key)
    end

    def []=(key, value)
      if frozen?
        raise RuntimeError, "Can't modify frozen hash"
      end
      delegate_hash[key] = value
    end

    def deep_dup
      dup.tap do |copy|
        copy.instance_variable_set(:@delegate_hash, delegate_hash.transform_values(&:dup))
      end
    end

    def initialize_dup(_)
      @delegate_hash = Hash[delegate_hash]
      super
    end

    def select
      keys = types.keys | values.keys | delegate_hash.keys
      keys.each_with_object({}) do |key, hash|
        attribute = self[key]
        if yield(key, attribute)
          hash[key] = attribute
        end
      end
    end

    def ==(other)
      if other.is_a?(LazyAttributeHash)
        materialize == other.materialize
      else
        materialize == other
      end
    end

    def marshal_dump
      [@types, @values, @additional_types, @default_attributes, @delegate_hash]
    end

    def marshal_load(values)
      if values.is_a?(Hash)
        empty_hash = {}.freeze
        initialize(empty_hash, empty_hash, empty_hash, empty_hash, values)
        @materialized = true
      else
        initialize(*values)
      end
    end

    protected
      def materialize
        unless @materialized
          values.each_key { |key| self[key] }
          types.each_key { |key| self[key] }
          unless frozen?
            @materialized = true
          end
        end
        delegate_hash
      end

    private
      attr_reader :types, :values, :additional_types, :delegate_hash, :default_attributes

      def assign_default_value(name)
        type = additional_types.fetch(name, types[name])
        value_present = true
        value = values.fetch(name) { value_present = false }

        if value_present
          delegate_hash[name] = Attribute.from_database(name, value, type)
        elsif types.key?(name)
          attr = default_attributes[name]
          if attr
            delegate_hash[name] = attr.dup
          else
            delegate_hash[name] = Attribute.uninitialized(name, type)
          end
        end
      end
  end
end

[/su_spoiler]

(The complete source code can also be found at GitHub.)

This line seems to allocate a Ruby string. There are two method calls on this line, so it's not totally clear where the allocation is happening. This benchmark is pretty fast to run, so lets just break the line up and re-run.


diff --git a/activemodel/lib/active_model/attribute_set/builder.rb b/activemodel/lib/active_model/attribute_set/builder.rb
index 2b1c2206ec..2fe33ccc15 100644
--- a/activemodel/lib/active_model/attribute_set/builder.rb
+++ b/activemodel/lib/active_model/attribute_set/builder.rb
@@ -110,7 +110,8 @@ def assign_default_value(name)
         value = values.fetch(name) { value_present = false }
 
         if value_present
-          delegate_hash[name] = Attribute.from_database(name, value, type)
+          delegate_hash[name] =
+            Attribute.from_database(name, value, type)
         elsif types.key?(name)
           attr = default_attributes[name]
           if attr

We can see the allocation is happening on the same line, which is the hash set line.


[aaron@TC activerecord (allocations-cast)]$ be ruby -I~/git/allocation_tracer/lib allocations.rb 
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:113, T_STRING [20, 0, 0, 0, 0, 0]
[aaron@TC activerecord (allocations-cast)]$

So what is going on? We know that a string is being allocated on this line, but why? The reason is that when a string key is set on a hash, the hash will duplicate the string.

We can observe this behavior in irb:


>> hash = {}
=> {}
>> string = "a string"
=> "a string"
>> hash[string] = true
=> true
>> hash.keys.first.object_id
=> 70105153309120
>> string.object_id
=> 70105153309080

However, this isn't the case with frozen strings. If a string is frozen, the hash will not duplicate the key.


>> hash = {}
=> {}
>> string = "a string".freeze
=> "a string"
>> hash[string] = true
=> true
>> hash.keys.first.object_id
=> 70268592761000
>> string.object_id
=> 70268592761000

To solve this problem we need to track down where the string was allocated and see if we can change it to be a frozen string.

This method is being called with a non-frozen string, so lets find who is calling the method. First we'll modify the benchmark to be less noisy.


require 'active_record'
require 'securerandom'
require 'objspace'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

OBJECT_LIMIT = 20
OBJECT_LIMIT.times do
  User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end

users = User.limit(OBJECT_LIMIT).to_a

users.each do |u|
  u.id
end

Next we'll modify the Rails source to output the caller.


diff --git a/activemodel/lib/active_model/attribute_set/builder.rb b/activemodel/lib/active_model/attribute_set/builder.rb
index 2b1c2206ec..51f820d96b 100644
--- a/activemodel/lib/active_model/attribute_set/builder.rb
+++ b/activemodel/lib/active_model/attribute_set/builder.rb
@@ -110,7 +110,11 @@ def assign_default_value(name)
         value = values.fetch(name) { value_present = false }
 
         if value_present
-          delegate_hash[name] = Attribute.from_database(name, value, type)
+          puts "#" * 90
+          puts caller
+          puts "#" * 90
+          delegate_hash[name] =
+            Attribute.from_database(name, value, type)
         elsif types.key?(name)
           attr = default_attributes[name]
           if attr

Now when we run the benchmark, we can see that the method is being called from the id method in primary_key.rb


##########################################################################################
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set/builder.rb:39:in `[]'
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set.rb:16:in `[]'
/Users/aaron/git/rails/activemodel/lib/active_model/attribute_set.rb:48:in `fetch_value'
/Users/aaron/git/rails/activerecord/lib/active_record/attribute_methods/read.rb:77:in `_read_attribute'
/Users/aaron/git/rails/activerecord/lib/active_record/attribute_methods/primary_key.rb:21:in `id'
allocations.rb:27:in `block in '
allocations.rb:26:in `each'
allocations.rb:26:in `'
##########################################################################################

primary_key.rb gets the string value from the primary_key class method. If we jump to that method,

we'll see that it calls reset_primary_key. We're on a base class, so this method then calls get_primary_key.

get_primary_key will ask the schema cache for the keys on a table.

If we print out the pk variable, we see it's the "id" string and that this string is not frozen.


diff --git a/activerecord/lib/active_record/attribute_methods/primary_key.rb b/activerecord/lib/active_record/attribute_methods/pr
imary_key.rb
index 9b267bb7c0..4bff18f4b5 100644
--- a/activerecord/lib/active_record/attribute_methods/primary_key.rb
+++ b/activerecord/lib/active_record/attribute_methods/primary_key.rb
@@ -98,6 +98,7 @@ def get_primary_key(base_name) #:nodoc:
             else
               if ActiveRecord::Base != self && table_exists?
                 pk = connection.schema_cache.primary_keys(table_name)
+                p pk => pk.frozen?
                 suppress_composite_primary_key(pk)
               else
                 "id"

Looking at the primary_keys definition in the SQLite3 adapter, we see that it's just querying the database for the primary keys on a table. This is the source of our string, and the place where we should add a .freeze call.

After adding the freeze call, lets rerun the benchmark.


require 'active_record'
require 'securerandom'
require 'allocation_tracer'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end OBJECT_LIMIT = 20 OBJECT_LIMIT.times do User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid) end ObjectSpace::AllocationTracer.setup(%i{path line type}) users = User.limit(OBJECT_LIMIT).to_a res = ObjectSpace::AllocationTracer.trace{ users.each do |u| u.id end } res.sort_by { |_, v| v.first }.each do |k,v| next unless v.first >= OBJECT_LIMIT
  puts "#{k[0]}:#{k[1]}, #{k[2]} #{v.inspect}"
end

Adding the freeze in the right location...


[aaron@TC-10 activerecord (allocations-cast)]$ be ruby -I lib:~/git/allocation_tracer/lib allocations.rb
/Users/aaron/git/rails/activemodel/lib/active_model/attribute.rb:9, T_OBJECT [20, 0, 0, 0, 0, 0]
[aaron@TC-10 activerecord (allocations-cast)]$

...eliminated all the string object allocations for this method.


require 'active_record'
require 'securerandom'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string "name"
    t.string "email"

    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

20.times do
  User.create!(name: SecureRandom.uuid, email: SecureRandom.uuid)
end

def count_allocations
  x = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - x
end

User.limit(5).each_with_index do |user, i|
  puts "TEST: #{i}"
  p count_allocations { user.id }
  p count_allocations { user.id }
  p count_allocations { user.id }
end

Re-running the pattern finding benchmark shows that the first call to the id method now only allocates one object rather than two. Great!


[aaron@TC-10 activerecord (allocations-cast)]$ be ruby -I lib:~/git/allocation_tracer/lib allocations.rb
TEST: 0
5
0
0
TEST: 1
1
0
0
TEST: 2
1
0
0
TEST: 3
1
0
0
TEST: 4
1
0
0
[aaron@TC-10 activerecord (allocations-cast)]$ 

The End.

 

Responses