In Progress
Unit 1, Lesson 21
In Progress

Episode #525 – Memory Profiling – Nate Berkopec

Objects may be cheap in Ruby, but they aren’t free. In this guest episode by Ruby and Rails optimization expert Nate Berkopec, you’ll learn how to diagnose memory usage problems using Ruby’s built-in ObjectSpace API.

Video transcript & code

In Ruby, we like to say that objects are cheap. And the Ruby virtual machine works hard to allow us to allocate thousands of objects without giving a second thought.

But even with all of its optimizations, Ruby objects still consume system resources. And in some scenarios, that resource usage can become a bottleneck for application performance.

That's why it's important for us to know, in a pinch, how to diagnose Ruby memory consumption issues. In this episode, guest chef Nate Berkopec is going to show how to use the built-in ObjectSpace API to ask a live Ruby program about the objects it has allocated, and the resources they are using.

Nate is a freelance Ruby and Rails performance consultant. He blogs at speedshop.co, and he maintains the Puma application server. He's also a motorcyclist, amateur radio operator, and search and rescue team member in Taos, New Mexico.

Here he is to show you how to diagnose your Ruby object memory usage problems. Enjoy!


Ruby is a language with automatic memory management. That means when we type something as simple as this:


a = "string"

The memory lifecycle of that object isn't something we have to worry about. The Ruby runtime takes care of allocating space for us, and will use that memory space for other purposes when we're done with the object.


a = "string"
a = nil

We can even modify objects and the runtime will take care of increasing or decreasing the memory space allocated to the object:


a = "string"
a = a * 100

However, this ease of use can sometimes be a double-edged sword when the abstraction starts to leak. For example, many large-scale Ruby web applications suffer from large amounts of memory usage, sometimes gigabytes per process. And when Rubyists without experience in other languages encounter this problem, they don't really know what to do about it, because up until now they haven't even had to think about memory at all in their Ruby career.

Debugging memory problems can be made easier if we have a deeper understanding of how Ruby manages memory internally, and what things we do in our code create excessive memory usage.

One fun way to increase our understanding of Ruby memory is to use a memory profiler. A simple memory profiler should tell us how many objects are created in a block of code. Let's build one now.

Ruby includes a magical module called ObjectSpace. It has a neat little method called count_objects:


ObjectSpace.count_objects
=> {:TOTAL=>53802, :FREE=>31, :T_OBJECT=>3373, :T_CLASS=>888, :T_MODULE=>30, :T_FLOAT=>4, :T_STRING=>36497, :T_REGEXP=>164, :T_ARRAY=>9399, :T_HASH=>789, :T_STRUCT=>2, :T_BIGNUM=>2, :T_FILE=>7, :T_DATA=>1443, :T_MATCH=>85, :T_COMPLEX=>1, :T_NODE=>1050, :T_ICLASS=>37}

The keys in this hash describe the current memory state of this Ruby process.

Total describes the number of slots available to place objects into. This is the maximum number of live objects we can have without triggering garbage collection.


ObjectSpace.count_objects[:TOTAL]
=> 53802

Free is the number of empty slots currently available for us to put an object into. If this number reaches zero, garbage collection is triggered and Ruby either frees up some slots and puts the objects there, or it will increase the size of the object list.


ObjectSpace.count_objects[:FREE]
=> 31

Then there's a bunch of keys for different types of objects - many are familiar, such as Class, Module, Float, String, but some are probably unfamiliar to you because they're internal datatypes used by the runtime, such as T_DATA, T_NODE, and T_ICLASS. For example, T_INODE is counting the nodes of the abstract syntax tree of your program.


ObjectSpace.count_objects[:T_INODE]
=> 1050

Just pay attention to the Ruby "primitive types" here, they're the important ones.


ObjectSpace.count_objects[:T_STRING]
=> 36497

You can probably already see some applications for the ObjectSpace module - logging object counts to an external service, for example, or to your development console or logs. You can do this without any performance overhead worries - these statistics are already kept whether or not you are using them.

Try running ObjectSpace.count_objects a few times and you'll even see the numbers change.


ObjectSpace.count_objects[:T_STRING]
=> 18824
ObjectSpace.count_objects[:T_STRING]
=> 18870
ObjectSpace.count_objects[:T_STRING]
=> 18595
ObjectSpace.count_objects[:T_STRING]
=> 18748

Let's say our profiler should be able to tell us the number of objects created inside of a block, so we want an interface something like this:


result = MyMemoryProfiler.profile do
  # some code we want to profile
end

puts result

First things first. To measure the objects created in a block accurately, we'll have to disable garbage collection for the duration of the block. If we don't, our profiler will return significantly different results if garbage collection just happens to occur while the block is executing.

Did you know you that turning Ruby’s garbage collector on and off is really simple?


GC.disable

That disables garbage collection. This turns it back on:


GC.enable

And this will trigger garbage collection to occur, regardless of memory conditions:


GC.start

Let's combine ObjectSpace.count_objects with GC.start to test our own theories about how memory works on a micro scale. For example, how many strings will this block allocate?


100.times do
  'hello' + ' ' + 'world'
end

Time to check your guess. Let's write our memory profiler, so we can measure how many objects are allocated in this code.

First, some structure for our profiler:


class MyMemoryProfiler
  def self.profile
    yield
  end
end

We've got a class with a class method that will take a block. Okay, now it's time to start profiling.

First, we disable the GC.:


class MyMemoryProfiler
  def self.profile
    GC.disable
    yield
  end
end

Don't forget to turn it back on again after you're done, too! We'll probably want to put this into an ensure block, just in case our profiled block raises an exception.


class MyMemoryProfiler
  def self.profile
    GC.disable
    yield
  ensure
    GC.enable
  end
end

Now, we'll take a measurement of the "before" state of our object counts. We'll execute the block, then take an "after" measurement.


class MyMemoryProfiler
  def self.profile
    GC.disable
    before = ObjectSpace.count_objects
    yield
    after = ObjectSpace.count_objects
  ensure
    GC.enable
  end
end

Recall that each key in the ObjectSpace hash is a type of object. So, we'll iterate through that hash and subtract the number objects we had before the block from the number we had after the block executed.


class MyMemoryProfiler
  def self.profile
    GC.disable
    before = ObjectSpace.count_objects
    yield
    after = ObjectSpace.count_objects
    after.each { |k,v| after[k] = v - before[k] }
  ensure
    GC.enable
  end
end

Now, ObjectSpace.count_objects will create a hash object, so we have to subtract that from our result to account for that. This is called probe effect- measuring or profiling a system and changing the behavior of that system by doing so. We'll also have one less free slot available, due to the new hash object.


class MyMemoryProfiler
  def self.profile
    GC.disable
    before = ObjectSpace.count_objects
    yield
    after = ObjectSpace.count_objects
    after.each { |k,v| after[k] = v - before[k] }
    after[:T_HASH] -= 1
    after[:FREE] += 1
  ensure
    GC.enable
  end
end

We'll also just remove any object types which didn't change during our analysis.


class MyMemoryProfiler
  def self.profile
    GC.disable
    before = ObjectSpace.count_objects
    yield
    after = ObjectSpace.count_objects
    after.each { |k,v| after[k] = v - before[k] }
    after[:T_HASH] -= 1
    after[:FREE] += 1
    after.reject { |k,v| v == 0 }
  ensure
    GC.enable
  end
end

Great. That should do it. Let's check the result of my earlier problem:


MyMemoryProfiler.profile do
  100.times { 'hello' + ' ' + 'world' }
end
=> {
        :FREE => -500,
    :T_STRING => 500
}

Did you get the right answer? If not, here's a hint: Ruby combines the strings one at a time - "hello" + " " becomes "hello ", and so on.


# 100 times
"hello"
" "
"world"
"hello "
"hello world"

Using this memory profiler, we can "micro-benchmark" different idioms to see which ones use more memory than others. You can earn a lot about how Ruby memory works this way.

ObjectSpace has a lot of other goodies too. Check out this parlour trick. We can count the number of live objects:


ObjectSpace.each_object.count #=> 42552

We can also count how many objects are currently live in each class.


ObjectSpace.each_object(Numeric).count #=> 7
ObjectSpace.each_object(Complex).count #=> 1

Oh, and that's right - you can iterate through every live object.


ObjectSpace.each_object(Complex) { |c| puts c } #=> 0+1i

There are lot of applications for ObjectSpace.each_object. minitest originally used it to discover test classes. You can use it in development to count and inspect objects created in your application - for example, open up a console mid-request and start counting and iterating through ActiveRecord objects.


ObjectSpace.each_object(ActiveRecord::Base) { |r| puts r }

Here’s a way to print all active objects by class, giving you an idea of what modules are creating and retaining the most objects:


result = ObjectSpace.each_object.
  map(&:class).
  each_with_object(Hash.new(0)) { |e, h| h[e] += 1 }.
  sort_by { |k,v| v }
result.last(3) #=> [[Class, 500], [Array, 2434], [String, 11804]]

require "objspace"


require "objspace"

extends the ObjectSpace module with several awesome methods. It's a kind of "debugging" module, really - the documentation comes with this stern warning:

Generally, you *SHOULD NOT* use this library if you do not know about the MRI implementation. Mainly, this library is for (memory) profiler developers and MRI developers who need to know about MRI memory usage.

Well, thanks to this short screencast, you're a memory profiler developer now. But, there's a good reason for this warning. require "objspace" will slow any production application to a crawl, thanks to all of the tracing it adds, so this is strictly for development use only. With that caveat, the ObjectSpace extension has a lot of superpowers. You can read about all of them in the docs, but I'm going to show you the ones that are really useful for any Ruby developer that's trying to understand how their app uses memory.

Here's an interesting one - count_objects_size:


ObjectSpace.count_objects_size
{
     :T_OBJECT => 198560,
      :T_CLASS => 614784,
     :T_MODULE => 66712,
      :T_FLOAT => 160,
     :T_STRING => 1578522,
     :T_REGEXP => 122875,
      :T_ARRAY => 630976,
       :T_HASH => 165672,
     :T_STRUCT => 160
     ...

ObjectSpace.count_objects_size shows you, in bytes, how much memory each type of object is using. I'll note that it's not especially accurate, but it's a good starting point to figuring out what kinds of objects are taking up memory in your app. This adds a bit more information on top of the count by object type, because you could have a problem where, say, just a dozen strings are using tens of megabytes of memory. A simple count won't help you there.

We can also check the size of individual objects (in bytes) with memsize_of:


ObjectSpace.memsize_of("The quick brown fox")
40
ObjectSpace.memsize_of([])
40
ObjectSpace.memsize_of(Array.new(10_000) { :a })
80040

Note how this demonstrates a bit of how Ruby uses memory that you may not have been aware of - objects are pretty much always at least 40 bytes.

There's also memsize_of_all to get the total memory size of a certain class of objects - this is slightly more useful than ObjectSpace.count_objects_size because it actually uses the classes in your application, rather than the internal data types of MRI:


ObjectSpace.memsize_of_all(String)
600682

Well, that's a whirlwind tour of the ObjectSpace module. It's a great starting point for learning more about Ruby's memory internals, so dig in and give it a try!


[su_box title="Editor's Note --- from the RubyTapas Masala Chef" box_color="#283037"]The astute Ruby programmer will carefully consult the Ruby 2.3.0 documentation for ObjectSpace.count_objects_size, ObjectSpace.memsize_of, and ObjectSpace.memsize_of_all. [/su_box]

Responses