In Progress
Unit 1, Lesson 1
In Progress

Benchmark-IPS

Video transcript & code

In episode #322 we met the Ruby standard Benchmark class, and learned how to use it to check the performance of difference kinds of code. As I said then, the standard benchmark library is the most important one to know, because it's always available to you.

But there are other benchmarking tools in the Ruby universe. Today we're going to meet one of them.

Here's some code that I wrote the other day in order to test some different Ruby dynamic method dispatch schemes against each other.

Event = Struct.new(:name, :source, :args)

class TestBed
  attr_reader :event_log

  def initialize
    @event_log = []
  end
end

class HardcodeTestbed < TestBed
  def call(event)
    case event.name
    when :foo
      handle_foo(event)
    when :bar
      handle_bar(event)
    when :baz
      handle_baz(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTestbed < TestBed
  def call(event)
    handler_name = "handle_#{event.name}"
    __send__(handler_name, event) if respond_to?(handler_name)
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = method_name.to_sym
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      __send__(handler_method, event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class BindTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = instance_method(method_name)
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      handler_method.bind(self).call(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class CodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "when :#{event_name} then handle_#{event_name}(event)"
    }.join("\n")
    class_eval %{
      def call(event)
        case event.name
        #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class IfCodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "event.name.equal?(:#{event_name}) then handle_#{event_name}(event)"
    }.join("\nelsif ")
    class_eval %{
      def call(event)
        if #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

def do_test(klass)
  testbed = klass.new
  testbed.call(e1 = Event[:foo])
  testbed.call(e2 = Event[:bar])
  testbed.call(e3 = Event[:baz])
  testbed.call(Event[:buz])
  unless testbed.event_log == [e1, e2, e3]
    fail "#{klass}: #{testbed.event_log.inspect}"
  end
end

classes = [
    HardcodeTestbed,
    SendTestbed,
    SendTableTestbed,
    BindTableTestbed,
    CodeGenTestbed,
    IfCodeGenTestbed,
]
width   = classes.map(&:name).map(&:size).max

require "benchmark"
n = 1000_000
Benchmark.bmbm(16) do |x|
  classes.each do |klass|
    x.report(klass.name) do
      n.times do
        do_test(klass)
      end
    end
  end
end

Let's run this, just to remind ourselves what Benchmark output looks like. I'll fast forward a bit so we don't have to wait too long.

$ ruby bench.rb
Rehearsal ----------------------------------------------------
HardcodeTestbed    1.870000   0.000000   1.870000 (  1.868746)
SendTestbed        4.640000   0.000000   4.640000 (  4.634629)
SendTableTestbed   2.660000   0.000000   2.660000 (  2.664021)
BindTableTestbed   4.000000   0.000000   4.000000 (  3.997878)
CodeGenTestbed     1.810000   0.000000   1.810000 (  1.816788)
IfCodeGenTestbed   2.090000   0.000000   2.090000 (  2.097692)
------------------------------------------ total: 17.070000sec

                       user     system      total        real
HardcodeTestbed    1.790000   0.000000   1.790000 (  1.784674)
SendTestbed        4.820000   0.000000   4.820000 (  4.825172)
SendTableTestbed   2.560000   0.000000   2.560000 (  2.557152)
BindTableTestbed   4.060000   0.010000   4.070000 (  4.061611)
CodeGenTestbed     1.960000   0.000000   1.960000 (  1.966116)
IfCodeGenTestbed   2.120000   0.000000   2.120000 (  2.114812)

While Benchmark is quite useful, there are some areas it could stand some improvement. Some of the nitpicks we might point out include:

  1. In order to get meaningful aggregate data, we have to manually come up with a good number of repetitions, through trial and error. Then we need to arrange for the code to be repeated that many times.
  2. As we noticed in that other episode, if we want the results to be correctly aligned, we have to tell the #bmbm method how long the longest title will be.
  3. When looking at the results, there's no way to tell whether the runtime was mostly consistent across repetitions, or whether certain repetitions contributed to a disproportionate amount of the overall runtime.
  4. The results are shown in the same order that we ran the tests. Benchmark doesn't give us any help sorting them by runtime.

The benchmark-ips Gem by Evan Phoenix is an extension to Benchmark that addresses all of these complaints. And converting our code to use it couldn't be simpler.

First, after installing the gem, we require benchmark/ips instead of plain benchmark.

Then we replace the #bmbm method with #ips. We remove the label width argument. We won't be needing it anymore.

Then, we get rid of the repetition count and the code that uses it. We won't be needing them anymore either.

Event = Struct.new(:name, :source, :args)

class TestBed
  attr_reader :event_log

  def initialize
    @event_log = []
  end
end

class HardcodeTestbed < TestBed
  def call(event)
    case event.name
    when :foo
      handle_foo(event)
    when :bar
      handle_bar(event)
    when :baz
      handle_baz(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTestbed < TestBed
  def call(event)
    handler_name = "handle_#{event.name}"
    __send__(handler_name, event) if respond_to?(handler_name)
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = method_name.to_sym
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      __send__(handler_method, event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class BindTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = instance_method(method_name)
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      handler_method.bind(self).call(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class CodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "when :#{event_name} then handle_#{event_name}(event)"
    }.join("\n")
    class_eval %{
      def call(event)
        case event.name
        #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class IfCodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "event.name.equal?(:#{event_name}) then handle_#{event_name}(event)"
    }.join("\nelsif ")
    class_eval %{
      def call(event)
        if #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

def do_test(klass)
  testbed = klass.new
  testbed.call(e1 = Event[:foo])
  testbed.call(e2 = Event[:bar])
  testbed.call(e3 = Event[:baz])
  testbed.call(Event[:buz])
  unless testbed.event_log == [e1, e2, e3]
    fail "#{klass}: #{testbed.event_log.inspect}"
  end
end

classes = [
  HardcodeTestbed,
  SendTestbed,
  SendTableTestbed,
  BindTableTestbed,
  CodeGenTestbed,
  IfCodeGenTestbed,
]
width   = classes.map(&:name).map(&:size).max

require "benchmark/ips"
Benchmark.ips do |x|
  classes.each do |klass|
    x.report(klass.name) do
      do_test(klass)
    end
  end
end

And that's it. Let's run this code.

The first thing we see is a rehearsal round. The IPS in Benchmark-IPS stands for "iterations per second". At this point, the gem is doing some dry runs to calculate how many repetitions will be needed to get an accurate iterations per second count.

Then it starts benchmarking for real.

$ ruby bench2.rb
Calculating -------------------------------------
     HardcodeTestbed    36.851k i/100ms
         SendTestbed    19.509k i/100ms
    SendTableTestbed    29.388k i/100ms
    BindTableTestbed    20.717k i/100ms
      CodeGenTestbed    39.082k i/100ms
    IfCodeGenTestbed    35.313k i/100ms
-------------------------------------------------
     HardcodeTestbed    599.624k (± 1.9%) i/s -      3.022M
         SendTestbed    263.969k (± 4.2%) i/s -      1.327M
    SendTableTestbed    409.964k (± 2.5%) i/s -      2.057M
    BindTableTestbed    242.405k (± 7.4%) i/s -      1.222M
      CodeGenTestbed    575.384k (± 5.3%) i/s -      2.892M
    IfCodeGenTestbed    493.219k (± 5.1%) i/s -      2.472M

First off, we can see that the results are nicely aligned without any assistance from us. Each line has a few different columns. The first column is the most important: it's the number of iterations per second the code being tested was able to complete. The higher the number, the faster the code.

Next up is the percent of standard deviation that the times exhibited. This tells us how variable each iteration was compared to others. These numbers are fairly low, meaning that the repetitions were pretty consistent, and don't have outliers skewing the results.

At the end of the line, we see the total number of iterations Benchmark-IPS actually ran in order to get its iterations-per-second average.

We've already seen a number of ways that this gem improves on the baseline Benchmark library. But Benchmark-IPS has one more trick up its sleeve.

We can send another message call on the reporter object, the #compare! message.

Event = Struct.new(:name, :source, :args)

class TestBed
  attr_reader :event_log

  def initialize
    @event_log = []
  end
end

class HardcodeTestbed < TestBed
  def call(event)
    case event.name
    when :foo
      handle_foo(event)
    when :bar
      handle_bar(event)
    when :baz
      handle_baz(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTestbed < TestBed
  def call(event)
    handler_name = "handle_#{event.name}"
    __send__(handler_name, event) if respond_to?(handler_name)
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class SendTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = method_name.to_sym
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      __send__(handler_method, event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class BindTableTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods[$1.to_sym] = instance_method(method_name)
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= {}
  end

  def call(event)
    if (handler_method = self.class.handler_methods[event.name])
      handler_method.bind(self).call(event)
    end
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class CodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "when :#{event_name} then handle_#{event_name}(event)"
    }.join("\n")
    class_eval %{
      def call(event)
        case event.name
        #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

class IfCodeGenTestbed < TestBed
  def self.method_added(method_name)
    if method_name.to_s =~ /\Ahandle_(.+)\z/
      handler_methods << $1
      regenerate_dispatch_method
    end
    super
  end

  def self.handler_methods
    @handler_methods ||= []
  end

  def self.regenerate_dispatch_method
    dispatch_table = handler_methods.map { |event_name|
      "event.name.equal?(:#{event_name}) then handle_#{event_name}(event)"
    }.join("\nelsif ")
    class_eval %{
      def call(event)
        if #{dispatch_table}
        end
      end
    }
  end

  def handle_foo(event)
    event_log << event
  end

  def handle_bar(event)
    event_log << event
  end

  def handle_baz(event)
    event_log << event
  end
end

def do_test(klass)
  testbed = klass.new
  testbed.call(e1 = Event[:foo])
  testbed.call(e2 = Event[:bar])
  testbed.call(e3 = Event[:baz])
  testbed.call(Event[:buz])
  unless testbed.event_log == [e1, e2, e3]
    fail "#{klass}: #{testbed.event_log.inspect}"
  end
end

classes = [
  HardcodeTestbed,
  SendTestbed,
  SendTableTestbed,
  BindTableTestbed,
  CodeGenTestbed,
  IfCodeGenTestbed,
]
width   = classes.map(&:name).map(&:size).max

require "benchmark/ips"
Benchmark.ips do |x|
  classes.each do |klass|
    x.report(klass.name) do
      do_test(klass)
    end
  end
  x.compare!
end

When we run the benchmark again, we see a new section in the output.

$ ruby bench3.rb
Calculating -------------------------------------
     HardcodeTestbed    38.468k i/100ms
         SendTestbed    20.666k i/100ms
    SendTableTestbed    29.470k i/100ms
    BindTableTestbed    22.667k i/100ms
      CodeGenTestbed    39.616k i/100ms
    IfCodeGenTestbed    38.571k i/100ms
-------------------------------------------------
     HardcodeTestbed    576.640k (± 1.3%) i/s -      2.885M
         SendTestbed    268.347k (± 1.3%) i/s -      1.343M
    SendTableTestbed    412.413k (± 1.0%) i/s -      2.063M
    BindTableTestbed    224.491k (±26.8%) i/s -      1.020M
      CodeGenTestbed    459.260k (±15.1%) i/s -      2.258M
    IfCodeGenTestbed    501.894k (± 9.4%) i/s -      2.507M

Comparison:
     HardcodeTestbed:   576640.3 i/s
    IfCodeGenTestbed:   501894.0 i/s - 1.15x slower
      CodeGenTestbed:   459260.2 i/s - 1.26x slower
    SendTableTestbed:   412412.6 i/s - 1.40x slower
         SendTestbed:   268347.3 i/s - 2.15x slower
    BindTableTestbed:   224491.1 i/s - 2.57x slower

The comparison section helpfully sorts the results from fastest to slowest. And for each one, it tells us exactly what percentage slower than the first it is. We no longer have to do any math ourselves in order to see exactly how big a speed-up a given optimization buys us.

Now, there is one thing missing in these results, that we had in the baseline Benchmark results. We can no longer see a breakdown of system, user, and wallclock time. So Benchmark-IPS is not a total replacement for the standard benchmark methods.

But for many benchmarking purposes, it's exactly what we need, and it does all the busywork for us. When comparing approaches for their performance, Benchmark-IPS is a great tool to keep in our back pockets.

And that's it for today. Happy hacking!

Responses