In Progress
Unit 1, Lesson 21
In Progress

Performance Testing – Scaling with Inputs

Anytime our code makes a decision or produces modified output based on inputs, that’s an algorithm. When we feed large quantities of data to our algorithms, we start having to worry about their performance scaling characteristics. Does processing twice as much data take twice as much time? Or more? Or less?

In today’s episode, guest chef Piotr Murach returns to explain how algorithm performance can be analyzed and characterized using “Big-O” notation. Then he goes on to demonstrate how we can programmatically verify our algorithmic performance assumptions. Enjoy!

Video transcript & code

Performance Testing - Scaling with Inputs

Computational Complexity

In the previous episode, we looked at locking code performance characteristics in a test suite and then safely refactoring it to be much faster. This time around, I would like us to shift our attention to the concept of computational complexity. In other words, a way of figuring out how our code scales with inputs of increasing size and answering the question: Does our code consume significantly more resources to perform computation with higher workloads or not?

Performance surprises

This is often where performance surprises exist. The code works fine for small to medium size inputs. However, with large inputs the performance problem suddenly becomes apparent. The code takes a long time to finish if at all.

Big-O

To be able to discuss whether our code scales with increasing inputs, we need to quickly dive into the Big-O notation. The idea behind Big-O is simple, it is a formal notation that allows us to talk about messy math in a pragmatic way, programmer to programmer. Traditionally computer scientists used this notation to talk about scalability of algorithms in terms of their computational complexity. For example, linear search is O(n) and binary search is O(log n) in the size of the array.

X & Y axis with data points

Another way to describe Big-O notation is akin to saying that some data points on a chart between the X and Y axis cluster together to form a particular curve. This curve indicates a trend that all future points will follow. A prediction of what values to expect for large inputs. To make it more concrete, let's look at life expectancy in the USA from 1900 in 10 year periods. We can see that the expected age rises quickly at the start of the century and tapers off towards the end.

X & Y axis with trend line

We can draw a line that best approximates all the data points and in doing so, it gives us a way of thinking about the future trend of life expectancy. We can make an educated guess about how long people will live in 2049.

Logarithmic O(log n)

We call this trend logarithmic and use O(log n) notation to describe it.

X & Y axis with data points

Let's take a look at another example. This time, we plot data points based on the number of mobile subscribers in the USA in millions from 1988 to 1997.

X & Y axis with trend line

We observe nearly the opposite behaviour compared with life expectancy, namely, a trend line that starts slowly in initial years but rises steeply with later years.

Quadratic O(n ^ 2)

This is what we call in general a power-law curve and in this particular instance a quadratic trend, described as O(n^2).

There are many other trend lines we could talk about depending how quickly the points rise or fall. As a guiding principle, we can say that anything that rises steeply and thus demonstrates power-law behavior spells trouble for our code.

benchmark-trend

How in practice can we measure Big-O notation? I created the benchmark-trend gem which automatically finds the best trend line that fits across all the inputs and in doing so empirically predicts the Big-O notation. The benchmark-trend gem powers all the computational complexity assertions present in the rspec-benchmark gem.

We have already seen some of the performance assertions provided by the rspec-benchmark gem. However, it has more assertions up its sleeve that allow us to test the computational complexity of our code. We can invoke our code in a block expectation and assert, for example, logarithmic trend with the perform_logarithmic matcher.


expect { ... }.to perform_constant
expect { ... }.to perform_logarithmic
expect { ... }.to perform_linear
expect { ... }.to perform_power
expect { ... }.to perform_exponential

Algorithm with 4 points

In general, the algorithm for writing performance tests that check how a piece of code scales can be summarized in a few steps:

  1. Choose a method to profile.
  2. Choose workloads for the method.
  3. Describe workloads with features, for example the number of bytes in an input file or the number of nodes in a binary tree.
  4. Predict the performance in terms of Big-O notation.

Following the algorithm, we choose to put Ruby’s Enumerable max method under the microscope to see how it scales. To do so we need to create workloads that the max method will understand that are progressively larger.

We open up our test file and add description for the computational complexity of max method.

Next according to our algorithm, we need to find workloads that will be used to create different array sizes. To do this we can use the bench_range helper method and give it a start value of 8 and end value of 100_000. This method will then generate an exponential sequence of numbers where intermediate values are powers of range multiplier, by default 8.

Now, we need to convert the range numbers into features that the max method understands. In our case the array instances with random numbers limited by the array size. It is important to create the number_arrays prior to invoking the algorithm so as to not impact our measurements.

We use a block expectation to estimate computation complexity.

Then we use the perform_logarithmic matcher to assert that the max method scales in logarithmic time with increasing array sizes. Usually the expectation subject doesn't change depending on input. This time though, we need to invoke max method for different array sizes. How do we achieve this?

To invoke the expectation block for different number_array sizes, we need to use in_range matcher and pass the array_sizes created during the workload setup. These array sizes will be yielded to the expectation block along with their index.

We can use the index to select appropriate array from the number_arrays and send the max message to it.


RSpec.describe "#max" do
  it "scales ??? for arbitrarily large arrays with unsorted elements"
    array_sizes = bench_range(8, 100_000) # => [8, 64, 512, 4096, 8192, 32768, 100000]
    number_arrays = array_sizes.map { |n| Array.new(n) { rand(n) } }

    expect { |n, i|
      number_arrays[i].max
    }.to perform_logarithmic.in_range(array_sizes)
  end
end

And finally run our test.


#max scales ??? for arbitrarily large arrays with unsorted elements
Failure/Error:
  expect { |n, i|
    number_arrays[i].max
  }.to perform_logarithmic.in_range(array_sizes)

  expected block to perform logarithmic, but performed linear

Ohh! So the max method doesn't scale logarithmically but grows linearly with array sizes.


RSpec.describe "#max" do
  it "scales linearly for arbitrarily large arrays with unsorted elements"
    array_sizes = bench_range(8, 100_000) # => [8, 64, 512, 4096, 8192]
    number_arrays = array_sizes.map { |n| Array.new(n) { rand(n) } }

    expect { |n, i|
      number_arrays[i].max
    }.to perform_linear.in_range(number_arrays)
  end
end

Now that we know this we can update our test to assert linear performance and run it again.


RSpec.describe "#max" do
  it "scales linearly for arbitrarily large arrays with unsorted elements"
    array_sizes = bench_range(8, 100_000) # => [8, 64, 512, 4096, 8192]
    number_arrays = array_sizes.map { |n| Array.new(n) { rand(n) } }

    expect { |n, i|
      number_arrays[i].max
    }.to perform_linear.in_range(number_arrays).sample(10).times
  end
end

To ensure that our results are stable we sample each measurement 10 times to smooth out the trend line.

Nice! By writing a test asserting computational complexity of the max method, we added a new dimension to our performance testing techniques. Most programmers do not perform analysis of the computational complexity of their code. However, by knowing how our code scales with growing inputs, we can prevent nasty performance surprises when our system starts hitting some serious workloads.

Happy safe hacking!

Responses