In Progress
Unit 1, Lesson 21
In Progress

Testing Threads

Video transcript & code

You might remember the thread-safe queue class we've been building since episode 137. I've been wanting to do some refactoring on this class. But the first step in refactoring is having green tests, and in order to have green tests I first need to have some tests at all.

So today I'd like to begin adding some tests around the queue class. This means that we're going to have to tackle the problem of testing using multiple threads.

We'll start with a simple test that a thread attempting to pop an item off of the queue will sleep until an item is available. At that point it should wake up and take the item.

We instantiate a queue object. Now we need to wait on the queue for an item to be available. We can't do that right here in the main test code, because we'd just cause our test to halt forever. Instead, we start up a separate consumer thread which pops the queue once and then exits.

At this point we have spun off a new thread, but by the nature of threads we have no idea how far it has proceeded as we move forward in the main thread. The trouble with testing threads is that they throw a big mud-ball of indeterminacy into our nice clean, predictable tests.

Before we move forward in this particular test, We want to ensure that the consumer thread has reached the point where it is waiting for a new item to appear in the queue. We might be tempted to just stick a big fat sleep in the code at this point. Surely after, say, a tenth of a second the other thread will be initialized and will have proceeded as far as possible, right?

specify "waiting for an item" do
  q = Queue.new
  consumer = Thread.new do
    q.pop
  end
  sleep 0.1
end

There are at least two problems with this approach. First, if we keep throwing big sleeps in our tests we'll quickly slow our test suite to a crawl. And second, this technique simply isn't reliable. Sooner or later we'll run the tests while the system is under high load and find that the tests intermittently fail because the other thread isn't being scheduled quickly enough. As a general rule, we don't like slow tests, and we really don't like unreliable tests. As tempting as it might be, this approach is a non-starter.

A strategy I often employ when testing threads is to find ways to force my threads to rendezvous at known points in their lifetimes. I think of these as "checkpoints". Since waiting on the queue should be the first and only reason this thread has to wait, we'll wait for the thread to go int a sleep state before moving on. We still have to use a sleep to wait for this, but we can use a much smaller duration because we'll be repeating it until the consumer thread is asleep. We use a sleep time of 1 millisecond, which as we discovered in episode 151, is probably about the shortest period the VM can accurately sleep.

We perform these millisecond sleeps until the thread's status is "sleep". Ruby threads can exist in one of several states, including running, sleeping, aborting, and dead. See the Thread class documentation for all the possible values #status can return.

Once we've determined that the consumer thread is waiting on the queue, we push an item onto the queue. Recall that the consumer thread simply pops one item and then ends. Since the pop is the last statement in the thread, the thread's "return value", so to speak, should be the popped item. So what we want to do next is wait for the thread to finish and then check it's value.

As it turns out, we can do both of these with a single method. By sending the thread the #value message, we implicitly wait for it to terminate and then get access to its ending value. At this point all we need to do is compare the value to the item we pushed into the queue. If they match, this test passes.

specify "waiting for an item" do
  q = Queue.new
  consumer = Thread.new do
    q.pop
  end
  sleep 0.001 until consumer.status == "sleep"
  q.push "hello"
  expect(consumer.value).to eq("hello")
end

Let's run this test. It passes.

That's not as reassuring as it might seem. This is tricky stuff, so we want very much to ensure that our test is in fact testing what we think it is, and not succeeding by accident. To check our work, we falsify the test by temporarily commenting out the line that causes queue clients to sleep until an item is available.

def wait_for_condition(
    cv, condition_predicate, timeout=:never, timeout_policy=->{nil})
  deadline = timeout == :never ? :never : Time.now + timeout
  @lock.synchronize do
    loop do
      cv_timeout = timeout == :never ? nil : deadline - Time.now
      if !condition_predicate.call && cv_timeout.to_f >= 0
        # cv.wait(@lock, cv_timeout)
      end
      if condition_predicate.call
        return yield
      elsif deadline == :never || deadline > Time.now
        next
      else
        return timeout_policy.call
      end
    end
  end
end

We run the test again. This time, we don't see a failure. Instead, the tests simply hang indefinitely.

This is because the consumer thread is now proceeding straight from running to a dead state, without ever sleeping. Our line of code that waits for the consumer to sleep will wait forever.

This is partial confirmation that we're testing the right thing, but it's not a very user-friendly way for a test to fail. Let's write a test helper method to improve the testing experience in cases like this. We'll call it #wait_for. In it, we use the timeout standard library to start a 1-second timeout. We've talked before about how timeout can be unsafe. In this case, we don't care, because it's only job is to bomb out of a failing test. We don't have any need to keep the system under test in a consistent state under those circumstances.

Inside the timeout block, we put our repeated 1-millisecond sleep. We use the block passed to this method as the predicate for when to stop waiting.

Now we can go back to our test and replace the sleep line with a call to our helper method. The end result reads quite nicely as "wait for the consumer status to be sleep".

specify "waiting for an item" do
  q = Queue.new
  consumer = Thread.new do
    q.pop
  end
  wait_for { consumer.status == "sleep" }
  q.push "hello"
  expect(consumer.value).to eq("hello")
end

def wait_for
  Timeout.timeout 1 do
    sleep 0.001 until yield
  end
end

We run the tests again. This time, after a pause, we see a timeout exception reported, with a stack trace pointing back to the source. This is a much better test failure.

When we re-add the commented-out line in the code under test, we can see that the test once again passes. We're pretty confident now that this is a meaningful test, and we can move on to more advanced tests.

In a future episode we'll look at more techniques for testing threads. But this is enough for now. Happy hacking!

Responses