In Progress
Unit 1, Lesson 1
In Progress

Mistake-Proof API

A useful question to ask ourselves anytime we design an interface is “How can we not only validate correct usage, but also make it impossible to even express incorrect usage?” In this episode you will learn how to eliminate the possibility of a whole class of mistakes by modifying the API.

Video transcript & code

Mistake-Proof API

Let’s say we’re implementing a sliding-window algorithm for rate-limiting message traffic.


class SlidingWindow
  def initialize(window_size: 2, limit: 3)
    @buckets      = Array.new(window_size, 0)
    @limit        = limit
    @current_tick = 0
  end 

  def on_message(tick)
    increment = tick - @current_tick
    @current_tick = tick
    increment.times do
      @buckets.shift
      @buckets << 0
    end
    if @buckets.reduce(:+) >= @limit
      return {allowed: false}
    else
      @buckets[-1] += 1
      return {allowed: true}
    end
  end
end

We can instantiate a SlidingWindow object with the size of the sliding window in terms of seconds, and the max number of messages to allow during any given window of time.


policy = SlidingWindow.new(window_size: 2, limit: 3)

Now we call on_message with the current "tick". The meaning of this call is: in the first second of operation, we are attemptingn to pass along a message.

Now we run this. The result says that this message is allowed through!


policy.on_message(1)  # => {:allowed=>true}

Now let's attempt to send a few more messages. We continue to pass the number 1, indicating that all of these message events occur during the first tick.


policy.on_message(1)  # => {:allowed=>true}
policy.on_message(1)  # => {:allowed=>true}

At the fourth attempt to pass a message during a single tick, the limit is hit, and the limiter says: no, no more messages.


policy.on_message(1)  # => {:allowed=>false}

We advance to a new tick and attempt to send a message.

But this one is blocked as well.


policy.on_message(2)  # => {:allowed=>false}

Those three original messages were only a single tick ago, and our sliding window has a two-tick-long history.

So we advance to the third tick and try again.

Now this succeeds.


policy.on_message(3)  # => {:allowed=>true}

Those original messages are now outside the two-tick sliding window, and the limiter is once again accepting messages.

Let's go ahead and simulate the passage of more time and messages. We briefly hit our limit in tick four, but then are back to normal in ticks five and six...


policy.on_message(3)  # => {:allowed=>true}
policy.on_message(4)  # => {:allowed=>true}
policy.on_message(4)  # => {:allowed=>false}
policy.on_message(5)  # => {:allowed=>true}
policy.on_message(6)  # => {:allowed=>true}

OK, you get the idea.

Up til now, we have been advancing our simulated clock one tick forwards at a time. But...

What if we went back in time instead?


policy.on_message(5)  # => {:allowed=>true}

This doesn't raise an exception. Is this... correct behavior? Is it something the designers of the algorithm anticipated? What exactly does it mean for a message one tick back in time to be permitted? And... what will the implications be for future messages once we start stepping forwards in time again?

There are no clear and obvious answers to these questions. It's not even clear if there could be a sensible, predictable behavior here, let alone if the code implements it as it stands now.

So what can we do about this?

Well, one thing we could do is to add some input validation to the on_message method, asserting that the tick argument must only move forwards in time.


  def on_message(tick)
    fail RangeError, "Time travel is forbidden" unless tick >= @current_tick # ~> RangeError: Time travel is forbidden
    # ...
  end

Now our experiment in time travel raises an exception.


policy.on_message(5)  # => 

# ~> RangeError
# ~> Time travel is forbidden
# ~>
# ~> 02_sliding_window_validated_input.rb:9:in `on_message'
# ~> 02_sliding_window_validated_input.rb:39:in `
'

This prevents nonsensical usage. But it doesn't prevent clients from attempting an invalid call. Which they might do accidentally, because of a flaw in their own code.

Is there any other possibility here? What if we could structure this API in such a way that it was impossible to use it incorrectly?

Here's an alternative implementation of the SlidingWindow protocol. Instead of a single method, there are now two:

on_tick

and on_message


class SlidingWindow
  def initialize(window_size: 2, limit: 3)
    @buckets = Array.new(window_size, 0)
    @limit = limit
  end

  def on_tick
    @buckets.shift
    @buckets << 0
  end

  def on_message
    if @buckets.reduce(:+) >= @limit
      return { allowed: false }
    else
      @buckets[-1] += 1
      return { allowed: true }
    end
  end
end

Notice that on_message no longer takes an argument.

Let's wave our magic video wand and transform our original sequence of calls to use this new API.


policy.on_tick
policy.on_message  # => {:allowed=>true}
policy.on_message  # => {:allowed=>true}
policy.on_message  # => {:allowed=>true}
policy.on_message  # => {:allowed=>false}
policy.on_tick
policy.on_message  # => {:allowed=>false}
policy.on_tick
policy.on_message  # => {:allowed=>true}
policy.on_message  # => {:allowed=>true}
policy.on_tick
policy.on_message  # => {:allowed=>true}
policy.on_message  # => {:allowed=>false}
policy.on_tick
policy.on_message  # => {:allowed=>true}
policy.on_tick
policy.on_message  # => {:allowed=>true}

Now instead of passing a tick number to each invocation of on_message, we explicitly advance to the next clock tick by calling on_tick.

So... what happens when we try to travel back in time using this new API?

...well, we can't. There's no way to even try that experiment, because the API only exposes two possibilities: advance the clock, or attempt a message.

By modifying the API, we've eliminated the possibility of a whole class of mistakes. This is a useful question to ask ourselves anytime we design an interface: how can we not only validate correct usage, but also make it impossible to even express incorrect usage?

As we've seen today, when dealing with stateful time-aware algorithms like this one, one way to do this is to stop passing state as data, and replace such methods with a purely event-oriented protocol. Happy hacking!

Responses