In Progress
Unit 1, Lesson 21
In Progress

Async with Multiplexing

In the last episode in this series, we took an initial crack at the problem of automatically coordinating asynchronous tasks. We used a looping approach where we checked in with all of our jobs and workers every time around the loop. Unfortunately, this strategy has some significant efficiency drawbacks. Today, we’ll apply a technique known as “multiplexing” to eliminate some unnecessary overhead.

Video transcript & code

This is the third episode in the Asynchronous Series.

Async with Multiplexing

 

Our objective was to keep three campers busy preparing the classic American campfire snacks known as "s'mores".

What we came up with was a simple polling loop.

Each time through, so long as there are any s'more orders left unfinished, the loop polls each of the workers to see if they are ready for work.

Then it checks to see what kind of worker is ready, and whether there is a job for it to do next.

When we run this, it finishes preparing three s'mores in 230 simulated seconds.


# >> 230: Everyone has a s'more!

There are some significant downsides to polling solutions like this one.

One major issue is that they are not efficient.

The polling loop repeatedly has to check each s'more and each prepper for their current status. In this toy simulation this doesn't matter. But our s'more example is standing in for IO-polling solutions in the real world, and in the real world, constant polling of numerous IO handles inefficient.

Another efficiency problem has to do with sleep time.

Naively written polling code can become a "busy-wait" loop which hogs the CPU, even when there's no new scheduling for it to do. In order to avoid this, polling solutions typically sleep a fixed amount of time in between polls. We've simulated that with our work() call.

But if a resource becomes ready right after the loop goes to sleep, it won't be noticed until the loop wakes back up. The time between the resource coming available and the next poll is wasted.


until smores.all?{|smore| smore.state == :ready }
  [wrangler, toaster, assembler].select(&:ready_for_work?).each do |prepper|
    case prepper
    when MarshmallowWrangler
      if smore = smores.detect{|smore| smore.state == :not_started}
        log "Finding a marshmallow for #{smore.camper}"
        prepper.start(smore)
      end
    when Toaster
      if smore = smores.detect{|smore| smore.state == :has_marshmallow}
        log "Toasting a marshmallow for #{smore.camper}"
        prepper.start(smore)
      end
    when Assembler
      if smore = smores.detect{|smore| smore.state == :toasted}
        log "Assembling a smore for #{smore.camper}"
        prepper.start(smore)
      end
    else
      fail "Should never get here"
    end
  end
  work(1)
end

We can simulate this by increasing our sleep time between polls:


  work(40)

When we run this, we see that the total time is now longer, because preppers are finishing and sitting idle while they wait for the next poll interval to come around.


# >> 320: Everyone has a s'more!

So between polling all of our resources, and inserting sleeps, we have a couple of sources of inefficiency. And they both stem from the fact that we have to keep checking if a worker is ready for new work.

What if instead of polling, we could arrange to wait until a resource is ready? Let's see what that would look like.

We'll add a new global helper method named wait_for_ready. It'll take an array of s'more prepper objects for its parameter.

Then it will find out which of the currently busy workers is going to be done the soonest.

It does this by taking the list of preppers,

filtering out any that are currently idle,

finding at what tick-time the remaining preppers are going to be ready,

taking the soonest of those times

And defaulting to now if it didn't find any.

It then calculates how long from now it will need to wait for the next prepper to be ready.

Then it waits that long.

And finally returns all the workers which are now ready for new work.


def wait_for_ready(preppers)
  soonest_ready_at = preppers
    .reject(&:ready_for_work?)
    .map(&:ready_at)
    .min || $tick
  wait_ticks = soonest_ready_at - $tick
  log "Sleeping #{wait_ticks} ticks while work is done"
  work(wait_ticks)
  preppers.select(&:ready_for_work?)
end

For this helper to work, it is necessary for us to expose the ready_at attribute of preppers.


class Prepper
  # ...
  attr_reader   :ready_at
  # ...

Now all we need to do is make use of our new wait_for_ready helper.

At the beginning of our polling loop,

instead of checking to see which preppers are ready, we call wait_for_ready with a list of preppers, and capture its results.

Remember, this will pause as long as necessary until a worker is ready for new work.


until smores.all?{|smore| smore.state == :ready }
  ready_preppers = wait_for_ready([wrangler, toaster, assembler])
  ready_preppers.each do |prepper|
  # ...

Which means we no longer need the call to wait at the bottom of the loop.


    # ...
    else
      fail "Should never get here"
    end
  end

Let's run this and see what happens.


# >>   0: Gathering around the campfire
# >>   0: Sleeping 0 ticks while work is done
# >>   0: Finding a marshmallow for Kashti
# >>   0: Sleeping 20 ticks while work is done
# >>  20: A marshmallow is ready for Kashti
# >>  20: Finding a marshmallow for Ebba
# >>  20: Toasting a marshmallow for Kashti
# >>  20: Sleeping 20 ticks while work is done
# >>  40: A marshmallow is ready for Ebba
# >>  40: Finding a marshmallow for Ylva
# >>  40: Sleeping 20 ticks while work is done
# >>  60: A marshmallow is ready for Ylva
# >>  60: Sleeping 20 ticks while work is done
# >>  80: A marshmallow is toasty warm for Kashti
# >>  80: Toasting a marshmallow for Ebba
# >>  80: Assembling a smore for Kashti
# >>  80: Sleeping 30 ticks while work is done
# >> 110: Your s'more is ready, Kashti!
# >> 110: Sleeping 30 ticks while work is done
# >> 140: A marshmallow is toasty warm for Ebba
# >> 140: Toasting a marshmallow for Ylva
# >> 140: Assembling a smore for Ebba
# >> 140: Sleeping 30 ticks while work is done
# >> 170: Your s'more is ready, Ebba!
# >> 170: Sleeping 30 ticks while work is done
# >> 200: A marshmallow is toasty warm for Ylva
# >> 200: Assembling a smore for Ylva
# >> 200: Sleeping 30 ticks while work is done
# >> 230: Your s'more is ready, Ylva!
# >> 230: Everyone has a s'more!
# >> 230: Shortest prep: Kashti at 110 seconds
# >> 230: Longest prep: Ylva at 190 seconds

Success! It completes in the same amount of time as our original example, but now without any of the inefficiencies of polling. This version sleeps as long as necessary when the workers are doing their work asynchronously. And when one or more worker is ready, the loop only services those workers. It doesn't try to poll every prepper every time.

The solution we've just built is using a technique known as multiplexing. Our wait_until_ready method is a multiplexor method. We give it a list of references to resources, and it returns when one or more of them is ready for more interaction.

Multiplexing/Demultiplexing

How does this relate to real-world asynchronous programming? Well, in essence our wait_until_ready method functions very similarly to the classic UNIX select() system call. It's also analogous to the Windows <a href="https://msdn.microsoft.com/en-us/vstudio/ms687025(v=vs.100).aspx">WaitForMultipleObjects()</a> system call, the Linux <a href="http://man7.org/linux/man-pages/man7/epoll.7.html">epoll</a> system, and the BSD <a href="https://man.openbsd.org/kqueue.2">kqueue()</a> system call.

All of these system APIs exist for the purpose of multiplexing I/O or other asynchronous operations. They each take a list of handles to I/O or other operations in progress. They then pause the current thread of execution until one or more of those handles is ready to be read from, written to, or with an error.

Why does this matter? Well, an enormous amount of software is built on top of I/O multiplexing APIs. The web server you downloaded this episode from probably uses them. As do various Ruby web application servers, the EventMachine Gem, and other reactive language runtimes such as Node.js. In fact most software that accomplishes more than one thing at once has multiplexing somewhere in its foundations. And now, hopefully, you have a better understanding of what multiplexing means. Happy hacking!

In a coming episode, Avdi continues the "asynchronous"  series with the Reactor pattern.

Responses