Asynch with Polling
In Episode #559, we began exploring the problem of full utilization. Given multiple worker objects which can each operate asynchronously, but which also need to collaborate together to complete a task, how can we coordinate their work so that none of them sit idle any more than necessary?
Video transcript & code
Async with Polling
Our example task is to assemble "s'mores" out of toasted marshmallows, graham crackers, and chocolate bars. In order to parallelize the work of creating s'mores, we've created three independent job descriptions:
- The Marshmallow Wrangler, responsible for extracting one marshmallow from the sticky mass inside the bag.
- The Toaster, who takes a marshmallow and toasts it over the campfire
- And the Assembler, who puts together the actual s'more sandwich and makes sure the chocolate reaches optimal meltiness.
class MarshmallowWrangler
def start(smore)
fail "I'm busy!" unless @ready_at.to_i <= $tick
@ready_at = $tick + 20
smore.started_at = $tick
end
def finish(smore)
log("A marshmallow is ready for #{smore.camper}")
end
end
class Toaster
def start(smore)
fail "I'm busy!" unless @ready_at.to_i <= $tick
@ready_at = $tick + 60
end
def finish(smore)
log("A marshmallow is toasty warm for #{smore}")
end
end
class Assembler
def start(smore)
fail "I'm busy!" unless @ready_at.to_i <= $tick
@ready_at = $tick + 30
end
def finish(smore)
smore.finished_at = $tick
log("Your s'more is ready, #{smore}!")
end
end
In the previous episode we tried to find a way to make sure that these workers were doing their jobs in parallel. In order to do so, we split each workers' work between a start
and a finish
method.
Then we scheduled each workers' labor by interleaving task starts, waits for work to be done, and task finish calls. We manually worked out an order of calls that would maximize the utilization of each worker without asking any of them to do more than one thing at a time.
wrangler.start(smores[0])
work(20)
wrangler.finish(smores[0])
wrangler.start(smores[1])
toaster.start(smores[0])
work(20)
wrangler.finish(smores[1])
wrangler.start(smores[2])
work(20)
wrangler.finish(smores[2])
work(20) # 80
toaster.finish(smores[0])
toaster.start(smores[1])
assembler.start(smores[0])
work(30) # 110
assembler.finish(smores[0])
work(30) # 140
toaster.finish(smores[1])
toaster.start(smores[2])
assembler.start(smores[1])
work(30) # 170
assembler.finish(smores[1])
work(30) # 200
toaster.finish(smores[2])
assembler.start(smores[2])
work(30) # 230
assembler.finish(smores[2])
This manual coordination was an exercise in clarity. We now have a good visual sense of the kind of interleaving necessary to efficiently coordinate asynchronous interdependent jobs. But there's no way we can do this for real-world code. It's time to find a way to automatically interleave asynchronous work for full utilization.
Today, we're going to take our first crack of several at this problem of automatic asynchronous work coordination.
The approach we're going to attempt is a classic one. We're going to take a polling approach, in which we continually check to see if our s'mores are done, and, if not, if there is someone free who could be working on them.
In order to make this happen, we're first going to need to make some changes to our Smore
and prepper classes.
To the Smore
class, we add a new state
attribute.
We initialize it to the state :not_started
.
class Smore
attr_accessor :started_at, :finished_at, :state
attr_reader :camper
def initialize(camper)
@camper = camper
@started_at = started_at
@finished_at = finished_at
@state = :not_started
end
def wait
finished_at - started_at
end
alias to_s camper
end
Next up, we introduce a new base class for the various s'more prepper classes, because we are going to be adding a significant amount of shared functionality to them.
Each prepper will have a field for the current s'more order they are working on.
Next we'll add an initializer. You might recall that before, we tracked whether the prepper classes were free or busy with a @ready_at
variable. We'll initialize this variable to zero to begin, indicating that the prepper starts out ready to do some work.
Next we're going to introduce some code to do a better job of having our preppers simulate truly asynchronous processes. Instead of telling them when to finish up, we want them to know when they are done, and automatically finish their work. In order to do this, we need to make the preppers aware of the current value of the global $tick
variable we are using to simulate time...and when that variable updates.
How will we do this? Well, we're going to use some deep Ruby magic called trace_var
. We tell Ruby to put a trace on the $tick
variable. And we provide a lambda which will be automatically called every time the variable is updated.
Why are we using this obscure trace_var
method? One reason is that it's the easiest way to have all the prepper objects keep track of the current tick without adding a significant amount of new code. The other reason is because, well... we can. I've been looking for an excuse to use trace_var
on this show for years, and I finally found one! For the record though: this is really a feature intended for debugging, and not one you should use in a production app!
Inside the lambda, we check to see if we've been working on a s'more order but we are now ready for new work.
If so we invoke our finish
method and then set current_smore
back to nil
.
Now that our prepper can automatically keep track of the current tick, we add a generic start
method that will work for all of our preppers. All it does is set the current s'more order.
Finally, we add the predicate method we referenced back in the initializer, which checks if the prepper is ready for new work based on whether the @ready_at
variable is equal to or greater than the current tick.
class Prepper
attr_accessor :current_smore
def initialize
@ready_at = 0
trace_var :$tick, ->(value) do
if current_smore && ready_for_work?
finish
self.current_smore = nil
end
end
end
def start(smore)
@current_smore = smore
end
def ready_for_work?
$tick >= @ready_at
end
end
Now that we have our base class, it's time to make all the prepper classes children of it. We'll start with the Toaster
role.
We update its start
method to also invoke the super
start
method we defined moments ago.
The finish method no longer needs a smore
argument, since now we are keeping track of the current order in private state.
We also add code that updates the s'more status to :toasted
while finishing up.
class Toaster < Prepper
def start(smore)
super
@ready_at = $tick + 60
end
def finish
current_smore.state = :toasted
log("A marshmallow is toasty warm for #{current_smore}")
end
end
We go through and make the exact same changes to the MarshmallowWrangler
class.
class MarshmallowWrangler < Prepper
def start(smore)
super
@ready_at = $tick + 60
end
def finish
current_smore.state = :has_smore
log("A marshmallow is toasty warm for #{current_smore}")
end
end
And we make similar changes to the Assembler
.
class Assembler < Prepper
def start(smore)
super
@ready_at = $tick + 30
end
def finish
current_smore.state = :ready
current_smore.finished_at = $tick
log("Your s'more is ready, #{current_smore}!")
end
end
Now for the fun part!
It's time to set up the polling loop. We start with an until
loop whose condition is that all the s'mores are in a ready state.
Inside, we go through the s'more preppers and select any that are currently free for new work.
Then start a case on the current free prepper
object.
If it's a MarshmallowWrangler
, and if there is a s'more order that hasn't been started...
We do some logging and tell it to get started un-sticking a marshmallow.
If the free prepper
is the Toaster
and there are any marshmallows ready for toasting...
We get it started.
Last but not least, if we're looking at the assembler
and if there are any s'more orders ready for final assembly...
We have it go ahead and start putting together the smore.
Just for the sake of sanity checking we add an else
clause which should never be invoked.
Finally, at the end of this inner loop, we advance the current time by one tick, before starting the whole process again.
until smores.all?{|smore| smore.state == :ready }
[wrangler, toaster, assembler].select(&:ready_for_work?).each do |prepper|
case prepper
when MarshmallowWrangler
if smore = smores.detect{|smore| smore.state == :not_started}
log "Finding a marshmallow for #{smore.camper}"
prepper.start(smore)
end
when Toaster
if smore = smores.detect{|smore| smore.state == :has_marshmallow}
log "Toasting a marshmallow for #{smore.camper}"
prepper.start(smore)
end
when Assembler
if smore = smores.detect{|smore| smore.state == :toasted}
log "Assembling a smore for #{smore.camper}"
prepper.start(smore)
end
else
fail "Should never get here"
end
end
work(1)
end
OK! We now have a polling loop which can, at least in theory, automatically coordinate all the work of our parallel s'more-making assembly line.
Let's give it a try!
# >> 0: Gathering around the campfire
# >> 0: Finding a marshmallow for Kashti
# >> 20: A marshmallow is ready for Kashti
# >> 20: Finding a marshmallow for Ebba
# >> 20: Toasting a marshmallow for Kashti
# >> 40: A marshmallow is ready for Ebba
# >> 40: Finding a marshmallow for Ylva
# >> 60: A marshmallow is ready for Ylva
# >> 80: A marshmallow is toasty warm for Kashti
# >> 80: Toasting a marshmallow for Ebba
# >> 80: Assembling a smore for Kashti
# >> 110: Your s'more is ready, Kashti!
# >> 140: A marshmallow is toasty warm for Ebba
# >> 140: Toasting a marshmallow for Ylva
# >> 140: Assembling a smore for Ebba
# >> 170: Your s'more is ready, Ebba!
# >> 200: A marshmallow is toasty warm for Ylva
# >> 200: Assembling a smore for Ylva
# >> 230: Your s'more is ready, Ylva!
# >> 230: Everyone has a s'more!
# >> 230: Shortest prep: Kashti at 110 seconds
# >> 230: Longest prep: Ylva at 190 seconds
And it looks like it works!
And if we look down at the end, we can see that it managed to finish all the orders in 230 ticks, which is just as fast as our manually-scheduled version did. So there we have it: we've successfully created a solution which efficiently and automatically divvies out work to our s'more preppers.
What we've constructed here is a controlled simulation of a real and fairly common pattern in asynchronous programming. Many real-world applications rely on external processes, remote services, and other types of asynchronous, time-consuming I/O. One way to ensure everything keeps moving is to to build the program around a big loop, which polls each resource to see if it is ready, and takes appropriate action if it is.
That said, there are also some significant downsides to the polling-loop approach. And we'll talk about them in an upcoming episode. Until then,
Happy hacking!
Responses