In Progress
Unit 1, Lesson 1
In Progress

Workflow

Video transcript & code

At the center of a lot of application programming is workflow. Unlike algorithms or object modeling, workflow is all about lists of tasks that have to be performed one after another in order to achieve some business goal. Usually, the execution of later tasks depends on the outcome of earlier tasks in the list.

For instance, consider this highly simplified workflow. First, log in. Then, make a purchase. Next, collect any special offers related to that purchase. Finally, get a receipt for the purchase.

Unfortunately, any of the steps in this workflow can fail. As we can see if we run the code: currently it doesn't get past trying to make a purchase before terminating with an exception.

$n = 0

def login
  puts "Logging in"
  $n += 1
  return :session123
end

def make_purchase(session)
  puts "Making purchase"
  fail "The API was rude to me" if $n < 2
  $n += 1
  :purchase_record
end

def get_special_offers(purchase_record)
  puts "Getting special offers"
  fail "Special offers server is down."
end

def get_receipt(purchase_record)
  puts "Getting receipt"
  fail "I forgot what I was doing" if $n < 3
  fail "I left it in my other pants" if $n < 4
  $n += 1
  :receipt
end

session = login
purchase_rec = make_purchase(session)
offers = get_special_offers(purchase_rec)
receipt = get_receipt(purchase_rec)

# >> Logging in
# >> Making purchase

# ~> RuntimeError
# ~> The API was rude to me
# ~>
# ~> xmptmp-in25515jcB.rb:11:in `make_purchase'
# ~> xmptmp-in25515jcB.rb:30:in `<main>'

The errors that these particular steps usually encounter tend to be transient errors. That is, they are errors that only happen some of the time, due to network connectivity issues, server outages, or other temporary problems.

There are a lot of potential ways we could change this code to make it more robust. Some possible approaches include building a state machine to represent our workflow. Or creating a general-purpose monadic abstraction for chaining together unreliable actions.

These are legitimate strategies with some good arguments in their favor. But I thought it would be interesting to try and see how we might tackle this with as little change or added ceremony as possible, using just some carefully chosen Ruby features.

In order to make the workflow more robust, we add a new helper method called attempt. It accepts an argument determining how many times a task will be retried, and a block which is the action to be attempted.

We surround each step in the workflow with a call to attempt, specifying different numbers of re-try attempts based on our past experience with these actions and our tolerance for waiting before giving up.

When we run the new code, it gets further. But unfortunately, it still doesn't quite succeed. The method for getting special offers is still failing.

The thing is, getting special offers is strictly optional. This is reflected in the fact that we've only permitted it one attempt. We don't want to waste a lot of time trying to retrieve special offers.

$n = 0

def login
  puts "Logging in"
  $n += 1
  return :session123
end

def make_purchase(session)
  $n += 1
  puts "Making purchase"
  fail "The API was rude to me" if $n < 3
  :purchase_record
end

def get_special_offers(purchase_record)
  $n += 1
  puts "Getting special offers"
  fail "Special offers server is down."
end

def get_receipt(purchase_record)
  $n += 1
  puts "Getting receipt"
  fail "I forgot what I was doing" if $n < 4
  fail "I left it in my other pants" if $n < 5
  :receipt
end

def attempt(times: 1)
  yield
rescue => e
  times -= 1
  retry if times > 0
  raise(e)
end

session      = attempt(times: 1)  {login}
purchase_rec = attempt(times: 3)  {make_purchase(session)}
offers       = attempt(times: 1)  {get_special_offers(purchase_rec)}
receipt      = attempt(times: 10) {get_receipt(purchase_rec)}

# >> Logging in
# >> Making purchase
# >> Making purchase
# >> Getting special offers

# ~> RuntimeError
# ~> Special offers server is down.
# ~>
# ~> xmptmp-in25515NTF.rb:19:in `get_special_offers'
# ~> xmptmp-in25515NTF.rb:40:in `block in <main>'
# ~> xmptmp-in25515NTF.rb:31:in `attempt'
# ~> xmptmp-in25515NTF.rb:40:in `<main>'

What we need is a way to alter the error policy for an individual step in the workflow. In order to do this, we add a new keyword argument on_error to the attempt method. We give it a default which is a lambda that simply re-raises the passed error.

Then we replace the raise in the rescue stanza with a call to the on_error handler, using Ruby's shorthand lambda calling syntax.

Back in our workflow, we change the special offers step. We give it a custom error handler, which simply ignores the passed error and does nothing. When we run the code again, it gets all the way to the end. We have successfully made the special offer step optional.

$n = 0

def login
  puts "Logging in"
  $n += 1
  return :session123
end

def make_purchase(session)
  $n += 1
  puts "Making purchase"
  fail "The API was rude to me" if $n < 3
  :purchase_record
end

def get_special_offers(purchase_record)
  $n += 1
  puts "Getting special offers"
  fail "Special offers server is down."
end

def get_receipt(purchase_record)
  $n += 1
  puts "Getting receipt"
  fail "I forgot what I was doing" if $n < 4
  fail "I left it in my other pants" if $n < 5
  :receipt
end

def attempt(times: 1, on_error: ->(e){raise e})
  yield
rescue => e
  times -= 1
  retry if times > 0
  on_error.(e)
end

session = attempt(times: 1)  {login}
purchase_rec = attempt(times: 3)  {make_purchase(session)}
offers = attempt(times: 1, on_error: ->(_){}) {
  get_special_offers(purchase_rec)
}
receipt = attempt(times: 10) {get_receipt(purchase_rec)}

# >> Logging in
# >> Making purchase
# >> Making purchase
# >> Getting special offers
# >> Getting receipt

Now let's throw a wrench in the works. We don't actually want these steps to raise exceptions when they fail. What we really want to do is collect both the exception and some extra contextual information for diagnostic use.

In order to do this, we add a new variable error_details, which starts out nil. We then make a lambda named capture_error, which will accept an exception as an argument and set the error_details variable to a hash of failure data. In the hash we include the actual exception, the time at which it was raised, and the hostname of the current node. For this last item, we also need to require the socket library.

We then go through each of our steps, except for the optional special offers one, changing the error handler to be our custom lambda.

We add a line to our output, examining the state of the error_details variable. Then we alter one of the workflow steps to force it to always fail.

When we run the code, we can see that rather than raising an exception as it did before, it now saves information into the error_details variable.

$n = 0

def login
  puts "Logging in"
  $n += 1
  return :session123
end

def make_purchase(session)
  $n += 1
  puts "Making purchase"
  fail "The API was rude to me" if $n < 100
  :purchase_record
end

def get_special_offers(purchase_record)
  $n += 1
  puts "Getting special offers"
  fail "Special offers server is down."
end

def get_receipt(purchase_record)
  $n += 1
  puts "Getting receipt"
  fail "I forgot what I was doing" if $n < 4
  fail "I left it in my other pants" if $n < 5
  :receipt
end

def attempt(times: 1, on_error: ->(e){raise e})
  yield
rescue => e
  times -= 1
  retry if times > 0
  on_error.(e)
end

require "socket"
error_details = nil
capture_error = ->(e){
  error_details = {
    error: e,
    time:  Time.now,
    host: Socket.gethostname
  }
}
session = attempt(times: 1, on_error: capture_error)  {login}
purchase_rec = attempt(times: 3, on_error: capture_error) {
  make_purchase(session)
}
offers = attempt(times: 1, on_error: ->(_){}) {
  get_special_offers(purchase_rec)
}
receipt = attempt(times: 10, on_error: capture_error) {
  get_receipt(purchase_rec)
}

error_details
# => {:error=>#<RuntimeError: The API was rude to me>,
#     :time=>2014-10-17 00:35:15 -0400,
#     :host=>"hazel"}

# >> Logging in
# >> Making purchase
# >> Making purchase
# >> Making purchase
# >> Getting special offers
# >> Getting receipt

Unfortunately, we can also see an unintended consequence in the output. Because we are no longer terminating execution early with an exception, the steps go right on executing even after the purchase step has failed.

In order to ensure execution stops as soon as a required step fails, we make a few more changes. First, we modify our error handler lambdas. We make the capture_error one return false, and the special offer's no-op one returns true.

Then we add and operators connecting each of the steps together in a chain. You might recall this idiom from episode #125.

We've now made the execution of each step depend on the previous step returning a truthy value. We happen to know that all of our steps return something truthy when they succeed. And we've ensured that the capture_error lambda returns false, which should end the chain of execution when an error is encountered.

To test this, we run the code again. Sure enough, we can see that the workflow now only gets as far as the failing step, and then stops.

$n = 0

def login
  puts "Logging in"
  $n += 1
  return :session123
end

def make_purchase(session)
  $n += 1
  puts "Making purchase"
  fail "The API was rude to me" if $n < 100
  :purchase_record
end

def get_special_offers(purchase_record)
  $n += 1
  puts "Getting special offers"
  fail "Special offers server is down."
end

def get_receipt(purchase_record)
  $n += 1
  puts "Getting receipt"
  fail "I forgot what I was doing" if $n < 4
  fail "I left it in my other pants" if $n < 5
  :receipt
end

def attempt(times: 1, on_error: ->(e){raise e})
  yield
rescue => e
  times -= 1
  retry if times > 0
  on_error.(e)
end

require "socket"
error_details = nil
capture_error = ->(e){
  error_details = {
    error: e,
    time:  Time.now,
    host:  Socket.gethostname
  }
  false
}
session = attempt(times: 1, on_error: capture_error)  {login} and
purchase_rec = attempt(times: 3, on_error: capture_error) {
  make_purchase(session)
} and
offers = attempt(times: 1, on_error: ->(_){true}) {
  get_special_offers(purchase_rec)
} and
receipt = attempt(times: 10, on_error: capture_error) {
  get_receipt(purchase_rec)
}

error_details
# => {:error=>#<RuntimeError: The API was rude to me>,
#     :time=>2014-10-17 00:34:58 -0400,
#     :host=>"hazel"}

# >> Logging in
# >> Making purchase
# >> Making purchase
# >> Making purchase

There is a lot more we could try and tackle here. For instance, what about specifying timeouts for slow operations? And what do we do about tasks that return a falsy value even when they succeed?

There are clearly refinements we could make. But what we have already done works well, and I like that we've managed to accomplish without radically changing the basic structure of this code. I think this is enough for today. Happy hacking!

Responses