In Progress
Unit 1, Lesson 1
In Progress

Humane Exceptions with Hiro Asari

Runtime exceptions in production systems are frustrating on the best of days, but things get extra complicated in the context of a distributed systems architecture. Today, guest chef Hiro Asari joins us with some advice about how to make exception reports more useful and actionable.

Video transcript & code

Runtime exceptions in production systems are frustrating on the best of days, but things get extra complicated in the context of a distributed systems architecture. Today, guest chef Hiro Asari joins us with some advice about how to make exception reports more useful and actionable.

Who is Hiro? Well, he hails from Japan but has spent most of his working life in the USA. You might know him for his work on Travis CI, which he helped build from a small open source project to the most widely used continuous integration service on the planet.


Introduction

Exceptions are a fact of a programmer's life. No matter how well we write our program, exceptions will happen, and we will have to deal with the consequences.


NoMethodError: undefined method `empty?' for false:FalseClass
  from lib/travis/build/env/config.rb:19:in `reject'
  from lib/travis/build/env/config.rb:19:in `env_vars'
  from lib/travis/build/env/config.rb:12:in `vars'
  from lib/travis/build/appliances/setup_filter.rb:109:in `each'
  from lib/travis/build/appliances/setup_filter.rb:109:in `flat_map'
  from lib/travis/build/appliances/setup_filter.rb:109:in `secrets'
  from lib/travis/build/appliances/setup_filter.rb:66:in `apply?'
  from lib/travis/build/appliances.rb:48:in `apply'
  from lib/travis/build/script.rb:197:in `setup_filter'
  from lib/travis/build/stages/builtin.rb:12:in `block in run'
  from lib/travis/shell/builder.rb:162:in `with_options'
  from lib/travis/build/stages/base.rb:13:in `with_stage'
  from lib/travis/build/stages/builtin.rb:8:in `run'
  from lib/travis/build/stages.rb:123:in `run_stage'
  from lib/travis/build/stages.rb:113:in `define_stage'
  from lib/travis/build/stages.rb:82:in `block in run'
  from lib/travis/build/stages.rb:81:in `each'
  from lib/travis/build/stages.rb:81:in `run'
  from lib/travis/build/script.rb:143:in `run'
  from lib/travis/build/script.rb:84:in `sexp'
  from lib/travis/build/script.rb:74:in `compile'
  from lib/travis/api/build/app.rb:82:in `block in '
  from sinatra/base.rb:1611:in `call'
  from sinatra/base.rb:1611:in `block in compile!'
  from sinatra/base.rb:975:in `block (3 levels) in route!'
  from sinatra/base.rb:994:in `route_eval'
  from sinatra/base.rb:975:in `block (2 levels) in route!'
  from sinatra/base.rb:1015:in `block in process_route'
  from sinatra/base.rb:1013:in `catch'
  from sinatra/base.rb:1013:in `process_route'
  from sinatra/base.rb:973:in `block in route!'
  from sinatra/base.rb:972:in `each'
  from sinatra/base.rb:972:in `route!'
  from sinatra/base.rb:1085:in `block in dispatch!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:1082:in `dispatch!'
  from sinatra/base.rb:907:in `block in call!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:907:in `call!'
  from sinatra/base.rb:895:in `call'
  from rack/deflater.rb:35:in `call'
  from sinatra/base.rb:954:in `forward'
  from sinatra/base.rb:1028:in `route_missing'
  from sinatra/base.rb:989:in `route!'
  from sinatra/base.rb:985:in `route!'
  from sinatra/base.rb:1085:in `block in dispatch!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:1082:in `dispatch!'
  from sinatra/base.rb:907:in `block in call!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:907:in `call!'
  from sinatra/base.rb:895:in `call'
  from rack/config.rb:17:in `call'
  from rack/protection/xss_header.rb:18:in `call'
  from rack/protection/path_traversal.rb:16:in `call'
  from rack/protection/json_csrf.rb:18:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/frame_options.rb:31:in `call'
  from rack/nulllogger.rb:9:in `call'
  from rack/head.rb:13:in `call'
  from sinatra/base.rb:182:in `call'
  from sinatra/base.rb:2013:in `call'
  from sinatra/base.rb:954:in `forward'
  from sinatra/base.rb:1028:in `route_missing'
  from sinatra/base.rb:989:in `route!'
  from sinatra/base.rb:985:in `route!'
  from sinatra/base.rb:1085:in `block in dispatch!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:1082:in `dispatch!'
  from sinatra/base.rb:907:in `block in call!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:907:in `call!'
  from sinatra/base.rb:895:in `call'
  from raven/integrations/rack.rb:51:in `call'
  from rack/protection/xss_header.rb:18:in `call'
  from rack/protection/path_traversal.rb:16:in `call'
  from rack/protection/json_csrf.rb:18:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/frame_options.rb:31:in `call'
  from rack/nulllogger.rb:9:in `call'
  from rack/head.rb:13:in `call'
  from sinatra/base.rb:182:in `call'
  from sinatra/base.rb:2013:in `call'
  from rack/ssl.rb:27:in `call'
  from rack/protection/xss_header.rb:18:in `call'
  from rack/protection/path_traversal.rb:16:in `call'
  from rack/protection/json_csrf.rb:18:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/frame_options.rb:31:in `call'
  from rack/nulllogger.rb:9:in `call'
  from rack/head.rb:13:in `call'
  from sinatra/base.rb:182:in `call'
  from sinatra/base.rb:2013:in `call'
  from sinatra/base.rb:1487:in `block in call'
  from sinatra/base.rb:1787:in `synchronize'
  from sinatra/base.rb:1487:in `call'
  from puma/configuration.rb:224:in `call'
  from puma/server.rb:600:in `handle_request'
  from puma/server.rb:435:in `process_client'
  from puma/server.rb:299:in `block in run'
  from puma/thread_pool.rb:120:in `block in spawn_thread'

They can be benign, or they can be serious. Often times, though, we have more control over what we do with the exceptions than we realize, and the choices we make can have a big impact on usefulness of our software.

Exceptions in a distributed system

Software can fail for various reasons, and in a distributed system relaying the exception to end users in a useful manner can be a challenge. If we do it right, however, it can both be instructive and reduce the support load by anticipating common user (and system) errors.

 

Imagine a component FrontEnd in a distributed system making a call to another component BackEnd. And imagine further that FrontEnd is responsible for displaying any error message to the end user.

How should it do the job?

Approach 1: Dumping back trace

A naïve and short-sighted approach would be to relay the Ruby back trace produced by BackEnd to the user. We may tell ourselves, "This is what Ruby is telling us!" and move on. While this is true technically, this is not very helpful.

The exception that BackEnd produces can be conceptually so far removed (perhaps because the user does not have access to BackEnd's source code) from the user that it is simply noise.

Approach 2: Swallow everything

<

div style="display: block;">

Since BackEnd's exceptions are noise, it may be tempting to have FrontEnd relay a simplified error message.


An error occurred. Please try again.

This simply spares the user the pain of looking at the back trace, leaving the user or our support staff none the wiser how to fix the problem.

A better approach

A better approach is to study and anticipate exceptions.
With the help of exception trackers (such as Sentry and Airbrake), let us first collect what kind of exceptions our system is seeing.

<

div style="display: block; float: none;">

For example, let us suppose we see the following exception in our tracker:


TypeError: no implicit conversion of Symbol into String
  from …/deploy/script.rb:39:in `delete'
  from …/deploy/script.rb:39:in `initialize'
  from …/deploy.rb:28:in `new'
  from …/deploy.rb:28:in `block in providers'
  from …/deploy.rb:28:in `map'

and that our code looks like this (with the given line numbers) in script.rb:


def initialize(script, sh, data, config)
   @script = script
   @sh = sh
   @data = data
   @config = config
   @silent = false
   @allow_failure = config.delete(:allow_failure)
end

Here, we see that the exception is raised from the #delete method on the config object.

It would be best if we knew what config looked like at the time when exception was raised. If we do not, do log enough information in the exception tracker which would allow us to deduce this critical information.

The error message indicates that config.delete is trying to convert our argument :allow_failure into a String. This instructs us that config is not a Hash (which accepts any object as an argument to its #delete method), but rather, a String.

Based on this insight, our #initialize can be improved by:

 

  • rescuing the observed TypeError
  • matching the observed error message
  • and raising an exception with a more meaningful error message, indicating how users can fix their input so that the exception is no longer raised.

 


def initialize(script, sh, data, config)
  @script = script
  @sh = sh
  @data = data
  @config = config
  @silent = false
  @allow_failure = config.delete(:allow_failure)
rescue TypeError => e
  if e.message =~ /no implicit conversion of Symbol into /
    raise "The configuration data should be a hash (dictionary)."
  end
end

Custom exception class

We may further refine this idea by raising a custom exception class, which can convey richer information about our application's exceptions.

For example, we can guide users to relevant documentation, using #message:

 


class OurException < StandardError
  attr_accessor :docs_url

  def initialize(msg = '')
    @msg      = msg
    @docs_url = "https://docs.example.com/"
  end

  def message
    msg + "\nPlease consult #{docs_url}"
  end
end

Unexpected exceptions will happen

Despite our best efforts, users can hand us data we may not expect, and exceptions will be raised. But even in these cases, we should not burden users with the task of deciphering the Ruby back trace. After all, they may not be Ruby developers, and the language we are using is an implementation detail with which they should not be concerned.

Exception trackers often assign a unique ID when the exception is sent to their data store.

Our last resort, then, is to instruct users to relay this ID, so that we can investigate the exception further.


There was an error from which we could not recover.
Unfortunately, we do not know much about this error.
Please review https://docs.example.com, or contact us at support@example.com with the error ID: 0cceaaec3a3f464bae6c166b97da71a7

Chances are, this piece of information allows us to understand another common exception that can be handled by introducing a new custom exception class.

Conclusion

Error messages are one place where we would rather avoid meeting our users. It is not a very pleasant place.

However, by keeping the end goal in mind, it can be a place where we can show our users how to get out of the sticky situation rather than leaving them in a haze.

Happy hacking!


Photo credit: https://commons.wikimedia.org/wiki/File:TrainwreckatMontparnasse1895_2.jpg (public domain)


Responses