Stack Overflow
Video transcript & code
I meant to cover this in the previous episode, but then I realized that I needed to introduce the TracePoint API first. Now that we're familiar with TracePoint, let's talk about stack overflows.
Ordinarily, Ruby adds a frame to the system stack for every method called or block entered. If you're not familiar with memory stacks and heaps, you can think of the stack as a trail of breadcrumbs the VM leaves behind. Every time it executes a method call, it needs to know what state and instruction to go back to when that method returns.
We can always see the full current stack by calling the caller method with an argument of 0. Here's some code where one method calls another, which calls another, which prints out the current stack trace.
def hello
greet
end
def greet
salute
end
def salute
puts "Salutations from:"
puts caller(0)
end
hello
# >> Salutations from:
# >> xmptmp-in27856D8J.rb:11:in `salute'
# >> xmptmp-in27856D8J.rb:6:in `greet'
# >> xmptmp-in27856D8J.rb:2:in `hello'
# >> xmptmp-in27856D8J.rb:14:in `<main>'
So every time a method is called, new breadcrumbs are dropped on the system stack. And eventually, if methods keep calling each other without returning, the stack runs out of space. The easiest way to see a stack overflow is to write a method which simply calls itself.
def foo
foo # ~> SystemStackError: stack level too deep
end
foo
# ~> SystemStackError
# ~> stack level too deep
# ~>
# ~> xmptmp-in278560aZ.rb:2
This code will never return, but it will continually add stack frames. Until eventually, the stack "overflows", which Ruby indicates with a SystemStackError.
It's quite obvious why this method causes a stack overflow. But it's often not nearly so obvious.
Here's one of my favorite sources of non-obvious stack overflows. Here we define method_missing
, in order handle messages for which there is no defined method. This code ought to just print out the name of the message that was sent.
But we've made a mistake: there's an extra 's' at the end of putss
. As a result, this is now an unknown message send. Which will trigger method_missing
, which will call the broken putss
, which will trigger method_missing
… and so on, until the stack overflows.
def method_missing(method_name, *args)
putss method_name
end
hello
# !> xmptmp-in27856nXH.rb:2: stack level too deep (SystemStackError)
# !> xmptmp-in27856nXH.rb:2: stack level too deep (SystemStackError)
Other times, stack overflows occur because a long sequence of method calls eventually wraps around and calls itself.
def foo
bar
end
def bar
baz
end
def baz
buz
end
def buz
faz
end
def faz
foo
end
foo
# ~> SystemStackError
# ~> stack level too deep
# ~>
# ~> xmptmp-in27856O2Z.rb:14
In this case the error points, somewhat arbitrarily, to line 14. But line 14 doesn't really spell out the full circular sequence of calls that lead to the stack overflow.
This is in Ruby 1.9. Thankfully, Ruby 2.2 improves matters considerably, by capturing the entire stack trace. Although when we see the error dumped to standard out, it elides most of the trace. If the circular sequence of calls is longer than the information shown here, we still may have trouble figuring out what went wrong.
$ ruby circular.rb circular.rb:22:in `buz': stack level too deep (SystemStackError) from circular.rb:18:in `baz' from circular.rb:14:in `bar' from circular.rb:10:in `foo' from circular.rb:26:in `faz' from circular.rb:22:in `buz' from circular.rb:18:in `baz' from circular.rb:14:in `bar' from circular.rb:10:in `foo' ... 10067 levels... from circular.rb:18:in `baz' from circular.rb:14:in `bar' from circular.rb:10:in `foo' from circular.rb:29:in `<main>'
One thing we can do is manually surround the problematic code in a block that explicitly rescues SystemStackError
-s. Then we can interrogate the error for its backtrace
attribute, which in Ruby 2 contains the full stack trace at the point Ruby ran out of stack.
def foo
bar
end
def bar
baz
end
def baz
buz
end
def buz
faz
end
def faz
foo
end
error = begin
foo
rescue SystemStackError => e
e
end
error.backtrace.size # => 8736
error.backtrace[87] # => "xmptmp-in27856pgJ.rb:10:in `baz'"
error.backtrace[1024] # => "xmptmp-in27856pgJ.rb:2:in `foo'"
The trouble with allowing a stack overflow to play out to the bitter end before debugging it, is that often this isn't an instantaneous process. If the code being infinitely repeated is processor-intensive and/or allocates a lot of objects along the way, executing an infinitely recursive method call may bring our system to a slow, grinding halt before eventually failing with a SystemStackError.
To make it easier to debug stack overflows, here's a snippet of code I learned from Myron Marston. It uses the TracePoint
API to hook itself into every method call, and then immediately enable tracing. The hook itself checks the current stack size against a max with every call, and raises a special exception if the max is exceeded.
Because traces can be enabled for the whole program, we don't have to enclose the code causing problems in a special block. We can just insert this tracing code somewhere in system startup, and then wait for the stack size max to be exceeded. Ruby then takes the raised exception and automatically dumps its full stack trace.
By the way, for the sake of example I'm using a pretty small max here. In large applications the stack may normally exceed 100 frames, so you may need to increase this. The original version of this code that I saw from Myron Marston used a max stack size of 500.
def foo
bar
end
def bar
baz
end
def baz
buz
end
def buz
faz
end
def faz
foo
end
max_stack_frames = 100
TooManyStackFrames = Class.new(StandardError)
TracePoint.new(:call) do |tp|
if caller.size >= max_stack_frames
raise TooManyStackFrames, "Stack has exceeded #{max_stack_frames} frames"
end
end.enable
foo
# ~> TooManyStackFrames
# ~> Stack has exceeded 100 frames
# ~>
# ~> xmptmp-in27856E2c.rb:26:in `block in <main>'
# ~> xmptmp-in27856E2c.rb:13:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:18:in `faz'
# ~> xmptmp-in27856E2c.rb:14:in `buz'
# ~> xmptmp-in27856E2c.rb:10:in `baz'
# ~> xmptmp-in27856E2c.rb:6:in `bar'
# ~> xmptmp-in27856E2c.rb:2:in `foo'
# ~> xmptmp-in27856E2c.rb:30:in `<main>'
This is a great example of how we can put the TracePoint
API to very practical use, to build little tools that help us with our Ruby development. If you've used TracePoint to build any little tools of your own, I'd love to hear about it.
Happy hacking!
Responses