July 24, 2008

Backtraces from Live Ruby Processes

We had some out-of-control mongrel processes at work recently… They wouldn’t respond to anything except a “kill -9”, were taking 100% cpu time, and spinning for hours before responding to any new requests.

Unfortunately we had no idea what could be causing it, either. Loading up gdb and printing out a backtrace only gives a bunch of “rb_call” type entries, the C-level code rather than anything at the ruby level. The obvious solution is to go ask Google how to get a backtrace.

Presumed answers:

http://weblog.jamisbuck.org/2006/9/22/inspecting-a-live-ruby-process and http://eigenclass.org/hiki.rb?ruby+live+process+introspection

However, they must have been working with a different version of Ruby, because that doesn’t work on my local Mac or the Linux servers; I just get errors about accessing invalid memory. So after a few frustrating hours I downloaded the source code for Ruby 1.8.6, and found rb_backtrace(). It’s a function that takes no arguments, and prints a backtrace to stdout rather than trying to return an array (which would be semi-difficult to interpret in gdb).

So …

$ gdb -p <process_id>

(gdb) call (int)rb_backtrace()

Forces ruby to print a backtrace out to the process’ stdout. In our case, that’s a daemontools log file, so I could finally track down what was making the mongrel spin forever!

There’s probably still a great way to do this within gdb, so you don’t have to go find the stdout, or possibly get debug data where it shouldn’t be. But it’s a starting point.