Ruby 1.8.5, REXML, and You
In my previous post, I mentioned that I was trying to track down why our mongrels (running Rails) at work were spinning out of control for no apparent reason. However, finally being able to generate a backtrace from those mongrels led right to the issue.
The relevant portion of the backtrace looked like this:
from /usr/lib/ruby/1.8/rexml/encoding.rb:59:in `check_encoding’
from /usr/lib/ruby/1.8/rexml/source.rb:40:in `initialize’
…
from /usr/lib/ruby/1.8/rexml/document.rb:45:in `initialize’
…
from ./current/config/../vendor/rails/actionpack/lib/action_controller/cgi_ext/cgi_methods.rb:53:in `parse_formatted_request_parameters’
So apparently REXML was the culprit, and it was happening while trying to parse the parameters that were passed to our Rails app. Before any of our (non-framework) code was even called. From some of my investigation, I knew some of the spinning mongrels corresponded to requests with 200k or more of POST data, so this made some sense. But other requests with that much data went through just fine.
In any case, tracking down that line revealed this code, as part of what REXML uses to detect the encoding of the XML data:
str =~ /^\s*<?xml\s*version=([‘”]).*?\2\s*encoding=([“’])(.*?)\2/um
return $1.upcase if $1
And, well, that’s just wrong. The non-escaped question mark, the two “\2” backreferences, the “return $1” instead of “return $3”, etc. However, our servers are currently running on Ruby 1.8.5. So I took a look at the corresponding code in the version of REXML that comes with Ruby 1.8.6, and found that it’s fixed:
str =~ /^\s*<\?xml\s+version\s*=\s*([‘”]).*?\1\s+encoding\s*=\s*([“’])(.*?)\2/um
return $3.upcase if $3
So it just happens that the sequences of data we were receiving in the rails app were playing unkindly with the broken regular expression, going exponentially out of control with backreferences.
Changing the old REXML code to the new code brought encoding-detection time down to 52 microseconds instead of 2 *hours* on some of the test data. So, we fixed that on our servers, and the problem is gone.
Another solution: Upgrade to Ruby 1.8.6 or better. :)
3 years ago