The Story Behind Ruby 1.9.3 Getting 36% Faster Loading Times
Xavier Shay is an Australian Rubyist who shares an issue with most of us: slow loading Rails 3 apps on Ruby 1.9.2! Unlike most of us, he put together a solution for ruby-head (what I'm calling 1.9.3 but isn't technically*) that, in my own tests, slashed 37% off the boot time of my Rails 3.0 app. He shared his work just a week ago. Awesome! But some other developments have occurred since..
* Just because things are in ruby-head doesn't mean they'll definitely make it into Ruby 1.9.3. Pragmatically, though, ruby-head seems to have attracted the 'Ruby 1.9.3' moniker and it makes for a better headline. Just don't get too upset if, for whatever reason, it gets yanked and delayed till Ruby 2.0.. ;-))
Tip: If you're still on 1.8, check out The Ruby 1.9 Walkthrough, a mega screencast aimed at Ruby 1.8.7 developers who want to learn all about what's new, what's gone, and what's different in Ruby 1.9.2 and 1.9.3.
Slow Rails 3.0 Loading Times == A Big Problem
Ruby 1.9.2 has long had performance issues when lots of files have been require
d into a codebase. Back in January, Colin Law posted to the Rails Core mailing list about the problem:
There has been an ongoing thread on the RoR talk list about startup time using Rails 3 with Ruby 1.9.2. Do not be confused by the subject of the thread, which mentions 1.9.1, it has moved on to 1.9.2. The gist is that on Ubuntu (and possibly Macs) the startup time when using 1.9.2 can be very much greater then 1.8.7, even with a minimal app. This applies to running tests, migrations, server startup and so on.
Can anyone here throw any light on this?
Colin Law
Rails 3's once-chief prolific superstar, Yehuda Katz, had the best response at the time:
There are things that the C require code does in 1.9 that slow things down. One such example is re-checking $LOAD_PATH to make sure it is all expanded on every require. This is something that should be addressed by ruby-core. I'll open a ticket on redmine if there isn't one already.
Yehuda Katz
It turns out there was already a ticket from October 2010 on Redmine but little had happened in the interim. The compromise, then, was to find ways to make testing fast on Rails 3 with Spork and similar workarounds.
Xavier To The Rescue!
Xavier, like a champ, spent a significant amount of time digging into the problem. He discovered that ruby-head's then-present way of dealing with loading files was woefully inefficient. He noted that require was working like so (in a very simplified and Ruby-fied example - because MRI Ruby is really written in C!):
def require(file) $loaded.each do |x| return false if x == file end load(file) $loaded.push(file) end
Xaiver then set to work making ruby-head use a hash for more efficient lookups. A system he explained like so (again, this is just incredibly simplified pseudocode):
def require(file) return false if $loaded[file] load(file) $loaded[file] = true end
The result was a much faster boot process for apps with lots of require
s going on (like Rails 3 apps) and he released a patch which lots of people loved. I ran my own tests on a Rails 3 app with ~3000 lines of code and about 20 dependent libraries and got a speed up in load times of 37%.
Experimental Folks Only: There's a patch aimed at Ruby 1.9.2-p180 bringing Xavier's ideas back to the implementation you already know and love. Experiment at your own risk!
4 Days Later, A Core Team Patch
Impressed by the improvement, I was going to write a post about Xavier's work and how you could get to using it right away but then, out of the blue, came something straight from the Ruby core team in Japan, a a 26 line patch to load.c by Masaya Tarui.
Masaya's patch took a totally different approach to Xavier's. Whereas Xavier's patch weighed in at over 1200 lines and essentially re-architected Ruby's feature loading process, Masaya's patch smashes a much-needed optimization into the existing code which significantly reduces the number of loop cycles necessary to check whether files have already been loaded or not.
The end result? I ran my tests again and got a 36% drop in Rails 3.0 app load time on the same app. So almost as fast as Xavier's patch but from a shorter yet scrappier solution.
I ran this briefly by Xavier on Twitter and he believes that this quick fix won't ultimately fix the problem for "really large apps" and some of his extra benchmarks shared in comments on this ticket seem to indicate as such. So I'm leaving this story a little in the air at the end here. Xavier did a fine bit of rearchitecting but Masaya swooped in with a short "quick fix" that, perhaps, has a lower chance of causing regressions.
Rails 3.0 App Bootup Times
When I was writing this post focused on Xavier's work rather than the new load.c patch, I was going to lead off with benchmarks of the new process. It turns out, though, the story became more interesting, so I've relegated these (new) benchmarks to the end of the post!
I'm no statistician and I grimace at poorly crafted benchmarks along with the rest of you. Trust, though, that all of these results came from a defined process so are more likely to be equally skewed, if at all ;-) They're based on the mean userspace times of the 2nd and 3rd runs of a time ./script/rails runner "puts 37337"
using the specific Ruby version on the specified Rails app (an "empty" Rails 3.0.6 app and a 3000 line, 20 gem "bigger" Rails app).
So, here's ruby-head (with the load.c patch) against Ruby 1.8.7 and Ruby 1.9.2-p180:
ruby-head isn't quite back to Ruby 1.8.7 speeds in terms of requiring lots of files, but it's a significant improvement over 1.9.2 (a 35% improvement on the empty app and 36% on the "bigger" one).
End result? You should be getting faster load times in ruby-head and, certainly, when Ruby 1.9.3 drops, whether or not Xavier's work makes it in. In any case, congratulations are due to Xavier for pushing the issue (coincidence the load.c fix came in 4 days after his big reveal?) and for ultimately making Ruby 1.9 a faster place.
[sponsor] Jumpstart Lab, headed by Jeff Casimir, is a training company specializing in Ruby on Rails. Their classes are usually two days long and while their prescheduled classes tend to be in Washington DC, where they're based, they'll travel anywhere if you have (or can find) at least six attendees.
June 5, 2011 at 5:50 am
On a synthetic benchmark requiring 2500 files my patch still blitzes ruby-head by about 5s, but on a rails app it only just edges ahead (1.08 vs 1.35 for a new app, 10.49 vs 10.88 for a larger one which is 18.37s on 1.9.2-p180). Given the differences are far less dramatic now, I anticipate my patch probably won't be incorporated into a point release due to risk of regressions. Still, going forward I believe a hash/set data structure is the correct approach.
June 5, 2011 at 8:21 am
Am I missing something or are these 2 patches not necessarily mutually exclusive (maybe with some edits?).
June 5, 2011 at 9:15 am
Combine the two patches for ROFLSPEED!!
June 5, 2011 at 1:10 pm
This might be handy for others... a 1.9.2 patch of the core teams version: https://gist.github.com/1008945
June 6, 2011 at 3:32 am
Is this patch into ruby-1.9.2-head too? Or just ruby-head?
June 6, 2011 at 5:02 am
Xavier: Here's a patch which might help you understanding r31875. https://gist.github.com/1009750
Thanks for your contribution about loading time of 1.9.3. I think you saved CRuby committers from lots of claims about loading time, as well as Rails users :)
As you see in load.c (and might see in the diff I've posted,) there're lots of necessary loops and checks in load.c. We included expanded path in $LOADED_FEATURES from 1.9 to avoid double loading issue like 'require "foo"; require "FOO"; require "./foo", so it gets a little slower from 1.8. It should be faster as 1.8 eventually of course, I want to see your efforts merged in the future.
To get it to be merged early, I hope you try to understand what's load.c is doing, and posting a patch for 1 problem step by step...
By the way, just letting you know this, $LOADED_FEATURES of JRuby is not a stock Array but an Array-like-Hash at master branch now. It gets really faster for artificial example, but there's no measurable difference for starting up big rails app such as 'slow-rails' by joevandyk
Pingback: Pedro Newsletter 06.06.2011 « Pragmatic Programmer Issues – pietrowski.info
June 6, 2011 at 4:13 pm
The C-code fix is a great quick-patch, but I also agree with Xavier that it should be a set or hash. Really, any time you're using a store simply for detection of duplicates, you shouldn't be using an Array. The simplicity of Xavier's fix is great :-).
Pingback: How To Get That Edge Ruby Faster-Loading-Hotness in Ruby 1.9.2 Now
June 8, 2011 at 5:08 am
I imagine a bigger speed boost might be had if everyone started using #require_relative where possible.
June 10, 2011 at 6:28 am
Very good, a move in the right direction, but 37% compared to the scale of the problem is nothing. Depending on the view point, it is either a transition from acceptable to even better acceptable or from nonoperational to still nonoperational.
June 17, 2011 at 4:49 pm
There is another patch which gives ~40% load time improvement:
http://www.lunarlogicpolska.com/blog/2011/06/14/tracing-processes-for-fun-and-profit.html
If all of them could be combined into one (unless they conflict) then we could end up with quite huge speed improvement.
June 23, 2011 at 3:22 am
puts 37337? Don't you mean 31337? :-/
Pingback: Ruby 1.9.3 Preview 1 Released – What’s New?
August 2, 2011 at 4:08 am
Good catch Xavier. I think the speed improvement got by your fix greatly depends on the right hash function too. If the the hashing is not good, we might even end up not getting any performance improvement. Any thoughts any one?
August 29, 2011 at 3:34 am
My load times with a fairly simple rails 3.1rc6 app. I ran this three times for each version and the variations were very minimal. Here's the results for one run.
$ ruby -v
ruby 1.9.3dev (2011-07-31 revision 32789) [x86_64-linux]
$ time ./script/rails runner "puts 37337"
37337
real 0m5.111s
user 0m4.690s
sys 0m0.390s
$ rvm use 1.9.2@rails31
Using .rvm/gems/ruby-1.9.2-p290 with gemset rails31
$ ruby -v
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]
$ time ./script/rails runner "puts 37337"
37337
real 0m8.587s
user 0m8.050s
sys 0m0.490s
I'm using
Ubuntu Natty
RVM
Rails 3.1rc6
Intel(R) Xeon(R) CPU E5504 @ 2.00GHz
16GB of RAM.
Thanks for all the work. This is great stuff!
Pingback: Installing Rails on OS X Lion with HomeBrew, RVM and Mysql | Ruby, Rails, OSX and Linux fun