Faster Ruby: A Ruby+OMR Retrospective

Years ago I worked on a project called Ruby+OMR. The goal of that project was to integrate Eclipse OMR, a toolkit for building fast language runtimes, into Ruby, to make it faster. I worked on the JIT compiler, but there was also work to integrate the OMR Garbage Collector to replace the Ruby one.

After the project had trickled down to a stop, I wrote a retrospective blog post about the project, but never published it. Then, I moved on from IBM and started working at Mozilla, on SpiderMonkey, their JavaScript engine.

Working at Mozilla I’ve learned enormous amounts about how dynamic languages can be made fast, and what kind of changes are the most important to seeing performance.

Now feels like a reasonable time to update and expand that retrospective, and then I have a second follow up blog post I'll post tomorrow about how I’d make Ruby fast these days if I were to try, from the perspective of someone who’s not been involved in the community for five years.

Retrospective

It has been five years since I stopped working on Ruby+OMR, which is far enough in the past that I should refresh people’s memories.

Eclipse OMR is a project that came out of IBM. The project contains a series of building blocks for building fast managed language runtimes: Garbage collection technology, JIT compiler technology, and much more.

The origin of the project was the J9 Java Virtual machine (later open sourced as OpenJ9). The compiler technology, called Testarossa, was already a multi-language compiler, being used in production IBM compilers for Java, COBOL, C/C++, PL/X and more.

The hypothesis behind OMR was this: If we already had a compiler that could be used for multiple languages, could we also extend that to other technologies in J9? Could we convert the JIT compiler, GC and other parts, turning them into a library that could be consumed by other projects, allowing them to take advantage of all the advanced technology that already existed there?

Of course, this wasn’t a project IBM embarked on for 100% altruistic reasons: Runtimes built on top of OMR would, by their very nature, come with good support for IBM’s hardware platforms, IBM Z and POWER , a good thing considering that there had been challenges getting another popular language runtime onto those platforms.

In order to demonstrate the possibilities of this project, we attempted to connect OMR to two existing language runtimes: CPython, and MRI Ruby. I honestly don’t remember the story of what happened with CPython+OMR; I know it had more challenges than Ruby+OMR.

My Ruby+OMR Story

By the time I joined the Ruby+OMR Project, the initial implementation was well underway, and we were already compiling basic methods.

I definitely remember working on trying to help the project get out the door… but honestly, I have relatively little recollection of concrete things I did in those days. Certainly I recall doing lots of work to try to improve performance, running benchmarks, making it crash less.

I do know that we decided to make sure we landed with a Big Bang. So we submitted a talk to RubyKagi 2015, which is the premiere conference for Ruby developers in Japan, and a conference frequented by many of the Ruby Core team.

I would give a talk on JIT technology, and Robert Young and Craig Lehman gave a talk on the GC integration. Just days before the talks, we open sourced our Ruby work (squashing the commit history, which as I try to write this retrospective, I understand and yet wish we hadn’t needed to).

I spent ages building my RubyKaigi talk. It felt so important that we land with our best feet forward. I iterated on my slides many times, practiced, edited and practiced some more.

The thing I remember most from that talk was the moment when I looked down into the audience, and saw Matz, the creator of Ruby, sitting in the front row, his head down and eyes closed. I thought I had managed to put him to sleep. Somewhere in the video of that talk you can spot it happening: Suddenly I start to stumble over my slides, and my voice jumps a half-register, before I manage to recover.

That Ruby Kaigi was also interesting: that was the one where Matz announced his goal Ruby3x3: Ruby 3 would be 3x faster than Ruby 2.0. It seemed like our JIT compiler would be a potentially key part of this!

We continued working on Ruby, and I returned to RubyKaigi ten months later, in September of 2016. I gave a talk, this time, about trying to nail down how specifically we would measure Ruby 3x3. To date, this is probably the favourite talk I have ever given; a relatively focused rant on the challenges of measuring computer performance and the various ways you can mislead yourself.

It was at this RubyKaigi that we had some conversations with the Ruby Core team about trying to integrate OMR into the Ruby Core. Overall, they weren’t particularly receptive. There were a number of concerns. In June of 2017, those concerns became a part of a a talk Matz gave in Singapore, where he called them the ‘hidden rules’ of Ruby 3x3:

  • Memory Requirements: He put it this way: Ruby's memory requirements are driven by Heroku's smallest dyno, which had 512mb of RAM at the time.

  • Dependency: Ruby is long lived, 25 years old almost, and so there was a definite fear of dependency on another project. He put it this way: If Ruby were to add another dependency, that dependency ought to be as stable as Ruby itself.

  • Maintainability: Overall maintainability matters: Unmaintainable code stops evolution, so the Ruby Core team must be able to maintain whatever JIT is proposed.

By this point, the OMR team had already scaled effort on Ruby+OMR to effectively zero, but if we hadn’t done that, this talk would have been the death-knell to Ruby+OMR, purely on the second two points. While we had a road to improved memory usage, we were by definition a new project, and a complex one at that. We’d never become the default JIT compiler for Ruby.

The rest of the talk focused on a project being done by a Toronto based Red Hat developer named Vladimir Makarov, called MJIT. MJIT added a JIT compiler to Ruby by translating the bytecode of a Ruby method to a small C file, invoking GCC or Clang to compile that C File into a shared object, and then loading the newly compiled shared object to back the Ruby method.

Editorializing, MJIT was a fascinating approach. It's not quite a bytecode level JIT, because it feeds the underlying compiler (gcc) not raw bytecode, but C code that executes the same code that the bytecode would, as well as a pre-compiled header with all the required definitions. Since MJIT is looking at C code, it is free do do all sorts of interesting optimization at the C level, that a bytecode level JIT similar to Testarossa would never see. This turns out to be a really interesting work around for a problem that Evan Phoenix pointed out in his 2015 RubyKaigi Keynote, which he called the Horizon Problem. In short, the issue is that a JIT compiler can only optimize what it sees: but in a bytecode JIT for Ruby, like Ruby+OMR huge swathes (possibly even the majority) of the important semantics are hidden away as calls to C Routines, and therefore provide barriers to optimization. MJIT would be limited in what optimizations were possible by the contents of the pre-compiled header, which ultimately would define most of the 'optimization horizon'.

Furthermore, MJIT solved in a relatively nice way many of the maintenance problems that concerned the Ruby core community: By producing C code, the JIT process would be relatively easily debuggable, by being able to reason via C code, which the Ruby Core developers are obviously proficient at.

I haven’t paid a lot of attention to the Ruby community since 2017, but MJIT did get integrated into Ruby, and at least according to the git history, appears to still be maintained.

I was very excited to see Maxime Chevalier-Boisvert announce YJIT, as I loved her idea of basic block versioning. I’m excited to see that project grow inside of Ruby. One thing that project has done excellently is include core developers early, and get into the official Ruby tree early.

What did Ruby+OMR accomplish?

Despite Ruby+OMR’s failure to form the basis of Ruby’s JIT technology, or replace Ruby’s GC technology, the project did serve a number of purposes:

  • Ruby was an important test bed for a lot of OMR. It served as a proving ground for ideas over and over again, and helped the team firm up ideas about how consumption of the OMR libraries should work. Ruby made OMR better by forcing us to think about and work on our consumption story.
  • We managed to influence the Ruby community in a number of ways:
    • We showed that GC technology improvements were possible, and that they could bring performance improvement.
    • We helped influence some of the Ruby community's thoughts on benchmarking, with my talk at RubyKaigi having been called out explicitly in the development of a Rails benchmark that was used to track Ruby performance for a few years hence.

What should we have done differently in Ruby+OMR?

There's a huge number of lessons I learned from working on Ruby+OMR.

  • At the time we did the work on Ruby+OMR, the integration story between OMR and a host language was pretty weak. It required coordination between two repos, a fairly gross ‘glue’ code that was required to make the two systems talk to each other.

    A new interface, called JitBuilder was developed that may have helped, but by the time it arrived on the scene we were already knee deep in our integration into Ruby.

  • We should have made it dramatically easier, much earlier, to have people be able to try out Ruby+OMR. The Ruby community uses packaging systems to match Ruby versions to their app, like RVM and rbenv, and so we would have been very well served by pushing hard to get acceptance into these package managers early.

  • Another barrier to having people try out Ruby+OMR with the JIT enabled was our lack of asynchronous compilation. Not having asynchronous compilation left us in a state where we couldn’t be run, or basically even tested, for latency sensitive tasks like a Rails server application.

    I left tackling this one far too late, and never actually succeeded in getting it up and running. For future systems, I suspect it would be prudent to tackle async compilation very early, to ensure the design is able to cope with it robustly.

One question people have asked about Ruby+OMR is how difficult it was to keep up with Ruby’s evolution. Overall, my recollection is that it wasn’t too challenging, because we chose an initial compiler design that limited the challenge: Ruby+OMR produced IL from Ruby bytecode (which didn’t change a lot release to release), and a lot of the Ruby bytecodes were implemented in the JIT purely by calling directly into appropriate RubyVM routines. This meant that the OMR JIT compiler naturally kept up with relative ease, as we weren’t doing almost anything fancy that would have posed a challenge. Longer term, integration challenges would have gotten larger, but we had hoped at some point we’d end up in-tree, and have an easier maintenance story.

Conclusion

I greatly enjoyed working on Ruby+OMR, and I believed for the majority of my time working on it that we were serious contenders to become the default JIT for Ruby. The Ruby community is a fascinating group of individuals, and I really enjoyed getting to know some people there.

Ultimately, the failure of the Ruby+OMR project was, in my opinion, our lack of maturity. We simply hadn’t nailed down a cohesive story that we could tell to projects that was compelling, rather than scary. It’s too bad, as there are still pieces of the Testarossa compiler technology that I miss, almost five years since I’ve stopped working with it.

Edit History

  • Section on MJIT updated August 8, 2022, 10:45am to clarify a bit what I found to be special about MJIT after an illuminating conversation with Chris Seaton on twitter