Exploring a language runtime with bpftrace

So I have been having quite a bit of fun learning about eBPF. It’s been on my todo list for like two or three years, but I’ve finally made the time to actually start to figure it out, and have already found some neat stuff, and had to do some fun hacking.

Let’s set the stage.

eBPF is an instrumentation and tracing system built into Linux (and Windows these days!). The general thrust of it is that you provide the kernel with some bytecode, which it verifies, then executes in kernel context.

You’re able to collect all sorts of information via this bytecode, which then can be dumped out. There are multiple ways to get BPF bytecode into the kernel, but I’m going to talk about bpftrace, which provides an awk like language for writing BPF programs.

Now, there’s a fair amount of reasonably comprehensive documentation about bpftrace so I’ll give only the tiniest intro, and then get into some of the problems I’ve been curious about and the solutions I’ve built around BPF to get answers.

Your basic bpftrace program looks like this:

some:probe  /* Probe */
{
  /* Action List */ 
  @start[tid] = nsec
}

some:other:probe
/@start[tid]/ /* filtered by having a start timestamp with this tid that is nonzero */
{
 $duration = nsec - @start[tid]
 print($duration)
 delete(@start[tid]) // remove the key entry so we can get the next duration.
}

A couple of things worth noting in this sketch:

  • Variables that start with @ are maps, associative keyed dictionaries. These are global. Variables that start with $ are locals to an action set.
  • /something/ is a filter applied on a probe
  • There are many builtin functions (print, delete) and ‘getters’ like nsec which returns a timestamp and tid which returns the thread id.

Rooted profiling

SpiderMonkey is a precisely rooted JS engine, so we have a Rooted type, that informs the GC about C++ object pointers. One thing we've noticed is that occasionally we see the Rooted constructor in profiles; yet it's also fast enough that it doesn't show up often. We also have a tool RootedTuplewhich allows us to reduce rooting cost in a single stack frame at the cost of slightly uglier code.

So I’ve been curious for a while about “Where are all these roots coming from”, eg, what source locations are the most frequently run Rooted creations. That link points to a bug which hypothesizes we could do this with std::source_location in C++20, but given we don’t yet have C++20 in SpiderMonkey I sort of stopped there. Until I started learning bpftrace, at which point I realized I probably could make this work.

My initial thinking looked roughly like this:

uprobe://<path to firefox>/Rooted::Rooted
{
    $filename = ???
    $lineno = ???
    @call_counts[$filename,$lineno]++
}

uprobe here is a “user probe” which accesses a function symbol in the program (as opposed to a kprobe, which is a kernel probe).

Of course, I couldn’t figure out how to get a filename or linenumber. Eventually I figured out that actually, there’s a builtin function which behaves pretty much how I want: ustack(n) which returns the top n user stack frames, and can be used as keys to a map.

For reasons I totally forget, I decided to instead target the registerWithRootList function that rooted constructors all actually call into. My first almost working probe looked like this:

uprobe:/home/matthew/unified-git/obj-debug-shell-x86_64-pc-linux-gnu/dist/bin/js:*registerWithRootLists*
{
   @call[ustack(perf,3)] = count()
}


uprobe:/home/matthew/unified-git/obj-debug-shell-x86_64-pc-linux-gnu/dist/bin/js:cpp:JS_ShutDown
{
   printf("calling exit from ebpf");
   exit()
}

after I marked registerWithRootLists as MOZ_NEVER_INLINE.

This worked, but gave me an unsymbolicated result:

<a whole bunch of low counts> 
...
@call[
        55912bcaef60 0x55912bcaef60 ([unknown])
        55912bc97acd 0x55912bc97acd ([unknown])
        55912c1e7598 0x55912c1e7598 ([unknown])
]: 837
@call[
        55912bd662e0 0x55912bd662e0 ([unknown])
        55912bd476ac 0x55912bd476ac ([unknown])
        55912bf9b415 0x55912bf9b415 ([unknown])
]: 1401
@call[
        55912bd662e0 0x55912bd662e0 ([unknown])
        55912bde2918 0x55912bde2918 ([unknown])
        55912c053dbd 0x55912c053dbd ([unknown])
]: 1538

Eventually, with the incredible pointer from Zixian Cai on mastodon I discovered that 1) you need to explicitly print the map, 2) need to do it while the process is alive. Along with discovering I can generalize my probe file by using $1, my final probe file was

uprobe:$1:*registerWithRootLists*
{
   @call[ustack(perf,3)] = count()
}


uprobe:$1:cpp:JS_ShutDown
{
   print(@call)
}

END {
    // Don't print map on exit.
    clear(@call)
}

which was run like this: sudo bpftrace rootTrace.bt programUnderTrace -c ‘programUnderTrace …’. To make sure the program was kept alive, after the call to JS_ShutDown I made the process sleep for 2 seconds.

I then ran the profiler against every JetStream3 subtest, and produced a bunch of reports. I was then able to open a family of bug reports, highlighting places where for example where we could remove 75,000,000 calls to registerWithRootLists by switching to RootedTuple.

This was a cool success, and I was high on it. So next I started writing an allocation profiler.

However that’s a blog post for later, as it involves some fun explorations with writing the tool you need, and this has gotten big enough.