Some notes on CMake variables and scopes

I've been doing a lot of work on CMake for Eclipse OMR for the last little while.

CMake is a really ambitious project that accomplishes so much with such simplicity it's like magic... so long as you stay on the well trodden road. Once you start wandering into the woods, because your project has peculiar needs or requirements, things can get hairy pretty quickly.

There's a pretty steep ramp from "This is amazing, a trivial CMakeLists.txt builds my project" to "How do I do this slightly odd thing?"

We'll see how much I end up talking about CMake, but I'll start with a quick discussion of variables and scopes in CMake.

Variables and scopes in CMake

First, a quick note of caution: Variables exist in an entirely separate universe from properties, and so what I say about variables may well not apply to properties, which I am much less well versed in.

Variables are set in CMake using set:

set(SOME_VARIABLE <value>)

The key to understanding variables in CMake in my mind is to understand where these variables get set.

Variables are set in a particular scope. I am aware of two places where new scopes are created:

  1. When add_subdirectory is used to add a new directory to the CMake source tree and
  2. When invoking a function

Each scope when created maintains a link to its parent scope, and so you can think of all the scopes in a project as a tree.

Here's the trick to understanding scopes in CMake: Unlike other languages, where name lookup would walk up the tree of scopes, each new scope is a copy by value of the parent scope at that point. This means add_subdirectory and function inherit the scope from the point where they're called, but modification will not be reflected in the parent scope.

This actually can be put to use to simplify your CMakeLists.txt. A surprising amount of CMake configuration is still done only through what seem to be 'global' variables -- despite the existence of more modern forms. i.e despite the existence of target_compile_options, if you need to add compiler options only to a C++ compile, you'll still have to use CMAKE_CXX_FLAGS.

If you don't realize, as i didn't, that scopes are copied-by-value, you may freak out at contaminating the build flags of other parts of a project. The trick is realizing that the scope copying limits the impact of the changes you make to these variables

Parent Scope

Given that a scope has a reference to the scope it was copied from, it maybe isn't surprising that there's a way in CMake to affect the parent scope:

set(FOO <foo value> PARENT_SCOPE)

Sets FOO in the parent scope... but not the current scope! So if you're going to want to read FOO back again, and see the updated value, you'll want to write to FOO without PARENT_SCOPE as well.

Cache Variables

Cache variables are special ones that persist across runs of CMake. They get written to a special file called CMakeCache.txt.

There's a little bit different about cache variables. They're typed, as they interact with CMake's configuration GUI system), as well they tend to override normal variables (which makes a bit of sense). Mostly though, on the subject, I'll defer to the documentation!

Scope Tidbits:

There's a couple other random notes related to scoping I'd like to share.

  1. It appears that not all scopes are created equal. In particular, it appears that targets will always use target-affecting variables from the contained directory scope, not function scopes.

    function(add_library_with_option)
        set(CMAKE_CXX_FLAGS "-added-option)
        add_library(foo T.cpp) 
     endfunction(add_library_with_option)

    It's been my experience that the above doesn't work as expected, because the add_library call doesn't seem to see the modification of the CXX flags.

  2. Pro Tip: If anything has gone wrong in your CMakeLists.txt, try looking in the cache! It's just a text file, but can be crazy helpful to figure out what's going on.

Paper I Love: "The Challenges of Staying Together While Moving Fast"

(Prefix inspired by Papers We Love)

I recently had the opportunity to meet Julia Rubin when she was meeting at IBM. While we met for only a few minutes, we had a great (albeit short) conversation, and I started looking into her publications. With only one read, (out of a small pile!) I've already found a paper I want to share with everyone: 

"The Challenges of Staying Together While Moving Fast: An Exploratory Study" - Julia Rubin and Martin Rinard (PDF)

This paper speaks to me: It really validates many of my workplace anxieties, and assures me that these feelings are quite universal across organizations. The industry really hasn't nailed building large software products, and there's a lot of work that could be done to make things better. 

The paper includes a section titled "Future Opportunities". I hope academia listens, as there are great projects in there with potential for impact on developers lives. 

Lambda Surprise

Another day of being surprised by C++

typedef int (*functiontype)();
int x = 10;

functiontype a,b;
a = []() -> int { return 10; };   // OK
b = [&x]() -> int {return x; }; // Type error

My intuition had said the latter should work; after all, the only thing that changed was the addition of a capture expression.

However, this changes the type of the lambda so that it's no longer coercable to a regular function (and in fact, others I've talked to suggest that the surprising thing is that the assignment to a even works.)

sigh.

There is a work around:

#include <functional> 
typedef std::function<int()> functiontype;

It's funny: Before working with C++ lambdas, I had been thinking they would provide huge amounts of power when working with (and possibly providing!) C/C++ API interfaces. Alas, they are special beasts.

Debugging a Clang plugin

Writing this down here so that maybe next time I google it, I will find my own page :)

The magical incantation desired is

$ gdb --args clang++ .... 
Reading symbols from clang++...(no debugging symbols found)...done.
(gdb) set follow-fork-mode child

Doing this means that gdb will not get stuck just debugging the clang driver!

Going to be speaking at the Ruby devroom 2016!

I will be speaking this year at the FOSDEM Ruby devroom about the challenges the Ruby+OMR JIT compiler faces, and how they can be surmounted with your help! The abstract is below, or on the FOSDEM website. 

Highly Surmountable Challenges in Ruby+OMR JIT Compilation

The Ruby+OMR JIT compiler adds a JIT to CRuby. However, it has challenges to surmount before it will provide broad improvement to Ruby applications that aren’t micro-benchmarks. This talk will cover some of those challenges, along with some brainstorming about potential ways to tackle them.

The Ruby+OMR JIT compiler is one way to add JIT compilation to the CRuby interpreter. However, it has a number of challenges to surmount before it will provide broad improvement to Ruby applications that aren’t micro-benchmarks. This talk will cover some of those challenges, along with some brainstorming about potential ways to tackle them.

The challenges range from small to large. You can get a sneak peek by looking through the issue tracker for the Ruby+OMR preview.  

Boobytrapped Classes

Today I learned: You can construct Boobytrapped class hierarchies in C++.

Here's an example (Godbolt link)

#include <iostream> 
struct Exploder { 
 // Causes explosions! 
}; 

struct Unexploder { 
  void roulette() {} 
};

template<class T>
struct BoobyTrap : public T { 
  /* May or may not explode. 
  */
  void unsafe_call () { exploder(); }
  void safe_call() {} 

  private: 

  void exploder() { T::roulette(); } 
}; 

int main(int argc, char** argv) { 
    BoobyTrap<Unexploder> s; 
    s.safe_call();
    s.unsafe_call(); // Click! We survived! 

    BoobyTrap<Exploder> unsafe;
    unsafe.safe_call(); 

    // Uncomment to have an explosion occur. 
    // Imagine this with conditional compilation?
    // unsafe.unsafe_call(); 
    return 0;
}

The wacky thing here is that you can totally use the safe_call member function of the BoobyTrapped class independent of parent class -- because unsafe_call is only expanded and substituted if you call it!

This feels awkward, because it divides the interface of BoobyTrap into callable and uncallable pieces. I cant decide if I think this is a good idea or bad idea.

Pro:

  • You can knit together classes, and so long as the interfaces match enough so that the interfaces work, you're OK.

Con:

  • Feels like Fragile Base class ++

Thanks to Leonardo for pointing this out!

Commits vs. Pull Requests

When working on an open source project, you face questions of how to make your work consumable. I've been watching the process in my work on Eclipse OMR, and I've come up with my personal mental model: 

  • Commits delimit atomic changes to the source code. Each commit should build, and contains one state transition for the code. Each commit message provides an opportunity to explain your reasoning for making the change. 
  • Pull Requests group together a set of related commits; For example, a feature, or a group of cleanup commits. These commits can be tested and discussed as a whole, and therefore merged as a whole.

Of course, this model is heavily driven by our Pull Request workflow at work.  

Getting Ready

Now that I'm working much more in the open on Eclipse OMR, I am getting ready to start blogging more about tech, and things from my day job.

I suspect lots of small posts, tips and tricks, etc. We'll see how it goes. A separate section to keep things a bit isolated from my other blog.