Playing around with semgrep

Semgrep seems to be a pretty cool looking tool, which allows you to do semantic grep. It appears to be designed to help find vulnerable usage patterns in source code, but the mere ability to have some level of semantic understanding paired with a 'grep' like interface is quite appealing.

I'm looking at ES2015+ features right now, and was curious if any of the benchmarks we've got checked into mozilla-central use rest arguments. With the help of the people at r2c on their community slack (see link at the top right of semgrep.dev), we were able to come up with the following semgrep pattern.

function $FUNC(..., ...$MORE) { ... }

This matches any function declaration which takes a rest parameter.

Running it across our performance tests, it provides exactly what I was hoping for:

$ semgrep -e 'function $FUNC(..., ...$MORE) { ... }' -l javascript third_party/webkit/PerformanceTests/
third_party/webkit/PerformanceTests/ARES-6/Air/reg.js
91:    function newReg(...args)
92:    {
93:        let result = new Reg(...args);
94:        Reg.regs.push(result);
95:        return result;
96:    }

third_party/webkit/PerformanceTests/ARES-6/Air/util.js
32:function addIndexed(list, cons, ...args)
33:{
34:    let result = new cons(list.length, ...args);
35:    list.push(result);
36:    return result;
37:}

third_party/webkit/PerformanceTests/ARES-6/Babylon/air-blob.js
91:    function newReg(...args)
92:    {
93:        let result = new Reg(...args);
94:        Reg.regs.push(result);
95:        return result;
96:    }

third_party/webkit/PerformanceTests/ARES-6/Basic/benchmark.js
35:        function expect(program, expected, ...inputs)
36:        {
37:            let result = simulate(prepare(program, inputs));
38:            if (result != expected)
39:                throw new Error("Program " + JSON.stringify(program) + " with inputs " + JSON.stringify(inputs) + " produced " + JSON.stringify(result) + " but we expected " + JSON.stringify(expected));
40:        }

third_party/webkit/PerformanceTests/ARES-6/glue.js
37:function reportResult(...args) {
38:    driver.reportResult(...args);
39:}

Now: As is, this isn't sufficient to cover all the cases I'm interested in: For example, what if someone defines an arrow function that takes a rest parameter?

(...rest) => { return rest[0]; }

Or, worse, the braces are optional when you have a single expression:

(...rest) => rest[0]

To help support more complicated patterns, semgrep supports boolean combinations of patterns (i.e. pattern1 || pattern2). I wasn't able to get this working because of some arrow parsing bugs, but nevertheless, this is a promising and neat tool!

They've got a live editor for it setup at semgrep.dev/write to dork around with.