Using recursion combinators in JavaScript

30 Jun, 2008 · 11 min read

Contents

In the previous post we explored “array extras” and how they can help us to write concise yet performant and clean code. In this post we take a look at generalizing recursive algorithms with recursion combinators — high-level functions that encapsulate all boilerplate code needed to set up the recursion. These functions were added to dojox.lang.functional and will be officially released with Dojo 1.2.

In general the recursion is a form of iterative problem solving in the same category as loops. There are two major natural sources of recursive algorithms: recursive data structures (lists, trees, and so on), and recursive definitions (factorial, Fibonacci numbers, the GCD algorithm, etc.). The recursion plays a prominent role in the functional programming (FP), and one of the best articles on this topic is “Recursion Theory and Joy” by Manfred von Thun , the creator of Joy (a purely functional programming language with Forth-like syntax). Manfred’s article explains intricacies of recursion including the venerable Y combinator , recursion combinators in general, and introduces a practical set of recursion combinators, which will guide us in this post.

FP programmers spent a lot of time advancing the theory of programming, and it shows. Studying functional languages gives you better understanding of computing, and provides elegant solutions to many difficult questions. Yet there is a problem of applicability: some solutions are not applicable nor practical for multi-paradigm languages like JavaScript. For example, while it is simple to replicate monads , it doesn’t make much sense: monads are mostly used to express sequential computations, introduce side-effects like input/output, and implement state. We can do all these things directly without artificial constructs. The same goes for the Y combinator, which solves the problem of self-reference to yet undefined or incomplete function — there is the direct way to do it in JavaScript (hint: arguments.callee ).

The simplest kind of recursion is a linear recursion (linrec). This is a pattern when a function calls itself once, or terminates when some condition is met. Basically it means that in order to do any linear recursion function we need four non-recursive functions:

a stop condition,
a “then” function, which is called when condition is met before stopping the recursion, produces a final value,
a “before” function, which is called before the recursive call to produce new arguments,
an “after” function, which is called after the recursive call to process the returned value.

Let’s code it up using lambdas (read more on them in Functional fun in JavaScript with Dojo ):

var df = dojox.lang.functional;

var linrec = function(cond, then, before, after){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before),
        after  = df.lambda(after);
    return function(){
        if(cond.apply(this, arguments)){
            return then.apply(this, arguments);
        }
        var args = before.apply(this, arguments);
        var ret  = arguments.callee.apply(this, args);
        return after.call(this, ret, arguments);
    };
};

The code is simple, yet flexible. Let’s go over all 4 parameter functions:

cond() takes all parameters passed to the linrec() and returns a Boolean value. If it is true, we stop recursions and call then(), otherwise we proceed to before().
then() takes all parameters passed to the linrec() and returns a value, which in turn will be the returned value of the linrec().
before() sets up the recursion arguments. It takes all parameters passed to linrec() and returns an array of new parameters to call linrec() recursively.
When the recursive call is finished we process its return with after(). It takes two parameters: the returned value, and the array of all arguments passed to our linrec(). It returns a new value, which will be the returned value of linrec().

Let’s see how we can code well-known algorithms using linrec():

// factorial
var fact0 = function(n){
    return n <= 1 ? 1 : n * arguments.callee.call(this, n - 1);
};

var fact1 = linrec("<= 1", "1", "[n - 1]", "m * n[0]");

// find the greatest common divisor using the Euclidean algorithm
var gcd0 = function(a, b){
    return b == 0 ? a : arguments.callee.call(this, b, a % b);
};

var gcd1 = linrec("a, b -> b == 0", "a", "a, b -> [b, a % b]", "x");

As you can see using linrec() is pretty straight-forward. But gsd0() demonstrates a very important case: a tail recursion. In terms of linrec() it means that we don’t process the result of the recursive call, but return it directly ⇒ we don’t need after() anymore. Let’s code it up:

var tailrec = function(cond, then, before){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before);
    return function(){
        if(cond.apply(this, arguments)){
            return then.apply(this, arguments);
        }
        var args = before.apply(this, arguments);
        return arguments.callee.apply(this, args);
    };
};

Let’s see our previous examples coded with tailrec():

// factorial: the tail recursive version with an accumulator
var fact2Aux = function(n, acc){
    return n <= 1 ? acc : arguments.callee.call(this, n - 1, n * acc);
};
var fact2 = function(n){ return fact2Aux(n, 1); }

var fact3Aux = tailrec("<= 1", "a, b -> b", "[n - 1, n * acc]");
var fact3 = function(n){ return fact3Aux(n, 1); }

var gcd2 = tailrec("a, b -> b", "a", "a, b -> [b, a % b]");

What does it buy us? The recursive part is encapsulated in linrec() or tailrec(), and we saved some bytes on the boilerplate. That’s it? Wait, there is more.

The recursive solutions sound like fun, but in the real life we have to take into account harsh realities:

Recursions can be expensive: stack frames are allocated, internal variables are allocated, we don’t reuse unneeded variables of the previous stack frame, and so on.
Usually there is a restriction on how many times we can recurse.

How severe are restrictions on number of recursions? We can write a super-simple program to find out:

rec = function(n){
    if(n <= 1){ return 0; }
    return rec(n - 1) + 1;  // to eliminate possible tail recursion optimization
};

Trying different n we can find a limit. I tried this code on different browsers. Results are below:

Browser	Windows	Linux	Macintosh
Firefox 2	1,000	1,000	—
Firefox 3	3,000	3,000	3,000
IE 6	2,542	n/a	n/a
IE 7	2,556	n/a	n/a
IE 8.0.6001.17184 Beta	2,385	n/a	n/a
Opera 9.5	9,997	9,999	—
Safari 3.1.1	499	n/a	499
Midori 0.0.18/Webkit 1.0.1	—	149,794	—

Practically all browsers limit the depth of recursion at ~3,000. Safari is the weak spot with ~500. Opera looks good with ~10,000 but it aborted the JavaScript interpreter thread that ran the test, while other browsers threw an exception — the “nuclear” response.

Let’s convert recursive algorithms into loops. Now we can easily do that, because by abstracting the recursive algorithms we set us up for improving their implementation without changing their interface. These are “loop” versions:

var linrec = function(cond, then, before, after){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before),
        after  = df.lambda(after);
    return function(){
        var args = arguments, top, ret;
        // 1st part
        for(; !cond.apply(this, args); args = before.apply(this, args)){
            top = {prev: top, args: args};
        }
        ret = then.apply(this, args);
        //2nd part
        for(; top; top = top.prev){
            ret = after.call(this, ret, top.args);
        }
        return ret;
    };
};

var tailrec = function(cond, then, before){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before);
    return function(){
        var args = arguments;
        for(; !cond.apply(this, args); args = before.apply(this, args));
        return then.apply(this, args);
    };
};

By eliminating the recursion we used an explicit stack (a list, really) in linrec to hold some necessary variables. But tailrec was converted without any linear structures. Now you can see why the tail recursion is the preferred form of recursion for many — it doesn’t require any intermediate storage for iterations, and can be optimized easily.

Is it all we can do to make it faster? No. As you’ve noticed we worked with lambdas, which allow to represent functions in a compact textual notation. We can easily inline them saving on calling external functions for small operations. Of course the inlining works only for text snippets, regular functions will be called as usual.

But before taking a look at numbers let me introduce two more recursion combinators — binrec, and multirec, which are here to deal with binary and generic tree-like recursion respectively:

var binrec = function(cond, then, before, after){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before),
        after  = df.lambda(after);
    return function(){
        if(cond.apply(this, arguments)){
            return then.apply(this, arguments);
        }
        var args = before.apply(this, arguments);
        var ret1 = arguments.callee.apply(this, args[0]);
        var ret2 = arguments.callee.apply(this, args[1]);
        return after.call(this, ret1, ret2, arguments);
    };
};

var multirec = function(cond, then, before, after){
    var cond   = df.lambda(cond),
        then   = df.lambda(then),
        before = df.lambda(before),
        after  = df.lambda(after);
    return function(){
        if(cond.apply(this, arguments)){
            return then.apply(this, arguments);
        }
        var args = before.apply(this, arguments),
            ret  = new Array(args.length);
        for(var i = 0; i < args.length; ++i){
            ret[i] = arguments.callee.apply(this, args[i]);
        }
        return after.call(this, ret, arguments);
    };
};

Pay attention to different signatures of before() and after() functions.

Let’s implement the Fibonacci algorithm with binrec:

var fib0 = function(n){
    return n <= 1 ? 1 :
        arguments.callee.call(this, n - 1) +
            arguments.callee.call(this, n - 2);
};

var fib1 = binrec("<= 1", "1", "[[n - 1], [n - 2]]", "+");

Yes, it is that simple.

All 4 recursion combinators implemented with loops and possible lambda inlining are the part of dojox.lang.functional package:

dojo.require("dojox.lang.functional.linrec");
dojo.require("dojox.lang.functional.tailrec");
dojo.require("dojox.lang.functional.binrec");
dojo.require("dojox.lang.functional.multirec");

And now it is time for obligatory numbers. I wrote a simple program , which measures performance of different versions of factorial functions, and Fibonacci functions. Here are the results for the factorial:

OS	Browser	raw rec	raw loop	linrec rec	linrec loop	linrec	tailrec rec	tailrec loop	tailrec	multirec rec	multirec loop	multirec
Windows	IE 6	703	78	2,266	1,875	782	2,157	1,422	531	3,312	3,219	2,063
	IE 7	719	94	2,328	2,047	765	2,531	1,828	562	3,390	3,390	1,985
	IE 8 Beta	516	47	1,344	1,078	594	1,297	734	406	2,140	2,125	1,640
	FF 1.5	625	31	1,281	1,156	657	1,469	766	532	2,172	2,094	1,578
	FF 2	984	47	2,860	2,406	1,984	2,469	1,281	1,359	4,968	5,406	4,859
	FF 3	613	8	247	172	131	694	359	343	576	417	903
	Opera 9.5	94	0	188	156	94	203	109	63	328	328	250
	Safari 3.1.1	62	0	281	94	78	188	63	31	329	234	109
Linux	FF 2	819	42	2,015	1,658	1,033	2,155	1,175	790	3,362	3,379	2,245
	FF 3	462	26	916	1,138	325	1,440	397	250	2,178	1,054	745
	Opera 9.5	133	24	445	382	182	411	263	123	673	763	528
	Midori 0.0.18	154	11	343	194	99	347	116	76	469	329	249

And here are the results for the Fibonacci:

OS	Browser	raw rec	raw tail	raw loop	binrec rec	binrec loop	binrec	tailrec rec	tailrec loop	tailrec	multirec rec	multirec loop	multirec
Windows	IE 6	1,125	453	47	5,953	5,109	2,328	1,187	766	312	6,609	7,094	3,719
	IE 7	1,219	407	47	6,000	4,938	2,485	1,219	750	313	6,781	6,938	3,875
	IE 8 Beta	797	281	15	3,609	2,985	1,953	750	406	235	4,344	5,844	3,093
	FF 1.5	797	282	31	3,594	3,156	2,203	750	438	343	4,422	4,032	3,266
	FF 2	1,656	672	15	8,860	8,234	7,000	1,610	672	672	11,985	11,328	8,390
	FF 3	449	93	11	1,240	845	1,257	228	88	48	1,332	653	1109
	Opera 9.5	140	32	0	500	484	250	93	62	32	547	671	515
	Safari 3.1.1	156	63	0	547	313	203	172	47	47	656	343	250
Linux	FF 2	1,343	346	36	5,858	4,942	3,188	1,064	776	464	6,787	6,317	4,459
	FF 3	718	914	12	3,067	2,134	1,756	573	228	152	3,484	3,307	2,025
	Opera 9.5	312	87	11	1,173	1,041	544	260	167	79	1,364	1,392	993
	Midori 0.0.18	268	116	8	831	457	288	219	103	65	1,003	574	394

The legend:

raw rec is the manually-written recursive version of an algorithm,
raw tail is the tail-recursive version of an algorithm,
raw loop is the loop-based version of an algorithm,
XXXrec rec is the naïve recursive version of linrec, tailrec, binrec, or multirec given above,
XXXrec loop is the loop-based version of respective recursion combinators,
XXXrec is the version from dojox.lang.factorial, which implements an optional inlining,
Midori is the Webkit-based web browser for Linux,
multirec() doesn’t match the factorial nor Fibonacci algorithms, I used it just to see how (badly) it stacks up against normal methods,
all numbers were taken on the same machine under Linux, and Windows (the latter was running under VMware),
all numbers were taken on the 3rd run,
all numbers are in milliseconds.

You are invited to check out the code of the test program . I didn’t massage the numbers and you can spot certain artifacts, when the results look a little strange. Numbers change from run to run, and if we were unlucky and caught some massive garbage collection, the numbers would be a little bit inflated. Unfortunately most browsers don’t allow JavaScript to run more than several seconds in a row, and show an alert about “unresponsive script” making the result invalid. But the numbers are real, just disregard small differences due to inaccuracy if timers, and look for major trends.

The results are in.

As you can see while special versions of these simple functions written with the knowledge of their respective domains are extremely fast, our optimized generic versions can match and in some cases exceed the naïve versions without investing a lot of time in writing them. Obviously, if we are to test more calculation-heavy recursive functions (both factorial and Fibonacci use very light-weight snippets), the difference between the generic version and the original version will be much smaller, if any.

The conclusion: it pays off to abstract algorithms, because you can improve them independently of the applications they are used in.

Using recursion combinators in JavaScript

Popular tags