Functional fun in JavaScript with Dojo

Everybody knows that JavaScript is a multi-paradigm language, and it can be used to program functionally. Practically all functional idioms can be used directly: higher-order functions, recursion, closures, and so on. The recent resurgence of Functional Programming (FP) brings functional methodologies in the mainstream. FP fundamentals gave us a lot of powerful idioms: iterative functions, which can replace loops, list processing in general, function manipulations, and many other things, which helps us to keep our code small yet concise, more powerful, and more fun. Let’s take a look at how Dojo helps to leverage the functional paradigm in the Core, and in the extended DojoX package (dojox.lang.functional).

One of the low hanging fruits is a functional decomposition aimed at reducing the complexity of algorithm implementations. For imperative and OOP programs in most cases it means getting rid of explicit loops. From my practice loops are a common source of errors because they involve a lot of low-level state-related details: an index variable, how it is initialized and advanced, checked for bounds, and so on. And don’t forget that index variables are scope-global, like all other variables in JavaScript. In many cases I don’t want to be concerned with such mundane low-level details. I don’t even care about an index variable, if all I want is to go over items in my container. In some cases I don’t even care about an order I go over the items. Another bad thing about loops is their poor maintainability: such code is hard to refactor, especially if it means moving nested loops. In this case we need to pay special attention to duplicated index variables (which were utilitarian to begin with!), mass find/replace is our friend, and so on — essentially we do a full-blown code graft instead of a simple move.

In order to use FP idioms we need tools for that. This problem was long recognized by the community, and JavaScript was augmented on several occasions:

JS 1.6 (in Firefox 1.5) introduced so-called Array extras: special Array methods, which help to simulate lists with arrays: indexOf(), lastIndexOf(), every(), some(), filter(), map(), forEach(). The last five methods are especially important because they help to eliminate the most common direct loops.
JS 1.7 (in Firefox 2) introduced Array comprehensions borrowed from Python. The new syntax allows to generate arrays using a compact yet clear notation reducing the possibility of errors. And of course iterators and generators will helps us with cleaner loops too. Another goody is the block scope with “let”.
JS 1.8 (in Firefox 3) brought us more Array extras: reduce() and reduceRight(). They give us a native support for all-important folds. Another notable additions are expression closures (simplified one-line functions), and generator expressions.
JS 2 (ES4 PDF) takes us even farther: for each statement, tail calls, and the whole raft of language improvements. Presumably JS 2 will come with the next generation of JavaScript virtual machines helping to reduce penalties for using new abstractions.

But… In most cases we cannot mandate what browser to use to our users. A lot of people still use Firefox 1.5, Opera doesn’t implement these new features, Safari implements some of them, and the majority of users come with Internet Explorer, which implements only a subset of ES3. We have no choice but to implement this functionality by ourselves.

Dojo made an important step in this direction by implementing JS 1.6 Array extras in Dojo Core. If you include dojo.js you already have them, no extra scripts are necessary. Let’s take a look at our favorite five functions: filter(), map(), forEach(), every(), and some().

The filter() function creates a new array with elements flagged by the callback:

JS 1.6: newArray = array.filter(callback[, thisObject])
Dojo:   newArray = dojo.filter(array, callback[, thisObject])

The prototype is self-descriptive. thisObject is an optional argument, which provides a context for the callback function. The callback function is called with following arguments:

callback.call(thisObject, array[i], i, array)

where i is the venerable index variable of numeric type. Of course we are free to ignore unneeded arguments:

dojo.filter([1, 2, 3], function(x){ return x % 2; })

The fragment above filters the array for odd numbers. It returns [1, 3].

The map() function creates a new array by applying the callback function to all elements of the original array:

JS 1.6: newArray = array.map(callback[, thisObject])
Dojo:   newArray = dojo.map(array, callback[, thisObject])

All arguments and the signature of the callback function is the same as for the filter() function. The return value of the callback will be copied to the new array verbatim without attempts to interpret it.

dojo.map([1, 2, 3], function(x){ return x % 2; })

The fragment above will return the array of 0s and 1s depending on the odd-ness or even-ness of corresponding elements of the array. It’ll return [1, 0, 1].

The forEach() function executes the callback function for every element of the array:

JS 1.6: array.forEach(callback[, thisObject])
Dojo:   dojo.forEach(array, callback[, thisObject])

All arguments and the signature of the callback function is the same as for the filter() function. It doesn’t return any value and all callback’s returns will be ignored.

dojo.forEach(array, console.log)

The fragment above will print in the Firebug console three values for each element: its value, its index, and the original array.

The every() function returns true if the callback function returned true for all elements:

JS 1.6: flag = array.every(callback[, thisObject])
Dojo:   flag = dojo.every(array, callback[, thisObject])

All arguments and the signature of the callback function is the same as for the filter() function. According to the reference implementation every() may inspect the subset of the array. As soon as it has the first negative hit, it will stop iterations.

dojo.every([1, 2, 3], function(x){ return x % 2; })

The fragment above will return false, because the array has one even element (2).

The some() function returns true if the callback function returned true at least for one element:

JS 1.6: flag = array.some(callback[, thisObject])
Dojo:   flag = dojo.some(array, callback[, thisObject])

All arguments and the signature of the callback function is the same as for the filter() function. According to the reference implementation some() may inspect the subset of the array. As soon as it has the first positive hit, it will stop iterations.

dojo.some([1, 2, 3], function(x){ return x % 2; })

The fragment above will return true, because the first element of the array (1) satisfies the criteria.

Update (3/11/2008): Both some() and every() can be used to implement a loop over array with an early escape (break from the loop), so practical programmers can use it for this side effect alone. For example, if you use some(), you should return false from the callback function when you want to continue the loop, and true when you want out. every() requires opposite values.

As you can see Dojo implements these five functions practically verbatim. The practical difference is the original five operates on sparse arrays, while the Dojo implements operations over dense arrays. It means that the reference implementation will detect and skip non-existent elements, while Dojo will call the callback with the “undefined” value. From my point of view the original behavior is rarely needed, and the Dojo implementation is a little bit faster because it doesn’t check for existence.

What can we do with these functions? We can simplify the code. As an example let’s play with a toy snippet, which calculates a percentage corresponding to a positive value in an array (e.g., to calculate slices of a pie chart):

var values = [1, 1, 2, 4], sum, i;
for(sum = 0, i = 0; i < values.length; sum += values[i++]);
var percents = new Array(values.length);
for(i = 0; i < values.length; percents[i] = values[i] / sum, ++i);

It does what we need, but a lot of things bother me. The code looks hackish because I wanted to keep it reasonably compact. If I move it around I have to be careful that the variable “i” is not defined in the new scope, and it is likely to be defined, especially inside other loops. Remember that majority of web browsers do not implement JavaScript 1.7 and have no block-level scopes — all variables are function-global. It rules out using block-level “let” declarations in most cases.

The last two lines are clearly beg for the dojo.map() goodness:

var percents = dojo.map(values, function(val){ return val / sum; });

What about the first two lines? dojo.forEach() with side-effects? It works:

var sum = 0;
dojo.forEach(values, function(val){ sum += val; });

Nah. Still too hackish. We need more primitives. Namely the fold/reduce group. It is provided by JS 1.8, but not by the Dojo Core. Let’s take a look at two extra functions.

The reduce() function applies the callback function to two values (its previous result or an initial value, and the current item of the array) left-to-right:

JS 1.8: result = array.reduce(callback[, initialValue])

The initialValue argument is optional. It provides the initial value for a left argument of the callback function. If it is skipped the first element of the array is used as the initial value. The callback function has the following signature:

newValue = callback(previousValue, array[i], i, array)

The previousValue is the initialValue on the first iteration, all other iterations are replaced with the newValue of the previous iteration. The final value is returned as the result.

[1, 2, 3].reduce(function(a, b){ return a + b; }, 0)

The above fragment sums up all elements starting with 0 and returns 6. We can achieve the same result without specifying the initial value:

[1, 2, 3].reduce(function(a, b){ return a + b; })

The reduceRight() function is very similar to reduce() but operates in the right-to-left fashion:

JS 1.8: result = array.reduceRight(callback[, initialValue])

All arguments have the same semantics as in reduce(). The callback function signature is the same with the previous value as the very first argument.

With these functions available we can rewrite our example like this:

var sum = values.reduce(function(a, b){ return a + b; });
var percents = values.map(function(val){ return val / sum; });

You can see that our snippet became more portable because we don’t have explicit index variables.

But Dojo doesn’t implement reduce() nor reduceRight(). And you can see that there is a slight difference between the reduce group, and the original iterative group of functions—there is no way to specify the context for the callback function in reduce() and reduceRight(). Of course we can augment it by using dojo.hitch() or similar techniques. Or we can use the dojox.lang.functional package.

First of all let’s look at our modified example and find out what’s wrong with it. Function definitions are way too unwieldy. 31 characters just to add two values? 34 characters to divide two values? Well, we can remove some spaces and shorten val and sum variable names to 1 character each, but it doesn’t buy us a lot while making code less readable. We have to do something more radical.

Update (2/25/2008): Several people (including Alex Russell, Shane O’Sullivan, and Wolfram Kriesing) pointed out that I failed to mention a shortcut provided by Dojo. Let me correct my mistake: you can use a string as a callback for all array extras implemented in Dojo Core. It will be interpreted as a body of function with 3 standard parameters named item, index, and array respectively. So we can get rid of the function prefix, and write something like that: dojo.map([1, 2, 3], "return item % 2"). It gives more flexibility in some cases, and remember that you get this functionality for free simply by including dojo.js in your page. But the next solution is even more radical. Read on.

The answer is simple: we can simulate lambdas. I spent some time building them, but eventually stumbled on the great work of Oliver Steele (http://osteele.com/sources/javascript/functional/) released under the MIT license. I immediately adopted his code with minor modifications.

What does lambda do for you? It gives you a super simple way to express functions with strings using a light-weight translator.

Let’s include the package and define a short-cut for the namespace first, which will be reused in coming examples:

dojo.require("dojox.lang.functional");
var df = dojox.lang.functional;

I assume that you have some cursory knowledge of the Dojo package system. Describing the dojox.lang.functional I describe the current trunk version (just do “svn co http://svn.dojotoolkit.org/dojo/view/anon/all/trunk dojo” to check it out, details are here). Now we can define a function to add two values:

var add = df.lambda("+");

That’s it! Division? Here you go:

var div = df.lambda("/");

Do you want to increment any value by 5? Easy:

var add5 = df.lambda("+5");

Do you want to divide it by 2?

var div2 = df.lambda("/2");

Divide 2 by your value?

var twoDiv = df.lambda("2/");

Compute a minimum of two values?

var min2 = df.lambda("Math.min(a, b)");

Why do you want to dress Math.min() like that? Math.min() takes as many arguments as specified. With lambda we restricted the arity of this function to two arguments, the rest will be ignored.

The rules of the lambda encoding are very simple:

If the string starts with an operator, this is the place for an implicit argument.
If the string ends with an operator, this is the place for an implicit argument.
Otherwise the string is inspected for free standing identifiers, which start with a lower-case character. They will be arguments in the left-to-right order. (Common reserved identifiers are ignored, e.g., “this”, “typeof”, and similar are skipped. You can see the whole list of them at the bottom of this article.)
Otherwise the string is inspected for a full-blown lambda function declaration like this: a, b -> Math.min(a, b). This is the simple way to specify the order of arguments, if it doesn’t match the order deduced by the previous rule.

As you can see these rules make writing functions compact and simple. The downside is obvious too: every time we will parse and create a function dynamically. Don’t panic! In most cases the overhead is acceptable, just don’t do stupid things like re-creating the same function in the loop—pre-create it and it will be as fast as the regular JavaScript function.

Let’s re-write our example once more:

var sum = values.reduce(df.lambda("+"));
var percents = values.map(function(val){ return val / sum; });

Hmm. We cannot pull the sum variable directly because it is not in our scope. We’ll deal with it later. What else is wrong? The call to df.lambda(). We can safely dispatch it too:

var sum = df.reduce(values, "+");

Huh?

The dojox.lang.functional defines all familiar functions with following improvements:

If the callback function is a string, it is interpreted as a lambda expression.
If the callback function is an array, it is interpreted as a function composition — all elements are interpreted as functions and applied right-to-left sequentially.
Just like the Dojo Core iterative functions they operate on dense arrays.
If the input array is a string, it is converted into an array of characters. (Both the Dojo Core and JS 1.6 iterative functions do the same.)
If the input "array" is actually an object, it is assumed to be a simple iterator object (more on that later).

And it defines the full fold/reduce family of functions and much more.

Prototypes of the five iterative functions are the same as in the Dojo Core (of course, you should prefix them with “df” instead of “dojo”). Both reduce() and reduceRight() are defined in the dojox.lang.functional.fold module, which should be included explicitly. Their prototypes:

df.reduce(array, callback[, initialValue])
df.reduceRight(array, callback[, initialValue])

You can see it is pretty consistent with both JS 1.8 definitions and the Dojo Core definitions. The callback function prototype is the same as in the standard.

The dojox.lang.functional.fold module defines four more classic fold methods:

df.foldl(array, callback, initialValue[, thisObject])
df.foldl1(array, callback[, thisObject])
df.foldr(array, callback, initialValue[, thisObject])
df.foldr1(array, callback[, thisObject])

foldl() is essentially the reduce() function with the initial value, but you can specify an optional context. foldl1() is reduce() without the initial value, and an optional context. foldr() and foldr1() are defined similarly, but just like reduceRight() they operate right-to-left. (In reality reduce() is defined in terms of foldl() and foldl1(), and reduceRight() is defined in terms of foldr() and foldr1().)

Okay, let’s rewrite the example again:

var percents = df.map(values, "/ this", df.reduce(values, "+"));

One line that can be moved around portably. The sum variable was eliminated, the value is passed directly as the context to our divide lambda. No index variables and associated junk.

What if we have to call this line over and over and profiling shows performance problems? We can instantiate our lambdas once and reuse them:

var add = df.lambda("+"), divThis = df.lambda("/ this");
...
var percents = df.map(values, divThis, df.foldl1(values, add));

Of course we can pass real functions as well.

But how fast are the Dojo implementations? How do they stack up against the native implementations? Take a look at this page, which measures the performance of different iterative functions. Below are tables with times in milliseconds. All (but one) taken on Windows XP running different browsers. The so-called “raw” version represents a straight-forward loop implementation of an algorithm, the “dojo” version calls an algorithm from the Dojo Core, the “df” version invokes an algorithm from dojox.lang.functional package, and the “array” calls the native builtin algorithm, if available. Empty cells indicate that the algorithm is not implemented in this configuration. All algorithms are called 200 times on an array of 2000 samples. You can glimpse all relevant details from the source code.

filter:

map:

forEach:

reduce:

reduceRight:

All numbers were taken on the third run. Don’t pay too much attention to close numbers—they vary from run to run. Look for big difference and overall trends. For example, all builtin algorithms in all versions of Firefox are slower than the “df” version (exception: forEach is on par), but Safari’s implementation is much better and predictably beats all hand-made versions. Because all runs were done on the same hardware, you can see the relative JavaScript performance of different browsers too. You can see that new generation of browsers are much faster.

Now let’s go back to dojox.lang.functional.

I mentioned a simple iterator before, which can be used instead of an array. What is it? It is literally a simple object, which should implement two methods:

flag = iterator.hasNext()
value = iterator.next()

You can recognize the forward iterator interface from Java, and some other languages. If hasNext() returns true, it is safe to call next() to get the element. As soon as hasNext() returns false, we stop iterations. By using this interface it is possible to write adapters, which will implement sophisticated iterations. Because of this reason only left-to-right iterative functions support simple iterators.

What other goodies are there?

Module dojox.lang,functional.reversed provides right-to-left implementations for filter(), map(), forEach(), every(), and some() which can be useful for side-effects:

dojo.require("dojox.lang.functional.reversed");

newArray = df.filterRev(array, callback[, thisObject])
newArray = df.mapRev(array, callback[, thisObject])
df.forEachRev(array, callback[, thisObject])
flag = dojo.everyRev(array, callback[, thisObject])
flag = dojo.someRev(array, callback[, thisObject])

As you can see their signatures are compatible with regular versions. As a side-effect filterRev() and mapRev() produce a reversed array of processed elements:

df.filterRev([1, 2, 3], "% 2")

The above fragment returns [3, 1].

Module dojox.lang.functional.scan provides scan versions of fold functions:

df.scanl(array, callback, initialValue[, thisObject])
df.scanl1(array, callback[, thisObject])
df.scanr(array, callback, initialValue[, thisObject])
df.scanr1(array, callback[, thisObject])

They have similar signatures but return an array with all intermediate values:

df.scanl1([1, 2, 3], "*")

The above fragment will return [1, 2, 6].

df.scanl([1, 2, 3], "+", 0)

The above fragment will return [0, 1, 3, 6]. In the case of addition it looks like a running total.

Module dojox.lang.functional.curry defines currying helpers:

df.curry(fun[, arity])

The return value is a special helper, which “curries” the function. The arity (number of arguments) can be deduced from the function, or supplied explicitly. Of course you can specify a lambda string instead of a function — it will be converted automatically. Examples:

var fun = df.lambda("a, b, c, x -> a * x * x + b * x + c");
var cfun = df.curry(fun);
var r1 = fun(1, 2, 3, 4);
var r2 = cfun(1)(2)(3)(4);
var r3 = cfun(1, 2)(3)(4);
var r4 = cfun(1)(2, 3)(4);
var r5 = cfun(1)(2)(3, 4);
var r6 = cfun(1, 2, 3)(4);
var r7 = cfun(1)(2, 3, 4);
var r8 = cfun(1, 2, 3, 4);

As you probably guessed already in the above fragment all results (from r1 to r8) are the same.

df.partial(fun[, arg1, arg2, …])

This function was based on Oliver Steele’s partial(). It uses a predefined object df.arg to mark, which arguments are unbound. The rest is going to be bound. Example:

var div2 = df.partial("/", df.arg, 2);

The above fragment is a fancy way to define “/2”. The difference with df.curry() is that it allows to bind arguments in any order, while df.curry() bounds them left-to-right.

df.mixer(fun, array)

Thus function allows to rearrange arguments however you want. The 0-based order is defined by an array. Example:

var divRev = df.mixer("/", [1, 0]);

The above fragment defines a function with swapped arguments: the first argument will be the second supplied argument, and visa versa. You can even pick smaller number of arguments than supplied.

df.flip(fun)

This function reverses the order of supplied arguments and applies it to the function argument immediately.

var divRev = df.flip("/");

The above fragment reverses the order of arguments for the division before applying.

Module dojox.lang.functional.zip defines classic list/array combiners:

df.zip(array1, array2, ...)

This function creates an array of arrays combining elements of input arrays. Basically it “assembles” elements with the same index together. For example:

df.zip([a0, a1, a2], [b0, b1, b2], [c0, c1, c2])

will produce [[a0, b0, c0], [a1, b1, c1], [a2, b2, c2]].

df.unzip(array)

This function perform the opposite transformation disassembling the input array into an array of original arrays. For example:

df.unzip([[a0, b0, c0], [a1, b1, c1], [a2, b2, c2]])

will produce [[a0, a1, a2], [b0, b1, b2], [c0, c1, c2]]. You can see the symmetry here. Actually both functions are the same function with different argument adapters.

Module dojox.lang.functional.object defines important object helpers:

df.forIn(object, callback[, thisObject])

This function applies the callback to all elements of the object/dictionary skipping all elements present in the empty object. The latter is the major reason for this function. Otherwise on Internet Explorer you will hit all builtin and presumably invisible elements. The callback function is called with following arguments:

callback.call(thisObject, object[key], key, object)

It is pretty much the same as for arrays.

df.keys(object)

This function returns an array of all object’s keys skipping keys present in the empty object:

df.keys({a: 1, b: 2, c: 3})

will return [“a”, “b”, “c”].

df.values(object)

This function will return an array of all object’s values skipping keys present in the empty object:

df.values({a: 1, b: 2, c: 3})

will return [1, 2, 3].

There are two modules helping to deal with generation of sequences.

Module dojox.lang.functional.sequence defines following helpers:

df.repeat(n, callback, initialValue[, thisObject])

This function calls the callback with an optional context on the initial value producing a sequence of n result:

df.repeat(4, "2*", 1)

will produce [1, 2, 4, 8].

The callback signature looks like that:

newValue = callback.call(thisObject, previousValue)

The very first previousValue is the initialValue.

df.until(predicate, callback, initialValue[, thisObject])

This function is similar to df.repeat(), but instead of calling the callback some fixed number of times, it checks the result with the predicate function. The iterations stop as soon as the predicate returns true.

The callback’s signature is the same as in df.repeat(). The predicate’s signature looks like that:

flag = predicate.call(thisObject, currentValue)

Example:

df.until("> 20", "2 *", 1)

will return [1, 2, 4, 8, 16].

Module dojox.lang.functional.listcomp defines simple array comprehensions with a syntax similar to JS 1.7.

df.listcomp(string)

This function compiles and executes an array comprehension specified by a string:

df.listcomp("i for(var i = 0; i < 10; ++i) if(i % 2)")

will return a list of odd digits: [1, 3, 5, 7, 9].

In some cases you want to pre-compile an array comprehension, but execute it later. There is a function for that:

df.compileListcomp(string)

It returns a function, which can be called without parameters whenever you need to use it.

Sometimes you need a function definition without compiling it. This is the function, which does it:

df.buildListcomp(string)

I already mentioned lambdas. They are defined in the module dojox.lang.functional.lambda.

df.lamda(string)

This function was already described. You don’t need to use it directly with dojox.lang.functional functions, but you can use it for your own code. Just like array comprehensions have a method to build a function definition, there is such function for lambdas:

df.buildLambda(string)

This function returns a textual definition of a lambda function.

The full list of ignored identifiers (they are not recognized as automatic arguments): this, true, false, null, undefined, typeof, instanceof, in delete, new, void, arguments, decodeURI, decodeURIComponent, encodeURI, encodeURIComponent, escape, eval, isFinite, isNaN, parseFloat, parseInt, unescape, dojo, dijit, dojox, window, document.

There is a super-module dojox.lang.functional, which includes three commonly used modules:

dojo.require("dojox.lang.functional.lambda");
dojo.require("dojox.lang.functional.array");
dojo.require("dojox.lang.functional.object");

If you want extra modules, include them with dojo.require().

Now go and have fun with functions!

Hello, reader!

My name is Eugene Lazutkin.

Functional fun in JavaScript with Dojo

Popular tags