Or, to be pedantic, "A Fixed-point Operator for m4 macros". Which bears a strong resemblance to the applicative-order Y combinator, since m4 macro expansion is eager - argument collection happens before macro expansion.
This document will probably make more sense if you've already read Higher-Order Programming in m4, which introduces some of the macro-construction techniques we rely on here.
The derivation below owes a lot to rayfd's Javascript version, because I find the derivation given in The Little Schemer to be extremely unintuitive.
A recursive function
I've chosen here to implement a function whose arguments are strings. This means that quoting needs to be done right, to make sure that the arguments are never expanded. (Were we to write a more traditional derivation using the factorial function, with numerical arguments, we could get away with very sloppy quoting since we could ignore the possibility of argument expansion.
So: consider a function to calculate the length of a string recursively:
-
the termination condition is that an empty string has length zero.
-
any other string has length (one plus the length of ths substring starting at the second character), which we calculate by calling the function recursively on the substring.
define(`length',`dnl ifelse($1, `', 0, `dnl eval(1+length(substr($1,1)))`'dnl ')`'dnl ')`'dnl dnl To verify that this works: length(`abcdefg') dnl -> 7 dnl To make really really sure it does what we want: define(`abcdefg', `')dnl length(`abcdefg') dnl -> 7
Our aim will be to find an operator which will generate this recursive function for us, without having to have the name length available for the continuation function.
The first step is to abstract out the name of the continuation macro, making it the first argument to length.
define(`length',dnl `ifelse(`$2', `', 0, `eval'(1+`$1'(``$1'', substr($2,1))))')`'dnl dnl length(`length', `abc')
This is rather clumsy, and requires repetition of the macro name. It also mixes up the evaluation of the function and the implementation of the recursion. We can separate out the evaluation and recursion by encapsulating the evaluation as an anonymous function.
define(`createLength',`dnl pushdef(`LL',dnl `ifelse'(dnl changequote([,])`$[]1'changequote,`',0,dnl ``eval''(dnl 1+``$1'(``$1'')'changequote([,])(`substr'(``$[]1'',1))changequote`'dnl )`'dnl )`popdef(`LL')'`'dnl )'dnl `LL')`'dnl
We create an anonymous function, LL, which takes one argument, the string to measure, and which calls out to a continuation. This anonymous function is created by the createLength macro, which uses its argument to generate the continuation for the anonymous function - which is therefore Curried over the continuation, as it were.
The continuation function itself is generated by createLength. So createLength(`createLength') generates an anonymous function, and passes itself to that anonymous function to provide the continuation:
createLength(`createLength')(`abcd') -> 4
That let us isolate recursion from evaluation, and we could bind a name to createLength(`createLength') to provide a transparently recursive function.
But, it would be nice if we could pull length evaluation and recursion completely apart, and make the recursion part separately reusable. We can do this like so, with this pair of macros:
define(`leng',`dnl pushdef(`LLf',dnl `ifelse'(changequote([,])`$[]1'changequote,`',0,dnl ``eval''(1+``$1''(`substr'(changequote([,])`$[]1'changequote,1))))dnl `popdef(`LLf')')'dnl `LLf')dnl dnl define(`recur',dnl `pushdef(`LL',dnl `leng'(``$1'(``$1'')')(changequote([,])`$[]1'changequote)dnl `popdef(`LL')')'dnl `LL')dnl
The first macro returns the anonymous length-calculating function, curried over its continuation. The second macro, given the name of the macro for generating the evaluating function, calls it, with itself as the continuation.
We can use this in the same way as `createLength' above:
recur(`recur')(`abcd') -> 4
This is still not quite right; we needed to hardcode the call to leng in the recur macro, and we're still doing the odd specify-the-name-twice thing.
We solve both of these by introducing another layer of Currying - writing a macro to generate recur(recur'), taking the name of the evaluate-function-generating macro (`leng) as its argument, and returning the double-call to recur, like so:
define(`recurWrapper', `dnl define(`recur',dnl `pushdef(`LL',dnl `$1''changequote([,])(['changequote([,])`changequote`]`$[]1'(``$[]1'')dnl ['changequote([,])'changequote`])changequote`dnl (changequote([,])`$[]1'changequote)dnl `popdef(`LL')')'dnl `LL')dnl recur(`recur')')`'dnl
recurWrapper simply expands to an call to an appropriately-defined recur function, so it can be used like:
recurWrapper(`leng')(`abcd') ->4
and of course we can bind the name length to such a call to recurWrapper, for reusability.
This is almost what we came looking for, with one important exception - the call to recurWrapper leaves recur defined - we haven't popped the definition. As it stands, we're not able to. As we discussed in the previous essay, our "anonymous functions" can only be executed once, but we need to execute recur multiple times. And we can't simply pop the definition after the execution of recur, since that is the last action of recurWrapper, which must have nothing between it and the argument to come.
So we must allow for one final anonymous function to clear up after recur, and ensure that a call to recurWrapper leaves the macro namespaces as clean as it found it - we write another function to wrap the final call to recur, and pop its definition:
define(`recurWrapper', `dnl define(`recur',dnl `pushdef(`LL',dnl `$1''changequote([,])(['changequote([,])`changequote`]`$[]1'(``$[]1'')dnl ['changequote([,])'changequote`])changequote`dnl (changequote([,])`$[]1'changequote)dnl `popdef(`LL')')'dnl `LL')dnl pushdef(`LL',`dnl recur(`recur')'changequote([,])(`$[]1')changequote`dnl popdef(`recur')`'popdef(`LL')')`'dnl '`LL')`'dnl
This is now a fixed point combinator for the m4 macro language. We can provide a evaluative-function generating macro - like leng above - which evaluates one step of a recursive function, and then calls to an unspecified continuation function. Given such a macro, passing its name as an argument to the latter recurWrapper' expands to a full evaluation of the recursive function. To make this clearer, we can rename the macro `m4Y (the Y combinator in m4), and rename its "anonymous functions" for consistency.
define(`m4Y', `dnl pushdef(`m4Y_recur',dnl `pushdef(`m4Y_LL',dnl `$1''changequote([,])(['changequote([,])`changequote`]`$[]1'(``$[]1'')dnl ['changequote([,])'changequote`])changequote`dnl (changequote([,])`$[]1'changequote)dnl `popdef(`m4Y_LL')')'dnl `m4Y_LL')dnl pushdef(`m4Y_LL',`dnl m4Y_recur(`m4Y_recur')'changequote([,])(`$[]1')changequote`dnl popdef(`m4Y_recur')`'popdef(`m4Y_LL')')`'dnl '`m4Y_LL')`'dnl
Ta-da!
There is one small caveat to the above which is why we renamed the "anonymous functions" - they are not anonymous for the duration of any calls to the evaluation of the recursion, so that must not use the strings m4Y_recur or m4Y_LL inside the evaluation function.
That aside, we can now do the following:
define(`length', `dnl m4Y(`leng')(`$1')')`'dnl
"define length to be the function resulting from the application of m4Y to the leng macro."
length(`abc') -> 3 length(`abcd') -> 4
and we recover the correct answer, having used m4Y in a fashion transparent to the caller.