October 30, 2019

# Introduction

Recursion is central to functional programming, as a clearer alternative to loops as other control structures typical of imperative languages. Functional programming encourages programmers to study recursion in greater depths. I first encountered the Y combinator in the mind-bending penultimate chapter of the wonderful The Little Schemer, which explores recursion in great depth. In an effort to unbend my own mind on the subject, I decided to derive it for myself, so I could see how it worked, and gain an extra tool in dealing with recursion and closures.

Do you now know why Y works? Read this chapter just one more time and you will.
(The Little Schemer)

The Y combinator was discovered by Haskell Curry in the 1940s. It allows recursion to be captured without functions needing to reference themselves by name. It provides some insight into the nature of recursion in the lambda calculus (where nothing has a name), and also demonstrates the power of closures.

We will conduct our explorations mostly in scheme, because it’s expressive, concise and elegant, but we also give examples in Haskell and JavaScript at the end. The Haskell version is ludicrously simple and clear (relying on lazy evaluation), while the JavaScript version mirrors the Scheme version.

## Further reading

The Little Schemer (amazon uk/us) gives a great introduction to recursion, including a section on the Y combinator. The presentation here is a little different, since I wanted a more direct understanding of how the Y combinator worked.

# Recursive functions as fixed points of (higher-order) functions

The Y combinator allows the programmer to pass in a function which isn’t explicitly recursive (doesn’t reference itself by name), but describes a step in a recursive process with a continuation, and provides back a new function which recursively applies that step using itself as the continuation.

Let’s start by making the term “step in a recursive process with a continuation” more concrete, and clarify how the Y combinator acts on these steps.

To give us something specific to think about, let’s examine the factorial function. The classic recursive definition of factorial is expressed in Scheme as follows

``````(define (factorial n)
(if (= n 0)
1
(* n (factorial (- n 1)))))
``````

This definition references itself. We can view it as an equation in terms of `factorial`, however. In fact, we can define

``````(define (factorialize f)
(lambda (n)
(if (= n 0)
1
(* n (f (- n 1))))))
``````

and observe that `(factorialize factorial)` (`factorialize` applied to the `factorial` function) is itself `factorial`. The formal way to say this is that `factorial` is a fixed-point of `factorialize`.

Its important to understand that `factorialize` operates on all functions from numbers to numbers. In terms of function, we can look at `factorialize` as doing a single step in the factorial function, and then, instead of recursing, handing off the remainder of the work to a continuation which is passed in as a parameter. This is what is meant by a “step in a recursive process with a continuation”. `factorialize` itself is not recursive - it hands the recursion over to some continuation which is passed in.

The Y combinator turns these recursive steps into full-blown recursive functions. Applied to `factorialize` it finds a fixed point (you can prove by induction that such a fixed point is necessarily the `factorial` function). This means we can define the factorial function by

``````(define factorial
(Y factorialize))
``````

where here, `Y` is the Y combinator. How does it do this? In effect it passes `factorialize` in as the continuation to `factorialize`, so that the same recursive step is applied over-and-over, until we reach the base case.

# Deriving the Y combinator

## Capturing our own value

The fixed point perspective is a useful starting point, because we want the Y combinator applied to `factorialize` to be an expression `expr` which satisfies

``````expr = (factorialize expr)
``````

One possible starting point is to ask, how can an expression capture it’s own value? That is, can we write an expression, which, inside itself, has a handle on its own value.

The trick to doing this is to observe that applying the anonymous function `(lambda (f) (f f))` to a function allows a function to receive itself as an argument. If we feed this function another function, `(lambda (recur) ...)`, and try to evaluate it

``````((lambda (f) (f f))
(lambda (recur)
...))
``````

Then inside the inner lambda, `recur` will be bound to the `(lambda (recur) ...)`. But then `(recur recur)` is just the inner lambda `(lambda (recur) ...)` applied to itself, which is the value of the expression we’re trying to evaluate (that might take a few reads!).

In other words, if we try to evaluate the following

``````((lambda (f) (f f))
(lambda (recur)
(recur recur)))
``````

by applying the outer function, we get back to where started. If you try to evaluate this, it will just loop forever! In Haskell we could write this as

``````let x = x in x
``````

Since we are looking for a function `exp` with value `(factorialize exp)`, and we know `(recur recur)` has value `exp` in the above, we can try inserting a call to factorialize:

``````((lambda (f) (f f))
(lambda (recur)
(factorialize (recur recur))))
``````

By the same reasoning, if this has value `v`, then by applying the outer `lambda`, we see it also has value `(factorialize v)`. Great! We’ve found a fixed point for `factorialize`, and hence this must be the `factorial` function. In fact, if we parameterize over `factorialize` then we have the (formal) Y combinator!

## Making it run

What happens when we try and evaluate this? Firing up a Scheme interpreter and plugging it in

``````((lambda (f) (f f))
(lambda (recur)
(factorialize (recur recur))))

;Aborting!: maximum recursion depth exceeded
``````

Hmm. The issue here is that when trying to evaluate this procedure, `(recur recur)` has to be fully evaluated before a call to `factorialize` is made (scheme evaluates its arguments before calling functions). This means for our expression `exp` to be evaluated, `exp` (`(recur recur)`) must first be evaluated - this leads to an infinite loop!

To fix this, we want to delay the evaluation of `(recur recur)` until it is needed (in other words evaluate it lazily). We can do this with the aid of a lambda:

``````((lambda (f) (f f))
(lambda (recur)
(factorialize (lambda (x) ((recur recur) x)))))
``````

Let’s try it:

``````(((lambda (f) (f f))
(lambda (recur)
(factorialize (lambda (x) ((recur recur) x))))) 0)

;Value: 1
``````
``````(((lambda (f) (f f))
(lambda (recur)
(factorialize (lambda (x) ((recur recur) x))))) 5)

;Value: 120
``````

Looking good! Notice that none of this has anything to do with `factorialize`. We can parameterise and abstract:

``````(define (Y F)
((lambda (f) (f f))
(lambda (recur)
(F (lambda (x) ((recur recur) x))))))
``````

Hello Y combinator!

# Examples of the Y combinator in action

We started with the factorial function, so that ought to work as expected:

``````(define factorial
(Y
(lambda (recur)
(lambda (n)
(if (= n 0)
1
(* n (recur (- n 1))))))))

(factorial 5)

;Value: 120
``````

Another easy example is defining the length of a list:

``````(define length
(Y
(lambda (recur)
(lambda (l)
(if (null? l)
0
(+ 1 (recur (cdr l))))))))

(length '(1 2 3))

;Value: 3

``````

Multiple calls to recur also work just fine:

``````(define fibonacci
(Y
(lambda (recur)
(lambda (n)
(cond
((= n 0) 0)
((= n 1) 1)
(else (+ (recur (- n 1))
(recur (- n 2)))))))))

(map fibonacci (list 0 1 2 3 4 5 6 7 8))

;Value: (0 1 1 2 3 5 8 13 21)
``````

We can also use the Y combinator’s definition to write recursive lambdas inline. To give a contrived example using length of lists:

``````(map
((lambda (F)
((lambda (f) (f f))
(lambda (recur)
(F (lambda (x) ((recur recur) x))))))
(lambda (recur)
(lambda (l)
(if (null? l)
0
(+ 1 (recur (cdr l)))))))

'((1) (1 2) (1 2 3)))

;Value: (1 2 3)
``````

# Intuitions about Y as a limit

Let’s try and get a different intuition for how Y works. Let’s lean on our `factorial` and `factorialize` example some more.

One intuitive way to get `factorial` out of `factorialize`, is to pass `factorialize` something like `factorialize` as it’s argument. In fact, each of the following gets closer and closer to `factorial`

``````(factorialize factorialize)                                ; Evaluates correctly for 0
(factorialize (factorialize factorialize))                 ; Evaluates correctly for 0, 1
(factorialize (factorialize (factorialize factorialize)))  ; Evaluates for correctly 0, 1, 2
...
``````

One can think of `factorial` as being something like

``````(factorialize (factorialize (factorialize ...)))
``````

Notice that to evaluate `factorial` on a given number, we only need finitely many of these calls to `factorialize`.

In fact, if we expand, for example, `((Y factorialize 3))` we get

``````(factorialize
(factorialize
(factorialize
((factorialize _) 0))))
``````

where the `_` represents a `lambda` which is never evaluated.

# In some other languages

The Y combinator is not restricted to Lisps. Let’s give examples of the Y combinator in Haskell and JavaScript.

## The Y combinator in Haskell

Lazy evaluation means that in Haskell we don’t need to work so hard, and we can just write down what i means to be a fixed point

``````y :: (a -> a) -> a
y f = let g = f g in g

factorial :: Integer -> Integer
factorial = y factorialize
where
factorialize _ 0 = 1
factorialize recur n = n * recur (n - 1)
``````

This to me seems something close to magic, even though in many ways its simpler to think about than the Scheme version. `y` is more often named `fix` in the Haskell community, presumably because it is a definition by equation of a fixed-point. Haskell really is quite beautiful.

## The Y combinator in JavaScript

JavaScript is not so beautiful. Lambdas and functions might be the only sensible parts of JavaScript, but that’s an extraordinarily powerful part all the same. Another grace is that you can even try this snippet in the developer console of your browser.

``````const y = (f) => ((g) => g(g))(
(recur) => f((x) => recur(recur)(x))
);

const factorialize = (recur) => (n) => n == 0 ? 1 : n * recur (n - 1);
const factorial = y(factorialize);
``````

# Conclusions

Lambdas really are way more powerful than you would at first think! I found understanding how to derive the Y combinator gives me a new way to think. There really are good reasons why functional programming reveres the lambda so much - they are the Jedi weapon par excellence.

I can imagine many folks would balk at the code for the y combinator - it’s hard to see what it does at a glance. It’s nonetheless wonderfully abstract, and can be understood by its properties. Nonetheless, explicit recursion is probably clearer.