w.

# A Short Note on The Y Combinator

Cross-posted at https://invenia.github.io/blog/2018/08/20/ycombinator/.

## Introduction

This post is a short note on the notorious Y combinator. No, not that company, but the computer sciency objects that looks like this:

Don’t worry if that looks complicated; we’ll get down to some examples and the nitty gritty details in just a second. But first, what even is this Y combinator thing? Simply put, the Y combinator is a higher-order function $Y$ that can be used to define recursive functions in languages that don’t support recursion. Cool!

For readers unfamiliar with the above notation, the right-hand side of Equation \eqref{eq:Y-combinator} is a lambda term, which is a valid expression in lambda calculus:

1. $x$, a variable, is a lambda term;
2. if $t$ is a lambda term, then the anonymous function $\lambda\, x : t$ is a lambda term;
3. if $s$ and $t$ are lambda terms, then $s\, t$ is a lambda term, which should be interpreted as $s$ applied with argument $t$; and
4. nothing else is a lambda term.

For example, if we apply $\lambda\, x : y\,x$ to $z$, we find

Although the notation in Equation \eqref{eq:example} suggests multiplication, note that everything is function application, because really that’s all there is in lambda calculus.

Consider the factorial function $\code{fact}$:

In words, if $n$ is zero, return $1$; otherwise, multiply $n$ with $\code{fact}(n-1)$. Equation \eqref{eq:fact-recursive} would be a valid expression if lambda calculus would allow us to use $\code{fact}$ in the definition of $\code{fact}$. Unfortunately, it doesn’t. Tricky. Let’s replace the inner $\code{fact}$ by a variable $f$:

Now, crucially, the Y combinator $Y$ is precisely designed to construct $\code{fact}$ from $\code{fact}'$:

To see this, let’s denote $\code{fact2}=Y\,\code{fact}'$ and verify that $\code{fact2}$ indeed equals $\code{fact}$:

\begin{align} \code{fact2} &= Y\, \code{fact}’ \\ &= (\lambda\, f : (\lambda\, x : f\,(x\, x))\, (\lambda\, x : f\,(x\, x)))\, \code{fact}’ \\ &= (\lambda\, x : \code{fact}’\,(x\, x) )\, (\lambda\, x : \code{fact}’\,(x\, x)) \label{eq:step-1} \\ &= \code{fact}’\, ((\lambda\, x : \code{fact}’\, (x\, x))\,(\lambda\, x : \code{fact}’\, (x\, x))) \label{eq:step-2} \\ &= \code{fact}’\, (Y\, \code{fact}’) \\ &= \code{fact}’\, \code{fact2}, \end{align}

which is exactly what we’re looking for, because the first argument to $\code{fact}'$ should be the actual factorial function, $\code{fact2}$ in this case. Neat!

We hence see that $Y$ can indeed be used to define recursive functions in languages that don’t support recursion. Where does this magic come from, you say? Sit tight, because that’s up next!

## Deriving the Y Combinator

This section introduces a simple trick that can be used to derive Equation \eqref{eq:Y-combinator}. We also show how this trick can be used to derive analogues of the Y combinator that implement mutual recursion in languages that don’t even support simple recursion.

Again, let’s start out by considering a recursive function:

where $g$ is some lambda term that depends on $f$ and $x$. As we discussed before, such a definition is not allowed. However, pulling out $f$,

we do find that $f$ is a fixed point of $h$: $f$ is invariant under applications of $h$. Now—and this is the trick—suppose that $f$ is the result of a function $\hat{f}$ applied to itself: $f=\hat{f}\,\hat{f}$. Then Equation \eqref{eq:fixed-point} becomes

from which we, by inspection, infer that

Therefore,

Pulling out $h$,

where suddenly a wild Y combinator has appeared.

The above derivation shows that $Y$ is a fixed-point combinator. Passed some function $h$, $Y\,h$ gives a fixed point of $h$: $f = Y\,h$ satisfies $f = h\,f$.

Pushing it even further, consider two functions that depend on each other:

\begin{align} f &= \lambda\,x:k_f[x, f, g], & g &= \lambda\,x:k_g[x, f, g] \end{align}

where $k_f$ and $k_g$ are lambda terms that depend on $x$, $f$, and $g$. This is foul play, as we know. We proceed as before and pull out $f$ and $g$:

\begin{align} f = \underbrace{ (\lambda\,f’:\lambda\,g’:\lambda\,x:k_f[x, f’, g’]) }_{h_f} \,\, f\, g = h_f\, f\, g \end{align}

\begin{align}
g = \underbrace{ (\lambda\,f’:\lambda\,g’:\lambda\,x:k_g[x, f’, g’]) }_{h_g} \,\, f\, g = h_g\, f\, g. \end{align}

Now—here’s that trick again—let $f = \hat{f}\,\hat{f}\,\hat{g}$ and $g = \hat{g}\,\hat{f}\,\hat{g}$.1 Then

\begin{align} \hat{f}\,\hat{f}\,\hat{g} &= h_f\,(\hat{f}\,\hat{f}\,\hat{g})\,(\hat{g}\,\hat{f}\,\hat{g}) = (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\,\,\hat{f}\,\hat{g},\\ \hat{g}\,\hat{f}\,\hat{g} &= h_g\,(\hat{f}\,\hat{f}\,\hat{g})\,(\hat{g}\,\hat{f}\,\hat{g}) = (\lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y))\,\,\hat{f}\,\hat{g}, \end{align}

which suggests that

\begin{align} \hat{f} &= \lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y), \\ \hat{g} &= \lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y). \end{align}

Therefore

\begin{align} f &= \hat{f}\,\hat{f}\,\hat{g} \\ &= (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y)) \\ &= Y_f\, h_f\, h_g \end{align}

where

Similarly,

Dang, laborious, but that worked. And thus we have derived two analogues $Y_f$ and $Y_g$ of the Y combinator that implement mutual recursion in languages that don’t even support simple recursion.

## Implementing the Y Combinator in Python

Well, that’s cool and all, but let’s see whether this Y combinator thing actually works. Consider the following nearly 1-to-1 translation of $Y$ and $\code{fact}'$ to Python:

Y = lambda f: (lambda x: f(x(x)))(lambda x: f(x(x)))
fact = lambda f: lambda n: 1 if n == 0 else n * f(n - 1)


If we try to run this, we run into some weird recursion:

>>> Y(fact)(4)
RecursionError: maximum recursion depth exceeded


Eh? What’s going? Let’s, for closer inspection, once more write down $Y$:

After $f$ is passed to $Y$, $(\lambda\, x : f\,(x\, x))$ is passed to $(\lambda\, x : f\,(x\, x))$; which then evaluates $x\, x$, which passes $(\lambda\, x : f\,(x\, x))$ to $(\lambda\, x : f\,(x\, x))$; which then again evaluates $x\, x$, which again passes $(\lambda\, x : f\,(x\, x))$ to $(\lambda\, x : f\,(x\, x))$; ad infinitum. Written down differently, evaluation of $Y\, f\, x$ yields

which goes on indefinitely. Consequently, $Y\, f$ will not evaluate in finite time, and this is the cause of the RecursionError. But we can fix this, and quite simply so: only allow the recursion—the $x\,x$ bit—to happen when it’s passed an argument; in other words, replace

Subsituting Equation \eqref{eq:strict-evaluation} in Equation \eqref{eq:Y-combinator}, we find

Translating to Python,

Y = lambda f: (lambda x: f(lambda y: x(x)(y)))(lambda x: f(lambda y: x(x)(y)))


And then we try again:

>>> Y(fact)(4)
24

>>> Y(fact)(3)
6

>>> Y(fact)(2)
2

>>> Y(fact)(1)
1


Sweet success!

## Summary

To recapitulate, the Y combinator is a higher-order function that can be used to define recursion—and even mutual recursion—in languages that don’t support recursion. One way of deriving $Y$ is to assume that the recursive function under consideration $f$ is the result of some other function $\hat{f}$ applied to itself: $f = \hat{f}\,\hat{f}$; after some simple manipulation, the result can then be determined by inspection. Although $Y$ can indeed be used to define recursive functions, it cannot be applied literally in a contemporary programming language; recursion errors might then occur. Fortunately, this can be fixed simply by letting the recursion in $Y$ happen when needed—that is, lazily.

1. Do you see why this is the appropriate generalisation of letting $f=\hat{f}\,\hat{f}$

Published on 16 August 2018.