Jekyll2019-02-27T20:15:26+00:00https://wesselb.github.io/feed.xmlwesselb.github.ioThoughts on machine learning and other topicsA Short Note on The Y Combinator2018-08-16T00:00:00+00:002018-08-16T00:00:00+00:00https://wesselb.github.io/2018/08/16/y-combinator<p class="pretitle">Cross-posted at <a href="https://invenia.github.io/blog/2018/08/20/ycombinator/">https://invenia.github.io/blog/2018/08/20/ycombinator/</a>.</p> <h2 id="introduction">Introduction</h2> <p>This post is a short note on the notorious <em>Y combinator</em>. No, not <a href="https://ycombinator.com">that company</a>, but the computer sciency objects that looks like this:</p> <script type="math/tex; mode=display">\label{eq:Y-combinator} Y = \lambda\, f : (\lambda\, x : f\,(x\, x))\, (\lambda\, x : f\,(x\, x)).</script> <p>Don’t worry if that looks complicated; we’ll get down to some examples and the nitty gritty details in just a second. But first, <em>what</em> even is this Y combinator thing? Simply put, the Y combinator is a higher-order function <script type="math/tex">Y</script> that can be used to define recursive functions in languages that don’t support recursion. Cool!</p> <p>For readers unfamiliar with the above notation, the right-hand side of Equation \eqref{eq:Y-combinator} is a <em>lambda term</em>, which is a valid expression in <a href="https://en.wikipedia.org/wiki/Lambda_calculus"><em>lambda calculus</em></a>:</p> <ol> <li><script type="math/tex">x</script>, a variable, is a lambda term;</li> <li>if <script type="math/tex">t</script> is a lambda term, then the anonymous function <script type="math/tex">\lambda\, x : t</script> is a lambda term;</li> <li>if <script type="math/tex">s</script> and <script type="math/tex">t</script> are lambda terms, then <script type="math/tex">s\, t</script> is a lambda term, which should be interpreted as <script type="math/tex">s</script> applied with argument <script type="math/tex">t</script>; and</li> <li>nothing else is a lambda term.</li> </ol> <p>For example, if we apply <script type="math/tex">\lambda\, x : y\,x</script> to <script type="math/tex">z</script>, we find</p> <script type="math/tex; mode=display">\label{eq:example} (\lambda\, x : y\,x)\, z = y\,z.</script> <p>Although the notation in Equation \eqref{eq:example} suggests multiplication, note that everything is function application, because really that’s all there is in lambda calculus.</p> <p>Consider the factorial function <script type="math/tex">\code{fact}</script>:</p> <script type="math/tex; mode=display">\label{eq:fact-recursive} \code{fact} = \lambda\, n : (\code{if}\, (\code{iszero}\, n) \, 1 \, (\code{multiply}\, n\, (\code{fact}\, (\code{subtract}\, n\, 1)))).</script> <p>In words, if <script type="math/tex">n</script> is zero, return <script type="math/tex">1</script>; otherwise, multiply <script type="math/tex">n</script> with <script type="math/tex">\code{fact}(n-1)</script>. Equation \eqref{eq:fact-recursive} would be a valid expression if lambda calculus would allow us to use <script type="math/tex">\code{fact}</script> in the definition of <script type="math/tex">\code{fact}</script>. Unfortunately, it doesn’t. Tricky. Let’s replace the inner <script type="math/tex">\code{fact}</script> by a variable <script type="math/tex">f</script>:</p> <script type="math/tex; mode=display">\code{fact}' = \lambda\, f: \lambda\, n : (\code{if}\, (\code{iszero}\, n) \, 1 \, (\code{multiply}\, n\, (f\, (\code{subtract}\, n\, 1)))).</script> <p>Now, crucially, the Y combinator <script type="math/tex">Y</script> is precisely designed to construct <script type="math/tex">\code{fact}</script> from <script type="math/tex">\code{fact}'</script>:</p> <script type="math/tex; mode=display">Y\, \code{fact}' = \code{fact}.</script> <p>To see this, let’s denote <script type="math/tex">\code{fact2}=Y\,\code{fact}'</script> and verify that <script type="math/tex">\code{fact2}</script> indeed equals <script type="math/tex">\code{fact}</script>:</p> <p>\begin{align} \code{fact2} &amp;= Y\, \code{fact}’ \\ &amp;= (\lambda\, f : (\lambda\, x : f\,(x\, x))\, (\lambda\, x : f\,(x\, x)))\, \code{fact}’ \\ &amp;= (\lambda\, x : \code{fact}’\,(x\, x) )\, (\lambda\, x : \code{fact}’\,(x\, x)) \label{eq:step-1} \\ &amp;= \code{fact}’\, ((\lambda\, x : \code{fact}’\, (x\, x))\,(\lambda\, x : \code{fact}’\, (x\, x))) \label{eq:step-2} \\ &amp;= \code{fact}’\, (Y\, \code{fact}’) \\ &amp;= \code{fact}’\, \code{fact2}, \end{align}</p> <p>which is <em>exactly</em> what we’re looking for, because the first argument to <script type="math/tex">\code{fact}'</script> should be the actual factorial function, <script type="math/tex">\code{fact2}</script> in this case. Neat!</p> <p>We hence see that <script type="math/tex">Y</script> can indeed be used to define recursive functions in languages that don’t support recursion. Where does this magic come from, you say? Sit tight, because that’s up next!</p> <h2 id="deriving-the-y-combinator">Deriving the Y Combinator</h2> <p>This section introduces a simple trick that can be used to derive Equation \eqref{eq:Y-combinator}. We also show how this trick can be used to derive analogues of the Y combinator that implement <em>mutual recursion</em> in languages that don’t even support simple recursion.</p> <p>Again, let’s start out by considering a recursive function:</p> <script type="math/tex; mode=display">f = \lambda\, x:g[f, x]</script> <p>where <script type="math/tex">g</script> is some lambda term that depends on <script type="math/tex">f</script> and <script type="math/tex">x</script>. As we discussed before, such a definition is not allowed. However, pulling out <script type="math/tex">f</script>,</p> <script type="math/tex; mode=display">\label{eq:fixed-point} f = \underbrace{(\lambda \, f' :\lambda\, x:g[f', x])}_{h}\,\, f = h\, f.</script> <p>we do find that <script type="math/tex">f</script> is a <em>fixed point</em> of <script type="math/tex">h</script>: <script type="math/tex">f</script> is invariant under applications of <script type="math/tex">h</script>. Now—and this is the trick—suppose that <script type="math/tex">f</script> is the result of a function <script type="math/tex">\hat{f}</script> applied to itself: <script type="math/tex">f=\hat{f}\,\hat{f}</script>. Then Equation \eqref{eq:fixed-point} becomes</p> <script type="math/tex; mode=display">\color{red}{\hat{f}} \,\hat{f} = h\,(\hat{f}\, \hat{f}) = (\color{red}{\lambda\,x:h(x\,x)})\,\,\hat{f},</script> <p>from which we, by inspection, infer that</p> <script type="math/tex; mode=display">\hat{f} = \lambda\,x:h(x\,x).</script> <p>Therefore,</p> <script type="math/tex; mode=display">f = \hat{f}\hat{f} = (\lambda\,x:h(x\,x))\,(\lambda\,x:h(x\,x)).</script> <p>Pulling out <script type="math/tex">h</script>,</p> <script type="math/tex; mode=display">f = (\lambda\, h': (\lambda\,x:h'\,(x\,x))\,(\lambda\,x:h'\,(x\,x)))\, h = Y\, h,</script> <p>where suddenly a wild Y combinator has appeared.</p> <p>The above derivation shows that <script type="math/tex">Y</script> is a <em>fixed-point combinator</em>. Passed some function <script type="math/tex">h</script>, <script type="math/tex">Y\,h</script> gives a fixed point of <script type="math/tex">h</script>: <script type="math/tex">f = Y\,h</script> satisfies <script type="math/tex">f = h\,f</script>.</p> <p>Pushing it even further, consider two functions that depend on each other:</p> <p>\begin{align} f &amp;= \lambda\,x:k_f[x, f, g], &amp; g &amp;= \lambda\,x:k_g[x, f, g] \end{align}</p> <p>where <script type="math/tex">k_f</script> and <script type="math/tex">k_g</script> are lambda terms that depend on <script type="math/tex">x</script>, <script type="math/tex">f</script>, and <script type="math/tex">g</script>. This is foul play, as we know. We proceed as before and pull out <script type="math/tex">f</script> and <script type="math/tex">g</script>:</p> <p>\begin{align} f = \underbrace{ (\lambda\,f’:\lambda\,g’:\lambda\,x:k_f[x, f’, g’]) }_{h_f} \,\, f\, g = h_f\, f\, g \end{align}</p> <p>\begin{align} <br /> g = \underbrace{ (\lambda\,f’:\lambda\,g’:\lambda\,x:k_g[x, f’, g’]) }_{h_g} \,\, f\, g = h_g\, f\, g. \end{align}</p> <p>Now—here’s that trick again—let <script type="math/tex">f = \hat{f}\,\hat{f}\,\hat{g}</script> and <script type="math/tex">g = \hat{g}\,\hat{f}\,\hat{g}</script>.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> Then</p> <p>\begin{align} \hat{f}\,\hat{f}\,\hat{g} &amp;= h_f\,(\hat{f}\,\hat{f}\,\hat{g})\,(\hat{g}\,\hat{f}\,\hat{g}) = (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\,\,\hat{f}\,\hat{g},\\ \hat{g}\,\hat{f}\,\hat{g} &amp;= h_g\,(\hat{f}\,\hat{f}\,\hat{g})\,(\hat{g}\,\hat{f}\,\hat{g}) = (\lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y))\,\,\hat{f}\,\hat{g}, \end{align}</p> <p>which suggests that</p> <p>\begin{align} \hat{f} &amp;= \lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y), \\ \hat{g} &amp;= \lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y). \end{align}</p> <p>Therefore</p> <p>\begin{align} f &amp;= \hat{f}\,\hat{f}\,\hat{g} \\ &amp;= (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_f\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_g\,(x\,x\,y)\,(y\,x\,y)) \\ &amp;= Y_f\, h_f\, h_g \end{align}</p> <p>where</p> <script type="math/tex; mode=display">Y_f = (\lambda\, h_f': \lambda\, h_g': (\lambda\,x:\lambda\,y:h_f'\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_f'\,(x\,x\,y)\,(y\,x\,y))\, (\lambda\,x:\lambda\,y:h_g'\,(x\,x\,y)\,(y\,x\,y))).</script> <p>Similarly,</p> <script type="math/tex; mode=display">g = Y_g\, h_f\, h_g.</script> <p><em>Dang</em>, laborious, but that worked. And thus we have derived two analogues <script type="math/tex">Y_f</script> and <script type="math/tex">Y_g</script> of the Y combinator that implement mutual recursion in languages that don’t even support simple recursion.</p> <h2 id="implementing-the-y-combinator-in-python">Implementing the Y Combinator in Python</h2> <p>Well, that’s cool and all, but let’s see whether this Y combinator thing actually works. Consider the following nearly 1-to-1 translation of <script type="math/tex">Y</script> and <script type="math/tex">\code{fact}'</script> to Python:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Y</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">f</span><span class="p">:</span> <span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">f</span><span class="p">(</span><span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)))(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">f</span><span class="p">(</span><span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)))</span> <span class="n">fact</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">f</span><span class="p">:</span> <span class="k">lambda</span> <span class="n">n</span><span class="p">:</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">0</span> <span class="k">else</span> <span class="n">n</span> <span class="o">*</span> <span class="n">f</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> </code></pre></div></div> <p>If we try to run this, we run into some weird recursion:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">Y</span><span class="p">(</span><span class="n">fact</span><span class="p">)(</span><span class="mi">4</span><span class="p">)</span> <span class="n">RecursionError</span><span class="p">:</span> <span class="n">maximum</span> <span class="n">recursion</span> <span class="n">depth</span> <span class="n">exceeded</span> </code></pre></div></div> <p>Eh? What’s going? Let’s, for closer inspection, once more write down <script type="math/tex">Y</script>:</p> <script type="math/tex; mode=display">Y = \lambda\, f: (\lambda\, x : f\,(x\, x))\, (\lambda\, x : f\,(x\, x)).</script> <p>After <script type="math/tex">f</script> is passed to <script type="math/tex">Y</script>, <script type="math/tex">(\lambda\, x : f\,(x\, x))</script> is passed to <script type="math/tex">(\lambda\, x : f\,(x\, x))</script>; which then evaluates <script type="math/tex">x\, x</script>, which passes <script type="math/tex">(\lambda\, x : f\,(x\, x))</script> to <script type="math/tex">(\lambda\, x : f\,(x\, x))</script>; which then again evaluates <script type="math/tex">x\, x</script>, which again passes <script type="math/tex">(\lambda\, x : f\,(x\, x))</script> to <script type="math/tex">(\lambda\, x : f\,(x\, x))</script>; <em>ad infinitum</em>. Written down differently, evaluation of <script type="math/tex">Y\, f\, x</script> yields</p> <script type="math/tex; mode=display">Y\, f\, x = (Y\, f)\, x = (Y\, (Y\, f))\, x = (Y\, (Y\, (Y\, f)))\, x = (Y\, (Y\, (Y\, (Y\, f))))\, x = \ldots,</script> <p>which goes on indefinitely. Consequently, <script type="math/tex">Y\, f</script> will not evaluate in finite time, and this is the cause of the <code class="highlighter-rouge">RecursionError</code>. But we can fix this, and quite simply so: only allow the recursion—the <script type="math/tex">x\,x</script> bit—to happen when it’s passed an argument; in other words, replace</p> <script type="math/tex; mode=display">\label{eq:strict-evaluation} x\,x \to \lambda\,y:x\,x\,y.</script> <p>Subsituting Equation \eqref{eq:strict-evaluation} in Equation \eqref{eq:Y-combinator}, we find</p> <script type="math/tex; mode=display">\label{eq:strict-Y-combinator} Y = \lambda\, f : (\lambda\, x : f(\lambda\, y: x\, x\,y))\, (\lambda\, x : f(\lambda\, y:x\, x\, y)).</script> <p>Translating to Python,</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Y</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">f</span><span class="p">:</span> <span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">f</span><span class="p">(</span><span class="k">lambda</span> <span class="n">y</span><span class="p">:</span> <span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)(</span><span class="n">y</span><span class="p">)))(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">f</span><span class="p">(</span><span class="k">lambda</span> <span class="n">y</span><span class="p">:</span> <span class="n">x</span><span class="p">(</span><span class="n">x</span><span class="p">)(</span><span class="n">y</span><span class="p">)))</span> </code></pre></div></div> <p>And then we try again:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">Y</span><span class="p">(</span><span class="n">fact</span><span class="p">)(</span><span class="mi">4</span><span class="p">)</span> <span class="mi">24</span> <span class="o">&gt;&gt;&gt;</span> <span class="n">Y</span><span class="p">(</span><span class="n">fact</span><span class="p">)(</span><span class="mi">3</span><span class="p">)</span> <span class="mi">6</span> <span class="o">&gt;&gt;&gt;</span> <span class="n">Y</span><span class="p">(</span><span class="n">fact</span><span class="p">)(</span><span class="mi">2</span><span class="p">)</span> <span class="mi">2</span> <span class="o">&gt;&gt;&gt;</span> <span class="n">Y</span><span class="p">(</span><span class="n">fact</span><span class="p">)(</span><span class="mi">1</span><span class="p">)</span> <span class="mi">1</span> </code></pre></div></div> <p>Sweet success!</p> <h2 id="summary">Summary</h2> <p>To recapitulate, the Y combinator is a higher-order function that can be used to define recursion—and even mutual recursion—in languages that don’t support recursion. One way of deriving <script type="math/tex">Y</script> is to assume that the recursive function under consideration <script type="math/tex">f</script> is the result of some other function <script type="math/tex">\hat{f}</script> applied to itself: <script type="math/tex">f = \hat{f}\,\hat{f}</script>; after some simple manipulation, the result can then be determined by inspection. Although <script type="math/tex">Y</script> can indeed be used to define recursive functions, it cannot be applied literally in a contemporary programming language; recursion errors might then occur. Fortunately, this can be fixed simply by letting the recursion in <script type="math/tex">Y</script> happen when needed—that is, <em>lazily</em>.</p> <div class="footnotes"> <ol> <li id="fn:1"> <p>Do you see why this is the appropriate generalisation of letting <script type="math/tex">f=\hat{f}\,\hat{f}</script>? <a href="#fnref:1" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Cross-posted at https://invenia.github.io/blog/2018/08/20/ycombinator/.Hello, World2018-06-19T00:00:00+00:002018-06-19T00:00:00+00:00https://wesselb.github.io/2018/06/19/hello-world<p>Hello, world! Another blog has come into existence. Woo! Find out more about me <a href="/portfolio">here</a> and <a href="/about">here</a>.</p> <p>Posts to follow soon. I promise.</p>Hello, world! Another blog has come into existence. Woo! Find out more about me here and here.