Skip to main content

Lambda Calculus

The section on computability mentioned the Church-Turing Thesis that Turing machines fully capture our notion of what may be computed.1 Part of the original justification for this belief is that Alan Turing showed in 1936 that his machine model was equivalent in power to another model of computation being developed at around the same time by Alonzo Church. This model is now known as the lambda calculus, and it forms the core of the theory of functional programming.

As already mentioned in the introductory section on functional programming, the symbol lambda (λ\lambda) is used to indicate an anonymous function defined by an expression. For example, λx(x+1)\lambda x(x + 1) is the function that takes any argument xx and returns that argument plus one; in ReasonML, this would be written x => { x + 1 }. The fundamental operation in λ\lambda-calculus is known as beta (β\beta) reduction. Given an expression that contains a λ\lambda function applied to an argument, β\beta reduction allows the application to be replaced by the body of the function with the argument substituted for the variable.2

For example, (λx(x+1))(41)(\lambda x(x+1))(41) may be reduced to the expression 41+141+1. To avoid an explosion of parentheses, we will often write a lambda function with a dot after the parameter name, meaning that the body of the function extends as far as possible to the right within the surrounding expression. We will also often omit the parentheses around the argument in a function application, writing f  xf\;x instead of f(x)f(x). When a term is applied to multiple arguments we will associate the application to the left, so that M  N  PM\;N\;P is the same as ((M  N)  P)((M\;N)\;P). With these conventions, the previous example becomes (λx.  x+1)  41(\lambda x.\;x+1)\;41.

Formally, the lambda calculus deals with terms of just three forms:

  • variables (xx, yy, ),
  • applications (M  NM\; N, where MM and NN are arbitrary lambda terms), and
  • abstractions (λx.  M\lambda x.\;M, where xx is a variable and MM is a lambda term). Although the example above used additional symbols such as ++ and 4141, these are not necessary. The remainder of this section will show how to encode common data types and operations with lambda terms. The development is based on the Church encoding as presented in the Wikipedia article on the lambda calculus; other encodings are possible.

Church Numerals and Arithmetic

We may encode the natural numbers as follows:

0λf.λx.  x1λf.λx.  f(x)2λf.λx.  f(f(x))nλf.λx.  f(n)(x)\begin{aligned} 0&\equiv\lambda f.\lambda x.\; x\\ 1&\equiv\lambda f.\lambda x.\; f(x)\\ 2&\equiv\lambda f.\lambda x.\; f(f(x))\\ &\ldots\\ n&\equiv\lambda f.\lambda x.\; f^{(n)}(x) \end{aligned}

That is, the encoding of nn is a function that takes a function ff and a value xx and iteratively applies ff to xx a total of nn times. The successor function, λy.  y+1\lambda y.\;y+1, is then given by SUCCλy.λf.λx.  f(yfx)\textrm{SUCC}\equiv\lambda y.\lambda f.\lambda x.\;f(y f x). Using this, we may define addition by noticing that adding yy to zz is the same as taking the successor of zz, yy times:

PLUSλy.λz.  y  SUCC  z\begin{aligned} \textrm{PLUS}&\equiv\lambda y.\lambda z.\;y\;\textrm{SUCC}\;z \end{aligned}

Similarly, since multiplication is repeated addition, we may define

MULTλy.λz.  y  (PLUS  z)  0\begin{aligned} \textrm{MULT}&\equiv\lambda y.\lambda z.\;y\;(\textrm{PLUS}\;z)\;0 \end{aligned}

For example,

MULT  6  7=(λy.λz.  y  (PLUS  z)  0)  6  7=(λz.  6  (PLUS  z)  0)  7=6  (PLUS  7)  0=(PLUS  7)(6)(0)=(PLUS  7)(5)(PLUS  7  0)=(PLUS  7)(5)(7)=(PLUS  7)(4)(14)==PLUS  7  35=42.\begin{aligned} \textrm{MULT}\;6\;7&=(\lambda y.\lambda z.\;y\;(\textrm{PLUS}\;z)\;0)\;6\;7\\ &=(\lambda z.\;6\;(\textrm{PLUS}\;z)\;0)\;7\\ &=6\;(\textrm{PLUS}\;7)\;0\\ &=(\textrm{PLUS}\;7)^{(6)}(0)\\ &=(\textrm{PLUS}\;7)^{(5)}(\textrm{PLUS}\;7\;0)\\ &=(\textrm{PLUS}\;7)^{(5)}(7)\\ &=(\textrm{PLUS}\;7)^{(4)}(14)\\ &=\cdots\\ &=\textrm{PLUS}\;7\;35\\ &=42. \end{aligned}

If we had a function PRED that computed the "predecessor" (λx.  x1\lambda x.\;x - 1, except the predecessor of zero is zero because we are limiting our arithmetic to the natural numbers for the moment), then we could similarly define subtraction as the iteration of PRED:

SUBλy.λz.  z  PRED  y\begin{aligned} \textrm{SUB}&\equiv\lambda y.\lambda z.\;z\;\textrm{PRED}\;y \end{aligned}

Note that SUB  m  n\textrm{SUB}\;m\;n is zero whenever mnm\leq n; this is often called the "modified minus" or monus operation. The function PRED itself will be defined below.

Booleans and Conditionals

Fundamentally, the role of a Boolean value is to make a distinction: true or false, yes or no, do this or else do that. The encoding of a Boolean will therefore take two arguments, aa and bb, and select one of them:3

TRUEλa.λb.  aFALSEλa.λb.  b\begin{aligned} \textrm{TRUE}&\equiv\lambda a.\lambda b.\;a\\ \textrm{FALSE}&\equiv\lambda a.\lambda b.\;b \end{aligned}

Given these encodings, we may then define Boolean operations and the conditional as follows:

ANDλp.λq.  p  q  pORλp.λq.  p  p  qNOTλp.  p  FALSE  TRUEIFλp.  p\begin{aligned} \textrm{AND}&\equiv\lambda p.\lambda q.\;p\;q\;p\\ \textrm{OR}&\equiv\lambda p.\lambda q.\;p\;p\;q\\ \textrm{NOT}&\equiv\lambda p.\;p\;\textrm{FALSE}\;\textrm{TRUE}\\ \textrm{IF}&\equiv\lambda p.\;p \end{aligned}

You should convince yourself that these operations are correct, by evaluating each possible combination: AND  FALSE  FALSE\textrm{AND}\;\textrm{FALSE}\;\textrm{FALSE}, . Note that IF is trivial, since IF  p  a  b\textrm{IF}\;p\;a\;b is the same as just applying p  a  bp\;a\;b; essentially, the meaning of a Boolean is the conditional, just as the meaning of a natural number is the ability to iterate a particular number of times.

The most basic operation on natural numbers that returns a Boolean is the predicate ISZERO, which takes a number and returns TRUE if it is zero and FALSE otherwise. This is easy to implement as the nn-fold iteration of the constant FALSE function, starting from the initial value TRUE:

ISZEROλn.  n  (λx.  FALSE)  TRUE\begin{aligned} \textrm{ISZERO}&\equiv\lambda n.\;n\;(\lambda x.\;\textrm{FALSE})\;\textrm{TRUE} \end{aligned}

If nn is zero, then we just get TRUE. If nn is non-zero, then the constant FALSE function will be applied at least once, making the result FALSE.

Building on this, we get less-than-or-equal-to and equality predicates by using subtraction:

LEQλm.λn.  ISZERO  (SUB  m  n)EQλm.λn.  AND  (LEQ  m  n)  (LEQ  n  m)\begin{aligned} \textrm{LEQ}&\equiv\lambda m.\lambda n.\;\textrm{ISZERO}\;(\textrm{SUB}\;m\;n)\\ \textrm{EQ}&\equiv\lambda m.\lambda n.\;\textrm{AND}\;(\textrm{LEQ}\;m\;n)\;(\textrm{LEQ}\;n\;m) \end{aligned}

This encoding of Booleans is actually found in some pure object-oriented languages, such as Smalltalk. The idea is that, if everything is an object, then the natural way to do anything is to call a method on an object. The TRUE and FALSE objects have methods that ask them to choose one of two things; the method in TRUE is implemented to return the first thing, while the implementation in FALSE returns the second. Here is what this might look like implemented in Java:

import java.util.function.Supplier;
public interface Boolean {
public <T> T ifThenElse(Supplier<T> ifTrue, Supplier<T> ifFalse);
public void ifThenElse(Runnable ifTrue, Runnable ifFalse);
}
public class True implements Boolean {
public <T> T ifThenElse(Supplier<T> ifTrue, Supplier<T> ifFalse) {
return ifTrue.get();
}
public void ifThenElse(Runnable ifTrue, Runnable ifFalse) {
ifTrue.run();
}
}
public class False implements Boolean {
public <T> T ifThenElse(Supplier<T> ifTrue, Supplier<T> ifFalse) {
return ifFalse.get();
}
public void ifThenElse(Runnable ifTrue, Runnable ifFalse) {
ifFalse.run();
}
}
public class Demo {
public static void main(String[] args) {
Boolean b1 = new True();
Boolean b2 = new False();
int result = b1.ifThenElse(() -> 42, () -> 37 / 0);
b2.ifThenElse(() -> System.out.println("Boo!"), () -> System.out.println("The answer is " + result));
}
}

The Supplier<T> and Runnable types are used because we want to pass in thunks to the ifThenElse method. A thunk is an anonymous function that takes no arguments; it is used to delay the execution of a block of code until we are ready to get() its value or run() it. This is needed in Java (and in almost all common programming languages except Haskell and Scala) because otherwise ifThenElse would evaluate both of its arguments before performing the method call; in the example, without using thunks, the call to b1.ifThenElse would result in a division by zero exception, and the call to b2.ifThenElse would print Boo! as well as the correct answer.4

Pairs and Lists

Inspired by the object-oriented view of the Boolean encoding, one way to represent a pair of values is to construct a function holding both values and expecting a "selector" that can retrieve either one of them on demand. Since the selector is just a function that chooses which of the two elements to retrieve, it can be a Boolean. This leads to the following encoding:

PAIRλa.λb.λs.  s  a  bFIRSTλp.  p  TRUESECONDλp.  p  FALSE\begin{aligned} \textrm{PAIR}&\equiv\lambda a.\lambda b.\lambda s.\;s\;a\;b\\ \textrm{FIRST}&\equiv\lambda p.\;p\;\textrm{TRUE}\\ \textrm{SECOND}&\equiv\lambda p.\;p\;\textrm{FALSE} \end{aligned}

You should check that if pPAIR  1  2p\equiv\textrm{PAIR}\;1\;2, then FIRST  p\textrm{FIRST}\;p reduces to 1, while SECOND  p\textrm{SECOND}\;p reduces to 2.

Using pairs, we may build up lists with the following two helpers:

NILλs.  TRUEISNILλp.  p  (λa.λb.  FALSE)\begin{aligned} \textrm{NIL}&\equiv\lambda s.\;\textrm{TRUE}\\ \textrm{ISNIL}&\equiv\lambda p.\;p\;(\lambda a.\lambda b.\;\textrm{FALSE}) \end{aligned}

The list [1,2,3][1, 2, 3] is encoded as PAIR  1  (PAIR  2  (PAIR  3  NIL))\textrm{PAIR}\;1\;(\textrm{PAIR}\;2\;(\textrm{PAIR}\;3\;\textrm{NIL})). Given a list \ell, ISNIL  \textrm{ISNIL}\;\ell returns TRUE if the list is empty and FALSE otherwise. If \ell is non-empty, then FIRST  \textrm{FIRST}\;\ell extracts its head and SECOND  \textrm{SECOND}\;\ell its tail.

We may also use pairs to construct the promised function PRED.5 Consider the function Fλp.  PAIR  (SECOND  p)  (SUCC  (SECOND  p))F\equiv\lambda p.\;\textrm{PAIR}\;(\textrm{SECOND}\;p)\;(\textrm{SUCC}\;(\textrm{SECOND}\;p)). Then F  (PAIR  m  n)=PAIR(n,n+1)F\;(\textrm{PAIR}\;m\;n)=\textrm{PAIR}(n, n+1). If you iterate FF a non-zero number nn times and apply it to PAIR  0  0\textrm{PAIR}\;0\;0, then the result will be PAIR  (n1)  n\textrm{PAIR}\;(n-1)\;n (prove this by induction on nn). Therefore, we may define PRED as:

PREDλn.  FIRST  (n  F  (PAIR  0  0))\begin{aligned} \textrm{PRED}&\equiv\lambda n.\;\textrm{FIRST}\;(n\;F\;(\textrm{PAIR}\;0\;0)) \end{aligned}

This has the disadvantage that computing the predecessor of nn takes O(n)O(n) time, but no one said the Church encodings were efficient.

Recursion

Consider the recursive definition of the factorial function in ReasonML:

The body of the function needs to have access to the definition of fact itself; this is the point of the rec qualifier with let. If we could provide the definition of fact as another argument to the function, then we could express factorial with the following "template":

This template describes how we could define fact if we already had the fact function: fact = fact_template(fact). To break out of the circularity of this definition, what we really need is a way to solve the equation f = fact_template(f) for f; such a solution is known as a fixed point of fact_template.6 A remarkable fact about the lambda calculus is that there is a term that will find such a fixed point:

FIXλf.  (λx.  f  (x  x))  (λx.  f  (x  x))\begin{aligned} \textrm{FIX}&\equiv\lambda f.\;(\lambda x.\;f\;(x\;x))\;(\lambda x.\;f\;(x\;x)) \end{aligned}

Observe what happens when we reduce FIX  g\textrm{FIX}\;g:

FIX  g=(λx.  g  (x  x))  (λx.  g  (x  x))=g((λx.  g  (x  x))  (λx.  g  (x  x)))\begin{aligned} \textrm{FIX}\;g&=(\lambda x.\;g\;(x\;x))\;(\lambda x.\;g\;(x\;x))\\ &=g((\lambda x.\;g\;(x\;x))\;(\lambda x.\;g\;(x\;x))) \end{aligned}

That is, FIX  g=g(FIX  g)\textrm{FIX}\;g=g(\textrm{FIX}\;g), so f=FIX  gf=\textrm{FIX}\;g is a solution to the equation f=g(f)f=g(f)!

Using FIX, we may define the factorial function (and by this point you should be convinced that we may define any pure function that we could have written in ReasonML) as follows:

FACTFIX  (λf.λn.  IF  (ISZERO  n)  1  (MULT  n  (f  (PRED  n))))\begin{aligned} \textrm{FACT}&\equiv\textrm{FIX}\;(\lambda f.\lambda n.\;\textrm{IF}\;(\textrm{ISZERO}\;n)\;1\;(\textrm{MULT}\;n\;(f\;(\textrm{PRED}\;n)))) \end{aligned}

  1. The thesis was explicitly named in 1952 by Church's student Stephen Kleene, also known for introducing the star operator of regular expressions.
  2. There is also a notion of alpha (α\alpha) equivalence, which allows us to uniformly rename the bound parameter throughout a function, so that λx(x+1)\lambda x(x + 1) is the same function as λy(y+1)\lambda y(y + 1). This is important for precise theoretical work, but we will be able to ignore the issue.
  3. Interestingly, the encoding for FALSE is the same (up to α\alpha-equivalence) as the encoding of zeroapparently C made the right choice in equating zero and false!
  4. Side effects!
  5. This version of the predecessor function was discovered by Stephen Kleene, reportedly while visiting the dentist.
  6. Another way to view this is that each application of fact_template takes an imperfect approximation to factorial and produces a better approximation. Finding the fixed point is just taking the limit of this series of approximations.