Mathematical Foundations for Programming Languages

Note

These slides are also available in PDF format: 4:3 PDF, 16:9 PDF, 16:10 PDF.

Lambda Calculus

The Lambda Calculus

The \(λ\)-calculus is a mathematical language of lambda terms bound by a set of transformation rules. The \(λ\)-calculus notation was introduced in the 1930s by Alonzo Church.

Just like programming languages, the \(λ\)-calculus has rules for what is a valid syntax:

Variables:A variable (such as \(x\)) is valid term in the \(λ\)-calculus.
Abstractions:If \(t\) is a term and \(x\) is a variable, then the term \((λ x.t)\) is a lambda abstraction.
Applications:If \(t\) and \(s\) are terms, then \((ts)\) is the application term of \(t\) onto \(s\).

Anonymous Functions

Similar to how (\ x -> t) defines an anonymous function in Haskell, lambda abstractions define anonymous functions in the \(λ\)-calculus.

A lambda abstraction which takes an \(x\) and returns a \(t\) is written as so:

\[(λ x.t)\]

Example

Suppose in mathematics we define a function \(f(x) = x + 2\). This could be written as \((λ x.x + 2)\) in the \(λ\)-calculus [1]. Of course, this function is anonymous and not bound to the name \(f\).

[1]Of course, we haven’t said that either \(+\) nor \(2\) is valid in lambda calculus yet. We will get to that…

Functions are First Class

In the \(λ\)-calculus, functions are not only first class, they are the only class of objects. In other words, all data in the \(λ\)-calculus are represented as functions.

Currying

Functions in the \(λ\)-calculus may only take one argument, so currying is typically used to write functions with multiple arguments. For example, the function \(f(x, y) = x + y\) might be written anonymously as:

\[(λ x.(λ y.x + y))\]

Further, function application is left-associative, so \((fxy)\) means \(((fx)y)\).

Free and Bound Variables

The \(λ\) operator (which creates lambda abstractions) binds a variable to wherever it occurs in the expression.

  • Variables which are bound in an expression are called bound variables
  • Variables which are not bound in an expression are called free variables

Example

With your learning group, identify the free and bound variables in this expression:

\[(λ x.(λ y.zy)(zx))\]

Transformations

\(\alpha\)-conversion:Allows variables to be renamed to non-colliding names. For example, \((λ x.x)\) is \(\alpha\)-equivalent to \((λ y.y)\).
\(\beta\)-reduction:Allows functions to be applied. For example, \(((λ x.x^2) 8)\) is \(\beta\)-equivalent to \(64\).
\(\eta\)-conversion:Allows functions with the same external properties to be substituted. For example, \((λ x.(fx))\) is \(\eta\)-equivalent to \(f\) if \(x\) does not appear in \(f\).

Examples

With your learning group, identify the transformation used in each of the following expressions, or state they are not equivalent. Turn in your answers on a sheet of paper with all of your names at the end of class for learning group participation credit for today.

  1. \((λ x.(λ x.x)) \to (λ y.(λ y.y))\)
  2. \((λ x.(λ x.x)) \to (λ y.(λ x.x))\)
  3. \((λ x.(λ x.x)) \to (λ y.(λ x.y))\)
  4. \((λ x.(λ y.x)) \to (λ y.(λ y.y))\)
  5. \(((λ x.x)(λ y.y)) \to (λ y.y)\)
  6. \((λ x.((λ y.y)x)) \to (λ y.y)\)

Church Numerals

Since all data in the \(λ\)-calculus must be a function, we use a clever convention of functions (called Church numerals) to define numbers:

0:\(λf.λx.x\)
1:\(λf.λx.fx\)
2:\(λf.λx.f(fx)\)
3:\(λf.λx.f(f(fx))\)

… and so on. In fact, the successor to any number \(n\) can be written as:

\[λf.λx.f (n f x)\]

Notice this

Defining numbers as functions in this way allows us to apply a Chuch numeral \(n\) to a function to get a new function that applies the original function \(n\) times.

Shorthand Notations

While it’s not a defined part of the \(λ\)-calculus, we define common shorthands for some features:

  • \(0, 1, 2, \ldots\) are shorthand for their corresponding Church numerals
  • \(\text{SUCC} = λn.λf.λx.f (n f x)\)

Note

The notation “\(=\)” above is not a part of the \(λ\)-calculus. I’m using it for saying “is shorthand for”.

Addition and Multiplication

Adding \(m\) to \(n\) can be thought of as taking the successor to \(n\), \(m\) times. Using our shorthand \(\text{SUCC}\), this can be written as:

\[\text{ADD} = λm.λn.(m \,\text{SUCC}\, n)\]

Similarly, multiplying \(m\) by \(n\) can be thought of as repeating \(\text{ADD}\, n\), \(m\) times and then applying it to \(0\), this can be written as:

\[\text{MULT} = λm.λn.(m (\text{ADD}\, n) 0)\]

Boolean Logic

We use the following convention for true and false:

\[\begin{split}\begin{split} \text{TRUE} &= λx.λy.x \\ \text{FALSE} &= λx.λy.y \qquad\text{(Church numeral zero}) \end{split}\end{split}\]

From here, we can define some common boolean operators:

\[\begin{split}\begin{split} \text{AND} &= λp.λq.p q p \\ \text{OR} &= λp.λq.p p q \\ \text{NOT} &= λp.p\ \text{FALSE}\ \text{TRUE} \\ \text{IF} &= λp.λa.λb.p a b \\ & \text{ (returns $a$ if the predicate is TRUE, $b$ otherwise)} \end{split}\end{split}\]

Cons Cells

By convention, we will represent a cons cell as a function that applies its argument to the CAR and CDR of the cons cell. This leads to the shorthand:

\[\begin{split}\begin{split} \text{CONS} &= λx.λy.λf.f x y \\ \text{CAR} &= λc.c\ \text{TRUE} \\ \text{CDR} &= λc.c\ \text{FALSE} \\ \text{NIL} &= λx.\text{TRUE} \\ \end{split}\end{split}\]

Using this, we can define lists:

\[(\text{CONS } 1\ (\text{CONS } 2\ (\text{CONS } 3\ \text{NIL})))\]

Lambda Calculus: Where from Here?

  • Subtraction is hard, but doable. Check out the Wikipedia page on Church Numerals for more info.
  • For recursion, we need to reference ourselves in a lambda abstraction. This is done using a Y-combinator.
  • From there, we can use the \(λ\)-calculus to compute the solution to any problem that a Turing machine can.
  • More on all of this in CSCI-561 (Theory of Computation).
  • Many functional programming languages (e.g., Haskell, Lisp) are just practical implementations of the \(λ\)-calculus.

Monads (not a quiz or exam topic)

What is a Monad?

Monads are a class of functions that compose other functions together in a certain way. A type with a monadic structure defines what it means to chain operations. A monadic type consists of a type constructor and two operations:

Return:Takes a plain value, and uses the constructor to place the value in a monadic container, creating a monadic value.
Bind:Does the reverse: takes a monadic container and passes it to the next function.

Remember that silly function in Haskell (>>=) that chained IO statements together?

Monads You Know

In Haskell, when you write a list comprehension:

[x * 2 | x <- [1..10], odd x]

In Haskell, the do block used for IO:

main = do
    putStrLn "What is your name?"
    name <- getLine
    putStrLn $ "Nice to meet you " ++ name

What Good are Monads?

  • Monads essentially are a hidden data structure that passes around state for you.
  • Many common imperative PL concepts can be defined in terms of a monadic structure, such as random number generators, input/output, variable assignment, …
  • Monads can be created in any language that supports anonymous functions and closures.

Relevant xkcd

From https://xkcd.com/1957/:

CVE-2018-????: Haskell isn’t side-effect free after all, the effects are all just concentrated on this one computer in Missouri that nobody has checked in on in a while.