I've found myself with a lot of free time by being under preventive isolation to avoid spreading the COVID-19 pandemics in my country. The sad part is that most of this free time comes from the fact that I cannot do stuff that I really liked, like commuting using my bike everyday to work, or taking long walks in the park with my wife and our dog.

However there is a happy part! Now I have a lot of time to study things that I didn't have the time to study before. One of them is actually taking the time to read Pierce's Types and Programming Languages book from end to end. I've been using the book as a reference for a few years but I haven't had the time and motivation to actually read each chapter and do the exercises.

Nonetheless, I have an urge to actually test-field all the constructs proposed in the book, I suppose that's my inner physicist telling me that theory is useless without experimentation. So I will be implementing some of the languages and type systems proposed in the book using Rust.

Will you implement exactly one of those type systems in any language? Probably not. But some of the ideas you see in TAPL will give you a deeper understanding of how everything works.

We will start with the untyped lambda calculus. Because it gives us a lot of chances to discuss stuff that is relevant when building a real language. However, this first post will just be a simple but functional implementation of it.

What's the lambda calculus?

The lambda calculus was one of the first computation models ever formulated, it was proposed by Alonzo Church during the early 1930s. This calculus pretty popular in the language design and type systems communities for two reasons. The first one is that it is the most powerful language you'll ever use, every construct in any language can be encoded in the lambda calculus, in other words it is Turing-complete. The second reason is that it is a particularly simple formal language, this means that it is more similar to a regular programming language than a Turing machine, for example.

The funny part is that Church invented the lambda calculus as a logic system, not as the foundation of modern computer science, quoting him: "There may, indeed, be other applications of the system than its use as a logic".

The Syntax

First we need to know how to write something in the lambda calculus. This is known as the syntax of the language. The lambda calculus has only three kinds of terms:

  • There are variables, which are usually denoted by lowercase letters: a, b, c, ...
  • There are lambda-abstractions, those are akin to anonymous functions in programming languages, they only have a single argument (which is a variable) and a body (which can be any term). They are represented as λ<variable>. <term>.
  • There are applications between terms, which is the equivalent of function application, but here can happen between any pair of terms. They are represented as <term> <term>.

Most of the time you don't find the grammar of a language written like this. You will find a definition like this one instead:

    t :=                <terms>
         x              <variables>
       | λx. t          <abstractions>
       | t t            <applications>

Which contains more or less the same information as the first definition.

And that's it, there is no more constructs in this language, no conditionals, no assignments, no arithmetic, nothing. Every construct you've seen in any language can be successfully emulated with just letter variables, abstractions and applications.

Implementing the syntax

Representing terms on Rust based on the grammar is pretty straightforward. Given that terms can be variables, abstractions or applications. It is sensible to use an enum to represent them:

enum Term {
    Var,
    Abs,
    App,
}

Then we need to add the relevant fields to each variant:

  • Variables can (hopefully) be represented as ASCII characters, so an u8 will be good enough.
  • Abstractions require a variable and another term, so we should use an u8 and a Box<Term>.
  • Applications require two terms, so we will use two Box<Term>.

So our Term enumeration will look like this:

enum Term {
    Var(u8),
    Abs(u8, Box<Term>),
    App(Box<Term>, Box<Term>)
}

The Box type, which is a reference to a value in the heap, is required because types cannot contain themselves in their own definition unless they are behind a reference, otherwise Rust cannot infer the size of the type.

Additionally we will implement the std::fmt::Display trait to have a clearer representation of terms when printing them:

use std::fmt;

impl fmt::Display for Term {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            Term::Var(var) => write!(f, "{}", *var as char),
            Term::Abs(var, term) => write!(f, "{}. {})", *var as char, term),
            Term::App(t1, t2) => write!(f, "({} {})", t1, t2),
        }
    }
}

We've casted the u8 representation of variables to char so we get letters instead of numbers when displaying them. Also, we added brackets surrounding abstractions and applications to avoid ambiguities.

Now let's test this, to write a term like ((λx. (y x)) z) we have to write the following:

fn main() {
    let term = Term::App(
        Box::new(Term::Abs(
            b'x',
            Box::new(Term::App(
                Box::new(Term::Var(b'y')),
                Box::new(Term::Var(b'x')),
            )),
        )),
        Box::new(Term::Var(b'z')),
    );
    println!("{}", term);
}

It is immediately clear that our representation of terms is not the easiest to write right now. All those boxes make the process too cumbersome... But it works.

The Semantics

Right now our implementation of the lambda calculus is just a bunch of symbols, and without any meaning or interpretation that won't change. We need to have a semantics for our language to define how terms are evaluated and actually do computations with them.

The most important thing about the lambda calculus is that abstractions represent functions,they have an argument and a body after all. Under that premise the term λx. x represents the identity function, i.e. the function that returns its own argument without changing it. So, if abstractions are functions, an application like (t1 t2) represents function application if t1 is an abstraction.

For example ((λx. x) y) should somehow evaluate to just y, even if y itself is another abstraction, which the reader familiar with functional programming will recognize as having functions as first class citizens, although in a language with only 3 kinds of citizens this is a necessity, not a feature.

With that interpretation of terms, we can define our first evaluation rule for our language. This first rule is know as beta-reduction and states the following: "To evaluate an application between an abstraction and any term, it is enough to substitute every occurrence of the abstraction's argument inside the body of the abstraction by the term". This rule is commonly represented as:

((λx. b) t) -> ([x -> t] b)

Where the external arrow is read as "evaluates to" and [x -> t] b means replacing every occurrence of x inside b by t. Just to be sure this makes sense, let's do an example:

((λx. (x y)) (λz. z)) -> ([x -> (λz. z)] (x y))
                      -> ((λz. z) y)
                      -> ([z -> y] z)
                      -> y

So yeah, this beta-reduction rule is just a fancy way of doing function application.

Implementing beta-reduction

To implement this beta-reduction rule over our terms, we need a to define the substitution operation first. So we need to go back and figure out what does [a -> b] t actually means.

A formal definition of substitution

The first important fact is that for doing beta-reduction we only replace variables by terms. So in [a -> b] t, a is always a variable. Now we need to know how this operation works in every possible term b.

However, this is more or less easy to do because there are only three forms of terms. Thus, to understand how substitution behaves over any term, we just need to understand how it behaves over variables, abstractions and applications.

Variables are the easier ones, doing [a -> b] x means replacing a by b inside x. But if x is a variable, we only do the substitution if a is equal to x:

[a -> b] a = b
[a -> b] x = x     if x is not equal to a.

Now lets do abstractions, at first glance those are easy to do, to replace a variable inside a function is just the same as replacing it in is body:

[a -> b] (λx. t) = λx . ([a -> b] t)

However there is one edge case. What happens if we do [a -> b] (λa. a)? Should that be (λa. b) or (λa. a)?.

We might agree that (λa. a) is the same as (λx. x), at the end they are the identity function, so somehow the letter we use for an abstraction's argument is irrelevant, but its use must be consistent. This notion has also a fancy name, it is known as alpha-equivalence.

One possible solution would be to translate first [a -> b] (λa. a) to [a -> b] (λx. x) using alpha-equivalence as an excuse. The disadvantage here is that we somehow require to find a variable name that it is not being used inside the body of the abstraction, and things start to get messy.

The solution we will take is a little bit disappointing. At the end, this is the programmers fault! Why is he writing ambiguous programs like that? Instead of cleaning the mess caused by using the same variable in different places, the substitution won't affect the abstraction if the substitute variable is equal to the argument of the abstraction (thank you to u/gallais for realizing I was ignoring this case):

[a -> b] (λa. t) = λa . t
[a -> b] (λx. t) = λx . ([a -> b] t)    if x is not equal to a

Nevertheless, there is another edge case here:

[a -> b](λb. a) = (λb. b)

This is wrong because we just said that variable names are irrelevant, so that evaluation should equivalent to this one:

[a -> b](λx. a) = (λx. b)

And it is not, (λb. b) is the identity and (λx. b) is a function that ignores its argument and returns b.

To solve this we need to also track down which variables aren't being used as arguments inside the b in [a -> b] and if one of those variables happens to be equal to the argument of the abstraction where we are applying the substitution we shouldn't do the substitution at all.

To have a nicer way of saying "variables that aren't being used as arguments", we will define the free variables of a term:

  • If the term is a variable x, then x is its only free variable.
  • If the term is an abstraction λx. t, its free variables are same free variables of t without including x.
  • If the term is an application (t1 t2), its free variables are the ones that are free in t1 or in t2.

So finally we have a correct substitution rule for abstractions:

[a -> b] (λa. t) = λa . t
[a -> b] (λx. t) = λx . ([a -> b] t)    if x is not equal to a and x is not free in b.

Sadly this restricts a lot what we can evaluate, but we will do a more elegant solution other day.

To have a complete definition of substitution, we need to define it over applications, but this is fairly straightforward, we just substitute each term in the application:

[a -> b] (t1 t2) = (([a -> b] t1) ([a -> b] t2)

Implementing substitution

So we have a lot to do here, first we need to define a function that finds out if one variable is free in a term or not using the definition we gave of free variable:

impl Term {
    /// Decide if `var` is free in `self`.
    fn is_free(&self, var: u8) -> bool {
        match self {
            Term::Var(var2) => var == *var2,
            Term::Abs(arg, body) => (var != *arg) && body.is_free(var),
            Term::App(t1, t2) => t1.is_free(var) || t2.is_free(var),
        }
    }
}

Then, we need to define the substitution operation, which will require that the term used as substitute can be cloned to actually do the substitution, so we'll also add #[derive(Clone)] to Term.

impl Term {
    ...
    /// Replace `var` by `subs` inside `self`.
    fn replace(&mut self, var: u8, subs: &Term) {
        match self {
            Term::Var(var2) => {
                if var == *var2 {
                    *self = subs.clone();
                }
            }
            Term::Abs(arg, body) => {
                if var != *arg && !subs.is_free(*arg) {
                    body.replace(var, subs);
                }
            }
            Term::App(t1, t2) => {
                t1.replace(var, subs);
                t2.replace(var, subs);
            }
        }
    }
}

Great! Now we are ready to implement the beta reduction rule

impl Term {
    ...
    /// Reduce `self` if possible.
    fn reduce(&mut self) {
        match self {
            // beta-reduction
            Term::App(t1, t2) => match &mut **t1 {
                Term::Abs(var, body) => {
                    body.replace(*var, t2);
                    *self = *body.clone();
                }
                _ => (),
            },
            _ => (),
        }
    }
}

Now let's test it, let's see what happens if we reduce ((λx. (y x)) z):

fn main() {
    let mut term = Term::App(
        Box::new(Term::Abs(
            b'x',
            Box::new(Term::App(
                Box::new(Term::Var(b'y')),
                Box::new(Term::Var(b'x')),
            )),
        )),
        Box::new(Term::Var(b'z')),
    );

    println!("Original term: {}", term);
    term.reduce();
    println!("After reduction: {}", term);
}

It evaluates to (y z) which is correct! However, there is an small mistake in our reduction logic: Right now, the program ((λx. (λy. x y)) (y z)) reduces to (λy. x y) which is clearly wrong. This happens because when we try to replace x by (y z), the y variable inside (λy. x y) is free in (y z).

We can fix this problem by changing the replace method to return a boolean indicating if the substitution did not happen because of this reason and then avoid doing the reduction if that's the case:

impl Term {
    ...
    /// Replace `var` by `subs` inside `self`. Return `false` if the substitution
    /// did not take place due to issues with free variables.
    fn replace(&mut self, var: u8, subs: &Term) -> bool {
        match self {
            Term::Var(var2) => {
                if var == *var2 {
                    *self = subs.clone();
                }
                true
            }
            Term::Abs(arg, body) => {
                if var == *arg {
                    true
                } else if subs.is_free(*arg) {
                    false
                } else {
                    body.replace(var, subs)
                }
            }
            Term::App(t1, t2) => {
                t1.replace(var, subs) && t2.replace(var, subs)
            }
        }
    }

    /// Reduce `self` if possible.
    fn reduce(&mut self) {
        match self {
            // beta-reduction
            Term::App(t1, t2) => match &mut **t1 {
                Term::Abs(var, body) => {
                    if body.replace(*var, t2) {
                        *self = *body.clone();
                    }
                }
                _ => (),
            },
            _ => (),
        }
    }
}

What's next?

There are several improvements we can do to our language:

  • It would be nice being able to parse expressions like ((λx. (y x)) z) and transform them to a Term instead of writing the term by hand. This is more like a side project because it does not affect the behavior of the language itself.
  • We need some way of representing terms that avoids naming issues. We already fixed part of the problem, but the solution feels like putting a piece of tape to a leaking bucket.
  • We can only do evaluations one step at a time. It is necessary to define some evaluation strategy for our language.

And we will do all of those in exactly that order! See you next time.

PD: you can find the complete code here.