Proving Cut Admissibility in Twelf

Posted on June 29, 2015

Tags: twelf, types

Veering wildly onto the theory side compared to my last post, I’d like to look at some more Twelf code today. Specifically, I’d like to prove a fun theorem called cut admissibility (or elimination) for a particular logic: a simple intuitionistic propositional sequent calculus. I chucked the code for this over here.

Background

If those words didn’t make any sense, here’ an incomplete primer on what we’re doing here. First of all we’re working with a flavor of logic called “sequent calculus”. Sequent calculus describes a class of logics characterized by using studying “sequents”, a sequent is just an expression Γ ⊢ A saying “A is true under the assumption that the set of propositions, Γ, are true”. A sequent calculus defines a couple things

What exactly A is, a calculus defines what propositions it talks about

For us we’re only interested in a few basic connectives, so our calculus can talk about true, false, A and B (A ∧ B), A or B (A ∨ B), and A implies B (A ⇒ B).
Rules for inferring Γ ⊢ A holds. We can use these inference rules to build up proofs of things in our system.

In sequent calculus there are two sorts of inference rules, left and right. A left rule takes a fact that we know and let’s us reason backwards to other things we must know hold. A right rule let’s us take the thing we’re trying to prove and instead prove smaller, simpler things.

More rules will follow in the Twelf code but for a nice example consider the left and right rules for ∧,
```
  Γ, A, B ⊢ C
 ———————————————
  Γ, A ∧ B ⊢ C

 Γ ⊢ A    Γ ⊢ B
 ———————————————
    Γ ⊢ A ∧ B
```
The left rule says if we know that A ∧ B is true, we can take it apart and try to prove our goal with assumptions that A and B are true. The right rule says to prove that A ∧ B is true we need to prove A is true and B is true. A proof in this system is a true of these rules just like you’d expect in a type theory or natural deduction.

We also tacitly assume a bunch of boring rules called structural rules about our sequents hold, so that we can freely duplicate, drop and swap assumptions in Γ. For a less awful introduction to sequent calculus Frank Pfenning has some good notes.

Now we want to prove a particular (meta)theorem about sequent calculus

if Γ ⊢ A
and Γ, A ⊢ B
then Γ ⊢ B

This theorem means a couple different things for example, our system is consistent and our system also admits lemmas. As it turns out, proving this theorem is hard. The basic complication is that we don’t know what form either of the first two proofs.

The Twelf Stuff

We now formalize our sequent calculus in Twelf. First we declare a type and some constants to represent propositions.

    prop  : type.

    =>    : prop -> prop -> prop. %infix right 4 =>.
    true  : prop.
    false : prop.
    /\    : prop -> prop -> prop. %infix none 5 /\.
    \/    : prop -> prop -> prop. %infix none 5 \/.

Notice here that we use infix to let us write A /\ B => C. Having specified these we now define what a proof is in this system. This is structured a little differently than you’d be led to believe from the above. We have an explicit type proof which is inhabited by “proof terms” which serve as a nice shortcut to those trees generated by inference rules. Finally, we don’t explicitly represent Γ, instead we have this thing called hyp which is used to represent a hypothesis in Γ. Left rules manipulate use these hypotheses and introduce new ones. Pay attention to /‌\/l and /‌\/r since you’ve seen the handwritten equivalents.

    proof   : type.
    hyp     : type.

    init    : hyp -> proof.

    =>/r    : (hyp -> proof) -> proof.
    =>/l    : (hyp -> proof) -> proof -> hyp -> proof.

    true/r  : proof.

    false/l : hyp -> proof.

    /\/r    : proof -> proof -> proof.
    /\/l    : (hyp -> hyp -> proof) -> hyp -> proof.

    \//r1   : proof -> proof.
    \//r2   : proof -> proof.
    \//l    : (hyp -> proof) -> (hyp -> proof) -> hyp -> proof.

The right rules are at least a little intuitive, the left rules are peculiar. Essentially we have a weird CPS-y feel going on here, to decompose a hypothesis you hand the hyp to the constant along with a continuation which takes the hypotheses you should get out of the decomposition. For example for ‌\/ we have to right rules (think Left and Right), then one left rule which takes two continuations and one hyp (think either). Finally, that init thing is the only way to actually take a hypothesis and use it as a proof.

We now want to unite these two pieces of syntax with a typing judgment letting us say that a proof proves some particular prop.

    of         : proof -> prop -> type.
    hof        : hyp   -> prop -> type.

    of/init    : of (init H) A
                  <- hof H A.

    of/=>/r    : of (=>/r F) (A => B)
                  <- ({h} hof h A -> of (F h) B).
    of/=>/l    : of (=>/l C Arg F) U
                  <- hof F (A => B)
                  <- of Arg A
                  <- ({h} hof h B -> of (C h) U).

    of/true/r  : of true/r true.

    of/false/l : of (false/l H) A
                  <- hof H false.

    of//\/r    : of (/\/r R L) (A /\ B)
                  <- of L A
                  <- of R B.
    of//\/l    : of (/\/l C H) U
                  <- hof H (A /\ B)
                  <- ({h}{h'} hof h A -> hof h' B -> of (C h h') U).

    of/\//r1   : of (\//r1 L) (A \/ B)
                  <- of L A.
    of/\//r2   : of (\//r2 R) (A \/ B)
                  <- of R B.
    of/\//l    : of (\//l R L H) C
                  <- hof H (A \/ B)
                  <- ({h} hof h A -> of (L h) C)
                  <- ({h} hof h B -> of (R h) C).

In order to handle hypotheses we have this hof judgment which handles typing various hyps. We introduce it just like we introduce hyps in those continuation-y things for left rules. Sorry for dumping so much code on you all at once: it’s just a lot of machinery we need to get working in order to actually start formalizing cut.

I would like to point out a few things about this formulation of sequent calculus though. First off, it’s very Twelf-y, we use the LF context to host the context of our logic using HOAS. We also basically just have void as the type of hypothesis! Look, there’s no way to construct a hypothesis, let alone a typing derivation hof! The idea is that we’ll just wave our hands at Twelf and say "consider our theorem in a context with hyps and hofs with

    %block someHofs : some {A : prop} block {h : hyp}{deriv : hof h A}.

In short, Twelf is nifty.

The Theorem

Now we’re almost in a position to state cut admissibility, we want to say something like

    cut : of Lemma A
            -> ({h} hyp h A -> of (F h) B)
            -> of ??? B

But what should that ??? be? We could just say “screw it it’s something” and leave it as an output of this lemma but experimentally (an hour of teeth gnashing later) it’s absolutely not worth the pain. Instead let’s do something clever.

Let’s first define an untyped version of cut which works just across proofs without mind to typing derivations. We can’t declare this total because it’s just not going to work for ill-typed things, we can give it a mode though (it’s not needed) just as mechanical documentation.

    cut : proof -> (hyp -> proof) -> proof -> type.
    %mode cut +A +B -C.

The goal here is we’re going to state our main theorem as

    of/cut : {A} of P A
              -> ({h} hof h A -> of (F h) B)
              -> cut P F C
              -> of C B
              -> type.
    %mode of/cut +A +B +C -D -E.

Leaving that derivation of cut as an output. This let’s us produce not just a random term but instead a proof that that term makes “sense” somehow along with a proof that it’s well typed.

cut is going to mirror the structure of of/cut so we now need to figure out how we’re going to structure our proof. It turns out a rather nice way to do this is to organize our cuts into 4 categories. The first one are “principle” cuts, they’re the ones where we have a right rule as our lemma and we immediately decompose that lemma in the other term with the corresponding left rule. This is sort of the case that we drive towards everywhere and it’s where the substitution bit happens.

First we have some simple cases

    trivial : cut P' ([x] P) P.
    p/init1 : cut (init H) ([x] P x) (P H).
    p/init2 : cut P ([x] init x) P.

In trivial we don’t use the hypothesis at all so we’re just “strengthening” here. p/init1 and p/init2 deal with the init rule on the left or right side of the cut, if it’s on the left we have a hypothesis of the appropriate type so we just apply the function. If it’s on the left we have a proof of the appropriate type so we just return that. In the more interesting cases we have the principle cuts for some specific connectives.

    p/=>   : cut (=>/r F) ([x] =>/l ([y] C y x) (Arg x) x) Out'
              <- ({y} cut (=>/r F) (C y) (C' y))
              <- cut (=>/r F) Arg Arg'
              <- cut Arg' F Out
              <- cut Out C' Out'.
    p//\   : cut (/\/r R L) ([x] /\/l ([y][z] C y z x) x) Out'
              <- ({x}{y} cut (/\/r R L) (C x y) (C' x y))
              <- ({x} cut R (C' x) (Out x))
              <- cut L Out Out'.
    p/\//1 : cut (\//r1 L) ([x] \//l ([y] C2 y x) ([y] C1 y x) x) Out
              <- ({x} cut (\//r1 L) (C1 x) (C1' x))
              <- cut L C1' Out.
    p/\//2 : cut (\//r2 L) ([x] \//l ([y] C2 y x) ([y] C1 y x) x) Out
              <- ({x} cut (\//r2 L) (C2 x) (C2' x))
              <- cut L C2' Out.

Let’s take a closer look at p/=>, the principle cut for =>. First off, our inputs are =>/r F and ([x] =>/l ([y] C y x) (Arg x) x). The first one is just a “function” that we’re supposed to substitute into the second. Now the second is comprised of a continuation and an argument. Notice that both of these depend on x! In order to handle this the first two lines of the proof

           <- ({y} cut (=>/r F) (C y) (C' y))
           <- cut (=>/r F) Arg Arg'

Are to remove that dependence. We get back a C' and an Arg' which doesn’t use the hyp (x). In order to do this we just recurse and cut the =>/r F out of them. Notice that both the type and the thing we’re substituting are the same size, what decreases here is what we’re substituting into. Now we’re ready to actually do some work. First we need to get a term representing the application of F to Arg'. This is done with cut since it’s just substitution

              <- cut Arg' F Out

But this isn’t enough, we don’t need the result of the application, we need the result of the continuation! So we have to cut the output of the application through the continuation

              <- cut Out C' Out'.

This code is kinda complicated. The typed version of this took me an hour since after 2am I am charitably called an idiot. However this same general pattern holds with all the principle cuts

The subterms of the target could depend on what we’re substituting for, so get rid of that dependence with a recursive call.
Get the “result”, which is just the term corresponding to the hypothesis the continuation is expecting. This is pretty trivial in all cases except => since it’s just lying about in an input.
Get the actual result by cutting 2. through the continuation.

Try to work through the case for /\ now

    p//\   : cut (/\/r R L) ([x] /\/l ([y][z] C y z x) x) Out'
              <- ({x}{y} cut (/\/r R L) (C x y) (C' x y))
              <- ({x} cut R (C' x) (Out x))
              <- cut L Out Out'.

After principle cuts we really just have a number of boring cases whose job it is to recurse. The first of these is called rightist substitution because it comes up if the term on the right (the part using the lemma) has a right rule first. This means we have to hunt in the subterms to go find where we’re actually using the lemma.

    r/=> : cut P ([x] (=>/r ([y] F y x))) (=>/r ([y] F' y))
            <- ({x} cut P (F x) (F' x)).
    r/true : cut P ([x] true/r) true/r.
    r//\ : cut P ([x] /\/r (R x) (L x)) (/\/r R' L')
            <- cut P L L'
            <- cut P R R'.
    r/\/1 : cut P ([x] \//r1 (L x)) (\//r1 L')
             <- cut P L L'.
    r/\/2 : cut P ([x] \//r2 (R x)) (\//r2 R')
             <- cut P R R'.

Nothing here should be surprising keeping in mind that all we’re doing here is recursing. The next set of cuts is called leftist substitution. Here we are actually recursing on the term we’re trying to substitute.

    l/=>    : cut (=>/l ([y] C y) Arg H) ([x] P x) (=>/l ([x] C' x) Arg H)
               <- ({x} cut (C x) P (C' x)).
    l/false : cut (false/l H) P (false/l H).
    l//\    : cut (/\/l ([x][y] C x y) H) P (/\/l ([x][y] C' x y) H)
               <- ({x}{y} cut (C x y) P (C' x y)).
    l/\/    : cut (\//l ([y] R  y) ([y] L  y) H) ([x] P x)
                  (\//l ([y] R' y) ([y] L' y) H)
               <- ({x} cut (L x) P (L' x))
               <- ({x} cut (R x) P (R' x)).

It’s the same game but just a different target, we’re now recursing on the continuation because that’s where we somehow created a proof of A. This means that on l/=> we’re substation left term which has three parts

A continuation, hyp B to proof of C
An argument of type A
A hypothesis of type A -> B

Now we’re only interesting in how we created that proof of C, that’s the only relevant part of this substitution. The output of this case is going to have that left rule, =>/l ??? Arg H so we have where ??? is a replacement of C that we get by cutting C through P “pointwise”. This comes through on the recursive call

               <- ({x} cut (C x) P (C' x)).

For one more case, consider the left rule for \/

    l/\/    : cut (\//l R L H) P

We start by trying to cut a left rule into P so we need to produce a left rule in the output with different continuations, something like

               (\//l R' L' H)

Now what should R' and L' be? In order to produce them we’ll throw up a pi so we can get L x, a proof with the appropriate type to cut again. With that, we can recurse and get back the new continuation we want.

               <- ({x} cut (L x) P (L' x))
               <- ({x} cut (R x) P (R' x)).

There’s one last class of cuts to worry about, think about the cases we’ve covered so far

The left term is a left rule (leftist substitution)
The right term is a right rule (rightist substitution)
The terms match up and we substitute

So what happens if we have a left rule on the right and a right rule on the left, but they don’t “match up”. By this I mean that the left rule in that right term works on a different hypothesis than the one that the function it’s wrapped in provides. In this case we just have to recurse some more

    lr/=> : cut P ([x] =>/l ([y] C y x) (Arg x) H) (=>/l C' Arg' H)
             <- cut P Arg Arg'
             <- ({y} cut P (C y) (C' y)).
    lr//\ : cut P ([x] /\/l ([y][z] C y z x) H) (/\/l C' H)
             <- ({y}{z} cut P (C y z) (C' y z)).
    lr/\/ : cut P ([x] \//l ([y] R y x) ([y] L y x) H) (\//l R' L' H)
             <- ({y} cut P (L y) (L' y))
             <- ({y} cut P (R y) (R' y)).

When we have such an occurrence we just do like we did with right rules.

Okay, now that we’ve handled all of these cases we’re ready to type the damn thing.

    of/cut : {A} of P A
              -> ({h} hof h A -> of (F h) B)
              -> cut P F C
              -> of C B
              -> type.
    %mode of/cut +A +B +C -D -E.

Honestly this is less exciting than you’d think. We’ve really done all the creative work in constructing the cut type family. All that’s left to do is check that this is correct. As an example, here’s a case that exemplifies how we verify all left-right commutative cuts.

    - : of/cut _ P ([x][h] of/=>/l ([y][yh] C y yh x h) (A x h) H)
         (lr/=> CC CA) (of/=>/l C' A' H)
         <- of/cut _ P A CA A'
         <- ({y}{yh} of/cut _ P (C y yh) (CC y) (C' y yh)).

We start by trying to show that

    lr/=> : cut P ([x] =>/l ([y] C y x) (Arg x) H) (=>/l C' Arg' H)
             <- cut P Arg Arg'
             <- ({y} cut P (C y) (C' y)).

Is type correct. To do this we have a derivation P that the left term is well-typed. Notice that I’ve overloaded P here, in the rule lr/=> P was a term and now it’s a typing derivation for that term. Next we have a typing derivation for [x] =>/l ([y] C y x) (Arg x) H. This is a function which takes two arguments. x is a hypothesis, the same as in lr/=>, however now we have h which is a hof derivation that h has a type. There’s only one way to type a usage of the left rule for =>, with of/=>/l so we have that next.

Finally, our output is on the next line in two parts. First we have a derivation for cut showing how to construct the “cut out term” in this case. Next we have a new typing derivation that again uses of/=>/l. Notice that both of these depend on some terms we get from the recursive calls here.

Since we’ve gone through all the cases already and done all the thinking, I’m not going to reproduce it all here. The intuition for how cut works is really best given by the untyped version with the understand that we check that it’s correct with this theorem as we did above.

Wrap Up

To recap here’s what we did

Define sequent calculus’ syntax
Define typing rules across it
Define how an untyped version of cut works across it
Validate the correctness of cut by

Hope that was a little fun, cheers!