Value vs Monomorphism Restriction

Posted on March 27, 2015

Tags: sml, haskell

I’m taking the undergraduate course on programming languages at CMU. For the record, I still get really excited about the novelty of taking a class (at school!) on programming languages. I’m easily made happy.

We started talking about System F and before long we touched on the value restriction. Specifically, how most people think of the value restriction incorrectly. To understand why this is, let’s first define the value restriction since it’s probably new to you if you don’t use SML.

The Value Restriction

In SML there are value level declarations just like in Haskell. We can write things like

    val x = 1
    val y = x + 1

and we end up with x bound to 1 and y bound to 2. Note that SML is strict so these bindings are evaluated right as we reach them. Also like in Haskell, SML has polymorphism, so we can write map

   fun map f [] = []
     | map f (x :: xs) = f x :: map f xs

And it gets the type ('a -> 'b) -> ('a list -> 'b list). Aside from minor syntatic differences, this is pretty much identical to what we’d write in Haskell. The value restriction concerns the intersection of these two things. In SML, the following should not compile under the standard

    val x = rev []

This is because SML requires that all polymorphic val bindings be values! In practice all implementations will do something besides this but we’ll just focus on what the standard says. Now the reason for this value restriction is widely misunderstood. Most people believe that the value restrictions

    val r  = ref NONE
    val () = r := SOME 1
    val _  = case !r of
                 SOME s => s
               | NONE   => ""

This seems to illustrate a pretty big issue for SML! We’re filling in polymorphic reference with one type and unboxing it with a different one! Clearly this would segfault without the value restriction. However, there’s a catch.

SML is based on System F (did you really think I could get through a blog post without some theory?) which is sometimes called the “polymorphic lambda calculus”. It’s the minimal language with polymorphism and functions. In this language there’s a construct for making polymorphic things: Λ.

In this language we write polymorphism explicitly by saying Λ τ. e which has the type ∀ t. T. So for example we write the identity function as

    id ≡ Λ τ. λ x : τ. x
    () = id[unit] ()

Now SML (and vanilla Haskell) have a limited subset of the power of Λ. Specifically all these lambdas have to appear at the start of a toplevel term. Meaning that they have to be of the form

    val x = Λ α. Λ β. ... e

This is called “prenex” form and is what makes type inference for SML possible. Now since we don’t show anyone the hidden Λs it doesn’t make sense to show them the type application that comes with them and SML infers and adds those for us too. What’s particularly interesting is that SML is often formalized as having this property: values start with Λ and are implicitly applied to the appropriate types where used. Even more interesting, how do you suppose we should evaluate a Λ? What for example, should this code do

    val x  = Λ τ. raise[τ] Fail (* Meaning raise an exception and say
                                  we have type τ *)
    val () = print "I made it here"
    val () = x[unit]

It seems clear that Λ should be evaluated just like how we evaluate λ, when we apply it. So I’d (and the formalization of SML) would expect this to print "I made it here" before throwing that exception. This might now surprise you just by parallels with code like this

    val x  = fn () => raise[τ] Fail
    val () = print "I made it here"
    val () = x ()

However, what about when those lambdas are implicit? In the actual source language of ML our code snippet would be

    val x  = raise Fail
    val () = print "I made it here"
    val () = x[unit]

Uhoh, this really looks like it ought to throw an exception but it apparently doesn’t! More worringly, what about when we have something like

    fun die ()  = raise Fail
    val x = die ()
    val () = print "Made it here"

Since x is never specialized, this doesn’t even throw an error! Yikes! Clearly this is a little confusing. It is however, type safe. Consider our original motivation for the value restriction. With explicit type application

    val r  = Λ τ. ref[τ] NONE
    val () = r[int] := SOME 1
    val _  = case !(r[string]) of
                 SOME s => s
               | NONE   => ""

Since the body of this function is run every time we do something with r, we’re just creating a whole bunch of new references in this code! There’s no type safety failure since !(r[string]) returns a fresh ref cell, completely different from the one we modified on the line above! This code always runs the NONE case. In fact, if this did the wrong thing it’s just a canary in the coal mine, a symptom of the fact that our system evaluates under (big) lambda binders.

So the value restriction is really not at all about type safety, it’s about comprehensibility. Mostly since the fact that a polymorphic expression is evaluated at usage rather than location is really strange. Most documentation seems to be wrong about this, everyone here seems agree that this is unfortunate but such is life.

The Monomorphism Restriction

Now let’s talk about the monomorphism restriction. This is better understood but still worth recapping. In Haskell we have type classes. They let us overload function to behave differently on different types. Everyone’s favoriate example is the type class for numbers which let’s us write

    fact :: (Eq a, Num a) => a -> a
    fact 0 = 1
    fact n = n * fact (n - 1)

And this works for all numbers, not just int or something. Under the hood, this works by passing a record of functions like *, fromInteger, and - to make the code work. That => is really just a sort of function arrow that happens to only take particular “implicit” records as an argument.

Now what do you suppose the most polymorphic type this code is?

    x = fact 10000

It could potentially work on all numbers so it gets the type

    x :: (Num a, Eq a) => a

However this is really like a function! This means that fact :: Integer and fact :: Int evaluate that computation twice. In fact each time we call fact we supply a new record and end up evaluating again. This is very costly and also very surprising to most folks. After all, why should something that looks like a normal number evaluate every time we use it! The monomorphism restriction is essentially

If you have a binding
Whose type is (C1, C2 ...) => t
And has no arguments to the left of the =
Don’t generalize it

This is intended to keep us from the surprise of evaluating a seemingly fully reduced term twice.

Sound familiar? Just like with the value restriction the whole point of the monomorphism restriction is to prevent a hidden function, either type abstraction or type constraints, from causing us to silently and dangerously duplicate work. While neither of them are essential to type safety: without it some really simple looking pieces of code become exponential.

Wrap Up

That about covers things. It turns out that both of these restrictions are just patches to cover some surprising areas of the semantics but both are easily understood when you look at the elaborated version. I deliberately went a bit faster through the monomorphism restriction since quite a lot of ink has already been spilled on the subject and unlike the value restriction, most of it is correct :)

As one final note, the way that Haskell handles the monomorphism restriction is precisely how OCaml handles the value restriction: weak polymorphism. Both of them mark the type variables they refuse to generalize as weak type variables. Whenever we first instantiate them to something we go back and retroactively modify the definition to pretend we had used this type all along. In this way, we only evaluate things once but can handle a lot of simple cases of binding a value and using it once.

The more you know.