Value vs Monomorphism Restriction
I’m taking the undergraduate course on programming languages at CMU. For the record, I still get really excited about the novelty of taking a class (at school!) on programming languages. I’m easily made happy.
We started talking about System F and before long we touched on the value restriction. Specifically, how most people think of the value restriction incorrectly. To understand why this is, let’s first define the value restriction since it’s probably new to you if you don’t use SML.
The Value Restriction
In SML there are value level declarations just like in Haskell. We can write things like
and we end up with x
bound to 1
and y
bound to 2
. Note that SML is strict so these bindings are evaluated right as we reach them. Also like in Haskell, SML has polymorphism, so we can write map
And it gets the type ('a -> 'b) -> ('a list -> 'b list)
. Aside from minor syntatic differences, this is pretty much identical to what we’d write in Haskell. The value restriction concerns the intersection of these two things. In SML, the following should not compile under the standard
This is because SML requires that all polymorphic val
bindings be values! In practice all implementations will do something besides this but we’ll just focus on what the standard says. Now the reason for this value restriction is widely misunderstood. Most people believe that the value restrictions
This seems to illustrate a pretty big issue for SML! We’re filling in polymorphic reference with one type and unboxing it with a different one! Clearly this would segfault without the value restriction. However, there’s a catch.
SML is based on System F (did you really think I could get through a blog post without some theory?) which is sometimes called the “polymorphic lambda calculus”. It’s the minimal language with polymorphism and functions. In this language there’s a construct for making polymorphic things: Λ.
In this language we write polymorphism explicitly by saying Λ τ. e
which has the type ∀ t. T
. So for example we write the identity function as
Now SML (and vanilla Haskell) have a limited subset of the power of Λ. Specifically all these lambdas have to appear at the start of a toplevel term. Meaning that they have to be of the form
This is called “prenex” form and is what makes type inference for SML possible. Now since we don’t show anyone the hidden Λs it doesn’t make sense to show them the type application that comes with them and SML infers and adds those for us too. What’s particularly interesting is that SML is often formalized as having this property: values start with Λ and are implicitly applied to the appropriate types where used. Even more interesting, how do you suppose we should evaluate a Λ? What for example, should this code do
val x = Λ τ. raise[τ] Fail (* Meaning raise an exception and say
we have type τ *)
val () = print "I made it here"
val () = x[unit]
It seems clear that Λ should be evaluated just like how we evaluate λ, when we apply it. So I’d (and the formalization of SML) would expect this to print "I made it here"
before throwing that exception. This might now surprise you just by parallels with code like this
However, what about when those lambdas are implicit? In the actual source language of ML our code snippet would be
Uhoh, this really looks like it ought to throw an exception but it apparently doesn’t! More worringly, what about when we have something like
Since x
is never specialized, this doesn’t even throw an error! Yikes! Clearly this is a little confusing. It is however, type safe. Consider our original motivation for the value restriction. With explicit type application
val r = Λ τ. ref[τ] NONE
val () = r[int] := SOME 1
val _ = case !(r[string]) of
SOME s => s
| NONE => ""
Since the body of this function is run every time we do something with r
, we’re just creating a whole bunch of new references in this code! There’s no type safety failure since !(r[string])
returns a fresh ref cell, completely different from the one we modified on the line above! This code always runs the NONE
case. In fact, if this did the wrong thing it’s just a canary in the coal mine, a symptom of the fact that our system evaluates under (big) lambda binders.
So the value restriction is really not at all about type safety, it’s about comprehensibility. Mostly since the fact that a polymorphic expression is evaluated at usage rather than location is really strange. Most documentation seems to be wrong about this, everyone here seems agree that this is unfortunate but such is life.
The Monomorphism Restriction
Now let’s talk about the monomorphism restriction. This is better understood but still worth recapping. In Haskell we have type classes. They let us overload function to behave differently on different types. Everyone’s favoriate example is the type class for numbers which let’s us write
And this works for all numbers, not just int
or something. Under the hood, this works by passing a record of functions like *
, fromInteger
, and -
to make the code work. That =>
is really just a sort of function arrow that happens to only take particular “implicit” records as an argument.
Now what do you suppose the most polymorphic type this code is?
It could potentially work on all numbers so it gets the type
However this is really like a function! This means that fact :: Integer
and fact :: Int
evaluate that computation twice. In fact each time we call fact
we supply a new record and end up evaluating again. This is very costly and also very surprising to most folks. After all, why should something that looks like a normal number evaluate every time we use it! The monomorphism restriction is essentially
- If you have a binding
- Whose type is
(C1, C2 ...) => t
- And has no arguments to the left of the
=
- Don’t generalize it
This is intended to keep us from the surprise of evaluating a seemingly fully reduced term twice.
Sound familiar? Just like with the value restriction the whole point of the monomorphism restriction is to prevent a hidden function, either type abstraction or type constraints, from causing us to silently and dangerously duplicate work. While neither of them are essential to type safety: without it some really simple looking pieces of code become exponential.
Wrap Up
That about covers things. It turns out that both of these restrictions are just patches to cover some surprising areas of the semantics but both are easily understood when you look at the elaborated version. I deliberately went a bit faster through the monomorphism restriction since quite a lot of ink has already been spilled on the subject and unlike the value restriction, most of it is correct :)
As one final note, the way that Haskell handles the monomorphism restriction is precisely how OCaml handles the value restriction: weak polymorphism. Both of them mark the type variables they refuse to generalize as weak type variables. Whenever we first instantiate them to something we go back and retroactively modify the definition to pretend we had used this type all along. In this way, we only evaluate things once but can handle a lot of simple cases of binding a value and using it once.
The more you know.
comments powered by Disqus