Runtime Tagging
In this post I’d just like to walk through some fun code, nothing particularly theory-y. The code I’d like to go through is a simple little module in ML that lets you easily construct “dynamic” types. This isn’t through the usual “really big sum of products” approach but instead is completely open and can be extended for every new defined type (at runtime).
The Basic Idea
The basic idea behind this trick hinges on how exceptions work in SML. Well, really it’s not about exceptions so much as what exceptions work with. In ML we can declare new exceptions like this
exception Foo of tyarg
and this gives us a new exception constructor Foo
and we can raise and handle it like you would expect
(raise (Foo 1)) handle Foo x => x
But what’s particularly interesting is that Foo
actually has a type. Really it’s just a constructor for a special type exn
. This means we can do things like pass around exception constructors, apply them, etc, etc.
exn
is what we might call an extensible data type, we can extend it arbitrarily. We could imagine allowing users to define their own such types but in SML we’ve just go the one. The reason we even have this one is because it’s a great choice if you can only allow one type to be raised and handled.
What we’re going to do is use the fact that we can generate new extensions to exn
at run time to create an exn
based structure providing a way to implement “tags”. Once we have these tags we’ll be able to implement a pair of functions
val tag : 'a tag -> 'a -> dynamic
val untag : 'a tag -> dynamic -> 'a option
So tags let us “forget” the type of some expression and treat it as some dynamic blob to be recovered at some time in the future. Concretely, we’d like to implement this signature
signature TAG =
sig
type dynamic
type 'a tag
val new : unit -> 'a tag
val tag : 'a tag -> 'a -> dynamic
val untag : 'a tag -> dynamic -> 'a option
end
The Implementation
So let’s start implementing the thing. First we need to decide what the type dynamic
should be. I propose that it should be exn
. The reason being that we can always extend exn
in various ways so if we implement things with dynamic = exn
we’ll have the ability to make dynamic
“grow a new branch” to accommodate whatever we’re working with.
structure Tag :> TAG =
struct
type dynamic = exn
end
Ok, so what should tag
be? Well it’s going to be type indexed obviously so that we can even talk about the signatures of (un)tag
, but more importantly its purpose should be to tell us how to package something up into an exn
so we can get it back out. The downside of this whole extensible data type thing is that if we forget about the constructor we used to make an exn
it’s just lost forever! A tag
will make sure that once we make a constructor to use with dynamic
we won’t find ourselves with a dynamic
and no way to inspect it.
The best way I can think of for doing this is to just back the (un)tag
operations straight into the implementation of the type.
structure Tag :> TAG =
struct
type dynamic = exn
type 'a tag = {into : 'a -> exn, out : exn -> 'a option}
end
Now this makes it look like tags could perform arbitrary operations in the process of tagging and untagging, but really we’re going to implement it so it’s all very simple and efficient.
In particular, we’re now in a position to define our three core operators
structure Tag :> TAG =
struct
type dynamic = exn
type 'a tag = {into : 'a -> exn, out : exn -> 'a option}
fun new () : 'a tag =
let
exception Fresh of 'a
in
{ into = Fresh
, out = fn e =>
case e of Fresh a => SOME a | _ => NONE
}
end
fun tag {into, out} = into
fun untag {into, out} = out
end
Now tag and untag are pretty simple because we basically implemented them up in new
so let’s look carefully at that. We start by first minting a new constructor for exn
. We know that this will not clash with any other exception in existence, no one else can raise it or handle it unless we explicitly give them this constructor. Now while we have access to it, we bundle the constructor into the tag
record we’re making.
into
is quite easy to implement because it’s just constructor application. out
is also straightforward, all we do is pattern match to see if the given exn
is correct. All we do in the actual matching bit is see if we’ve been given something made with our Fresh
constructor and return the included a
if we did. The handling everything else is important, otherwise this would explode horribly every time we failed to untag something.
And there’s a nice way of implementing the same sort of run time typing you get in dynamic languages in SML. One nice advantage of this over the usual
datatype dynamic = INT of int | STRING of string | ...
approach is we can always extend our dynamic
with user defined types. So we can do something like
datatype foo = Foo of int
val fooTag = Tag.new () : foo tag
val d = Tag.tag fooTag (Foo 2)
val SOME (Foo 2) = Tag.untag fooTag d
Wrap Up
There you go, this is just a very short post on a very short piece of code that let’s us do something fun. Some nice things you can do now
- Use the inherently recursive nature of
dynamic
to write an infinite loop without direct recursion - Do the same thing, but without using
exn
and using the generative effect of allocating a reference instead - etc