Runtime Tagging
In this post I’d just like to walk through some fun code, nothing particularly theory-y. The code I’d like to go through is a simple little module in ML that lets you easily construct “dynamic” types. This isn’t through the usual “really big sum of products” approach but instead is completely open and can be extended for every new defined type (at runtime).
The Basic Idea
The basic idea behind this trick hinges on how exceptions work in SML. Well, really it’s not about exceptions so much as what exceptions work with. In ML we can declare new exceptions like this
    exception Foo of tyarg
and this gives us a new exception constructor Foo and we can raise and handle it like you would expect
    (raise (Foo 1)) handle Foo x => x
But what’s particularly interesting is that Foo actually has a type. Really it’s just a constructor for a special type exn. This means we can do things like pass around exception constructors, apply them, etc, etc.
exn is what we might call an extensible data type, we can extend it arbitrarily. We could imagine allowing users to define their own such types but in SML we’ve just go the one. The reason we even have this one is because it’s a great choice if you can only allow one type to be raised and handled.
What we’re going to do is use the fact that we can generate new extensions to exn at run time to create an exn based structure providing a way to implement “tags”. Once we have these tags we’ll be able to implement a pair of functions
    val tag   : 'a tag -> 'a -> dynamic
    val untag : 'a tag -> dynamic -> 'a option
So tags let us “forget” the type of some expression and treat it as some dynamic blob to be recovered at some time in the future. Concretely, we’d like to implement this signature
    signature TAG =
    sig
      type dynamic
      type 'a tag
      val new    : unit -> 'a tag
      val tag    : 'a tag -> 'a -> dynamic
      val untag  : 'a tag -> dynamic -> 'a option
    end
The Implementation
So let’s start implementing the thing. First we need to decide what the type dynamic should be. I propose that it should be exn. The reason being that we can always extend exn in various ways so if we implement things with dynamic = exn we’ll have the ability to make dynamic “grow a new branch” to accommodate whatever we’re working with.
    structure Tag :> TAG =
    struct
      type dynamic = exn
    end
Ok, so what should tag be? Well it’s going to be type indexed obviously so that we can even talk about the signatures of (un)tag, but more importantly its purpose should be to tell us how to package something up into an exn so we can get it back out. The downside of this whole extensible data type thing is that if we forget about the constructor we used to make an exn it’s just lost forever! A tag will make sure that once we make a constructor to use with dynamic we won’t find ourselves with a dynamic and no way to inspect it.
The best way I can think of for doing this is to just back the (un)tag operations straight into the implementation of the type.
    structure Tag :> TAG =
    struct
      type dynamic = exn
      type 'a tag  = {into : 'a -> exn, out : exn -> 'a option}
    end
Now this makes it look like tags could perform arbitrary operations in the process of tagging and untagging, but really we’re going to implement it so it’s all very simple and efficient.
In particular, we’re now in a position to define our three core operators
    structure Tag :> TAG =
    struct
      type dynamic = exn
      type 'a tag  = {into : 'a -> exn, out : exn -> 'a option}
      fun new () : 'a tag =
        let
          exception Fresh of 'a
        in
          { into = Fresh
          , out = fn e =>
              case e of Fresh a => SOME a | _ => NONE
          }
        end
      fun tag {into, out} = into
      fun untag {into, out} = out
    end
Now tag and untag are pretty simple because we basically implemented them up in new so let’s look carefully at that. We start by first minting a new constructor for exn. We know that this will not clash with any other exception in existence, no one else can raise it or handle it unless we explicitly give them this constructor. Now while we have access to it, we bundle the constructor into the tag record we’re making.
into is quite easy to implement because it’s just constructor application. out is also straightforward, all we do is pattern match to see if the given exn is correct. All we do in the actual matching bit is see if we’ve been given something made with our Fresh constructor and return the included a if we did. The handling everything else is important, otherwise this would explode horribly every time we failed to untag something.
And there’s a nice way of implementing the same sort of run time typing you get in dynamic languages in SML. One nice advantage of this over the usual
    datatype dynamic = INT of int | STRING of string | ...
approach is we can always extend our dynamic with user defined types. So we can do something like
    datatype foo = Foo of int
    val fooTag = Tag.new () : foo tag
    val d = Tag.tag fooTag (Foo 2)
    val SOME (Foo 2) = Tag.untag fooTag d
Wrap Up
There you go, this is just a very short post on a very short piece of code that let’s us do something fun. Some nice things you can do now
- Use the inherently recursive nature of 
dynamicto write an infinite loop without direct recursion - Do the same thing, but without using 
exnand using the generative effect of allocating a reference instead - etc