Runtime Tagging

Posted on October 12, 2015

Tags: sml

In this post I’d just like to walk through some fun code, nothing particularly theory-y. The code I’d like to go through is a simple little module in ML that lets you easily construct “dynamic” types. This isn’t through the usual “really big sum of products” approach but instead is completely open and can be extended for every new defined type (at runtime).

The Basic Idea

The basic idea behind this trick hinges on how exceptions work in SML. Well, really it’s not about exceptions so much as what exceptions work with. In ML we can declare new exceptions like this

    exception Foo of tyarg

and this gives us a new exception constructor Foo and we can raise and handle it like you would expect

    (raise (Foo 1)) handle Foo x => x

But what’s particularly interesting is that Foo actually has a type. Really it’s just a constructor for a special type exn. This means we can do things like pass around exception constructors, apply them, etc, etc.

exn is what we might call an extensible data type, we can extend it arbitrarily. We could imagine allowing users to define their own such types but in SML we’ve just go the one. The reason we even have this one is because it’s a great choice if you can only allow one type to be raised and handled.

What we’re going to do is use the fact that we can generate new extensions to exn at run time to create an exn based structure providing a way to implement “tags”. Once we have these tags we’ll be able to implement a pair of functions

    val tag   : 'a tag -> 'a -> dynamic
    val untag : 'a tag -> dynamic -> 'a option

So tags let us “forget” the type of some expression and treat it as some dynamic blob to be recovered at some time in the future. Concretely, we’d like to implement this signature

    signature TAG =
    sig
      type dynamic
      type 'a tag

      val new    : unit -> 'a tag
      val tag    : 'a tag -> 'a -> dynamic
      val untag  : 'a tag -> dynamic -> 'a option
    end

The Implementation

So let’s start implementing the thing. First we need to decide what the type dynamic should be. I propose that it should be exn. The reason being that we can always extend exn in various ways so if we implement things with dynamic = exn we’ll have the ability to make dynamic “grow a new branch” to accommodate whatever we’re working with.

    structure Tag :> TAG =
    struct
      type dynamic = exn
    end

Ok, so what should tag be? Well it’s going to be type indexed obviously so that we can even talk about the signatures of (un)tag, but more importantly its purpose should be to tell us how to package something up into an exn so we can get it back out. The downside of this whole extensible data type thing is that if we forget about the constructor we used to make an exn it’s just lost forever! A tag will make sure that once we make a constructor to use with dynamic we won’t find ourselves with a dynamic and no way to inspect it.

The best way I can think of for doing this is to just back the (un)tag operations straight into the implementation of the type.

    structure Tag :> TAG =
    struct
      type dynamic = exn
      type 'a tag  = {into : 'a -> exn, out : exn -> 'a option}
    end

Now this makes it look like tags could perform arbitrary operations in the process of tagging and untagging, but really we’re going to implement it so it’s all very simple and efficient.

In particular, we’re now in a position to define our three core operators

    structure Tag :> TAG =
    struct
      type dynamic = exn
      type 'a tag  = {into : 'a -> exn, out : exn -> 'a option}

      fun new () : 'a tag =
        let
          exception Fresh of 'a
        in
          { into = Fresh
          , out = fn e =>
              case e of Fresh a => SOME a | _ => NONE
          }
        end

      fun tag {into, out} = into
      fun untag {into, out} = out
    end

Now tag and untag are pretty simple because we basically implemented them up in new so let’s look carefully at that. We start by first minting a new constructor for exn. We know that this will not clash with any other exception in existence, no one else can raise it or handle it unless we explicitly give them this constructor. Now while we have access to it, we bundle the constructor into the tag record we’re making.

into is quite easy to implement because it’s just constructor application. out is also straightforward, all we do is pattern match to see if the given exn is correct. All we do in the actual matching bit is see if we’ve been given something made with our Fresh constructor and return the included a if we did. The handling everything else is important, otherwise this would explode horribly every time we failed to untag something.

And there’s a nice way of implementing the same sort of run time typing you get in dynamic languages in SML. One nice advantage of this over the usual

    datatype dynamic = INT of int | STRING of string | ...

approach is we can always extend our dynamic with user defined types. So we can do something like

    datatype foo = Foo of int
    val fooTag = Tag.new () : foo tag
    val d = Tag.tag fooTag (Foo 2)
    val SOME (Foo 2) = Tag.untag fooTag d

Wrap Up

There you go, this is just a very short post on a very short piece of code that let’s us do something fun. Some nice things you can do now

Use the inherently recursive nature of dynamic to write an infinite loop without direct recursion
Do the same thing, but without using exn and using the generative effect of allocating a reference instead
etc