The Universe of Discourse
           
Thu, 18 Sep 2008

data Mu f = In (f (Mu f))
Last week I wrote about one of two mindboggling pieces of code that appears in the paper Functional Programming with Overloading and Higher-Order Polymorphism, by Mark P. Jones. Today I'll write about the other one. It looks like this:

        data Mu f = In (f (Mu f))                       -- (???)
I bet a bunch of people reading this on Planet Haskell are nodding and saying "Oh, that!"

When I first saw this I couldn't figure out what it was saying at all. It was totally opaque. I still have trouble recognizing in Haskell what tokens are types, what tokens are type constructors, and what tokens are value constructors. Code like (???) is unusually confusing in this regard.

Normally, one sees something like this instead:

        data Maybe f = Nothing | Just f
Here f is a type variable; that is, a variable that ranges over types. Maybe is a type constructor, which is like a function that you can apply to a type to get another type. The most familiar example of a type constructor is List:

        data List e = Nil | Cons e (List e)
Given any type f, you can apply the type constructor List to f to get a new type List f. For example, you can apply List to Int to get the type List Int. (The Haskell built-in list type constructor goes by the funny name of [], but works the same way. The type [Int] is a synonym for ([] Int).)

Actually, type names are type constructors also; they're argumentless type constructors. So we have type constructors like Int, which take no arguments, and type constructors like List, which take one argument. Haskell also has type constructors that take more than one argument. For example, Haskell has a standard type constructor called Either for making union types:

        data Either a b = Left a | Right b;
Then the type Either Int String contains values like Left 37 and Right "Cotton Mather".

To keep track of how many arguments a type constructor has, one can consider the, ahem, type, of the type constructor. But to avoid the obvious looming terminological confusion, the experts use the word "kind" to refer to the type of a type constructor. The kind of List is * → *, which means that it takes a type and gives you back a type. The kind of Either is * → * → *, which means that it takes two types and gives you back a type. Well, actually, it is curried, just like regular functions are, so that Either Int is itself a type constructor of kind * → * which takes a type a and returns a type which could be either an Int or an a. The nullary type constructor Int has kind *.

Continuing the "Maybe" example above, f is a type, or a constructor of kind *, if you prefer. Just is a value constructor, of type fMaybe f. It takes a value of type f and produces a value of type Maybe f.

Now here is a crucial point. In declarations of type constructors, such as these:

        data Either a b = ...
        data List e = ...
        data Maybe f = ...
the type variables a, b, e, and f actually range over type constructors, not over types. Haskell can infer the kinds of the type constructors Either, List, and Maybe, and also the kinds of the type variables, from the definitions on the right of the = signs. In this case, it concludes that all four variables must have kind *, and so really do represent types, and not higher-order type constructors. So you can't ask for Either Int List because List is known to have kind * → *, and Haskell needs a type constructor of kind * to serve as an argument to Either.

But with a different definition, Haskell might infer that a type variable has a higher-order kind. Here is a contrived example, which might be good for something, perhaps. I'm not sure:

        data TyCon f = ValCon (f Int)
This defines a type constructor TyCon with kind (* → *) → *, which can be applied to any type constuctor f that has kind * → *, to yield a type. What new type? The new type TyCon f is isomorphic to the type f of Int. For example, TyCon List is basically the same as List Int. The value Just 37 has type Maybe Int, and the value ValCon (Just 37) has type TyCon Maybe.

Similarly, the value [1, 2, 3] has type [Int], which, you remember, is a synonym for [] Int. And the value ValCon [1, 2, 3] has type TyCon [].

Now that the jargon is laid out, let's look at (???) again:

        data Mu f = In (f (Mu f))                       -- (???)
When I was first trying to get my head around this, I had trouble seeing what the values were going to be. It looks at first like it has no bottom. The token f here, like in the TyCon example, is a variable that ranges over type constructors with kind * → *, so could be List or Maybe or [], something that takes a type and yields a new type. Mu itself has kind (* → *) → *, taking something like f and yielding a type. But what's an actual value? You need to apply the value constructor In to a value of type f (Mu f), and it's not immediately clear where to get such a thing.

I asked on #haskell, and Cale Gibbard explained it very clearly. To do anything useful you first have to fix f. Let's take f = Maybe. In that particular case, (???) becomes:

        data Mu Maybe = In (Maybe (Mu Maybe))
So the In value constructor will take a value of type Maybe (Mu Maybe) and return a value of type Mu Maybe. Where do we get a value of type Maybe (Mu Maybe)? Oh, no problem: the value Nothing is polymorphic, and has type Maybe a for all a, so in particular it has type Maybe (Mu Maybe). Whatever Maybe (Mu Maybe) is, it is a Maybe-type, so it has a Nothing value. So we do have something to get started with.

Since Nothing is a Maybe (Mu Maybe) value, we can apply the In constructor to it, yielding the value In Nothing, which has type Mu Maybe. Then applying Just, of type a → Maybe a, to In Nothing, of type Mu Maybe, produces Just (In Nothing), of type Maybe (Mu Maybe) again. We can repeat the process as much as we want and produce as many values of type Mu Maybe as we want; they look like these:

        In Nothing
        In (Just (In Nothing))
        In (Just (In (Just (In Nothing))))
        In (Just (In (Just (In (Just (In Nothing))))))
        ...
And that's it, that's the type Mu Maybe, the set of those values. It will look a little simpler if we omit the In markers, which don't really add much value. We can just agree to omit them, or we can get rid of them in the code by defining some semantic sugar:

        nothing = In Nothing
        just = In . Just
Then the values of Mu Maybe look like this:
        nothing
        just nothing
        just (just nothing)
        just (just (just nothing))
        ...
It becomes evident that what the Mu operator does is to close the type under repeated application. This is analogous to the way the fixpoint combinator works on values. Consider the usual definition of the fixpoint combinator:

        Y f = f (Y f)
Here f is a function of type aa. Y f is a fixed point of f. That is, it is a value x of type a such that f x = x. (Put x = Y f in the definition to see this.)

The fixed point of a function f can be computed by considering the limit of the following sequence of values:


f(⊥)
f(f(⊥))
f(f(f(⊥)))
...

This actually finds the least fixed point of f, for a certain definition of "least". For many functions f, like xx + 1, this finds the uninteresting fixed point ⊥, but for many f, like x → λ n. if n = 0 then 1 else n * x(n - 1), it's something better.

Mu is analogous to Y. Instead of operating on a function f from values to values, and producing a single fixed-point value, it operates on a type constructor f from types to types, and produces a fixed-point type. The resulting type T is the least fixed point of the type constructor f, the smallest set of values such that f T = T.

Consider the example of f = Maybe again. We want to find a type T such that T = Maybe T. Consider the following sequence:

{ ⊥ }
Maybe { ⊥ }
Maybe(Maybe { ⊥ })
Maybe(Maybe(Maybe { ⊥ }))
...

The first item is the set that contains nothing but the bottom value, which we might call t0. But t0 is not a fixed point of Maybe, because Maybe { ⊥ } also contains Nothing. So Maybe { ⊥ } is a different type from t0, which we can call t1 = { Nothing, ⊥ }.

The type t1 is not a fixed point of Maybe either, because Maybe t1 evidently contains both Nothing and Just Nothing. Repeating this process, we find that the limit of the sequence is the type Mu Maybe = { ⊥, Nothing, Just Nothing, Just (Just Nothing), Just (Just (Just Nothing)), ... }. This type is fixed under Maybe.

It might be worth pointing out that this is not the only such fixed point, but is is the least fixed point. One can easily find larger types that are fixed under Maybe. For example, postulate a special value Q which has the property that Q = Just Q. Then Mu Maybe ∪ { Q } is also a fixed point of Maybe. But it's easy to see (and to show, by induction) that any such fixed point must be a superset of Mu Maybe. Further consideration of this point might take me off to co-induction, paraconsistent logic, Peter Aczel's nonstandard set theory, and I'd never get back again. So let's leave this for now.

So that's what Mu really is: a fixed-point operator for type constructors. And having realized this, one can go back and look at the definition and see that oh, that's precisely what the definition says, how obvious:

              Y f =     f  (Y f)             -- ordinary fixed-point operator
        data Mu f = In (f (Mu f))            -- (???)
Given f, a function from values to values, Y(f) calculates a value x such that x = f(x). Given f, a function from types to types, Mu(f) calculates a type T such that f(T) = T. That's why the definitions are identical. (Except for that annoying In constructor, which really oughtn't to be there.)

You can use this technique to construct various recursive datatypes. For example, Mu Maybe turns out to be equivalent to the following definition of the natural numbers:

        data Number = Zero | Succ Number;
Notice the structural similarity with the definition of Maybe:

        data Maybe a = Nothing | Just a;
One can similarly define lists:

        data Mu f = In (f (Mu f)) 
        data ListX a b = Nil | Cons a b deriving Show
        type List a = Mu (ListX a)

        -- syntactic sugar
        nil :: List a
        nil = In Nil
        cons :: a → List a → List a
        cons x y = In (Cons x y)

        -- for example
        ls = cons 3 (cons 4 (cons 5 nil))          -- :: List Integer
        lt = (cons 'p' (cons 'y' (cons 'x' nil)))  -- :: List Char
Or you could similarly do trees, or whatever. Why one might want to do this is a totally separate article, which I am not going to write today.

Here's the point of today's article: I find it amazing that Haskell's type system is powerful enough to allow one to defined a fixed-point operator for functions over types.

We've come a long way since FORTRAN, that's for sure.

A couple of final, tangential notes: Google search for "Mu f = In (f (Mu f))" turns up relatively few hits, but each hit is extremely interesting. If you're trying to preload your laptop with good stuff to read on a plane ride, downloading these papers might be a good move.

The Peter Aczel thing seems to be less well-known that it should be. It is a version of set theory that allows coinductive definitions of sets instead of inductive definitions. In particular, it allows one to have a set S = { S }, which standard set theory forbids. If you are interested in co-induction you should take a look at this. You can find a clear explanation of it in Barwise and Etchemendy's book The Liar (which I have read) and possibly also in Aczel's book Non Well-Founded Sets (which I haven't read).


[Other articles in category /prog] permanent link