The Universe of Discourse


Tue, 26 Apr 2022

What was wrong with SML?

[ I hope this article won't be too controversial. My sense is that SML is moribund at this point and serious ML projects that still exist are carried on in OCaml. But I do observe that there was a new SML/NJ version released only six months ago, so perhaps I am mistaken. ]

I recently wrote:

It was apparent that SML had some major problems. When I encountered Haskell around 1998 it seemed that Haskell at least had a story for how these problems might be fixed.

A reader wrote to ask:

I was curious what the major problems you saw with SML were.

I actually have notes about this that I made while I was writing the first article, and was luckily able to restrain myself from writing up at the time, because it would have been a huge digression. But I think the criticism is technically interesting and may provide some interesting historical context about what things looked like in 1995.

I had three main items in mind. Every language has problems, but these three seemed to me be the deep ones where a drastically different direction was needed.

Notation for types and expressions in this article will be a mishmash of SML, Haskell, and pseudocode. I can only hope that the examples will all be simple enough that the meaning is clear.

Mutation

Reference type soundness

It seems easy to write down the rules for type inference in the presence of references. This turns out not to be the case.

The naïve idea was: for each type α there is a corresponding type ref α, the type of memory cells containing a value of type α. You can create a cell with an initialized value by using the ref function: If v has type α, then ref v has type ref α and its value is a cell that has been initialized to contain the value v. (SML actually calls the type α ref, but the meaning is the same.)

The reverse of this is the operator ! which takes a reference of type ref α and returns the referenced value of type α.

And finally, if m is a reference, then you can overwrite the value stored in its its memory cell by saying with m := v. For example:

    m = ref 4          -- m is a cell containing 4
    m := 1 + !m        -- overwrite contents with 1+4
    print (2 * !m)     -- prints 10

The type rules seem very straightforward:

    ref   :: α → ref α
    (!)   :: ref α → α
    (:=)  :: ref α × α → unit

(Translated into Haskellese, that last one would look more like (ref α, α) → () or perhaps ref α → α → () because Haskell loves currying.)

This all seems clear, but it is not sound. The prototypical example is:

     m = ref (fn x ⇒ x)

Here m is a reference to the identity function. The identity function has type α → α, so variable m has type ref(α → α).

     m := not

Now we assign the Boolean negation operator to m. not has type bool → bool, so the types can be unified: m has type ref(α → α). The type elaborator sees := here and says okay, the first argument has type ref(α → α), the second has type bool → bool, I can unify that, I get α = bool, everything is fine.

Then we do

     print ((!m) 23)

and again the type checker is happy. It says:

  • m has type ref(α → α)
  • !m has type α → α
  • 23 has type int

amd that unifies, with α = int, so the result will have type int. Then the runtime blithely invokes the boolean not function on the argument 23. OOOOOPS.

SML's reference type variables

A little before the time I got into SML, this problem had been discovered and a patch put in place to prevent it. Basically, some type variables were ordinary variables, other types (distinguished by having names that began with an underscore) were special “reference type variables”. The ref function didn't have type α → ref α, it had type _α → ref _α. The type elaboration algorithm was stricter when specializing reference types than when specializing ordinary types. It was complicated, clearly a hack, and I no longer remember the details.

At the time I got out of SML, this hack been replaced with a more complicated hack, in which the variables still had annotations to say how they related to references, but instead of a flag the annotation was now a number. I never understood it. For details, see this section of the SML '97 documentation, which begins “The interaction between polymorphism and side-effects has always been a troublesome problem for ML.”

After this article was published, Akiva Leffert reminded me that SML later settled on a third fix to this problem, the “value restriction”, which you can read about in the document linked previously. (I thought I remembered there being three different systems, but then decided that I was being silly, and I must have been remembering wrong. I wasn't.)

Haskell's primary solution to this is to burn it all to the ground. Mutation doesn't cause any type problems because there isn't any.

If you want something like ref which will break purity, you encapsulate it inside the State monad or something like it, or else you throw up your hands and do it in the IO monad, depending on what you're trying to accomplish.

Scala has a very different solution to this problem, called covariant and contravariant traits.

Impure features more generally

More generally I found it hard to program in SML because I didn't understand the evaluation model. Consider a very simple example:

     map print [1..1000]

Does it print the values in forward or reverse order? One could implement it either way. Or perhaps it prints them in random order, or concurrently. Issues of normal-order versus applicative-order evaluation become important. SML has exceptions, and I often found myself surprised by the timing of exceptions. It has mutation, and I often found that mutations didn't occur in the order I expected.

Haskell's solution to this again is monads. In general it promises nothing at all about execution order, and if you want to force something to happen in a particular sequence, you use the monadic bind operator >>=. Peyton-Jones’ paper “Tackling the Awkward Squad” discusses the monadic approach to impure features.

Combining computations that require different effects (say, state and IO and exceptions) is very badly handled by Haskell. The standard answer is to use a stacked monadic type like IO ExceptionT a (State b) with monad transformers. This requires explicit liftings of computations into the appropriate monad. It's confusing and nonorthogonal. Monad composition is non-commutative so that IO (Error a) is subtly different from Error (IO a), and you may find you have the one when you need the other, and you need to rewrite a large chunks of your program when you realize that you stacked your monads in the wrong order.

My favorite solution to this so far is algebraic effect systems. Pretnar's 2015 paper “An Introduction to Algebraic Effects and Handlers” is excellent. I see that Alexis King is working on an algebraic effect system for Haskell but I haven't tried it and don't know how well it works.

Overloading and ad-hoc polymorphism

Arithmetic types

Every language has to solve the problem of 3 + 0.5. The left argument is an integer, the right argument is something else, let's call it a float. This issue is baked into the hardware, which has two representations for numbers and two sets of machine instructions for adding them.

Dynamically-typed languages have an easy answer: at run time, discover that the left argument is an integer, convert it to a float, add the numbers as floats, and yield a float result. Languages such as C do something similar but at compile time.

Hindley-Milner type languages like ML have a deeper problem: What is the type of the addition function? Tough question.

I understand that OCaml punts on this. There are two addition functions with different names. One, +, has type int × int → int. The other, +., has type float × float → float. The expression 3 + 0.5 is ill-typed because its right-hand argument is not an integer. You should have written something like int_to_float 3 +. 0.5.

SML didn't do things this way. It was a little less inconvenient and a little less conceptually simple. The + function claimed to have type α × α → α, but this was actually a lie. At compile time it would be resolved to either int × int → int or to float × float → float. The problem expression above was still illegal. You needed to write int_to_float 3 + 0.5, but at least there was only one symbol for addition and you were still writing + with no adornments. The explicit calls to int_to_float and similar conversions still cluttered up the code, sometimes severely

The overloading of + was a special case in the compiler. Nothing like it was available to the programmer. If you wanted to create your own numeric type, say a complex number, you could not overload + to operate on it. You would have to use |+| or some other identifier. And you couldn't define anything like this:

    def dot_product (a, b) (c, d) = a*c + b*d  -- won't work

because SML wouldn't know which multiplication and addition to use; you'd have to put in an explicit type annotation and have two versions of dot_product:

    def dot_product_int   (a : int,   b) (c, d) = a*c + b*d
    def dot_product_float (a : float, b) (c, d) = a*c + b*d

Notice that the right-hand sides are identical. That's how you can tell that the language is doing something stupid.

That only gets you so far. If you might want to compute the dot product of an int vector and a float vector, you would need four functions:

    def dot_product_ii (a : int,   b) (c, d) = a*c + b*d
    def dot_product_ff (a : float, b) (c, d) = a*c + b*d
    def dot_product_if (a,         b) (c, d) = (int_to_float a) * c + (int_to_float b)*d
    def dot_product_fi (a,         b) (c, d) = a * (int_to_float c) + b * (int_to_float d)

Oh, you wanted your vectors to maybe have components of different types? I guess you need to manually define 16 functions then…

Equality types

A similar problem comes up in connection with equality. You can write 3 = 4 and 3.0 = 4.0 but not 3 = 4.0; you need to say int_to_float 3 = 4.0. At least the type of = is clearer here; it really is α × α → bool because you can compare not only numbers but also strings, booleans, lists, and so forth. Anything, really, as indicated by the free variable α.

Ha ha, I lied, you can't actually compare functions. (If you could, you could solve the halting problem.) So the α in the type of = is not completely free; it mustn't be replaced by a function type. (It is also questionable whether it should work for real numbers, and I think SML changed its mind about this at one point.)

Here, OCaml's +. trick was unworkable. You cannot have a different identifier for equality comparisons at every different type. SML's solution was a further epicycle on the type system. Some type variables were designated “equality type variables”. The type of = was not α × α → bool but ''α × ''α → bool where ''α means that the α can be instantiated only for an “equality type” that admits equality comparisons. Integers were an equality type, but functions (and, in some versions, reals) were not.

Again, this mechanism was not available to the programmer. If your type was a structure, it would be an equality type if and only if all its members were equality types. Otherwise you would have to write your own synthetic equality function and name it === or something. If !!t!! is an equality type, then so too is “list of !!t!!”, but this sort of inheritance, beautifully handled in general by Haskell's type subclass feature, was available in SML only as a couple of hardwired special cases.

Type classes

Haskell dealt with all these issues reasonably well with type classes, proposed in Wadler and Blott's 1988 paper “How to make ad-hoc polymorphism less ad hoc”. In Haskell, the addition function now has type Num a ⇒ a → a → a and the equality function has type Eq a ⇒ a → a → Bool. Anyone can define their own instance of Num and define an addition function for it. You need an explicit conversion if you want to add it to an integer:

                    some_int + myNumericValue       -- No
    toMyNumericType some_int + myNumericValue       -- Yes

but at least it can be done. And you can define a type class and overload toMyNumericType so that one identifier serves for every type you can convert to your type. Also, a special hack takes care of lexical constants with no explicit conversion:

    23 + myNumericValue   -- Yes
                          -- (actually uses overloaded   fromInteger 23   instead)

As far as I know Haskell still doesn't have a complete solution to the problem of how to make numeric types interoperate smoothly. Maybe nobody does. Most dynamic languages with ad-hoc polymorphism will treat a + b differently from b + a, and can give you spooky action at a distance problems. If type B isn't overloaded, b + a will invoke the overloaded addition for type A, but then if someone defines an overloaded addition operator for B, in a different module, the meaning of every b + a in the program changes completely because it now calls the overloaded addition for B instead of the one for A.

In Structure and Interpretation of Computer Programs, Abelson and Sussman describe an arithmetic system in which the arithmetic types form an explicit lattice. Every type comes with a “promotion” function to promote it to a type higher up in the lattice. When values of different types are added, each value is promoted, perhaps repeatedly, until the two values are the same type, which is the lattice join of the two original types. I've never used anything like this and don't know how well it works in practice, but it seems like a plausible approach, one which works the way we usually think about numbers, and understands that it can add a float to a Gaussian integer by construing both of them as complex numbers.

[ Addendum 20220430: Phil Eaton informs me that my sense of SML's moribundity is exaggerated: “Standard ML variations are in fact quite alive and the number of variations is growing at the moment”, they said, and provided a link to their summary of the state of Standard ML in 2020, which ends with a recommendation of SML's “small but definitely, surprisingly, not dead community.” Thanks, M. Eaton! ]


[Other articles in category /prog/haskell] permanent link