Tue, 26 Apr 2022
[ I hope this article won't be too controversial. My sense is that SML is moribund at this point and serious ML projects that still exist are carried on in OCaml. But I do observe that there was a new SML/NJ version released only six months ago, so perhaps I am mistaken. ]
A reader wrote to ask:
I actually have notes about this that I made while I was writing the first article, and was luckily able to restrain myself from writing up at the time, because it would have been a huge digression. But I think the criticism is technically interesting and may provide some interesting historical context about what things looked like in 1995.
I had three main items in mind. Every language has problems, but these three seemed to me be the deep ones where a drastically different direction was needed.
Notation for types and expressions in this article will be a mishmash of SML, Haskell, and pseudocode. I can only hope that the examples will all be simple enough that the meaning is clear.
Reference type soundness
It seems easy to write down the rules for type inference in the presence of references. This turns out not to be the case.
The naïve idea was: for each type
The reverse of this is the operator
The type rules seem very straightforward:
(Translated into Haskellese, that last one would look more like
This all seems clear, but it is not sound. The prototypical example is:
Now we assign the Boolean negation operator to
Then we do
and again the type checker is happy. It says:
amd that unifies, with
SML's reference type variables
A little before the time I got into SML, this problem had been discovered and a patch
put in place to prevent it. Basically, some type variables were
ordinary variables, other types (distinguished by having names that began
with an underscore) were special “reference type variables”. The
At the time I got out of SML, this hack been replaced with a more complicated hack, in which the variables still had annotations to say how they related to references, but instead of a flag the annotation was now a number. I never understood it. For details, see this section of the SML '97 documentation, which begins “The interaction between polymorphism and side-effects has always been a troublesome problem for ML.”
After this article was published, Akiva Leffert reminded me that SML later settled on a third fix to this problem, the “value restriction”, which you can read about in the document linked previously. (I thought I remembered there being three different systems, but then decided that I was being silly, and I must have been remembering wrong. I wasn't.)
Haskell's primary solution to this is to burn it all to the ground. Mutation doesn't cause any type problems because there isn't any.
If you want something like
Scala has a very different solution to this problem, called covariant and contravariant traits.
Impure features more generally
More generally I found it hard to program in SML because I didn't understand the evaluation model. Consider a very simple example:
Does it print the values in forward or reverse order? One could implement it either way. Or perhaps it prints them in random order, or concurrently. Issues of normal-order versus applicative-order evaluation become important. SML has exceptions, and I often found myself surprised by the timing of exceptions. It has mutation, and I often found that mutations didn't occur in the order I expected.
Haskell's solution to this again is monads. In general it promises
nothing at all about execution order, and if you want to force
something to happen in a particular sequence, you use the monadic bind
Combining computations that require different effects (say, state
and IO and exceptions) is very badly handled by Haskell. The
standard answer is to use a stacked monadic type like
My favorite solution to this so far is algebraic effect systems. Pretnar's 2015 paper “An Introduction to Algebraic Effects and Handlers” is excellent. I see that Alexis King is working on an algebraic effect system for Haskell but I haven't tried it and don't know how well it works.
Overloading and ad-hoc polymorphism
Every language has to solve the problem of
Dynamically-typed languages have an easy answer: at run time, discover that the left argument is an integer, convert it to a float, add the numbers as floats, and yield a float result. Languages such as C do something similar but at compile time.
Hindley-Milner type languages like ML have a deeper problem: What is the type of the addition function? Tough question.
I understand that OCaml punts on this. There are two addition
functions with different names. One,
SML didn't do things this way. It was a little less inconvenient and a
little less conceptually simple. The
The overloading of
because SML wouldn't know which multiplication and addition to use;
you'd have to put in an explicit type annotation and have two versions
Notice that the right-hand sides are identical. That's how you can tell that the language is doing something stupid.
That only gets you so far. If you might want to compute the dot product of an int vector and a float vector, you would need four functions:
Oh, you wanted your vectors to maybe have components of different types? I guess you need to manually define 16 functions then…
A similar problem comes up in connection with equality. You can write
Ha ha, I lied, you can't actually compare functions. (If you could, you could
solve the halting problem.) So the α in the type of
mechanism was not available to the programmer. If your type was a
structure, it would be an equality type if and only if all its members
were equality types. Otherwise you would have to write your own
synthetic equality function and name it
Haskell dealt with all these issues reasonably well with type classes,
proposed in Wadler and Blott's 1988 paper
“How to make ad-hoc polymorphism less ad hoc”.
In Haskell, the addition function now has type
but at least it can be done. And you can define a type class and overload
As far as I know Haskell still doesn't have a complete solution to the
problem of how to make numeric types interoperate smoothly. Maybe
nobody does. Most dynamic languages with ad-hoc polymorphism will
In Structure and Interpretation of Computer Programs, Abelson and Sussman describe an arithmetic system in which the arithmetic types form an explicit lattice. Every type comes with a “promotion” function to promote it to a type higher up in the lattice. When values of different types are added, each value is promoted, perhaps repeatedly, until the two values are the same type, which is the lattice join of the two original types. I've never used anything like this and don't know how well it works in practice, but it seems like a plausible approach, one which works the way we usually think about numbers, and understands that it can add a float to a Gaussian integer by construing both of them as complex numbers.
[ Addendum 20220430: Phil Eaton informs me that my sense of SML's moribundity is exaggerated: “Standard ML variations are in fact quite alive and the number of variations is growing at the moment”, they said, and provided a link to their summary of the state of Standard ML in 2020, which ends with a recommendation of SML's “small but definitely, surprisingly, not dead community.” Thanks, M. Eaton! ]