The Universe of Discourse


Mon, 22 Oct 2018

Applicative WTF?

While I was writing up last week's long article about Traversable, I wrote this stuff about Applicative also. It's part of the story but I wasn't sure how to work it into the other narrative, so I took it out and left a remark that “maybe I'll publish a writeup of that later”. This is a disorganized collection of loosely-related paragraphs on that topic.

It concerns my attempts to create various class instance definitions for the following type:

    data Tree a = Con a | Add (Tree a) (Tree a)
        deriving (Eq, Show)

which notionally represents a type of very simple expression tree over values of type a.


I need some function for making Trees that isn't too simple or too complicated, and I went with:

    h n | n < 2 = Con n
    h n = if even n then Add (h (n `div` 2)) (h (n `div` 2))
                    else Add (Con 1) (h (n - 1))

which builds trees like these:

    2 = 1 + 1
    3 = 1 + (1 + 1)
    4 = (1 + 1) + (1 + 1)
    5 = 1 + ((1 + 1) + (1 + 1))
    6 = (1 + (1 + 1)) + (1 + (1 + 1))
    7 = 1 + (1 + (1 + 1)) + (1 + (1 + 1))
    8 = ((1 + 1) + (1 + 1)) + ((1 + 1) + (1 + 1))

Now I wanted to traverse h [1,2,3] but I couldn't do that because I didn't have an Applicative instance for Tree. I had been putting off dealing with this, but since Traversable doesn't really make sense without Applicative I thought the day of reckoning would come. Here it was. Now is when I learn how to fix all my broken monads.

To define an Applicative instance for Tree I needed to define pure, which is obvious (it's just Con) and <*> which would apply a tree of functions to a tree of inputs to get a tree of results. What the hell does that mean?

Well, I can kinda make sense of it. If I apply one function to a tree of inputs, that's straightforward, it's just fmap, and I get a tree of results. Suppose I have a tree of functions, and I replace the function at each leaf with the tree of its function's results. Then I have a tree of trees. But a tree that has trees at its leaves is just a tree. So I could write some tree-flattening function that builds the tree of trees, then flattens out the type. In fact this is just join that I already know from Monad world. The corresponding operation for lists takes a list of lists and flattens them into a single list.) Flattening a tree is quite easy to do:

    join (Con ta) = ta
    join (Add ttx tty) = Add (join ttx) (join tty)

and since this is enough to define a Monad instance for Tree I suppose it is enough to get an Applicative instance also, since every Monad is an Applicative. Haskell makes this a pain. It should be able to infer the Applicative from this, and I wasn't clever enough to do it myself. And there ought to be some formulaic way to get <*> from >>= and join and fmap, the way you can get join from >>=:

    join = (>>= id)

but I couldn't find out what it was. This gets back to my original complaint: Haskell now wants every Monad instance to be an instance of Applicative, but if I give it the fmap and the join and the return it ought to be able to figure out the Applicative instance itself instead of refusing to compile my program. Okay, fine, whatever. Haskell's gonna Hask.

(I later realized that building <*> when you have a Monad instance is easy once you know the recipe; it's just:

    fs <*> xs = do
      f <- fs
      x <- xs
      return (f x)

So again, why can't GHC infer <*> from my Monad instance, maybe with a nonfatal warning?

    Warning: No Applicative instance provided for Tree; deriving one from Monad

This is not a rhetorical question.)

(Side note: it seems like there ought to be a nice short abbreviation of the (<*>) function above, the way one can write join = (>>= id). I sought one but did not find any. One can eliminate the do notation to obtain the expression:

    fs <*> xs = fs >>= \f -> xs >>= \x -> return (f x)

but that is not any help unless we can simplify the expression with the usual tricks, such as combinatory logic and η-conversion. I was not able to do this, and the automatic pointfree converter produced (. ((. (return .)) . (>>=))) . (>>=) ARGH MY EYES.)

Anyway I did eventually figure out my <*> function for trees by breaking the left side into cases. When the tree of functions is Con f it's a single function and we can just use fmap to map it over the input tree:

    (Con f) <*> tv = fmap f tv

And when it's bigger than that we can break it up recursively:

    (Add lt rt) <*> tv = Add (lt <*> tv) (rt <*> tv)

Once this is written it seemed a little embarrassing that it took me so long to figure out what it meant but this kind of thing always seems easier from the far side of the fence. It's hard to understand until you understand it.

Actually that wasn't quite the <*> I wanted. Say we have a tree of functions and a tree of arguments.

three-node
tree diagram of the expression below
Add (Con (* 10))
    (Con (* 100))
5-node
tree diagram of the expression below
Add (Add (Con 3) (Con 4)) (Con 5)

I can map the whole tree of functions over each single leaf on the right, like this:

tree diagram
of the expression below, showing how each of the leaves of the second
tree has been replaced by a complete copy of the first tree.
The complete tree has five 'Add' nodes and six leaves with values 30,
300, 40, 400, 50, 500.
Add (Add (Add (Con 30) (Con 300))
         (Add (Con 40) (Con 400)))
    (Add (Con 50) (Con 500))

or I can map each function over the whole tree on the right, like this:

tree diagram
of the expression below, showing how each of the leaves of the first
tree has been replaced by a complete copy of the second tree.
As before, the complete tree has five 'Add' nodes and six leaves with
the same values, but this time the structure is different and the
leaves are grouped by length instead of by leading digit.
Add
  (Add (Add (Con 30)  (Con 40))  (Con 50))
  (Add (Add (Con 300) (Con 400)) (Con 500))

The code I showed earlier does the second of those. You can see it from the fmap f tv expression, which takes a single function and maps it over a whole tree of values. I had actually wanted the other one, but there isn't anything quite like fmap for that. I was busy trying to understand Applicative and I was afraid if I got distracted trying to invent a reverse fmap I might lose the thread. This happens to me a lot with Haskell. I did eventually go back and figure it out. The reverse fmap is

    pamf fs v = fmap ($ v) fs      -- good

or

    pamf = flip (fmap . flip id)   -- yuck

Now there's a simple answer to this which occurs to me now that I didn't think of before, but I'm going to proceed with how I planned to do it before, with pamf. The <*> that I didn't want looked like this:

    (Con f) <*> tv = fmap f tv
    (Add lt rt) <*> tv = Add (lt <*> tv) (rt <*> tv)

I need to do the main recursion on the values argument instead of on the functions argument:

    tf <*> (Con v)    = pamf tf v
       where pamf fs v = fmap ($ v) fs
    tf <*> (Add lv rv) = Add (tf <*> lv) (tf <*> rv)           

(This is an interesting example: usually the base case is trivial and the recursive clause is harder to write, but this time it's the base case that's not perfectly straightforward.)

Anyway, this worked, but there was an easier solution at hand. The difference between the first version and the second is exactly the same as the difference between

        fs <*> xs = do
          f <- fs
          x <- xs
          return (f x)

and

        fs <*> xs = do
          x <- xs
          f <- fs
          return (f x)

Digging deeper into why this worked this way was interesting, but it's bed time, so I'm going to cut the scroll here.

[ Addendum 20221021: More about the two versions of <*> and a third version that doesn't work. ]


[Other articles in category /prog/haskell] permanent link