Archive:
In this section: Subtopics:
Comments disabled |
Tue, 01 Oct 2019
How do I keep type constructors from overrunning my Haskell program?
Here's a little function I wrote over the weekend as part of a suite for investigating Yahtzee:
I don't claim this code is any good; I was just hacking around exploring the problem space. But it does do what I wanted. The
which means that we have two rolls remaining in the round, and the most recent roll of the five dice showed 6, 4, 4, 3, and 1, respectively. It also takes a choice of which dice to keep: The list
means to keep the 4's and reroll the 6, the 3, and the 1.
The
This function was not hard to write and it did work adequately. But I wasn't satisfied. What if I have some unrelated integer list
and I pass it to a function that is expecting a
The declared type of
But now I need to rewrite
This still compiles and it still produces the results I want. And it
has the type checking I want. I can no longer pass a raw integer
list, or any other isomorphic type, to I could rename
And I can do something similar on the output side also:
This is not unreasonably longer or more cluttered than the original
code. It does forgo type checking inside of Is this considered The Thing To Do? And if so, where could I have learned this, so that I wouldn't have had to invent it? (Or, if not, where could I have learned whatever is The Thing To Do?) I find most Haskell instruction on the Internet to be either too elementary
or too advanced
with very little practical advice about how to write, you know, an actual program. Where can I find some? [Other articles in category /prog/haskell] permanent link Fri, 09 Nov 2018
Why I never finish my Haskell programs (part 3 of ∞)
I'm doing more work on matrix functions. A matrix represents a
relation, and I am representing a matrix as a $$ \require{enclose} \begin{pmatrix}1&2\\3&4\end{pmatrix}\enclose{circle}{\oplus} \begin{pmatrix}10&20\\30&40\end{pmatrix}= \begin{pmatrix} 11 & 21 & 12 & 22 \\ 31 & 41 & 32 & 42 \\ 13 & 23 & 14 & 24 \\ 33 & 43 & 34 & 44 \end{pmatrix} $$ No problem, this is what
gives But my matrices are nested lists, so I need to apply the
(The Now
does indeed produce the result I want, except that the type markers are still in there: instead of
I get
No problem, I'll just use
And now matrix addition is finished:
This works perfectly. But the
into this:
with the idea that I will now have The first sign that something is going wrong is that
where I was hoping for something more like this:
which is not reasonable to expect: how can Haskell be expected to
figure out I wanted two diferent functors in there when there is only one
Telling GHC explicitly what type I want for I get the same answers that Haskell got, but I can't see where the difference was coming from. So now, instead of defining matrix operations, I am looking into the
type unification algorithm and trying to figure out why And that is yet another reason why I never finish my Haskell programs. (“What do you mean, λ-abstraction didn't work?”) [Other articles in category /prog/haskell] permanent link Thu, 08 Nov 2018
Haskell type checker complaint 184 of 698
I want to build an adjacency matrix for the vertices of a cube; this
is a matrix that has
This compiles and GHC infers the type
Fine. Now I want to build the adjacency matrix, which is completely straightforward:
Ha ha, no it isn't; in Haskell nothing is straightforward. This
produces 106 lines of type whining, followed by a failed compilation.
Apparently this is because because To fix this I have to say explicitly what I mean by
Here's another way I could accomplish this:
Or how about this?
I think there must be something really wrong with the language design here. I don't know exactly what it is, but I think someone must have made the wrong tradeoff at some point. [Other articles in category /prog/haskell] permanent link Fri, 26 Oct 2018
A snide addendum about implicit typeclass instances
In an earlier article I demanded:
“This” being that instead of raising a type error, Haskell quietly accepts this nonsense:
but it clutches its pearls and faints in horror when confronted with this expression:
Nobody did explain this. But I imagined
someone earnestly explaining: “Okay, but in the first case, the
Yeah, yeah, I know that. Hey, you know what else is a functor? The
identity functor. If I understand this is a terrible idea. To be clear, what I want is for it to collapse on the divan for both expressions. Pearl-clutching is Haskell's finest feature and greatest strength, and it should do it whenever possible. [Other articles in category /prog/haskell] permanent link Tue, 23 Oct 2018
Getting Applicatives from Monads and “>>=” from “join”
I conplained recently about GHC not being able to infer an
Applicative instance from a type that already has a Monad
instance, and there is a related complaint that the Monad instance
must define But both of these problems can be worked around. If I have a Monad instance, it seems to work just fine if I say:
Where this code is completely canned, the same for every Monad. And if I know
I suppose these might faul foul of whatever problem is being described in the documents I linked above. But I'll either find out, or I won't, and either way is a good outcome. [ Addendum: Vaibhav Sagar points out that my definition of [Other articles in category /prog/haskell] permanent link Mon, 22 Oct 2018While I was writing up last week's long article about Traversable, I wrote this stuff about Applicative also. It's part of the story but I wasn't sure how to work it into the other narrative, so I took it out and left a remark that “maybe I'll publish a writeup of that later”. This is a disorganized collection of loosely-related paragraphs on that topic. It concerns my attempts to create various class instance definitions for the following type:
which notionally represents a type of very simple expression tree over values of type a. I need some function for making
which builds trees like these:
Now I wanted to To define an Applicative instance for Well, I can kinda make sense of it. If I apply one function to a
tree of inputs, that's straightforward, it's just
and since this is enough to define a Monad instance for
but I couldn't find out what it was. This gets back to my original
complaint: Haskell now wants every Monad instance to be an instance
of Applicative, but if I give it the (I later realized that building
So again, why can't GHC infer
This is not a rhetorical question.) (Side note: it seems like there ought to be a nice short abbreviation
of the
but that is not any help unless we can simplify the expression with
the usual tricks, such as combinatory logic and η-conversion. I was
not able to do this, and the automatic pointfree
converter produced
Anyway I did eventually figure out my
And when it's bigger than that we can break it up recursively:
Once this is written it seemed a little embarrassing that it took me so long to figure out what it meant but this kind of thing always seems easier from the far side of the fence. It's hard to understand until you understand it. Actually that wasn't quite the
I can map the whole tree of functions over each single leaf on the right, like this:
or I can map each function over the whole tree on the right, like this:
The code I showed earlier does the second of those. You can see it from
the
or
Now there's a simple answer to this which occurs to me now that I
didn't think of before, but I'm going to proceed with how I planned to
do it before, with
I need to do the main recursion on the values argument instead of on the functions argument:
(This is an interesting example: usually the base case is trivial and the recursive clause is harder to write, but this time it's the base case that's not perfectly straightforward.) Anyway, this worked, but there was an easier solution at hand. The difference between the first version and the second is exactly the same as the difference between
and
Digging deeper into why this worked this way was interesting, but it's bed time, so I'm going to cut the scroll here. [Other articles in category /prog/haskell] permanent link Sat, 20 Oct 2018
I struggle to understand Traversable
Haskell evolved a lot since the last time I seriously wrote any
Haskell code, so much so that all my old programs broke. My Monad
instances don't compile any more because I'm no longer allowed to
have a monad which isn't also an instance of Applicative. Last time I used
Haskell, Applicative wasn't even a thing. I had read the McBride and
Paterson paper that introduced applicative functors, but that was
years ago, and I didn't remember any of the details. (In fact, while
writing this article, I realized that the paper I read was a preprint,
and I probably read it before it was published, in 2008.) So to
resuscitate my old code I had to implement a bunch of Anyway I got that more or less under control (maybe I'll publish a
writeup of that later) and moved on to Traversable which, I hadn't realized
before, was also introduced in that same paper. (In the
prepublication version, Traversable had been given the unmemorable name
The traversable functor itself here is The first thing to try here is to make it less abstract. I was thinking about Traversable this time because I thought I might want it for a certain type of tree structure I was working with. So I defined an even simpler tree structure:
Defining a bunch of other cases wouldn't add anything to my understanding, and it would make it take longer to try stuff, so I really want to use the simplest possible example here. And this is it: one base case, one recursive case. Then I tried to make this type it into a Traversable instance. First we need it to be a Functor, which is totally straightforward:
Then we need it to be a Foldable, which means it needs to provide a
version of
but these days the list functor in the third place has been generalized:
The idea is that
and
The canonical examples for lists are:
(add up the elements, starting with zero) and
(ignore the elements, adding 1 to the total each time, starting with
zero). Also Anyway for
The I didn't write this off the top of my head, I got it by following the types, like this:
It turns out it is easier and more straightforward to write
and here I was stumped. What is this supposed to actually do?
For our
Okay, a function I scratched my head and read a bunch of different explanations and none of them helped. All the descriptions I found were in either prose or mathematics and I still couldn't figure out what it was for. Finally I just wrote a bunch of examples and at last the light came on. I'm going to show you the examples and maybe the light will come on for you too. We need two Traversable functors to use as examples. We don't have a Traversable
implementation for
Okay, I think I could have guessed that just from the types. And
going the other way is not very interesting because the output, being
a
If the !!x!! is even then the result is just half of !!x!!, and otherwise the division by 2 “fails” and the result is nothing. Now:
It took me a few examples to figure out what was going on here: When
all the list elements are even, the result is That pretty much exhausts what can be done with lists and maybes. Now
I have two choices about where to go next: I could try making both
functors
In the
which not only type checks but looks like it could even be correct.
So now I have a motto for what Which, now that I have said it myself, I realize it is exactly what
everyone else was trying to tell me all along: normal function
application takes an Okay, I can listen all day to an explanation of what an electric drill does, but until I hold it in my hand and drill some holes I don't really understand. Encouraged, I tried the hard clause:
and this time I had a roadmap to follow:
The
Clearly
let's try that:
This looks plausible. It compiles, so it must be doing something.
Partial victory! But what is it doing? We can run it and see, which
was the whole point of an exercise: work up a Traversable instance for Here are some example trees:
(I also tried First we'll try
but So try:
which yields:
It keeps the existing structure, and applies But where does that spoilage behavior come from exactly? It comes
from the overloaded behavior of
Once we get a I think that's one way to think of Now let's try the next-simplest Applicative, which is
Now
This is where the light finally went on for me. Instead of thinking
of lists as lists, I should be thinking of them as choices. A list
like The Traversing Now I finally understand how the
I asked “how the hell do I turn a tree of lists into a single list
of Okay! And indeed
and ending
That was traversing a list function over a What other functors do I know? One easy one is the functor that takes
type
Huh, I don't know what I was expecting but I think that wouldn't have
been it. But I figured out what was going on: the built-in Applicative
instance for the
But if we wanted it to multiply instead we could use the potato label,
which is called
There are three leaves, so we multiply three sevens and get 343. Or we could do the same sort of thing on a
Here instead of multiplying together a bunch of sevens we multiply together the leaf values themselves. The McBride and Paterson paper spends a couple of pages talking about
traversals over monoids, and when I saw the example above it started
to make more sense to me. And their
There's another useful way to traverse a list function. Instead of taking each choice at each leaf we make a single choice ahead of time about whether we'll take the first, second, or third menu item, and then we take that item every time:
There's a built-in instance for
Okay, I think I got it. Now I just have to drill some more holes. [Other articles in category /prog/haskell] permanent link Mon, 15 Oct 2018
'The' reader monad does not exist
Reading over my recent article complaining about the environment functor I realized there's yet another terminology problem that makes the discussion unnecessarily confusing. “The” environment functor isn't unique. There is a family of environment functors, one for each possible environment type e. If g is the environment functor at type e, a value of type g t is a function e → t. But e could be anything and if g and h are environment functors at two different types e and e’ they are of course different functors. This is even obvious from the definition:
The functor isn't We should speak of
I should have said:
And instead of:
I should have said:
or
although I'm not sure I like the way the prepositions are proliferating there. The same issue affects ⸢the⸣ reader monad, ⸢the⸣ state monad, and many others. I'm beginning to find remarkable how much basic terminology Haskell is missing or gets wrong. Mathematicians have a very keen appreciation of the importance of specific and precise terminology, and you'd think this would have filtered into the Haskell world. People are forever complaining that Haskell uses unfamiliar terms like “functor”, and the community's response is (properly, I think) that these terms are pre-existing and there is no point to inventing a new term that will be just as unfamiliar, or, worse, lure people into thinking that the know what it means when they don't. You don't want to call a functor a “container”, says the argument, because many functors (environment functors for example) are nothing at all like containers. I think this is wise. But having planted their flag on that hill, the Haskell folks don't
then use their own terminology correctly. I complained years
ago that the term
“monad” was used interchangeably for four subtly different concepts,
and here we actually have a fifth. I pointed out that in the case of
[Other articles in category /prog/haskell] permanent link Thu, 11 Oct 2018
The more I think about “parcel” the happier I am with it. It strongly
suggests container types, of course, so that a
I coined “parcel” thinking that one would want different terminology
for values of type [Other articles in category /prog/haskell] permanent link
I hate the environment functor
Here we have the well-known
It takes a single function and a (collection of input values / decorated input value / something something input value) and produces a (collection of output values / decorated output value / something something output value). Yow, that's not going to work. Is there any good terminology for a
value of type Starting over then. Here we have the well-known
It takes a single function, and an Here is a sort of reversed version of
It takes a parcel of functions, and a single input and produces a
parcel of outputs, by applying each function in the parcel
independently to the single
So far so good. Now I ask you to predict the type of
Certainly it should start out with
because the
The
and lifts it to a new function that operates in the
Here it has taken
and lifted it to a new function that operates in the
This is complicated but straightforward. Okay, that was
and when I saw this I said “What. Where did Then I paused and for a while and said “… I bet it's that goddamn environment thing again.” Yep, that's what it was. It's the environment functor, always turning up where I don't want it and least expect it, like that one guy we all went to college with. The environment functor, by the way, is yet another one of those things that Haskell ought to have a standard name for, but doesn't. The phrase “the reader monad” is fairly common, but here I only want the functor part of the monad. And people variously say “reader monad”, “environment monad”, and “evaluation monad” to mean the same thing. In this article, it will be the environment functor. Here's what happened. Here are
The first argument to When operating in the environment functor,
or shorter and more mysteriously
which follows by η-reduction, something Haskell enthusiasts never seem to get enough of. In In the application
so it can be understood as a parcel in the environment functor, where
the environment We wanted
and since Haskell has decided that
To apply this to
Where did The funny thing about the type of
and indeed, by some theorem or other, because the types are identical,
the functions themselves must be identical also! (There are some side
conditions, all of which hold here.) The two functions
Or, cleaning up some superfluous parentheses and inserting some new ones:
And putting !!c = p\to q!!:
Honestly, I would have preferred a type error: “Hey, dummy,
I mean, seriously, suppose you wrote [ Addendum 20181111: Apparently, everyone else hates it too. ] [Other articles in category /prog/haskell] permanent link Sat, 08 Sep 2018
Why I never finish my Haskell programs (part 2 of ∞)
Here's something else that often goes wrong when I am writing a Haskell program. It's related to the problem in the previous article but not the same. Let's say I'm building a module for managing polynomials. Say
Now clearly this is going to be a functor, so I define the Functor instance, which is totally straightforward:
Then I ask myself if it is also going to be an Applicative.
Certainly the
But what about
The first argument there is a polynomial whose coefficients are functions. This is not something we normally deal with. That ought to be the end of the matter. But instead I pursue it just a little farther. Suppose we did have such an object. What would it mean to apply a functional polynomial and an ordinary polynomial? Do we apply the functions on the left to the coefficients on the right and then collect like terms? Say for example $$\begin{align} \left((\sqrt\bullet) \cdot x + \left(\frac1\bullet\right) \cdot 1 \right) ⊛ (9x+4) & = \sqrt9 x^2 + \sqrt4 x + \frac19 x + \frac14 \\ & = 3x^2 + \frac{19}{9} x + \frac 14 \end{align}$$ Well, this is kinda interesting. And it would mean that the
Then the ⊛ can be understood to be just like polynomial
multiplication, except that coefficients are combined with function
composition instead of with multiplication. The operation is
associative, as one would hope and expect, and even though the ⊛
operation is not commutative, it has a two-sided identity element,
which is This is different from the failure mode of the previous article because in that example I was going down a Haskell rabbit hole of more and more unnecessary programming. This time the programming is all trivial. Instead, I've discovered a new kind of mathematical operation and I abandon the programming entirely and go off chasing a mathematical wild goose. [ Addendum 20181109: Another one of these. ] [Other articles in category /prog/haskell] permanent link Mon, 03 Sep 2018
Why I never finish my Haskell programs (part 1 of ∞)
Whenever I try to program in Haskell, the same thing always goes wrong. Here is an example. I am writing a module to operate on polynomials. The polynomial !!x^3 - 3x + 1!! is represented as
[ Addendum 20180904: This is not an error. The !!x^3!! term is last, not first. Much easier that way. Fun fact: two separate people on Reddit both commented that I was a dummy for not doing it the easy way, which is the way I did do it. Fuckin' Reddit, man. ] I want to add two polynomials. To do this I just add the corresponding coefficients, so it's just
Except no, that's wrong, because it stops too soon. When the lists
are different lengths,
and I can write this off the top of my head. But do I? No, this is where things go off the rails. “I ought to be
able to generalize this,” I say. “I can define a function like
as long as there is a suitable Monoid instance for the I could write So do I write Then I open a new file and start writing
And I go father and farther down the rabbit hole and I never come back
to what I was actually working on. Maybe the next step in this
descent into madness is that I start thinking about how to perform
unification of arbitrary algebraic data structures, I abandon Actually when I try to program in Haskell there a lot of things that go wrong and this is only one of them, but it seems like this one might be more amenable to a quick fix than some of the other things. [ Addendum 20180904: A lobste.rs
user
points out that I don't need Monoid, but only Semigroup, since
I don't need [ Addendum 20181109: More articles in this series: [2] [3] ] [Other articles in category /prog/haskell] permanent link Wed, 08 Aug 2018In my original article, I said:
Jeremy Yallop brought up an example that I had definitely seen before. In 2008 Conor McBride and Ross Paterson wrote an influential paper, “Idioms: applicative programming with effects” that introduced the idea of an applicative functor, a sort of intermediate point between functors and monads. It has since made its way into standard Haskell and was deemed sufficiently important to be worth breaking backward compatibility. McBride and Paterson used several notations for operations in an
applicative functor. Their primary notation was !!\iota!! for what is
now known as $$\iota f \circledast is_1 \circledast \ldots \circledast is_n$$ came up so often they wanted a less cluttered notation for it:
On page 5, they suggested an exercise:
They give a hint, intended to lead the reader to the solution, which
involves a function named
and have it mean
The haskell wiki has details, written by Don Stewart when the McBride-Paterson paper was still in preprint. The wiki goes somewhat further, also defining
so that
now does a I have certainly read this paper more than once, and I was groping for this example while I was writing the original article, but I couldn't quite put my finger on it. Thank you, M. Yallop! [ By the way, I am a little bit disappointed that the haskell wiki is not called “Hicki”. ] [Other articles in category /prog/haskell] permanent link In the previous article I described a rather odd abuse of the Haskell type system to use a singleton type as a sort of pseudo-keyword, and asked if anyone had seen this done elsewhere. Joachim Breitner reported having seen this before. Most recently in
LiquidHaskell, which defines a
so that they can end every proof with
This example is from Vazou et al., Functional Pearl: Theorem Proving
for All, p. 3. The authors
explain: “The Or see the examples from the bottom of the LH splash
page, proving the
associative law for I looked in the rest of the LiquidHaskell distribution but did not find any other uses of the singleton-type trick. I would still be interested to see more examples. [ Addendum: Another example. ] [Other articles in category /prog/haskell] permanent link
Is this weird Haskell technique something I made up?
A friend asked me the other day about techniques in Haskell to pretend
to make up keywords. For example, suppose we want something like a
(monadic)
This uses a condition Now suppose for whatever reason we don't like writing it as
Now we can write
and the But then I had a surprising idea. We can define it this way:
Now we write
and if we omit or misspell the For a less trivial (but perhaps sillier) example, consider:
The idea here is that we want to try a computation, and do one thing
if it succeeds and another if it throws an exception. The point is
not the usefulness of this particular and somewhat contrived exception
handling construct, it's the syntactic sugar of the
I was fairly confident I had seen something like this somewhere before, and that it was not original to me. But I've asked several Haskell experts and nobody has said it was familar. I thought perhaps I had seen it somewhere in Brent Yorgey's code, but he vehemently denied it. So my question is, did I make up this technique of using a one-element type as a pretend keyword? [ Addendum: At least one example of this trick appears in LiquidHaskell. I would be interested to hear about other places it has been used. ] [ Addendum: Jeremy Yallop points out that a similar trick was hinted at in McBride and Paterson “Idioms: applicative programming with effects” (2008), with which I am familiar, although their trick is both more useful and more complex. So this might have been what I was thinking of. ] [Other articles in category /prog/haskell] permanent link Fri, 24 Apr 2015
Easy exhaustive search with the list monad
(Haskell people may want to skip this article about Haskell, because the technique is well-known in the Haskell community.) Suppose you would like to perform an exhaustive search. Let's say for concreteness that we would like to solve this cryptarithm puzzle:
This means that we want to map the letters (This is not an especially difficult example; my 10-year-old daughter Katara was able to solve it, with some assistance, in about 30 minutes.) If I were doing this in Perl, I would write up either a recursive descent search or a solution based on a stack or queue of partial solutions which the program would progressively try to expand to a full solution, as per the techniques of chapter 5 of Higher-Order Perl. In Haskell, we can use the list monad to hide all the searching machinery under the surface. First a few utility functions:
Now the solution to the problem is:
Let's look at just the first line of this:
The
where “…” is the rest of the block. To expand this further, we need
to look at the overloading for
where “…” is the rest of the block. So the variable The next line is the same:
for each of the nine possible values for
This is two more nested loops.
At this point the value of
Three more nested loops and another computation.
Yet another nested loop and a final computation.
This is the business end. I find
which is equivalent to:
which means that the values in the list returned by If But if The result is that if we have found a solution at this point, a list
containing it is returned, to be concatenated into the list of all
solutions that is being constructed by the nested After a few seconds, Haskell generates and tests 1.36 million choices for the eight bindings, and produces the unique solution:
That is:
It would be an interesting and pleasant exercise to try to implement
the same underlying machinery in another language. I tried this in
Perl once, and I found that although it worked perfectly well, between
the lack of the [ Addendum: Thanks to Tony Finch for pointing out the η-reduction I missed while writing this at 3 AM. ] [ Addendum: Several people so far have misunderstood the question
about Python in the last paragraph. The question was not to implement
an exhaustive search in Python; I had no doubt that it could be done
in a simple and clean way, as it can in Perl. The question was to
implement the same underlying machinery, including the list monad
and its [ Peter De Wachter has written in with a Python solution that clearly demonstrates that the problems I was worried about will not arise, at least for this task. I hope to post his solution in the next few days. ] [ Addendum 20150803: De Wachter's solution and one in Perl ] [Other articles in category /prog/haskell] permanent link Thu, 26 Aug 2010
Monad terminology problem
The most serious problem here is #4, that people refer to individual values of monadic types as "monads". Even when they don't do this, they are hampered by the lack of a good term for it. As I know no good alternative has been proposed. People often say "monadic value" (I think), which is accurate, but something of a mouthful. One thing I have discovered in my writing life is that the clarity of a confusing document can sometimes be improved merely by replacing a polysyllabic noun phrase with a monosyllable. For example, chapter 3 of Higher-Order Perl discussed the technique of memoizing a function by generating an anonymous replacement for it that maintains a cache and calls the real function on a cache miss. Early drafts were hard to understand, and improved greatly when I replaced the phrase "anonymous replacement function" with "stub". The Perl documentation was significantly improved merely by replacing "associative array" everywhere with "hash" and "funny punctuation character" with "sigil". I think a monosyllabic replacement for "monadic value" would be a similar boon to discussion of monads, not just for beginners but for everyone else too. The drawback, of introducing yet another jargon term, would in this case be outweighed by the benefits. Jargon can obscure, but sometimes it can clarify. The replacement word should be euphonious, clear but not overly specific, and not easily confused with similar jargon words. It would probably be good for it to begin with the letter "m". I suggest: So return takes a value and returns a mote. The >>= function similarly lifts a function on pure values to a function on motes; when the mote is a container one may think of >>= as applying the function to the values in the container. [] is a monad, so lists are motes. The expression on the right-hand side of a var ← expr in a do-block must have mote type; it binds the mote on the right to the name on the left, using the >>= operator. I have been using this term privately for several months, and it has been a small but noticeable success. Writing and debugging monadic programs is easier because I have a simple name for the motes that the program manipulates, which I can use when I mumble to myself: "What is the type error here? Oh, commit should be returning a mote." And then I insert return in the right place. I'm don't want to oversell the importance of this invention. But there is clearly a gap in the current terminology, and I think it is well-filled by "mote". (While this article was in progress I discovered that What a Monad is not uses the nonceword "mobit". I still prefer "mote".)
[Other articles in category /prog/haskell] permanent link Sun, 03 Jan 2010
A short bibliography of probability monads
I did not imagine that my idea was a new one. I arrived at it by thinking about List as a representation of non-deterministic computation. But if you think of it that way, the natural interpretation is that every list element represents an equally likely outcome, and so annotating the list elements with probabilities is the obvious next step. So the existence of the Erwig library was not a big surprise. A little more surprising though, were the references in the Erwig paper. Specifically, the idea dates back to at least 1981; Erwig cites a paper that describes the probability monad in a pure-mathematics context. Nobody responded to my taunting complaint about Haskell's failure to provide support a good monad of sets. It may be that this is because they all agree with me. (For example, the documentation of the Erwig package says "Unfortunately we cannot use a more efficient data structure because the key type must be of class Ord, but the Monad class does not allow constraints for result types.") But a number of years ago I said that the C++ macro processor blows goat dick. I would not have put it so strongly had I not naïvely believed that this was a universally-held opinion. But no, plenty of hapless C++ programmers wrote me indignant messages defending their macro system. So my being right is no guarantee that language partisans will not dispute with me, and the Haskell community's failure to do so in this case reflects well on them, I think.
[Other articles in category /prog/haskell] permanent link Thu, 31 Dec 2009
A monad for probability and provenance
Suppose a monad value represents all the possible outcomes of an event, each with a probability of occurrence. For concreteness, let's suppose all our probability distributions are discrete. Then we might have: data ProbDist p a = ProbDist [(a,p)] deriving (Eq, Show) unpd (ProbDist ps) = psEach a is an outcome, and each p is the probability of that outcome occurring. For example, biased and unbiased coins:
unbiasedCoin = ProbDist [ ("heads", 0.5), ("tails", 0.5) ]; biasedCoin = ProbDist [ ("heads", 0.6), ("tails", 0.4) ]; Or a couple of simple functions for making dice:
import Data.Ratio d sides = ProbDist [(i, 1 % sides) | i <- [1 .. sides]] die = d 6
d n is an n-sided die. The Functor instance is straightforward:
instance Functor (ProbDist p) where fmap f (ProbDist pas) = ProbDist $ map (\(a,p) -> (f a, p)) pasThe Monad instance requires return and >>=. The return function merely takes an event and turns it into a distribution where that event occurs with probability 1. I find join easier to think about than >>=. The join function takes a nested distribution, where each outcome of the outer distribution specifies an inner distribution for the actual events, and collapses it into a regular, overall distribution. For example, suppose you put a biased coin and an unbiased coin in a bag, then pull one out and flip it:
bag :: ProbDist Double (ProbDist Double String) bag = ProbDist [ (biasedCoin, 0.5), (unbiasedCoin, 0.5) ]The join operator collapses this into a single ProbDist Double String:
ProbDist [("heads",0.3), ("tails",0.2), ("heads",0.25), ("tails",0.25)]It would be nice if join could combine the duplicate heads into a single ("heads", 0.55) entry. But that would force an Eq a constraint on the event type, which isn't allowed, because (>>=) must work for all data types, not just for instances of Eq. This is a problem with Haskell, not with the monad itself. It's the same problem that prevents one from making a good set monad in Haskell, even though categorially sets are a perfectly good monad. (The return function constructs singletons, and the join function is simply set union.) Maybe in the next language. Perhaps someone else will find the >>= operator easier to understand than join? I don't know. Anyway, it's simple enough to derive once you understand join; here's the code:
instance (Num p) => Monad (ProbDist p) where return a = ProbDist [(a, 1)] (ProbDist pas) >>= f = ProbDist $ do (a, p) <- pas let (ProbDist pbs) = f a (b, q) <- pbs return (b, p*q)So now we can do some straightforward experiments:
liftM2 (+) (d 6) (d 6) ProbDist [(2,1 % 36),(3,1 % 36),(4,1 % 36),(5,1 % 36),(6,1 % 36),(7,1 % 36),(3,1 % 36),(4,1 % 36),(5,1 % 36),(6,1 % 36),(7,1 % 36),(8,1 % 36),(4,1 % 36),(5,1 % 36),(6,1 % 36),(7,1 % 36),(8,1 % 36),(9,1 % 36),(5,1 % 36),(6,1 % 36),(7,1 % 36),(8,1 % 36),(9,1 % 36),(10,1 % 36),(6,1 % 36),(7,1 % 36),(8,1 % 36),(9,1 % 36),(10,1 % 36),(11,1 % 36),(7,1 % 36),(8,1 % 36),(9,1 % 36),(10,1 % 36),(11,1 % 36),(12,1 % 36)]This is nasty-looking; we really need to merge the multiple listings of the same event. Here is a function to do that:
agglomerate :: (Num p, Eq b) => (a -> b) -> ProbDist p a -> ProbDist p b agglomerate f pd = ProbDist $ foldr insert [] (unpd (fmap f pd)) where insert (k, p) [] = [(k, p)] insert (k, p) ((k', p'):kps) | k == k' = (k, p+p'):kps | otherwise = (k', p'):(insert (k,p) kps) agg :: (Num p, Eq a) => ProbDist p a -> ProbDist p a agg = agglomerate idThen agg $ liftM2 (+) (d 6) (d 6) produces:
ProbDist [(12,1 % 36),(11,1 % 18),(10,1 % 12),(9,1 % 9), (8,5 % 36),(7,1 % 6),(6,5 % 36),(5,1 % 9), (4,1 % 12),(3,1 % 18),(2,1 % 36)]Hey, that's correct. There must be a shorter way to write insert. It really bothers me, because it looks look it should be possible to do it as a fold. But I couldn't make it look any better. You are not limited to calculating probabilities. The monad actually will count things. For example, let us throw three dice and count how many ways there are to throw various numbers of sixes:
eq6 n = if n == 6 then 1 else 0
agg $ liftM3 (\a b c -> eq6 a + eq6 b + eq6 c) die die die
ProbDist [(3,1),(2,15),(1,75),(0,125)]
There is one way to throw three sixes, 15 ways to throw two sixes, 75
ways to throw one six, and 125 ways to throw no sixes. So
ProbDist is a misnomer. It's easy to convert counts to probabilities:
probMap :: (p -> q) -> ProbDist p a -> ProbDist q a probMap f (ProbDist pds) = ProbDist $ (map (\(a,p) -> (a, f p))) pds normalize :: (Fractional p) => ProbDist p a -> ProbDist p a normalize pd@(ProbDist pas) = probMap (/ total) pd where total = sum . (map snd) $ pas normalize $ agg $ probMap toRational $ liftM3 (\a b c -> eq6 a + eq6 b + eq6 c) die die die ProbDist [(3,1 % 216),(2,5 % 72),(1,25 % 72),(0,125 % 216)]I think this is the first time I've gotten to write die die die in a computer program. The do notation is very nice. Here we calculate the distribution where we roll four dice and discard the smallest:
stat = do a <- d 6 b <- d 6 c <- d 6 d <- d 6 return (a+b+c+d - minimum [a,b,c,d]) probMap fromRational $ agg stat ProbDist [(18,1.6203703703703703e-2), (17,4.1666666666666664e-2), (16,7.253086419753087e-2), (15,0.10108024691358025), (14,0.12345679012345678), (13,0.13271604938271606), (12,0.12885802469135801), (11,0.11419753086419752), (10,9.41358024691358e-2), (9,7.021604938271606e-2), (8,4.7839506172839504e-2), (7,2.9320987654320986e-2), (6,1.6203703703703703e-2), (5,7.716049382716049e-3), (4,3.0864197530864196e-3), (3,7.716049382716049e-4)]One thing I was hoping to get didn't work out. I had this idea that I'd be able to calculate the outcome of a game of craps like this:
dice = liftM2 (+) (d 6) (d 6) point n = do roll <- dice case roll of 7 -> return "lose" _ | roll == n = "win" _ | otherwise = point n craps = do roll <- dice case roll of 2 -> return "lose" 3 -> return "lose" 4 -> point 4 5 -> point 5 6 -> point 6 7 -> return "win" 8 -> point 8 9 -> point 9 10 -> point 10 11 -> return "win" 12 -> return "lose"This doesn't work at all; point is an infinite loop because the first value of dice, namely 2, causes a recursive call. I might be able to do something about this, but I'll have to think about it more. It also occurred to me that the use of * in the definition of >>= / join could be generalized. A couple of years back I mentioned a paper of Green, Karvounarakis, and Tannen that discusses "provenance semirings". The idea is that each item in a database is annotated with some "provenance" information about why it is there, and you want to calculate the provenance for items in tables that are computed from table joins. My earlier explanation is here. One special case of provenance information is that the provenances are probabilities that the database information is correct, and then the probabilities are calculated correctly for the joins, by multiplication and addition of probabilities. But in the general case the provenances are opaque symbols, and the multiplication and addition construct regular expressions over these symbols. One could generalize ProbDist similarly, and the ProbDist monad (even more of a misnomer this time) would calculate the provenance automatically. It occurs to me now that there's probably a natural way to view a database table join as a sort of Kleisli composition, but this article has gone on too long already. Happy new year, everyone. [ Addendum 20100103: unsurprisingly, this is not a new idea. Several readers wrote in with references to previous discussion of this monad, and related monads. It turns out that the idea goes back at least to 1981. ]
My thanks to Graham Hunter for his donation.
[Other articles in category /prog/haskell] permanent link Tue, 16 Jun 2009
Haskell logo fail
Ouch.
[Other articles in category /prog/haskell] permanent link Thu, 03 Jan 2008
Note on point-free programming style
grep '^X-Spam-Level' | sort | uniq | wc -land the analogous Haskell code:
length . nub . sort . filter (isPrefixOf "X-Spam-Level")Neither one explicitly mentions its argument, which is why this is "point-free". In "point-free" programming, instead of defining a function in terms of its effect on its arguments, one defines it by composing the component functions themselves, directly, with higher-order operators. For example, instead of:
foo x y = 2 * x + yone has, in point-free style:
foo = (+) . (2 *)where (2 *) is the function that doubles its argument, and (+) is the (curried) addition function. The two definitions of foo are entirely equivalent. As the two examples should make clear, point-free style is sometimes natural, and sometimes not, and the example chosen by M. Lai was carefully selected to bias the argument in favor of point-free style. Often, after writing a function in pointful style, I get the computer to convert it automatically to point-free style, just to see what it looks like. This is usually educational, and sometimes I use the computed point-free definition instead. As I get better at understanding point-free programming style in Haskell, I am more and more likely to write certain functions point-free in the first place. For example, I recently wrote:
soln = int 1 (srt (add one (neg (sqr soln))))and then scratched my head, erased it, and replaced it with the equivalent:
soln = int 1 ((srt . (add one) . neg . sqr) soln)I could have factored out the int 1 too: soln = (int 1 . srt . add one . neg . sqr) solnI could even have removed soln from the right-hand side:
soln = fix (int 1 . srt . add one . neg . sqr)but I am not yet a perfect sage. Sometimes I opt for an intermediate form, one in which some of the arguments are explicit and some are implicit. For example, as an exercise I wrote a function numOccurrences which takes a value and a list and counts the number of times the value occurs in the list. A straightforward and conventional implementation is:
numOccurrences x [] = 0 numOccurrences x (y:ys) = if (x == y) then 1 + rest else rest where rest = numOccurrences x ysbut the partially point-free version I wrote was much better:
numOccurrences x = length . filter (== x)Once you see this, it's easy to go back to a fully pointful version:
numOccurrences x y = length (filter (== x) y)Or you can go the other way, to a point-free version:
numOccurrences = (length .) . filter . (==)which I find confusing. Anyway, the point of this note is not to argue that the point-free style is better or worse than the pointful style. Sometimes I use the one, and sometimes the other. I just want to point out that the argument made by M. Lai is deceptive, because of the choice of examples. As an equally biased counterexample, consider:
bar x = x*x + 2*x + 1which the automatic converter informs me can be written in point-free style as:
bar = (1 +) . ap ((+) . join (*)) (2 *)Perusal of this example will reveal much to the attentive reader, including the definitions of join and ap. But I don't think many people would argue that it is an improvement on the original. (Maybe I'm wrong, and people would argue that it was an improvement. I won't know for sure until I have more experience.) For some sort of balance, here is another example where I think the point-free version is at least as good as the pointful version: a recent comment on Reddit suggested a >>> operator that composes functions just like the . operator, but in the other order, so that: f >>> g = g . for, if you prefer:
(>>>) f g x = g(f(x))The point-free definition of >>> is:
(>>>) = flip (.)where the flip operator takes a function of two arguments and makes a new function that does the same thing, but with the arguments in the opposite order. Whatever your feelings about point-free style, it is undeniable that the point-free definition makes perfectly clear that >>> is nothing but . with its arguments in reverse order.
[Other articles in category /prog/haskell] permanent link |