The Universe of Disco


Fri, 27 Aug 2010

A dummy generator for mock objects
I am not sure how useful this actually is, but I after having used it once it was not yet obvious that it was a bad idea, so I am writing it up here.

Suppose you are debugging some method, say someMethod, which accepts as one of its arguments complicated, annoying objects $annoying that you either can't or don't want to instantiate. This might be because $annoying is very complicated, with many sub-objects to set up, or perhaps you simply don't know how to build $annoying and don't care to find out.

That is okay, because you can get someMethod to run without the full behavior of $annoying. Say for example someMethod calls $annoying->foo_manager->get_foo(...)->get_user_id. You don't understand or care about the details because for debugging someMethod it is enough to suppose that the end result is the user ID 3. You could supply a mock object, or several, that implement the various methods, but that requires some work up front.

Instead, use this canned Dummy class. Instead of instantiating a real $annoying (which is difficult) or using a bespoke mock object, use Dummy->new("annoying"):

        package Dummy;
        use Data::Dumper;
        $Data::Dumper::Terse = 1;
        our $METHOD;

        my @names = qw(bottle corncob euphonium octopus potato slide);
        my $NAME = "aaa";

        sub new {
          my ($class, $name) = @_;
          $name ||= $METHOD || shift(@names) || $NAME++;
          bless { N => $name } => $class;
        }
The call Dummy->new("annoying") will generate an ad-hoc mock object; whenever any method is called on this dummy object, the call will be caught by an AUTOLOAD that will prompt you for the return value you want it to produce:

        sub AUTOLOAD {
          my ($self, @args) = @_;
          my ($p, $m) = $AUTOLOAD =~ /(.*)::(.*)/;
          local $METHOD = $m;
          print STDERR "<< $_[0]{N}\->$m >>\n";
          print STDERR "Arguments: " . Dumper(\@args) . "\n";
          my $v;
          do {
            print STDERR "Value?  ";
            chomp($v = <STDIN>);
          } until eval "$v; 1";
          return(eval $v);
        }

        sub DESTROY { }

        1;
The prompt looks like this:

  << annoying->foo_manager >>
  Arguments: []
  Value? 
If the returned value should be a sub-object, no problem: just put in new Dummy and it will make a new Dummy object named foo_manager, and the next prompt will be:

  << foo_manager->get_foo >>
  Arguments: ...
  ...
  Value? 
Now you can put in new Dummy "(Fred's foo)" or whatever. Eventually it will ask you for a value for (Fred's foo)->id and you can have it return 4.

It's tempting to add caching, so that it won't ask you twice for the results of the same method call. But that would foreclose the option to have the call return different results twice. Better, I think, is for the user to cache the results themselves if they plan to use them again; there is nothing stopping the user from entering a value expression like $::val = ....

This may turn out to be one of those things that is mildly useful, but not useful enough to actually use; we'll see.


[Other articles in category /prog/perl] permanent link

Thu, 26 Aug 2010

Monad terminology problem
I think one problem (of many) that beginners might have with Haskell monads is the confusing terminology. The word "monad" can refer to four related but different things:

  1. The Monad typeclass itself.

  2. When a type constructor T of kind ∗ → ∗ is an instance of Monad we say that T "is a monad". For example, "Tree is a monad"; "((→) a) is a monad". This is the only usage that is strictly corrrect.

  3. Types resulting from the application of monadic type constructors (#2) are sometimes referred to as monads. For example, "[Integer] is a monad".

  4. Individual values of monadic types (#3) are often referred to as monads. For example, the "All About Monads" tutorial says "A list is also a monad".

Usage #1 is not a real problem; it does not occur that often, and is readily distinguished by context, capitalization, type font, and other markers. #2 is actually correct, so there is no problem there. #3 seems to be an uncommon colloquialism.

The most serious problem here is #4, that people refer to individual values of monadic types as "monads". Even when they don't do this, they are hampered by the lack of a good term for it. As I know no good alternative has been proposed. People often say "monadic value" (I think), which is accurate, but something of a mouthful.

One thing I have discovered in my writing life is that the clarity of a confusing document can sometimes be improved merely by replacing a polysyllabic noun phrase with a monosyllable. For example, chapter 3 of Higher-Order Perl discussed the technique of memoizing a function by generating an anonymous replacement for it that maintains a cache and calls the real function on a cache miss. Early drafts were hard to understand, and improved greatly when I replaced the phrase "anonymous replacement function" with "stub". The Perl documentation was significantly improved merely by replacing "associative array" everywhere with "hash" and "funny punctuation character" with "sigil".

I think a monosyllabic replacement for "monadic value" would be a similar boon to discussion of monads, not just for beginners but for everyone else too. The drawback, of introducing yet another jargon term, would in this case be outweighed by the benefits. Jargon can obscure, but sometimes it can clarify.

The replacement word should be euphonious, clear but not overly specific, and not easily confused with similar jargon words. It would probably be good for it to begin with the letter "m". I suggest:

mote

So return takes a value and returns a mote. The >>= function similarly lifts a function on pure values to a function on motes; when the mote is a container one may think of >>= as applying the function to the values in the container. [] is a monad, so lists are motes. The expression on the right-hand side of a var ← expr in a do-block must have mote type; it binds the mote on the right to the name on the left, using the >>= operator.

I have been using this term privately for several months, and it has been a small but noticeable success. Writing and debugging monadic programs is easier because I have a simple name for the motes that the program manipulates, which I can use when I mumble to myself: "What is the type error here? Oh, commit should be returning a mote." And then I insert return in the right place.

I'm don't want to oversell the importance of this invention. But there is clearly a gap in the current terminology, and I think it is well-filled by "mote".

(While this article was in progress I discovered that What a Monad is not uses the nonceword "mobit". I still prefer "mote".)


[Other articles in category /prog/haskell] permanent link