The Universe of Disco


Tue, 04 Aug 2015

The list monad in Perl and Python

A few months ago I wrote an article about using Haskell's list monad to do exhaustive search, with the running example of solving this cryptarithm puzzle:

    S E N D
+   M O R E
-----------
  M O N E Y

(This means that we want to map the letters S, E, N, D, M, O, R, Y to distinct digits 0 through 9 to produce a five-digit and two four-digit numerals which, when added in the indicated way, produce the indicated sum.)

At the end, I said:

It would be an interesting and pleasant exercise to try to implement the same underlying machinery in another language. I tried this in Perl once, and I found that although it worked perfectly well, between the lack of the do-notation's syntactic sugar and Perl's clumsy notation for lambda functions (sub { my ($s) = @_; … } instead of \s -> …) the result was completely unreadable and therefore unusable. However, I suspect it would be even worse in Python because of semantic limitations of that language. I would be interested to hear about this if anyone tries it.

I was specifically worried about Python's peculiar local variable binding. But I did receive the following quite clear solution from Peter De Wachter, who has kindly allowed me to reprint it:

digits = set(range(10))

def to_number(*digits):
    n = 0
    for d in digits:
        n = n * 10 + d
    return n

def let(x, f):
    return f(x)

def unit(x):
     return [x]

def bind(xs, f):
     ys = []
     for x in xs:
         ys += f(x)
     return ys

def guard(b, f):
     return f() if b else []

after which the complete solution looks like:

def solutions():
    return bind(digits - {0}, lambda s:
           bind(digits - {s}, lambda e:
           bind(digits - {s,e}, lambda n:
           bind(digits - {s,e,n}, lambda d:
           let(to_number(s,e,n,d), lambda send:
           bind(digits - {0,s,e,n,d}, lambda m:
           bind(digits - {s,e,n,d,m}, lambda o:
           bind(digits - {s,e,n,d,m,o}, lambda r:
           let(to_number(m,o,r,e), lambda more:
           bind(digits - {s,e,n,d,m,o,r}, lambda y:
           let(to_number(m,o,n,e,y), lambda money:
           guard(send + more == money, lambda:
           unit((send, more, money))))))))))))))

print(solutions())

I think this shows that my fears were unfounded. This code produces the correct answer in about 1.8 seconds on my laptop.

Thus inspired, I tried doing it again in Perl, and it was not as bad as I remembered:

sub bd { my ($ls, $f) = @_;
  [ map @{$f->($_)}, @$ls ]      # Yow
}
sub guard { $_[0] ? [undef] : [] }

I opted to omit unit/return since an idiomatic solution doesn't really need it. We can't name the bind function bind because that is reserved for a built-in function; I named it bd instead. We could use Perl's operator overloading to represent binding with the >> operator, but that would require turning all the lists into objects, and it didn't seem worth doing.

We don't need to_number, because Perl does it implicitly, but we do need a set subtraction function, because Perl has no built-in set operators:

sub remove {
  my ($b, $a) = @_;
  my %h = map { $_ => 1 } @$a;
  delete $h{$_} for @$b;
  return [ keys %h ];
}

After which the solution, although cluttered by Perl's verbose notation for lambda functions, is not too bad:

my $digits = [0..9];
my $solutions =
  bd remove([0],        $digits) => sub { my ($s) = @_;
  bd remove([$s],       $digits) => sub { my ($e) = @_;
  bd remove([$s,$e],    $digits) => sub { my ($n) = @_;
  bd remove([$s,$e,$n], $digits) => sub { my ($d) = @_;
    my $send = "$s$e$n$d";

  bd remove([0,$s,$e,$n,$d],     $digits) => sub { my ($m) = @_;
  bd remove([$s,$e,$n,$d,$m],    $digits) => sub { my ($o) = @_;
  bd remove([$s,$e,$n,$d,$m,$o], $digits) => sub { my ($r) = @_;
    my $more = "$m$o$r$e";

  bd remove([$s,$e,$n,$d,$m,$o,$r], $digits) => sub { my ($y) = @_;
    my $money = "$m$o$n$e$y";
  bd guard($send + $more == $money) => sub { [[$send, $more, $money]] }}}}}}}}};

  for my $s (@$solutions) {
    print "@$s\n";
  }

This runs in about 5.5 seconds on my laptop. I guess, but am not sure, that remove is mainly at fault for this poor performance.

An earlier version of this article claimed, incorrectly, that the Python version had lazy semantics. It does not; it is strict.

[ Addendum: Aaron Crane has done some benchmarking of the Perl version. A better implementation of remove (using an array instead of a hash) does speed up the calculation somewhat, but contrary to my guess, the largest part of the run time is bd itself, apparently becuse Perl function calls are relatively slow.

HN user masklinn tried a translation of the Python code into a version that returns a lazy iterator; I gather the changes were minor. ]


[Other articles in category /prog] permanent link