Tree search in Haskell
In Perl I would often write a generic tree search function:
# Perl
sub search {
my ($is_good, $children_of, $root) = @_;
my @queue = ($root);
return sub {
while (1) {
return unless @queue;
my $node = shift @queue;
push @queue, $children_of->($node);
return $node if $is_good->($node);
}
}
}
For example, see Higher-Order Perl, section 5.3.
To use this, we provide two callback functions. $is_good checks
whether the current item has the properties we were searching for.
$children_of takes an item and returns its children in the tree.
The search function returns an iterator object, which, each time it is
called, returns a single item satisfying the $is_good predicate, or
undef if none remains. For example, this searches the space of all
strings over abc for palindromic strings:
# Perl
my $it = search(sub { $_[0] eq reverse $_[0] },
sub { return map "$_[0]$_" => ("a", "b", "c") },
"");
while (my $pal = $it->()) {
print $pal, "\n";
}
Many variations of this are possible. For example, replacing push
with unshift changes the search from breadth-first to depth-first.
Higher-Order Perl shows how to modify it to do heuristically-guided search.
I wanted to do this in Haskell, and my first try didn’t work at all:
-- Haskell
search1 :: (n -> Bool) -> (n -> [n]) -> n -> [n]
search1 isGood childrenOf root =
s [root]
where
s nodes = do
n <- nodes
filter isGood (s $ childrenOf n)
There are two problems with this. First, the filter is in the wrong
place. It says that the search should proceed downward only from the
good nodes, and stop when it reaches a not-good node.
To see what's wrong with this, consider a search for palindromes. Th
string ab isn't a palindrome, so the search would be cut off
at ab , and never proceed downward to find aba or abccbccba .
It should be up to childrenOf to decide how to
continue the search. If the search should be pruned at a particular
node, childrenOf should return an empty list of children. The
$isGood callback has no role here.
But the larger problem is that in most cases this function will compute
forever without producing any output at all, because the call to s
recurses before it returns even one list element.
Here’s the palindrome example in Haskell:
palindromes = search isPalindrome extend ""
where
isPalindrome s = (s == reverse s)
extend s = map (s ++) ["a", "b", "c"]
This yields a big fat !!\huge \bot!!: it does nothing, until memory is
exhausted, and then it crashes.
My next attempt looked something like this:
search2 :: (n -> Bool) -> (n -> [n]) -> n -> [n]
search2 isGood childrenOf root = filter isGood $ s [root]
where
s nodes = do
n <- nodes
n : (s $ childrenOf n)
The filter has moved outward, into a single final pass over the
generated tree. And s now returns a list that at least has the node
n on the front, before it recurses. If one doesn’t look at the nodes
after n , the program doesn’t make the recursive call.
The palindromes program still isn’t right though. take 20
palindromes produces:
["","a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa",
"aaaaaaaaa", "aaaaaaaaaa","aaaaaaaaaaa","aaaaaaaaaaaa",
"aaaaaaaaaaaaa","aaaaaaaaaaaaaa", "aaaaaaaaaaaaaaa",
"aaaaaaaaaaaaaaaa","aaaaaaaaaaaaaaaaa", "aaaaaaaaaaaaaaaaaa",
"aaaaaaaaaaaaaaaaaaa"]
It’s doing a depth-first search, charging down the leftmost branch to
infinity. That’s because the list returned from s (a:b:rest) starts
with a , then has the descendants of a , before continuing with b
and b 's descendants. So we get all the palindromes beginning with
“a” before any of the ones beginning with "b" , and similarly all
the ones beginning with "aa" before any of the ones beginning with
"ab" , and so on.
I needed to convert the search to breadth-first, which is
memory-expensive but at least visits all the nodes, even when the tree
is infinite:
search3 :: (n -> Bool) -> (n -> [n]) -> n -> [n]
search3 isGood childrenOf root = filter isGood $ s [root]
where
s nodes = nodes ++ (s $ concat (map childrenOf nodes))
This worked. I got a little lucky here, in that I had already had the
idea to make s :: [n] -> [n] rather than the more obvious s :: n ->
[n] . I had done that because I wanted to do the n <- nodes thing,
which is no longer present in this version. But it’s just what we need, because we
want s to return a list that has all the nodes at the current
level (nodes ) before it recurses to compute the nodes farther down.
Now take 20 palindromes produces the answer I wanted:
["","a","b","c","aa","bb","cc","aaa","aba","aca","bab","bbb","bcb",
"cac", "cbc","ccc","aaaa","abba","acca","baab"]
While I was writing this version I vaguely wondered if there was something that combines concat and map , but I didn’t follow up on it until just now. It turns out there is and it’s called concatMap . 😛
search3' :: (n -> Bool) -> (n -> [n]) -> n -> [n]
search3' isGood childrenOf root = filter isGood $ s [root]
where
s nodes = nodes ++ (s $ concatMap childrenOf nodes)
So this worked, and I was going to move on. But then a brainwave hit me: Haskell is a lazy language. I don’t have to generate and filter the tree at the same time. I can generate the entire (infinite) tree and filter it later:
-- breadth-first tree search
bfsTree :: (n -> [n]) -> [n] -> [n]
bfsTree childrenOf nodes =
nodes ++ bfsTree childrenOf (concatMap childrenOf nodes)
search4 isGood childrenOf root =
filter isGood $ bfsTree childrenOf [root]
This is much better because it breaks the generation and filtering
into independent components, and also makes clear that searching is
nothing more than filtering the list of nodes. The interesting part
of this program is the breadth-first tree traversal, and the tree
traversal part now has only two arguments instead of three; the filter
operation afterwards is trivial. Tree search in Haskell is mostly
tree, and hardly any search!
With this refactoring we might well decide to get rid of search entirely:
palindromes4 = filter isPalindrome allStrings
where
isPalindrome s = (s == reverse s)
allStrings = bfsTree (\s -> map (s ++) ["a", "b", "c"]) [""]
And then I remembered something I hadn’t thought about in a long, long
time:
[Lazy evaluation] makes it practical to modularize a program as a
generator that constructs a large number of possible answers, and a
selector that chooses the appropriate one.
That's exactly what I was doing and what I should have been doing all
along. And it ends:
Lazy evaluation is perhaps the
most powerful tool for modularization … the most powerful glue
functional programmers possess.
(”Why Functional Programming Matters”, John Hughes, 1990.)
I felt a little bit silly, because I wrote a book about lazy
functional programming and yet somehow, it’s not the glue I
reach for first when I need glue.
[ Addendum 20221023: somewhere along the way I dropped the idea of
using the list monad for the list construction, instead using explicit
map and concat . But it could be put back. For example:
s nodes = (nodes ++) . s $ do
n <- nodes
childrenOf n
I don't think this is an improvement on just using concatMap . ]
[Other articles in category /prog/haskell]
permanent link
|