The Universe of Disco


Tue, 03 Oct 2006

Really real examples of HOP techniques in action
I recently stopped working for the University of Pennsylvannia's Informations Systems and Computing group, which is the organization that provides computer services to everyone on campus that doesn't provide it for themselves.

I used HOP stuff less than I might have if I hadn't written the HOP book myself. There's always a tradeoff with the use of any advanced techniques: it might provide some technical benefit, like making the source code smaller, but the drawback is that the other people you work with might not be able to maintain it. Since I'm the author of the book, I can be expected to be biased in favor of the techniques. So I tried to compensate the other way, and to use them only when I was absolutely sure it was the best thing to do.

There were two interesting uses of HOP techniques. One was in the username generator for new accounts. The other was in a generic server module I wrote.

Name generation

The name generator is used to offer account names to incoming students and faculty. It is given the user's full name, and optionally some additional information of the same sort. It then generates a bunch of usernames to offer the user. For example, if the user's name is "George Franklin Bauer, Jr.", it might generate usernames like:

        george    bauer     georgef   fgeorge   fbauer    bauerf
        gf        georgeb   fg        fb        bauerg    bf
        georgefb  georgebf  fgeorgeb  fbauerg   bauergf   bauerfg
        ge        ba        gef       gbauer    fge       fba
        bgeorge   baf       gfbauer   gbauerf   fgbauer   fbgeorge
        bgeorgef  bfgeorge  geo       bau       geof      georgeba
        fgeo      fbau      bauerge   bauf      fbauerge  bauergef
        bauerfge  geor      baue      georf     gb        fgeor
        fbaue     bg        bauef     gfb       gbf       fgb
        fbg       bgf       bfg       georg     georgf    gebauer
        fgeorg    bageorge  gefbauer  gebauerf  fgebauer
The code that did this, before I got to it, was extremely long and convoluted. It was also extremely slow. It would generate a zillion names (slowly) and then truncate the list to the required length.

It was convoluted because people kept asking that the generation algorithm be tweaked in various ways. Each tweak was accompanied by someone hacking on the code to get it to do things a little differently.

I threw it all away and replaced it with a lazy generator based on the lazy stream stuff of Chapter 6. The underlying stream library was basically the same as the one in Chapter 6. Atop this, I built some functions that generated streams of names. For example, one requirement was that if the name generator ran out of names like the examples above, it should proceed by generating names that ended with digits. So:

        sub suffix {
          my ($s, $suffix) = @_;
          smap { "$_$suffix" } $s;          
        }       

        # Given (a, b, c), produce a1, b1, c1, a2, b2, c2, a3...
        sub enumerate {
          my $s = shift;
          lazyappend(smap { suffix($s, $_) } iota());
        }

        # Given (a, b, c), produce a, b, c, a1, b1, c1, a2, b2, c2, a3...
        sub and_enumerate {
          my $s = shift;
          append($s, enumerate($s));
        }

        # Throw away names that are already used
        sub available_filter {
          my ($s, $pn) = @_;
          $pn ||= PennNames::Generate::InUse->new;
          sgrep { $pn->available($_) } $s;
        }
The use of the stream approach was strongly indicated here for two reasons. First, the number of names to generate wasn't known in advance. It was convenient for the generation module to pass back a data structure that encapsulated an unlimited number of names, and let the caller mine it for as many names as were necessary.

Second, the frequent changes and tinkerings to the name generation algorithm in the past suggested that an extremely modular approach would be a benefit. In fact, the requirements for the generation algorithm chanced several times as I was writing the code, and the stream approach made it really easy to tinker with the order in which names were generated, by plugging together the prefabricated stream modules.

Generic server

For a different project, I wrote a generic forking server module. The module would manage a listening socket. When a new connection was made to the socket, the module would fork. The parent would go back to listening; the child would execute a callback function, and exit when the callback returned.

The callback was responsible for communicating with the client. It was passed the client socket:

        sub child_callback {
          my $socket = shift;
          # ... read and write the socket ...
          return;   # child process exits
        }
But typically, you don't want to have to manage the socket manually. For example, the protocol might be conversational: read a request from the client, reply to it, and so forth:

        # typical client callback:
        sub child_callback {
          my $socket = shift; 
          while (my $request = <$socket>) {
            # generate response to request
            print $socket $response;
          }
        }
The code to handle the loop and the reading and writing was nontrivial, but was going to be the same for most client functions. So I provided a callback generator. The input to the callback generator is a function that takes requests and returns appropriate responses:

        sub child_behavior {
          my $request = shift;
          if ($request =~ /^LOOKUP (\w+)/) {
            my $input = $1;
            if (my $result = lookup($input)) {
              return "OK $input $result";
            } else {
              return "NOK $input";
            }
          } elsif ($request =~ /^QUIT/) {
            return;
          } elsif ($request =~ /^LIST/) {
            my $N = my @N = all_names();
            return join "\n", "OK $N", @N, ".";
          } else {
            return "HUH?";
          }
        }
This child_behavior function is not suitable as a callback, because the argument to the callback is the socket handle. But the child_behavior function can be turned into a callback:

        $server->run(CALLBACK => make_callback(\&child_behavior));
make_callback() takes a function like child_behavior() and wraps it up in an I/O loop to turn it into a callback function. make_callback() looks something like this:

        sub make_callback {
          my $behavior = shift;
          return sub {
            my $socket = shift;
            while (my $request = <$socket>) {
              chomp $request;
              my $response = $behavior->($request);
              return unless defined $response;
              print $socket $response;
            }
          };
        }
I think this was the right design; it kept the design modular and flexible, but also simple.


[Other articles in category /prog] permanent link