Fri, 20 Jul 2018

Shitpost roundup, 2018-06

Volume was way down in May and June, mainly because of giant work crises that ate all my energy. I will try to get back on track now.



In the past I have boldfaced posts that seemed more likely to be of general interest. None of these seem likely to be of general interest.

Also, I think it is time to stop posting these roundups. By now everyone who wants to know about is aware of it and can follow along without prompting. So I expect this will be the last of these posts. Shitposting will continue, but without these summaries.

Tue, 17 Jul 2018

The food I couldn't eat

[ I wrote this in 2007 and it seems I forgot to publish it. Enjoy! ]

I eat pretty much everything. Except ketchup. I can't stand ketchup. When I went to Taiwan a couple of years ago my hosts asked if there were any foods I didn't eat. I said no, except for ketchup.

"Ketchup? You mean that red stuff?"

Right. Yes, it's strange.

When I was thirteen my grandparents took me to Greece, and for some reason I ate hardly anything but souvlaki the whole time. When I got home, I felt like a complete ass. I swore that I would never squander another such opportunity, and that if I ever went abroad again I would eat absolutely everything that was put before me.

This is a good policy not just because it exposes me to a lot of delicious and interesting food, and not just because it prevents me from feeling like a complete ass, but also because I don't have to worry that perhaps my hosts will be insulted or disappointed that I won't eat the food they get for me.

On my second trip to Taiwan, I ate at a hot pot buffet restaurant. They give you a pot of soup, and then you go to the buffet and load up with raw meat and vegetables and things, and cook them at your table in the soup. It's fun. In my soup there were some dark reddish-brown cubes that had approximately the same texture as soft tofu. I didn't know what it was, but I ate it and tried to figure it out.

The next day I took the bus to Lishan (梨山), and through good fortune was invited to eat dinner with a Taiwanese professor of criminology and his family. The soup had those red chunks in it again, and I said "I had these for lunch yesterday! What are they?" I then sucked one down quickly, because sometimes people interpret that kind of question as a criticism, and I didn't want to offend the professor.

Actually it's much easier to ask about food in China than it is in, say, Korea. Koreans are defensive about their cuisine. They get jumpy if you ask what something is, and are likely to answer "It's good. Just eat it!". They are afraid that the next words out of your mouth will be something about how bad it smells. This is because the Japanese, champion sneerers, made about one billion insulting remarks about smelly Korean food while they were occupying the country between 1911 and 1945. So if you are in Korea and you don't like the food, the Koreans will take it very personally.

Chinese people, on the other hand, know that they have the best food in the world, and that everyone loves Chinese food. If you don't like it, they will not get offended. They will just conclude that you are a barbarian or an idiot, and eat it themselves.

Anyway, it turns out that the reddish-brown stuff was congealed duck's blood. Okay. Hey, I had congealed duck blood soup twice in two days! No way am I going home from this trip feeling like an ass.

So the eat-absolutely-everything policy has worked out well for me, and although I haven't liked everything, at least I don't feel like I wasted my time.

The only time I've regretted the policy was on my first trip to Taiwan. I was taken out to dinner and one of the dishes turned out to be pieces of steamed squid. That's not my favorite food, but I can live with it. But the steamed squid was buried under a big, quivering mound of sugared mayonnaise.

I remembered my policy, and took a bite. I'm sure I turned green.

So that's the food that I couldn't eat.

Thu, 12 Jul 2018

Don't do this either

Here is another bit of Perl code:

 sub function {
   my ($self, $cookie) = @_;
   $cookie = ref $cookie && $cookie->can('value') ? $cookie->value : $cookie;

The idea here is that we are expecting $cookie to be either a string, passed directly, or some sort of cookie object with a value method that will produce the desired string. The ref … && … condition distinguishes the two situations.

A relatively minor problem is that if someone passes an object with no value method, $cookie will be set to that object instead of to a string, with mysterious results later on.

But the real problem here is that the function's interface is not simple enough. The function needs the string. It should insist on being passed the string. If the caller has the string, it can pass the string. If the caller has a cookie object, it should extract the string and pass the string. If the caller has some other object that contains the string, it should extract the string and pass the string. It is not the job of this function to know how to extract cookie strings from every possible kind of object.

I have seen code in which this obsequiousness has escalated to absurdity. I recently saw a function whose job was to send an email. It needs an EmailClass object, which encapsulates the message template and some of the headers. Here is how it obtains that object:

    12    my $stash = $args{stash} || {};
    16    my $emailclass_obj = delete $args{emailclass_obj}; # isn't being passed here
    17    my $emailclass = $args{emailclass_name} || $args{emailclass} || $stash->{emailclass} || '';
    18    $emailclass = $emailclass->emailclass_name if $emailclass && ref($emailclass);
    60    $emailclass_obj //= $args{schema}->resultset('EmailClass')->find_by_name($emailclass);

Here the function needs an EmailClass object. The caller can pass one in $args{emailclass_obj}. But maybe the caller doesn't have one, and only knows the name of the emailclass it wants to use. Very well, we will allow it to pass the string and look it up later.

But that string could be passed in any of $args{emailclass_name}, or $args{emailclass}, or $args{stash}{emailclass} at the caller's whim and we have to rummage around hoping to find it.

Oh, and by the way, that string might not be a string! It might be the actual object, so there are actually seven possibilities:


Notice that if $args{emailclass_name} is actually an emailclass object, the name will be extracted from that object on line 18, and then, 42 lines later, the name may be used to perform a database lookup to recover the original object again.

We hope by the end of this rigamarole that $emailclass_obj will contain an EmailClass object, and $emailclass will contain its name. But can you find any combinations of arguments where this turns out not to be true? (There are several.) Does the existing code exercise any of these cases? (I don't know. This function is called in 133 places.)

All this because this function was not prepared to insist firmly that its arguments be passed in a simple and unambiguous format, say like this:

    my $emailclass = $args->{emailclass} 
          || $self->look_up_emailclass($args->{emailclass_name})
          || croak "one of emailclass or emailclass_name is required";

I am not certain why programmers think it is a good idea to have functions communicate their arguments by way of a round of Charades. But here's my current theory: some programmers think it is discreditable for their function to throw an exception. “It doesn't have to die there,” they say to themselves. “It would be more convenient for the caller if we just accepted either form and did what they meant.” This is a good way to think about user interfaces! But a function's calling convention is not a user interface. If a function is called with the wrong arguments, the best thing it can do is to drop dead immediately, pausing only long enough to gasp out a message explaining what is wrong, and incriminating its caller. Humans are deserving of mercy; calling functions are not.

Allowing an argument to be passed in seven different ways may be convenient for the programmer writing the call, who can save a few seconds looking up the correct spelling of emailclass_name, but debugging what happens when elaborate and inconsistent arguments are misinterpreted will be eat up the gains many times over. Code is written once, and read many times, so we should be willing to spend more time writing it if it will save trouble reading it again later.

Novice programmers may ask “But what if this is business-critical code? A failure here could be catastrophic!”

Perhaps a failure here could be catastrophic. But if it is a catastrophe to throw an exception, when we know the caller is so confused that it is failing to pass the required arguments, then how much more catastrophic to pretend nothing is wrong and to continue onward when we are surely ignorant of the caller's intentions? And that catastrophe may not be detected until long afterward, or at all.

There is such a thing as being too accommodating.

Sat, 07 Jul 2018

Don't do this

[ This article has undergone major revisions since it was first published yesterday. ]

Here is a line of Perl code:

  if ($self->fidget && blessed $self->fidget eq 'Widget::Fidget') {

This looks to see if $self has anything in its fidget slot, and if so it checks to see if the value there is an instance of the class Widget::Fidget. If both are true, it runs the following block.

That blessed check is bad practice for several reasons.

  1. It duplicates the declaration of the fidget member data:

    has fidget => (
      is  => 'rw',
      isa => 'Widget::Fidget',
      init_arg => undef,

    So the fidget slot can't contain anything other than a Widget::Fidget, because the OOP system is already enforcing that. That means that the blessed … eq test is not doing anything — unless someone comes along later and changes the declared type, in which case the test will then be checking the wrong condition.

  2. Actually, that has already happened! The declaration, as written, allows fidget to be an instance not just of Widget::Fidget but of any class derived from it. But the blessed … eq check prevents this. This reneges on a major promise of OOP, that if a class doesn't have the behavior you need, you can subclass it and modify or extend it, and then use objects from the subclass instead. But if you try that here, the blessed … eq check will foil you.

    So this is a prime example of “… in which case the test will be checking the wrong condition” above. The test does not match the declaration, so it is checking the wrong condition. The blessed … eq check breaks the ability of the class to work with derived classes of Widget::Fidget.

  3. Similarly, the check prevents someone from changing the declared type to something more permissive, such as

    “either Widget::Fidget or Gidget::Fidget


    “any object that supports wiggle and waggle methods”


    “any object that adheres to the specification of Widget::Interface

    and then inserting a different object that supports the same interface. But the whole point of object-oriented programming is that as long as an object conforms to the required interface, you shouldn't care about its internal implementation.

  4. In particular, the check above prevents someone from creating a mock Widget::Fidget object and injecting it for testing purposes.

  5. We have traded away many of the modularity and interoperability guarantees that OOP was trying to preserve for us. What did we get in return? What are the purported advantages of the blessed … eq check? I suppose it is intended to detect an anomalous situation in which some completely wrong object is somehow stored into the self.fidget member. The member declaration will prevent this (that is what it is for), but let's imagine that it has happened anyway. This could be a very serious problem. What will happen next?

    With the check in place, the bug will go unnoticed because the function will simply continue as if it had no fidget. This could cause a much more subtle failure much farther down the road. Someone trying to debug this will be mystified: At best “it's behaving as though it had no fidget, but I know that one was set earlier”, and at worst “why is there two years of inconsistent data in the database?” This could take a very long time to track down. Even worse, it might never be noticed, and the method might quietly do the wrong thing every time it was used.

    Without the extra check, the situation is much better: the function will throw an exception as soon as it tries to call a fidget method on the non-fidget object. The exception will point a big fat finger right at the problem: “hey, on line 2389 you tried to call the rotate method on a Skunk::Stinky object, but that class has no such method`. Someone trying to debug this will immediately ask the right question: “Who put a skunk in there instead of a widget?”

It's easy to get this right. Instead of

  if ($self->fidget && blessed $self->fidget eq 'Widget::Fidget') {

one can simply use:

  if ($self->fidget) {

Moral of the story: programmers write too much code.

I am reminded of something chess master Aron Nimzovitch once said, maybe in Chess Praxis, that amateur chess players are always trying to be Doing Something.

Fri, 06 Jul 2018

In which, to my surprise, I find myself belonging to a group

My employer ZipRecruiter had a giant crisis at last month, of a scale that I have never seen at this company, and indeed, have never seen at any well-run company before. A great many of us, all the way up to the CTO, made a heroic effort for a month and got it sorted out.

It reminded me a bit of when Toph was three days old and I got a call from the hospital to bring her into the emergency room immediately. She had jaundice, which is not unusual in newborn babies. It is easy to treat, but if untreated it can cause permanent brain damage. So Toph and I went to the hospital, where she underwent the treatment, which was to have very bright lights shined directly on her skin for thirty-six hours. (Strange but true!)

The nurses in the hospital told me they had everything under control, and they would take care of Toph while I went home, but I did not go. I wanted to be sure that Toph was fed immediately and that her diapers were changed timely. The nurses have other people to take care of, and there was no reason to make her wait to eat and sleep when I could be there tending to her. It was not as if I had something else to do that I felt was more important. So I stayed in the room with Toph until it was time for us to go home, feeding her and taking care of her and just being with her.

It could have been a very stressful time, but I don't remember it that way. I remember it as a calm and happy time. Toph was in no real danger. The path forward was clear. I had my job, to help Toph get better, and I was able to do it undistracted. The hospital (Children's Hospital of Philadelphia) was wonderful, and gave me all the support I needed to do my job. When I got there they showed me the closet where the bedding was and the other closet where the snacks were and told me to help myself. They gave me the number to call at mealtimes to order meals to be sent up to my room. They had wi-fi so I could work quietly when Toph was asleep. Everything went smoothly, Toph got better, and we went home.

This was something like that. It wasn't calm; it was alarming and disquieting. But not in an entirely bad way; it was also exciting and engaging. It was hard work, but it was work I enjoyed and that I felt was worth doing. I love working and programming and thinking about things, and doing that extra-intensely for a few weeks was fun. Stressful, but fun.

And I was not alone. So many of the people I work with are so good at their jobs. I had all the support I needed. I could focus on my part of the work and feel confident that the other parts I was ignoring were being handled by competent and reasonable people who were at least as dedicated as I was. The higher-up management was coordinating things from the top, connecting technical and business concerns, and I felt secure that the overall design of the new system would make sense even if I couldn't always understand why. I didn't want to think about business concerns, I wanted someone else to do it for me and tell me what to do, and they did. Other teams working on different components that my components would interface with would deliver what they promised and it would work.

And the other programmers in my group were outstanding. We were scattered all over the globe, but handed off tasks to one another without any mishaps. I would come into work in the morning and the guys in Europe would be getting ready to go to bed and would tell me what they were up to and the other east-coasters and I could help pick up where they left off. The earth turned and the west-coasters appeared and as the end of the day came I would tell them what I had done and they could continue with it.

I am almost pathologically averse to belonging to groups. It makes me uncomfortable and even in groups that I have been associated with for years I feel out of place and like my membership is only provisional and temporary. I always want to go my own way and if everyone around me is going a different way I am suspicious and contrarian. When other people feel group loyalty I wonder what is wrong with them.

The up-side of this is that I am more willing than most people to cross group boundaries. People in a close-knit community often read all the same books and know all the same techniques for solving problems. This means that when a problem comes along that one of them can't solve, none of the rest can solve it either. I am sometimes the person who can find the solution because I have spent time in a different tribe and I know different things. This is a role I enjoy.

Higher-Order Perl exemplifies this. To write Higher-Order Perl I visited functional programming communities and tried to learn techniques that those communities understood that people outside those communities could use. Then I came back to the Perl community with the loot I had gathered.

But it's not all good. I have sometimes been able to make my non-belonging work out well. But it is not a choice; it's the way I am made, and I can't control it. When I am asked to be part of a team, I immediately become wary and wonder what the scam is. I can be loyal to people personally, but I have hardly any group loyalty. Sometimes this can lead to ugly situations.

But in fixing this crisis I felt proud to be part of the team. It is a really good team and I think it says something good about me that I can work well with the rest of them. And I felt proud to be part of this company, which is so functional, so well-run, so full of kind and talented people. Have I ever had this feeling before? If I have it was a long, long time ago.

G.H. Hardy once wrote that when he found himself forced to listen to pompous people, he would console himself by thinking:

Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.

Well, I was at ZipRecruiter during the great crisis of June 2018 and I was able to do my part and to collaborate with those people on equal terms, and that is something to be proud of.

Wed, 04 Jul 2018

Jackson and Gregg on optimization

Today Brendan Gregg's blog has an article Evaluating the Evaluation: Benchmarking Checklist that begins:

A co-worker introduced me to Craig Hanson and Pat Crain's performance mantras, which neatly summarize much of what we do in performance analysis and tuning. They are:

Performance mantras

  1. Don't do it
  2. Do it, but don't do it again
  3. Do it less
  4. Do it later
  5. Do it when they're not looking
  6. Do it concurrently
  7. Do it cheaper

I found this striking because I took it to be an obvious reference Michael A. Jackson's advice in his brilliant 1975 book Principles of Program Design. Jackson said:

We follow two rules in the matter of optimization:

Rule 1: Don't do it.
Rule 2 (for experts only). Don't do it yet.

The intent of the two passages is completely different. Hanson and Crain are offering advice about what to optimize. “Don't do it” means that to make a program run faster, eliminate some of the things it does. “Do it, but don't do it again” means that to make a program run faster, have it avoid repeating work it has already done, say by caching results. And so on.

Jackson's advice is of a very different nature. It is only indirectly about improving the program's behavior. Instead it is addressing the programmer's behavior: stop trying to optimize all the damn time! It is not about what to optimize but whether, and Jackson says that to a first approximation, the answer is no.

Here are Jackson's rules with more complete context. The quotation is from the preface (page vii) and is discussing the style of the examples in his book:

Above all, optimization is avoided. We follow two rules in the matter of optimization:

Rule 1. Don't do it.
Rule 2 (for experts only). Don't do it yet — that is, not until you have a perfectly clear and unoptimized solution.

Most programmers do too much optimization, and virtually all do it too early. This book tries to act as an antidote. Of course, there are systems which must be highly optimized if they are to be economically useful, and Chapter 12 discusses some relevant techniques. But two points should always be remembered: first, optimization makes a system less reliable and harder to maintain, and therefore more expensive to build and operate; second, because optimization obscures structure it is difficult to improve the efficiency of a system which is already partly optimized.

Here's some code I dealt with this month:

    my $emailclass = $args->{emailclass};
    if (!$emailclass && $args->{emailclass_name} ) {
      # do some caching so if we're called on the same object over and over we don't have to do another find.
      my $last_emailclass = $self->{__LAST_EMAILCLASS__};
      if ( $last_emailclass && $last_emailclass->{name} eq $args->{emailclass_name} ) {
        $emailclass = $last_emailclass->{emailclass};
      } else {
        $emailclass = $self->schema->resultset('EmailClass')
        $self->{__LAST_EMAILCLASS__} = {
                                        name => $args->{emailclass_name},
                                        emailclass => $emailclass,

Holy cow, this is wrong in so many ways. 8 lines of this mess, for what? To cache a single database lookup (the ->find_by_name call), in a single object, if it happens to be looking for the same name as last time. If caching was actually wanted, it should have been addressed in the ->find_by_name call, which could do the caching more generally, and which has some hope of knowing something about when the cache entries should be expired. Even stipulating that caching was wanted and for some reason should have been put here, why such an elaborate mechanism, all to cache just the last lookup? It could have been:

    $emailclass = $self->emailclass_for_name($args->{emailclass_name});

    sub emailclass_for_name {
      my ($self, $name) = @_;
      $self->{emailclass}{$name} //=
      return $self->{emailclass}{$name};

I was able to do a bit better than this, and replaced the code with:

    $emailclass = $self->schema->resultset('EmailClass')

My first thought was that the original caching code had been written by a very inexperienced programmer, someone who with more maturity might learn to do their job with less wasted effort. I was wrong; it had been written by a senior developer, someone who with more maturity might learn to do their job with less wasted effort.

The tragedy did not end there. Two years after the original code was written a more junior programmer duplicated the same unnecessary code elsewhere in the same module, saying:

I figured they must have had a reason to do it that way…

Thus is the iniquity of the fathers visited on the children.

In a nearby piece of code, an object A, on the first call to a certain method, constructed object B and cached it:

    base_path => ...
    schema    => $self->schema,
    retry     => ...,

Then on subsequent calls, it reused B from the cache.

But the cache was shared among many instances of A, not all of which had the same ->schema member. So some of those instances of A would ask B a question and get the answer from the wrong database. A co-worker spent hours and hours in the middle of the night last month tracking this down. Again, the cache was not only broken but completely unnecesary. What was being saved? A single object construction, probably a few hundred bytes and a few hundred microseconds at most. And again, the code was perpetrated by a senior developer who should have known better. My co-worker replaced 13 lines of broken code with four that worked.

Brendan Gregg is unusually clever, and an exceptional case. Most programmers are not Brendan Gregg, and should take Jackson's advice and stop trying to be so clever all the time.

