The Universe of Discourse


Wed, 11 Jul 2018

Don't do this either

Here is another bit of Perl code:

 sub function {
   my ($self, $cookie) = @_;
   $cookie = ref $cookie && $cookie->can('value') ? $cookie->value : $cookie;
   ...
 }

The idea here is that we are expecting $cookie to be either a string, passed directly, or some sort of cookie object with a value method that will produce the desired string. The ref … && … condition distinguishes the two situations.

A relatively minor problem is that if someone passes an object with no value method, $cookie will be set to that object instead of to a string, with mysterious results later on.

But the real problem here is that the function's interface is not simple enough. The function needs the string. It should insist on being passed the string. If the caller has the string, it can pass the string. If the caller has a cookie object, it should extract the string and pass the string. If the caller has some other object that contains the string, it should extract the string and pass the string. It is not the job of this function to know how to extract cookie strings from every possible kind of object.

I have seen code in which this obsequiousness has escalated to absurdity. I recently saw a function whose job was to send an email. It needs an EmailClass object, which encapsulates the message template and some of the headers. Here is how it obtains that object:

    12    my $stash = $args{stash} || {};
    …
    16    my $emailclass_obj = delete $args{emailclass_obj}; # isn't being passed here
    17    my $emailclass = $args{emailclass_name} || $args{emailclass} || $stash->{emailclass} || '';
    18    $emailclass = $emailclass->emailclass_name if $emailclass && ref($emailclass);
    …  
    60    $emailclass_obj //= $args{schema}->resultset('EmailClass')->find_by_name($emailclass);

Here the function needs an EmailClass object. The caller can pass one in $args{emailclass_obj}. But maybe the caller doesn't have one, and only knows the name of the emailclass it wants to use. Very well, we will allow it to pass the string and look it up later.

But that string could be passed in any of $args{emailclass_name}, or $args{emailclass}, or $args{stash}{emailclass} at the caller's whim and we have to rummage around hoping to find it.

Oh, and by the way, that string might not be a string! It might be the actual object, so there are actually seven possibilities:

    $args{emailclass}
    $args{emailclass_obj}
    $args{emailclass_name}
    $args{stash}{emailclass}
    $args{emailclass}->emailclass_name
    $args{emailclass_name}->emailclass_name
    $args{stash}{emailclass}->emailclass_name

Notice that if $args{emailclass_name} is actually an emailclass object, the name will be extracted from that object on line 18, and then, 42 lines later, the name may be used to perform a database lookup to recover the original object again.

We hope by the end of this rigamarole that $emailclass_obj will contain an EmailClass object, and $emailclass will contain its name. But can you find any combinations of arguments where this turns out not to be true? (There are several.) Does the existing code exercise any of these cases? (I don't know. This function is called in 133 places.)

All this because this function was not prepared to insist firmly that its arguments be passed in a simple and unambiguous format, say like this:

    my $emailclass = $args->{emailclass} 
          || $self->look_up_emailclass($args->{emailclass_name})
          || croak "one of emailclass or emailclass_name is required";

I am not certain why programmers think it is a good idea to have functions communicate their arguments by way of a round of Charades. But here's my current theory: some programmers think it is discreditable for their function to throw an exception. “It doesn't have to die there,” they say to themselves. “It would be more convenient for the caller if we just accepted either form and did what they meant.” This is a good way to think about user interfaces! But a function's calling convention is not a user interface. If a function is called with the wrong arguments, the best thing it can do is to drop dead immediately, pausing only long enough to gasp out a message explaining what is wrong, and incriminating its caller. Humans are deserving of mercy; calling functions are not.

Allowing an argument to be passed in seven different ways may be convenient for the programmer writing the call, who can save a few seconds looking up the correct spelling of emailclass_name, but debugging what happens when elaborate and inconsistent arguments are misinterpreted will be eat up the gains many times over. Code is written once, and read many times, so we should be willing to spend more time writing it if it will save trouble reading it again later.

Novice programmers may ask “But what if this is business-critical code? A failure here could be catastrophic!”

Perhaps a failure here could be catastrophic. But if it is a catastrophe to throw an exception, when we know the caller is so confused that it is failing to pass the required arguments, then how much more catastrophic to pretend nothing is wrong and to continue onward when we are surely ignorant of the caller's intentions? And that catastrophe may not be detected until long afterward, or at all.

There is such a thing as being too accommodating.


[Other articles in category /prog/perl] permanent link