The Universe of Discourse


Sat, 15 Dec 2012

How I got four errors into a one-line program
At my current job, each task is assigned a ticket number of the form e12345. The git history is extremely convoluted, and it's been observed that it's easier to find things if you include the ticket number at the front of the commit message. I got tired of inserting it manually, and thought I would write a prepare-commit-message hook to insert it automatically.

A prepare-commit-message hook is a program that you stick in the file .git/hooks/prepare-commit-hook. When you run git-commit, git first writes the commit message to a file, then invokes the prepare-commit-message program on file; the program can modify the contents of the message, or abort the commit if it wants to. Then git runs the editor on the message, if it was going to do that, and creates the commit with the edited message.

The hook I wrote was basically a one-liner, and the reason I am posting this note is because I found three significant programming errors in it in the first day of use.

Here's the first cut:

case $2 in
  message)
    perl -i -lpe "s/^(e\d+:\s+)?/$(cs -): /" $1
    ;;
esac
This is a shell script, but the main purpose is to run the perl one-liner. The shell script gets two arguments: $1 is the path to the file that contains the proposed commit message. The $2 argument is a tag which describes the commit's context; it's merge if the commit is a merge commit, for example; it's template if the commit message is supplied from a template via -t on the command line or the commit.template configuration option. The default is the empty string, and message, which I have here, means that the message was supplied with the -m command-line option.

The Perl script edits the commit message file, named in $1, in-place, looking for something like e12345: at the beginning of a line, and replacing it with the output of the cs - command, which is a little program I wrote to print the current ticket number.

(cs is run by the shell, and its output is inserted into the Perl script before perl is run, so that the program that Perl sees is something like s/^(e\d+:\s+)?/e12345: /.) Simple enough.

There is already an error here, although it's a design error, not an implementation error: the Perl one-liner is only invoked when $2 is message. For some reason I decided that I would want it only when I supplied git-commit with the -m message option. This belief lasted exactly until the first time I ran git-commit in default mode it popped up the editor to edit the commit message, and I had to insert the ticket number manually.

So the first change was to let the hook run in the default case as well as the message case:

case $2 in
  ""|message)
    perl -i -lpe "s/^(e\d+:\s+)?/$(cs -): /" $1
    ;;
esac
This was wrong because it inserts the ticket number at the start of each line; I wanted it only at the start of the first line. So that was programming error number 1:

case $2 in
  ""|message)
    perl -i -lpe "$. == 1 && s/^(e\d+:\s+)?/$(cs -): /" $1
    ;;
esac
So far, so good.

Bug #2 appeared the first time I tried a rebase. The cs command infers the ticket number from the name of the current branch. If it fails, it issues a warning and emits the string eXXXXX instead. During a rebase, the head is detached and there is no current branch. So the four commits I rebased all had their formerly-correct ticket numbers replaced with the string eXXXXX.

There are several ways to fix this. The best way would be to make sure that the current ticket number was stashed somewhere that cs could always get it. Instead, I changed the Perl script to recognize when the commit message already began with a ticket number, and to leave it alone if so:

case $2 in
  ""|message)
    perl -i -lpe "\$. == 1 && !/^e\d+:\s+/ && s/^/$(cs -): /" $1
    ;;
esac
It probably would have been a good idea to leave an escape hatch, and have cs emit the value of $ENV{TICKET_NUMBER} if that is set, to allow invocations like TICKET_NUMBER=e71828 git commit -m …, but I didn't do it, yet.

The third bug appeared when I did git commit --fixup for the first time. With --fixup you tell it which commit you are trying to fix up, and it writes the commit message in a special form that tells a subsequent git-rebase --interactive that this new commit should be handled specially. (It should be applied immediately after that other one, and should be marked as a "fixup", which means that it is squashed into the other one and that its log message is discarded in favor of the other one.) If you are fixing up a commit whose message was Frobulate the veeblefetzers, the fixup commit's message is automatically generated as fixup! Frobulate the veeblefetzers. Or it would have been, if you were not using my prepare-commit-message hook, which would rewrite it to e12345: fixup! Frobulate the veeblefetzers. This is not in the right form, so it's not recognized by git-rebase --interactive for special handling.

So the hook became:

case $2 in
  ""|message)
    perl -i -lpe "\$. == 1 && !/^(squash|fixup)! / && !/^e\d+:\s+/ && s/^/$(cs -): /" $1
    ;;
esac
(The exception for squash is similar to the one for fixup. I never use squash, but it seemed foolish not to put it in while I was thinking of it.)

This is starting to look a little gross, but in a program this small I can tolerate a little grossness.

I thought it was remarkable that such a small program broke in so many different ways. Much of that is because it must interact with git, which is very large and complicated, and partly it is that it must interact with git, which is in many places not very well designed. The first bug, where the ticket number was appended to each line instead of just the first, is not git's fault. It was fallout from my initial bad design decision to apply the script only to messages supplied with -m, which are typically one-liners, so that's what I was thinking of when I wrote the Perl script.

But the other two errors would have been avoided had the interface to the hook been more uniform. There seems to be no reason that rebasing (or cherry-picking) and git-commit --fixup contexts couldn't have been communicated to the hook via the same $2 argument that communicates other contexts. Had this been done in a more uniform way, my program would have worked more correctly. But it wasn't done, and it's probably too late to change it now, since such a change risks breaking many existing prepare-commit-message hooks. (“The enemy of software is software.”) A well-written hook will of course have a catchall:

case $2 in
  ""|message)
    perl -i -lpe "\$. == 1 && !/^(squash|fixup)! / && !/^e\d+:\s+/ && s/^/$(cs -): /" $1
    ;;

  merge|template|squash|commit)
    # do nothing
    ;;
  *)      # wat
    echo "prepare-message-hook: unknown context '$2'" 1>&2
    exit 1;
    ;;

esac
But mine doesn't and I bet a lot of others don't either.


[Other articles in category /prog] permanent link