The Universe of Disco


Sat, 16 Dec 2023

My Git pre-commit hook contained a footgun

The other day I made some changes to a program, but when I ran the tests they failed in a very bizarre way I couldn't understand. After a bit of investigation I still didn't understand. I decided to try to narrow down the scope of possible problems by reverting the code to the unmodified state, then introducing changes from one file at a time.

My plan was: commit all the new work, reset the working directory back to the last good commit, and then start pulling in file changes. So I typed in rapid succession:

git add -u
git commit -m 'broken'
git branch wat
git reset --hard good

So the complete broken code was on the new branch wat.

Then I wanted to pull in the first file from wat. But when I examined wat there were no changes.

Wat.

I looked all around the history and couldn't find the changes. The wat branch was there but it was on the current commit, the one with none of the changes I wanted. I checked in the reflog for the commit and didn't see it.

Eventually I looked back in my terminal history and discovered the problem: I had a Git pre-commit hook which git-commit had attempted to run before it made the new commit. It checks for strings I don't usually intend to commit, such as XXX and the like.

This time one of the files had something like that. My pre-commit hook had printed an error message and exited with a failure status, so git-commit aborted without making the commit. But I had typed the commands in quick succession without paying attention to what they were saying, so I went ahead with the git-reset without even seeing the error message. This wiped out the working tree changes that I had wanted to preserve.

Fortunately the git-add had gone through, so the modified files were in the repository anyway, just hard to find. And even more fortunately, last time this happened to me, I wrote up instructions about what to do. This time around recovery was quicker and easier. I knew I only needed to recover stuff from the last add command, so instead of analyzing every loose object in the repository, I did

find .git/objects -mmin 10 --type f

to locate loose objects that had been modified in the last ten minutes. There were only half a dozen or so. I was able to recover the lost changes without too much trouble.

Looking back at that previous article, I see that it said:

it only took about twenty minutes… suppose that it had taken much longer, say forty minutes instead of twenty, to rescue the lost blobs from the repository. Would that extra twenty minutes have been time wasted? No! … The rescue might have cost twenty extra minutes, but if so it was paid back with forty minutes of additional Git expertise…

To that I would like to add, the time spent writing up the blog article was also well-spent, because it meant that seven years later I didn't have to figure everything out again, I just followed my own instructions from last time.

But there's a lesson here I'm still trying to figure out. Suppose I want to prevent this sort of error in the future. The obvious answer is “stop splatting stuff onto the terminal without paying attention, jackass”, but that strategy wasn't sufficient this time around and I couldn't think of any way to make it more likely to work next time around.

You have to play the hand you're dealt. If I can't fix myself, maybe I can fix the software. I would like to make some changes to the pre-commit hook to make it easier to recover from something like this.

My first idea was that the hook could unconditionally save the staged changes somewhere before it started, and then once it was sure that it would complete it could throw away the saved changes. For example, it might use the stash for this.

(Although, strangely, git-stash does not seem to have an easy way to say “stash the current changes, but without removing them from the working tree”. Maybe git-stash save followed by git-stash apply would do what I wanted? I have not yet experimented with it.)

Rather than using the stash, the hook might just commit everything (with commit -n to prevent infinite loops) and then reset the commit immediately, before doing whatever it was planning to do. Then if it was successful, Git would make a second, permanent commit and we could forget about the one made by the hook. But if something went wrong, the hook's commit would still be in the reflog. This doubles the number of commits you make. That doesn't take much time, because Git commit creation is lightning fast. But it would tend to clutter up the reflog.

Thinking on it now, I wonder if a better approach isn't to turn the pre-commit hook into a post-commit hook. Instead of a pre-commit hook that does this:

  1. Check for errors in staged files
    • If there are errors:
      1. Fix the files (if appropriate)
      2. Print a message
      3. Fail
    • Otherwise:
      1. Exit successfully
      2. (git-commit continues and commits the changes)

How about a post-commit hook that does this:

  1. Check for errors in the files that changed in the current head commit
    • If there are errors:
      1. Soft-reset back to the previous commit
      2. Fix the files (if appropriate)
      3. Print a message
      4. Fail
    • Otherwise:
      1. Exit successfully

Now suppose I ignore the failure, and throw away the staged changes. It's okay, the changes were still committed and the commit is still in the reflog. This seems clearly better than my earlier ideas.

I'll consider it further and report back if I actually do anything about this.

Larry Wall once said that too many programmers will have a problem, think of a solution, and implement it, but it works better if you can think of several solutions, then implement the one you think is best.

That's a lesson I think I have learned. Thanks, Larry.

Addendum

I see that Eric Raymond's version of the jargon file, last revised December 2003, omits “footgun”. Surely this word is not that new? I want to see if it was used on Usenet prior to that update, but Google Groups search is useless for this question. Does anyone have suggestions for how to proceed?


[Other articles in category /prog/git] permanent link