The Universe of Discourse

Thu, 21 Jul 2016

Today I invented a pretty good hack.

Suppose I have branch topic checked out. It often happens that I want to

    git push origin topic:mjd/topic


which pushes the topic branch to the origin repository, but on origin it is named mjd/topic instead of topic. This is a good practice when many people share the same repository. I wanted to write a program that would do this automatically.

So the question arose, how should the program figure out the mjd part? Almost any answer would be good here: use some selection of environment variables, the current username, a hard-wired default, and the local part of Git's user.email configuration setting, in some order. Getting user.email is easy (git config get user.email) but it might not be set and then you get nothing. If you make a commit but have no user.email, Git doesn't mind. It invents an address somehow. I decided that I would like my program to to do exactly what Git does when it makes a commit.

But what does Git use for the committer's email address if there is no user.email set? This turns out to be complicated. It consults several environment variables in some order, as I suggested before. (It is documented in git-commit-tree if you are interested.) I did not want to duplicate Git's complicated procedure, because it might change, and because duplicating code is a sin. But there seemed to be no way to get Git to disgorge this value, short of actually making a commit and examining it.

So I wrote this command, which makes a commit and examines it:

    git log -1 --format=%ce $(git-commit-tree HEAD^{tree} < /dev/null)  This is extremely weird, but aside from that it seems to have no concrete drawbacks. It is pure hack, but it is a hack that works flawlessly. What is going on here? First, the $(…) part:

    git-commit-tree HEAD^{tree} < /dev/null


The git-commit-tree command is what git-commit uses to actually create a commit. It takes a tree object, reads a commit message from standard input, writes a new commit object, and prints its SHA1 hash on standard output. Unlike git-commit, it doesn't modify the index (git-commit would use git-write-tree to turn the index into a tree object) and it doesn't change any of the refs (git-commit would update the HEAD ref to point to the new commit.) It just creates the commit.

Here we could use any tree, but the tree of the HEAD commit is convenient, and HEAD^{tree} is its name. We supply an empty commit message from /dev/null.

Then the outer command runs:

    git log -1 --format=%ce $(…)  The $(…) part is replaced by the SHA1 hash of the commit we just created with git-commit-tree. The -1 flag to git-log gets the log information for just this one commit, and the --format=%ce tells git-log to print out just the committer's email address, whatever it is.

This is fast—nearly instantaneous—and cheap. It doesn't change the state of the repository, except to write a new object, which typically takes up 125 bytes. The new commit object is not attached to any refs and so will be garbage collected in due course. You can do it in the middle of a rebase. You can do it in the middle of a merge. You can do it with a dirty index or a dirty working tree. It always works.

(Well, not quite. It will fail if run in an empty repository, because there is no HEAD^{tree} yet. Probably there are some other similarly obscure failure modes.)

I called the shortcut git-push program git-pusho but I dropped the email-address-finder into git-get, which is my storehouse of weird “How do I find out X” tricks.

I wish my best work of the day had been a little bit more significant, but I'll take what I can get.

[ Addendum: Twitter user @shachaf has reminded me that the right way to do this is

    git var GIT_COMMITTER_IDENT


which prints out something like

    Mark Jason Dominus (陶敏修) <mjd@plover.com> 1469102546 -0400


which you can then parse. @shachaf also points out that a Stack Overflow discussion of this very question contains a comment suggesting the same weird hack! ]