The Universe of Disco


Tue, 24 Mar 2020

git log --author=... confused me

Today I was looking for recent commits by co worker Fred Flooney, address fflooney@example.com, so I did

    git log --author=ffloo

but nothing came up. I couldn't remember if --author would do a substring search, so I tried

    git log --author=fflooney
    git log --author=fflooney@example.com

and still nothing came up. “Okay,” I said, “probably I have Fred's address wrong.” Then I did

    git log --format=%ae | grep ffloo

The --format=%ae means to just print out commit author email addresses, instead of the usual information. This command did produce many commits with the author address fflooney@example.com.

I changed this to

    git log --format='%H %ae' | grep ffloo

which also prints out the full hash of the matching commits. The first one was 542ab72c92c2692d223bfca4470cf2c0f2339441.

Then I had a perplexity. When I did

    git log -1 --format='%H %ae' 542ab72c92c2692d223bfca4470cf2c0f2339441

it told me the author email address was fflooney@example.com. But when I did

    git show 542ab72c92c2692d223bfca4470cf2c0f2339441

the address displayed was fredf@example.com.

The answer is, the repository might have a file in its root named .mailmap that says “If you see this name and address, pretend you saw this other name and address instead.” Some of the commits really had been created with the address I was looking for, fflooney. But the .mailmap said that the canonical version of that address was fredf@. Nearly all Git operations use the canonical address. The git-log --author option searches the canonical address, and git-show and git-log, by default, display the canonical address.

But my --format=%ae overrides the default behavior; %ae explicitly requests the actual address. To display the canonical address, I should have used --format=%aE instead.

Also, I learned that --author= does not only a substring search but a regex search. I asked it for --author=d* and was puzzled when it produced commits written by people with no d. This is a beginner mistake: d* matches zero or more instances of d, and every name contains zero or more instances of d. (I had thought that the * would be like a shell glob.)

Also, I learned that --author=d+ matches only authors that contain the literal characters d+. If you want the + to mean “one or more” you need --author=d\+.

Thanks to Cees Hek, Gerald Burns, and Val Kalesnik for helping me get to the bottom of this.

The .mailmap thing is documented in git-check-mailmap.

[ Addendum: I could also have used git-log --no-use-mailmap ..., had I known about this beforehand. ]


[Other articles in category /prog] permanent link