The Universe of Discourse


Sun, 21 Sep 2025

My new git utility `what-changed-twice` needs a new name

As I have explained in the past, my typical workflow is to go along commiting stuff that might or might not make sense, then clean it all up at the end, doing multiple passes with git-add and git-rebase to get related changes into the same commit, and then to order the commits in a sensible way. Yesterday I built a new utility that I found helpful. I couldn't think of a name for it, so I called it what-changed-twice, which is not great but my I am bad at naming things and my first attempt was analyze-commits. I welcome suggestions. In this article I will call it Fred.

What is Fred for? I have a couple of uses for it so far.

Often as I work I'll produce a chain of commits that looks like this:

470947ff minor corrections
d630bf32 continue work on `jq` series
c24b8b24 wip
f4695e97 fix link
a8aa1a5c sp
5f1d7a61 WIP
a337696f Where is the quincunx on the quincunx?
39fe1810 new article: The fivefold symmetry of the quince
0a5a8e2e update broken link
196e7491 sp
bdc781f6 new article: fpuzhpx
40c52f47 merge old and new seasons articles and publish
b59441cd finish updating with Star Wars Droids
537a3545 droids and BJ and the Bear
d142598c Add nicely formatted season tables to this old article
19340470 mention numberphile video

It often happens that I will modify a file on Monday, modify it some more on Tuesday, correct a spelling error on Wednesday. I might have made 7 sets of changes to the main file, of which 4 are related, 2 others are related to each other but not to the other 4, and the last one is unrelated to any of the rest. When a file has changed more than once, I need to see what changed and then group the changes into related sets.

The sp commits are spelling corrections; if the error was made in the same unmerged topic branch I will want to squash the correction into the original commit so that the error never appears at all.

Some files changed only once, and I don't need to think about those at this stage. Later I can go back and split up those commits if it seems to make the history clearer.

Fred takes the output of git-log for the commits you are interested in:

$ git log --stat -20 main...topic | /tmp/what-changed-twice

It finds which files were modified in which commits, and it prints a report about any file that was modified in more than one commit:

 calendar/seasons.blog  196 40 d1
  math/centrifuge.blog  193 33
misc/straight-men.blog  53 b5 bd
        prog/jq-2.blog  33 5f d6 

    193  1934047
    196  196e749
     33  33a2304
     40  40c52f4
     53  537a354
     5f  5f1d7a6
     b5  b59441c
     bd  bdc781f
     d1  d142598
     d6  d630bf3

The report is in two parts. At the top, the path of each file that changed more than once in the log, and the (highly-abbreviated) commit IDs of the commits in which it changed. For example, calendar/seasons.blog changed in commits 196, 40, and d1. The second part of the report explains that 196 is actually an abbreviation for commit 196e749.

Now I can look to see what else changed in those three commits:

$ git show --stat 196e749 40c52f4 d142598

then look at the changes to calendar/seasons.blog in those three commits

$ git show 196e74 40c52f4 d142598 -- calendar/seasons.blog

and then decide if there are any changes I might like to squash together.

Many other files changed on the branch, but I only have to concern myself with four.

There's bonus information too. If a commit is not mentioned in the report, then it only changed files that didn't change in any other commit. That means that in a rebase, I can move that commit literally anywhere else in the sequence without creating a conflict. Only the commits in the report can cause conflicts if they are reordered.

I write most things in Python these days, but this one seemed to cry out for Perl. Here's the code.

Hmm, maybe I'll call it squash-what.


[Other articles in category /prog/git] permanent link