The Universe of Discourse

Thu, 29 Aug 2019

More workarounds for semantic problems


Philippe Bruhat, a devious master of sending a single message that will be read in two different ways by two different recipients, suggested an alternative wording for magic phrase messages:

    "I request that this commit be exempt from review for the following reasons: "
    (followed by an actual explanation)

The Git hook will pattern-match the message and find the magic phrase, which is I request that this commit be exempt from review for the following reasons:. Humans, however, will read and understand the actual explanation. I think M. Bruhat has put the first line in quotes so that humans will not attempt to interpret it. In this case it might not be a big problem if they did interpret it; at worst they might be puzzled about why the request was being sent to them rather than to the Git hook. But it also protects against the situation where the secret phrase is “Craig said I could do this” or “Chinga tu madre!”.

My only concern is that, depending on how the explanation was phrased, it might be ungrammatical. I think these quoted phrases should behave like nouns, as in

    "Tinkle in the toidy" is a highly offensive phrase.

As written, M. Bruhat's suggestion has a dangling noun without even a punctuation mark. I suggested something like this:

    This message includes the phrase "The fat man screams at midnight"
    for the following reasons:
    (followed by an actual explanation)

Or we could take a hint from the bronze age Assyrians, who began letters with formulas like:

To my lord Tukulti-Ninurta, say thus: …

Note that this is addressed not to Tukulti-Ninurta himself, but to the messenger who is to read the message to Tukulti-Ninurta. Following this pattern we might write our commit message in this form:

    To the Git pre-receive hook, I say thus:          
    "Craig said I could do this"
    because (actual explanation)

(I originally wrote “we could take a page from the Assyrians”, which is obviously ridiculous.)

Many thanks to M. Bruhat for discussing this with me, and to Rafaël Garcia-Suarez for bringing it to his attention.

[Other articles in category /tech] permanent link

Wed, 28 Aug 2019

Opposites again

In a (still unpublished) discussion a while back, of the complexities of the idea of “opposites”, I said:

"Opposite" extends to all sorts of situations in which logic doesn't apply. Red is the opposite of green, but I'm not sure that it makes sense to ask for the logical negation of green. I suppose you can go with "not green", which is certainly quite different from "red".

A related example: Red is the opposite of green.

What's the opposite of “not green”? Is it “not red”? I think it isn't. The opposite of “not green” is “green”.

[Other articles in category /lang] permanent link

DLR insurance business

DLR is the Docklands Light Railway, a light rail system that operates in East London. The fare collection system is interesting. You buy a ticket, but you don't have to show it before you board. Instead, during the ride, a ticket agent might come through the car and demand to see it. If you can't produce it on demand, you become liable for a large fine. So you can evade the fare, but doing so is a high-risk gamble.

The DLR might like to have enough fare inspectors that the gamble would have negative expected payout for the passengers. But at the time I took it, they didn't do so many inspections. The gamble was actually a good one for passengers, if they didn't mind the high risk of a rare large loss in return for small frequent wins. Most people wouldn't accept the risk, so the system worked.

But about fifteen years ago, some guy in East London had an idea. Insurance exists to diffuse risk! You'd pay him a monthly subscription fee, and then you'd ride the DLR that month without buying tickets. If you were caught by the fare inspectors, you'd pay the fine, send him the receipt, and he would reimburse you. You'd win because the insurance premium you paid this guy cost less than what you would have paid for DLR tickets. He'd win because he could set the insurance premiums high enough to cover the relatively few fines he had to pay out.

For a time this went well for everyone except the DLR. Eventually they caught the guy and punished him for conspiring to evade fares or something like that.

Does anyone remember this? Can someone point me to a reference?

[ Addendum 20190914: Leads provided by Florian Ilgenfritz produced a wealth of information about similar schemes. ]

[Other articles in category /misc] permanent link

Why didn't git add -p work?

It has sometimes happened that I couldn't get my git add -p to work. I would carefully edit a chunk, and then Git would say

    Your edited hunk does not apply. Edit again (saying "no" discards!) [y/n]? e

or sometimes also

    error: patch fragment without header at line 33: @@ -26,21 +29,20 @@ class Parser():

so I'd do it over, and it still wouldn't work.

Today I learned that at least some of those are because Emacs's diff-mode has some bug. It's getting the @@ lines wrong. When I switched to text-mode and composed the @@ line myself, the patch applied.

[Other articles in category /prog] permanent link

Tue, 27 Aug 2019

Workarounds for semantic problems

At work we have a Git repostory hook (which I wrote) that prevents people from pushing changes to sensitive code without having it reviewed first. But there is an escape hatch for emergencies: if your commit message contains a certain phrase, the hook will allow it anyway. The phrase is:

    Craig said I could do this

(Craig is the CTO.)

Recently we did have a semi-emergency situation, and my co-worker Nimrod was delayed because nobody was awake to approve his code. In the discussion after, I mentioned the magic escape phrase. He objected that he could not have used it, because he would have been unwilling to wake up Craig to get the go-ahead. I was briefly puzzled. I hadn't said anything about waking up Craig; all you have to do is put a key phrase in your commit message. Nimrod eventually got me to understand the issue:

Nimrod: what does the phrase Craig said I could do this imply?

I had been thinking of the message as being communicated only to the Git hook. The Git hook thinks you mean only that it should allow the commit into the repo without review, which is true. But Nimrod is concerned about how it will be received by other humans, and to these people he would appear to be telling a lie. Right!

Nimrod had previously suggested a similar feature that involved the magic phrase “I solemnly swear I'm up to no good”.

Me: It seems to me that the only thing lacking from the current feature set is that you want the magic phrase to be I solemnly swear I'm up to no good instead of Craig said I could do this. Is that correct? It seems to me that alternate universe Nimrod might reasonably object that he may not force a commit unless he can really swear that he is up to no good.

So in this case that wouldn't have helped you. Or so I assume.

(I don't know, maybe he really was up to no good? But he did deny it.)

Nimrod: true enough! but it seems less likely to be taken seriously than having to swear that you have Craig's approval…


Me: Perhaps the magic phrase should be I request that this commit be exempt from review.

It would take a very subtle alternate-universe Nimrod to claim that he was too truthful to invoke that magic phrase just because he wanted his commit to be exempt from review.

Nimrod liked that okay, but then I had a better idea:

My suggestion, for the next time this comes up, is that you include the following wording in your commit message:

    My mention here of the magic phrase “Craig said I could do this” is not
    intended to aver that Craig did, in fact, say that I could do this.

The Git hook does not understand the use-mention distinction and you can then enable the feature without uttering a falsehood.

Problem solved!

(This reminds me a little bit of those programs that Philippe Bruhat writes that can be interpreted either as Perl or as PostScript, depending on how you understand the quoting and commenting conventions.)

[ Addendum: The actual magic phrase is not “Craig said I could do this”. ]

[ Addendum 20190829: There is a followup article. ]

[Other articles in category /tech] permanent link

Wed, 07 Aug 2019

Technical devices for reducing the number of axioms

In a recent article, I wrote:

I guessed it was a mere technical device, similar to the one that we can use to reduce five axioms of group theory to three. …

The fact that you can discard two of the axioms is mildly interesting, but of very little practical value in group theory.

There was a sub-digression, which I removed, about a similar sort of device that does have practical value. Suppose you have a group !!\langle G, \ast \rangle!! with a nonempty subset !!H\subset G!!, and you want to show that !!\langle H, \ast \rangle!! is a subgroup of !!G!!. To do this is it is sufficient to show three things:

  1. !!H!! is closed under !!\ast!!
  2. !!G!!'s identity element is in !!H!!
  3. For each element !!h!! of !!H!!, its inverse !!h^{-1}!! is also in !!H!!

Often, however, it is more convenient to show instead:

For each !!a, b\in H!!, the product !!ab^{-1}!! is also in !!H!!

which takes care of all three at once.

[Other articles in category /math] permanent link

Mon, 05 Aug 2019

Princess Andromeda

After decapitating Medusa the Gorgon, Perseus flies home on the winged sandals lent to him by Hermes, But he stops off to do some heroing. Below, he spots a beautiful princess Andromeda, chained to a rock.

Here's the description my kids got from D'Aulaire's Book of Greek Myths:

On the way home, as he flew over the coast of Ethiopia, Perseus saw, far below, a beautiful maiden chained to a rick by the sea. She was so pale that at first he thought she was a marble statue, but then he saw tears trickling from her eyes.

Here's the d’Aulaires’ picture of the pasty-faced princess:

Andromeda has been left there to distract a sea monster, which will devour her instead of ravaging the kingdom. Perseus rescues her, then murders her loser ex-boyfriend, who was conspicuously absent from the rendezvous with the monster. Perseus eventually marries Andromeda and she bears his children.

Very good. Except, one problem here. Andromeda is Princess Royal of Ethiopia, the daughter of King Cepheus and Queen Cassiopeia. She is not pale like a marble statue. She has dark skin.

How dark is not exactly clear. For the Greeks “Aethiopia” was a not entirely specific faraway land. But its name means the land of people with burnt faces, not the land of people who are pale like white marble.

The D'Aulaires are not entirely at fault here. Ovid's Metamorphoses compares her with marble:

As soon as Perseus, great-grandson of Abas, saw her fastened by her arms to the hard rock, he would have thought she was a marble statue, except that a light breeze stirred her hair, and warm tears ran from her eyes.

But he's also quite clear (in Book II) that Ethiopians have dark skin:

It was [during Phaethon episode], so they believe, that the Ethiopians acquired their dark colour, since the blood was drawn to the surface of their bodies.

(Should we assume that Ovid evokes marble for its whiteness? Some marble isn't white. I don't know and I'm not going to check the original Latin today. Or perhaps he only intended to evoke its stillness, for the contrast in the next phrase. Anyway, didn't the Romans paint their marble statuary?)

Andromeda was a popular subject for painting and sculpture over the centuries, since she comes with a a built-in excuse for depicting her naked or at least draped with wet fabric. European artists, predictably, made her white:

Painting by Gustave Doré, 1869.

But at least not every time:

Copy by Bernard Picart, 1731

[Other articles in category /book/myth] permanent link

Sat, 03 Aug 2019

Git wishlist: aggregate changes across non-contiguous commits

(This is actually an essay on the difference between science and engineering.)

My co-worker Lemuel recently asked if there was a way to see all the changes to master from the last week that pertained to a certain ticket. The relevant commit messages all contained the ticket ID, so he knew which commits he wanted; that part is clear. Suppose Lemuel wanted to see the changes introduced in commits C, E, and H, but not those from A, B, D, F, or G.

The closest he could come was git show H E C, which wasn't quite what he wanted. It describes the complete history of the changes, but what he wantwa is more analogous to a diff. For comparison, imagine a world in which git diff A H didn't exist, and you were told to use git show A B C D E F G H instead. See the problem? What Lemuel wants is more like diff than like show.

Lemuel's imaginary command would solve another common request: How can I see all the changes that I have landed on master in a certain time interval? Or similarly: how can I add up the git diff --stat line counts for all my commits in a certain interval?

He said:

It just kinda boggles my mind you can't just get a collective diff on command for a given set of commits

I remember that when I was first learning Git, I often felt boggled in this way. Why can't it just…? And there are several sorts of answers, of which one or more might apply in a particular situation:

  1. It surely could, but nobody has done it yet
  2. It perhaps could, but nobody is quite sure how
  3. It maybe could, but what you want is not as clear as you think
  4. It can't, because that is impossible
  5. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question

Often, engineers will go straight to #5, when actually the answer is in a higher tier. Or they go to #4 without asking if maybe, once the desiderata are clarified a bit, it will move from “impossible” to merely “difficult”. These are bad habits.

I replied to Lemuel's (implicit) question here and tried to make it a mixture of 2 and 3, perhaps with a bit of 4:

Each commit is a snapshot of the state of the repo at a particular instant. A diff shows you the difference between two snapshots. When you do git show commit you're looking at the differences between the snapshot at that commit and at its parent.

Now suppose you have commit A with parent B, and commit C with parent D. I come to you and say I want to see the differences in both A and C at that same time. What would you have it do?

If A and B are on a separate branch and are completely unrelated to C and D, it is hard to see what to do here. But it's not impossible. Our hypothetical command could produce the same output as git show A C. Or it could print an error message Can't display changes from unrelated commits A, C and die without any more output. Either of those might be acceptable.

And if A, B, C, D are all related and on the same branch, say with D , then C, then B, then A, the situation is simpler and perhaps we can do better.

If so, very good, because this is probably the most common case by far. Note that Lemuel's request is of this type.

I continued:

Suppose, for example,that C changes some setting from 0 to 1, then B changes it again to be 2, then A changes it a third time, to say 3. What should the diff show?

This is a serious question, not a refutation. Lemuel could quite reasonably reply by saying that it should show 0 changing to 3, the intermediate changes being less important. (“If you wanted to see those, you should have used git show A C.”)

It may be that that wouldn't work well in practice, that you'd find there were common situations where it really didn't tell you what you wanted to know. But that's something we;d have to learn by trying it out.

I was trying really hard to get away from “what you want is stupid” and toward “there are good reasons why this doesn't exist, but perhaps they are surmountable”:

(I'm not trying to start an argument, just to reduce your bogglement by explaining why this may be less well-specified and more complex than you realize.)

I hoped that Lemuel would take up my invitation to continue the discussion and I tried to enocurage him:

I've wanted this too, and I think something like it could work, especially if all the commits are part of the same branch. … Similarly people often want a way to see all the changes made only by a certain person. Your idea would answer that use case also.

Let's consider another example. Suppose some file contains functions X, Y, Z in that order. Commit A removes Y entirely. Commit B adds a new function, YY, between X and Z. Commit C modifies YY to produce YY'. Lemuel asks for the changes introduced by A and C; he is not interested in B. What should happen?

If Y and YY are completely unrelated, and YY just happens to be at the same place in the file, I think we definitely want to show Y being removed by A, and then that C has made a change to an unrelated function. We certainly don't want to show all of YY beind added. But if YY is considered to be a replacement for Y, I'm not as sure. Maybe we can show the same thing? Or maybe we want to pretend that A replaced Y with YY? That seems dicier now than when I first thought about it, so perhaps it's not as big a problem as I thought.

Or maybe it's enough to do the following:

  1. Take all the chunks produced by the diffs in the output of git show .... In fact we can do better: if A, B, and C are a contiguous sequence, with A the parent of B and B the parent of C, then don't use the chunks from git show A B C; use git diff A C.

  2. Sort the chunks by filename.

  3. Merge the chunks that are making changes to the same file:

    • If two chunks don't overlap at all, there's no issue, just keep them as separate chunks.

    • If two chunks overlap and don't conflict, merge them into a single chunk

    • If they overlap and do conflict, just keep them separate but retain the date and commit ID information. (“This change, then this other change.”)

  4. Then output all the chunks in some reasonable order: grouped by file, and if there were unmergeable chunks for the same file, in chronological order.

This is certainly doable.

If there ware no conflicts, it would certainly be better than git show ... would have been. Is it enough better to offset whatever weirdness might be introduced by the overlap handling? (We're grouping chunks by filename. What if files are renamed?) We don't know, and it does not even have an objective answer. We would have to try it, and then the result might be that some people like it and use it and other people hate it and refuse to use it. If so, that is a win!

[Other articles in category /prog] permanent link