The Universe of Discourse

Thu, 16 Nov 2017

Another system software error

[ Warning: This article is meandering and does not end anywhere in particular ]

My recent article about system software errors kinda blew up the Reddit / Hacker News space, and even got listed on Voat, which I understand is the Group W Bench where they send you if you aren't moral enough to be in Reddit. Many people on these fora were eager to tell war stories of times that they had found errors in the compiler or other infrastructural software.

This morning I remembered another example that had happened to me. In the middle 1990s, I was just testing some network program on one of the Sun Solaris machines that belonged to the Computational Linguistics program, when the entire machine locked up. I had to go into the machine room and power-cycle it to get it to come back up.

I returned to my desk to pick up where I had left off, and the machine locked up, again just as I ran my program. I rebooted the machine again, and putting two and two together I tried the next run on a different, less heavily-used machine, maybe my desk workstation or something.

The problem turned out to be a bug in that version of Solaris: if you bound a network socket to some address, and then tried to connect it to the same address, everything got stuck. I wrote a five-line demonstration program and we reported the bug to Sun. I don't know if it was fixed.

My boss had an odd immediate response to this, something along the lines that connecting a socket to itself is not a sanctioned use case, so the failure is excusable. Channeling Richard Stallman, I argued that no user-space system call should ever be able to crash the system, no matter what stupid thing it does. He at once agreed.

I felt I was on safe ground, because I had in mind the GNU GCC bug reporting instructions of the time, which contained the following unequivocal statement:

If the compiler gets a fatal signal, for any input whatever, that is a compiler bug. Reliable compilers never crash.

I love this paragraph. So clear, so pithy! And the second sentence! It could have been left off, but it is there to articulate the writer's moral stance. It is a rock-firm committment in a wavering and uncertain world.

Stallman was a major influence on my writing for a long time. I first encountered his work in 1985, when I was browsing in a bookstore and happened to pick up a copy of Dr. Dobb's Journal. That issue contained the very first publication of the GNU Manifesto. I had never heard of Unix before, but I was bowled over by Stallman's vision, and I read the whole thing then and there, standing up.

(It hit the same spot in my heart as Albert Szent-Györgyi's The Crazy Ape, which made a similarly big impression on me at about the same time. I think programmers don't take moral concerns seriously enough, and this is one reason why so many of them find Stallman annoying. But this is what I think makes Stallman so important. Perhaps Dan Bernstein is a similar case.)

I have very vague memories of perhaps finding a bug in gcc, which is perhaps why I was familiar with that particular section of the gcc documentation. But more likely I just read it because I read a lot of stuff. Also Stallman was probably on my “read everything he writes” list.

Why was I trying to connect a socket to itself, anyway? Oh, it was a bug. I meant to connect it somewhere else and used the wrong variable or something. If the operating system crashes when you try, that is a bug. Reliable operating systems never crash.

[ Final note: I looked for my five-line program that connected a socket to itself, but I could not find it. But I found something better instead: an email I sent in April 1993 reporting a program that caused g++ version 2.3.3 to crash with an internal compiler error. And yes, my report does quote the same passage I quoted above. ]

[Other articles in category /prog] permanent link

Wed, 15 Nov 2017

Down with the negation sign!

[ Credit where it is due: This was entirely Darius Bacon's idea. ]

In connection with “Recognizing when two arithmetic expressions are essentially the same”, I had several conversations with people about ways to normalize numeric expressions. In that article I observed that while everyone knows the usual associative law for addition $$ (a + b) + c = a + (b + c)$$ nobody ever seems to mention the corresponding law for subtraction: $$ (a+b)-c = a + (b-c).$$

And while everyone “knows” that subtraction is not associative because $$(a - b) - c ≠ a - (b-c)$$ nobody ever seems to observe that there is an associative law for subtraction: $$\begin{align} (a - b) + c & = a - (b - c) \\ (a -b) -c & = a-(b+c).\end{align}$$

This asymmetry is kind of a nuisance, and suggests that a more symmetric notation might be better. Darius Bacon suggested a simple change that I think is an improvement:

Write the negation of !!a!! as $$a\star$$ so that one has, for all !!a!!, $$a+a\star = a\star + a = 0.$$

The !!\star!! operation obeys the following elegant and simple laws: $$\begin{align} a\star\star & = a \\ (a+b)\star & = a\star + b\star \end{align} $$

Once we adopt !!\star!!, we get a huge payoff: We can eliminate subtraction:

Instead of !!a-b!! we now write !!a+b\star!!.

The negation of !!a+b\star!! is $$(a+b\star)\star = a\star + b{\star\star} = a\star +b.$$

We no longer have the annoying notational asymmetry between !!a-b!! and !!-b + a!! where the plus sign appears from nowhere. Instead, one is !!a+b\star!! and the other is !!b\star+a!!, which is obviously just the usual commutativity of addition.

The !!\star!! is of course nothing but a synonym for multiplication by !!-1!!. But it is a much less clumsy synonym. !!a\star!! means !!a\cdot(-1)!!, but with less inkjunk.

In conventional notation the parentheses in !!a(-b)!! are essential and if you lose them the whole thing is ruined. But because !!\star!! is just a special case of multiplication, it associates with multiplication and division, so we don't have to worry about parentheses in !!(a\star)b = a(b\star) = (ab)\star!!. They are all equal to just !!ab\star!!. and you can drop the parentheses or include them or write the terms in any order, just as you like, just as you would with !!abc!!.

The surprising associativity of subtraction is no longer surprising, because $$(a + b) - c = a + (b - c)$$ is now written as $$(a + b) + c\star = a + (b + c\star)$$ so it's just the usual associative law for addition; it is not even disguised. The same happens for the reverse associative laws for subtraction that nobody mentions; they become variations on $$ \begin{align} (a + b\star) + c\star & = a + (b\star + c\star) \\ & = a + (b+c)\star \end{align} $$ and such like.

The !!\star!! is faster to read and faster to say. Instead of “minus one” or “negative one” or “times negative one”, you just say “star”.

The !!\star!! is just a number, and it behaves like a number. Its role in an expression is the same as any other number's. It is just a special, one-off notation for a single, particularly important number.

Open questions:

  1. Do we now need to replace the !!\pm!! sign? If so, what should we replace it with?

  2. Maybe the idea is sound, but the !!\star!! itself is a bad choice. It is slow to write. It will inevitably be confused with the * that almost every programming language uses to denote multiplication.

  3. The real problem here is that the !!-!! symbol is overloaded. Instead of changing the negation symbol to !!\star!! and eliminating the subtraction symbol, what if we just eliminated subtraction? None of the new notation would be incompatible with the old notation: !!-(a+-b) = b+-a!! still means exactly what it used to. But you are no longer allowed to abbreviate it to !!-(a-b) = b-a!!.

    This would fix the problem of the !!\star!! taking too long to write: we would just use !!-!! in its place. It would also fix the concern of point 2: !!a\pm b!! now means !!a+b!! or !!a+-b!! which is not hard to remember or to understand. Another happy result: notations like !!-1!! and !!-2!! do not change at all.

Curious footnote: While I was writing up the draft of this article, it had a reminder in it: “How did you and Darius come up with this?” I went back to our email to look, and I discovered the answer was:

  1. Darius suggested the idea to me.
  2. I said, “Hey, that's a great idea!”

I wish I could take more credit, but there it is. Hmm, maybe I will take credit for inspiring Darius! That should be worth at least fifty percent, perhaps more.

[ This article had some perinatal problems. It escaped early from the laboratory, in a not-quite-finished state, so I apologize if you are seeing it twice. ]

[Other articles in category /math] permanent link

Sun, 12 Nov 2017

No, it is not a compiler error. It is never a compiler error.

When I used to hang out in the comp.lang.c Usenet group, back when there was a comp.lang.c Usenet group, people would show up fairly often with some program they had written that didn't work, and ask if their compiler had a bug. The compiler did not have a bug. The compiler never had a bug. The bug was always in the programmer's code and usually in their understanding of the language.

When I worked at the University of Pennsylvania, a grad student posted to one of the internal bulletin boards looking for help with a program that didn't work. Another graduate student, a super-annoying know-it-all, said confidently that it was certainly a compiler bug. It was not a compiler bug. It was caused by a misunderstanding of the way arguments to unprototyped functions were automatically promoted.

This is actually a subtle point, obscure and easily misunderstood. Most examples I have seen of people blaming the compiler are much sillier. I used to be on the mailing list for discussing the development of Perl 5, and people would show up from time to time to ask if Perl's if statement was broken. This is a little mind-boggling, that someone could think this. Perl was first released in 1987. (How time flies!) The if statement is not exactly an obscure or little-used feature. If there had been a bug in if it would have been discovered and fixed by 1988. Again, the bug was always in the programmer's code and usually in their understanding of the language.

Here's something I wrote in October 2000, which I think makes the case very clearly, this time concerning a claimed bug in the stat() function, another feature that first appeared in Perl 1.000:

On the one hand, there's a chance that the compiler has a broken stat and is subtracting 6 or something. Maybe that sounds likely to you but it sounds really weird to me. I cannot imagine how such a thing could possibly occur. Why 6? It all seems very unlikely.

Well, in the absence of an alternative hypothesis, we have to take what we can get. But in this case, there is an alternative hypothesis! The alternative hypothesis is that [this person's] program has a bug.

Now, which seems more likely to you?

  • Weird, inexplicable compiler bug that nobody has ever seen before


  • Programmer fucked up

Hmmm. Let me think.

I'll take Door #2, Monty.

Presumably I had to learn this myself at some point. A programmer can waste a lot of time looking for the bug in the compiler instead of looking for the bug in their program. I have a file of (obnoxious) Good Advice for Programmers that I wrote about twenty years ago, and one of these items is:

Looking for a compiler bug is the strategy of LAST resort. LAST resort.

Anyway, I will get to the point. As I mentioned a few months ago, I built a simple phone app that Toph and I can use to find solutions to “twenty-four puzzles”. In these puzzles, you are given four single-digit numbers and you have to combine them arithmetically to total 24. Pennsylvania license plates have four digits, so as we drive around we play the game with the license plate numbers we see. Sometimes we can't solve a puzzle, and then we wonder: is it because there is no solution, or because we just couldn't find one? Then we ask the phone app.

The other day we saw the puzzle «5 4 5 1», which is very easy, but I asked the phone app, to find out if there were any other solutions that we missed. And it announced `No solutions.” Which is wrong. So my program had a bug, as my programs often do.

The app has a pre-populated dictionary containing all possible solutions to all the puzzles that have solutions, which I generated ahead of time and embedded into the app. My first guess was that bug had been in the process that generated this dictionary, and that it had somehow missed the solutions of «5 4 5 1». These would be indexed under the key 1455, which is the same puzzle, because each list of solutions is associated with the four input numbers in ascending order. Happily I still had the original file containing the dictionary data, but when I looked in it under 1455 I saw exactly the two solutions that I expected to see.

So then I looked into the app itself to see where the bug was. Code Studio's underlying language is Javascript, and Code Studio has a nice debugger. I ran the app under the debugger, and stopped in the relevant code, which was:

    var x = [getNumber("a"), getNumber("b"), getNumber("c"), getNumber("d")].sort().join("");

This constructs a hash key (x) that is used to index into the canned dictionary of solutions. The getNumber() calls were retrieving the four numbers from the app's menus, and I verified that the four numbers were «5 4 5 1» as they ought to be. But what I saw next astounded me: x was not being set to 1455 as it should have been. It was set to 4155, which was not in the dictionary. And it was set to 4155 because

the built-in sort() function

was sorting the numbers





For a while I could not believe my eyes. But after another fifteen or thirty minutes of tinkering, I sent off a bug report… no, I did not. I still didn't believe it. I asked the front-end programmers at my company what my mistake had been. Nobody had any suggestions.

Then I sent off a bug report that began:

I think that Array.prototype.sort() returned a wrongly-sorted result when passed a list of four numbers. This seems impossible, but …

I was about 70% expecting to get a reply back explaining what I had misunderstood about the behavior of Javascript's sort().

But to my astonishment, the reply came back only an hour later:

Wow! You're absolutely right. We'll investigate this right away.

In case you're curious, the bug was as follows: The sort() function was using a bubble sort. (This is of course a bad choice, and I think the maintainers plan to replace it.) The bubble sort makes several passes through the input, swapping items that are out of order. It keeps a count of the number of swaps in each pass, and if the number of swaps is zero, the array is already ordered and the sort can stop early and skip the remaining passes. The test for this was:

    if (changes <= 1) break;

but it should have been:

    if (changes == 0) break;


The Code Studio folks handled this very creditably, and did indeed fix it the same day. (The support system ticket is available for your perusal, as is the Github pull request with the fix, in case you are interested.)

I still can't quite believe it. I feel as though I have accidentally spotted the Loch Ness Monster, or Bigfoot, or something like that, a strange and legendary monster that until now I thought most likely didn't exist.

A bug in the sort() function. O day and night, but this is wondrous strange!

[ Addendum 20171113: Thanks to Reddit user spotter for pointing me to a related 2008 blog post of Jeff Atwood's, “The First Rule of Programming: It's Always Your Fault”. ]

[ Addendum 20171113: Yes, yes, I know sort() is in the library, not in the compiler. I am using “compiler error” as a synecdoche for “system software error”. ]

[ Addendum 20171116: I remembered examples of two other fundamental system software errors I have discovered, including one honest-to-goodness compiler bug. ]

[Other articles in category /prog] permanent link

Sat, 11 Nov 2017

Randomized algorithms go fishing

A little while back I thought of a perfect metaphor for explaining what a randomized algorithm is. It's so perfect I'm sure it must have thought of many times before, but it's new to me.

Suppose you have a lake, and you want to know if there are fish in the lake. You dig some worms, pick a spot, bait the hook, and wait. At the end of the day, if you have caught a fish, you have your answer: there are fish in the lake.[1]

But what if you don't catch a fish? Then you still don't know. Perhaps you used the wrong bait, or fished in the wrong spot. Perhaps you did everything right and the fish happened not to be biting that day. Or perhaps you did everything right except there are no fish in the lake.

But you can try again. Pick a different spot, try a different bait, and fish for another day. And if you catch a fish, you know the answer: the lake does contain fish. But if not, you can go fishing again tomorrow.

Suppose you go fishing every day for a month and you catch nothing. You still don't know why. But you have a pretty good idea: most likely, it is because there are no fish to catch. It could be that you have just been very unlucky, but that much bad luck is unlikely.

But perhaps you're not sure enough. You can keep fishing. If, after a year, you have not caught any fish, you can be almost certain that there were no fish in the lake at all. Because a year-long run of bad luck is extremely unlikely. But if you are still not convinced, you can keep on fishing. You will never be 100% certain, but if you keep at it long enough you can become 99.99999% certain with as many nines as you like.

That is a randomized algorithm, for finding out of there are fish in a lake! It might tell you definitively that there are, by producing a fish. Or it might fail, and then you still don't know. But as long as it keeps failing, the chance that there are any fish rapidly becomes very small, exponentially so, and can be made as small as you like.

For not-metaphorical examples, see:

  • The Miller-Rabin primality test: Given an odd number K, is K composite? If it is, the Miller-Rabin test will tell you so 75% of the time. If not, you can go fishing again the next day. After n trials, you are either !!100\%!! certain that K is composite, or !!100\%-\frac1{2^{2n}}!! certain that it is prime.

  • Frievalds’ algorithm: given three square matrices !!A, B, !! and !!C!!, is !!C!! the product !!A×B!!? Actually multiplying !!A×B!! could be slow. But if !!A×B!! is not equal to !!C!!, Frievald's algorithm will quickly tell you that it isn't—half the time. If not, you can go fishing again. After n trials, you are either !!100\%!! certain that !!C!! is not the correct product, or !!100\%-\frac1{2^n}!! certain that it is.

[1] Let us ignore mathematicians’ pettifoggery about lakes that contain exactly one fish. This is just a metaphor. If you are really concerned, you can catch-and-release.

[Other articles in category /CS] permanent link

Tue, 07 Nov 2017

A modern translation of the 1+1=2 lemma

A while back I blogged an explanation of the “!!1+1=2!!” lemma from Whitehead and Russell's Principia Mathematica:

W. Ethan Duckworth of the Department of Mathematics and Statistics at Loyola University translated this into modern notation and has kindly given me permission to publish it here:

I think it is interesting and instructive to compare the two versions. One thing to notice is that there is no perfect translation. As when translating between two natural languages (German and English, say), the meaning cannot be preserved exactly. Whitehead and Russell's language is different from the modern language not only because the notation is different but because the underlying concepts are different. To really get what Principia Mathematica is saying you have to immerse yourself in the Principia Mathematica model of the world.

The best example of this here is the symbol “1”. In the modern translation, this means the number 1. But at this point in Principia Mathematica, the number 1 has not yet been defined, and to use it here would be circular, because proposition ∗54.43 is an important step on the way to defining it. In Principia Mathematica, the symbol “1” represents the class of all sets that contain exactly one element.[1] Following the definition of ∗52.01, in modern notation we would write something like:

$$1 \equiv_{\text{def}} \{x \mid \exists y . x = \{ y \} \}$$

But in many modern universes, that of ZF set theory in particular, there is no such object.[2] The situation in ZF is even worse: the purported definition is meaningless, because the comprehension is unrestricted.

The Principia Mathematica notation for !!|A|!!, the cardinality of set !!A!!, is !!Nc\,‘A!!, but again this is only an approximate translation. The meaning of !!Nc\,‘A!! is something close to

the unique class !!C!! such that !!x\in C!! if and only if there exists a one-to-one relation between !!A!! and !!x!!.

(So for example one might assert that !!Nc\,‘\Lambda = 0!!, and in fact this is precisely what proposition ∗101.1 does assert.) Even this doesn't quite capture the Principia Mathematica meaning, since the modern conception of a relation is that it is a special kind of set, but in Principia Mathematica relations and sets are different sorts of things. (We would also use a one-to-one function, but here there is no additional mismatch between the modern concept and the Principia Mathematica one.)

It is important, when reading old mathematics, to try to understand in modern terms what is being talked about. But it is also dangerous to forget that the ideas themselves are different, not just the language.[3] I extract a lot of value from switching back and forth between different historical views, and comparing them. Some of this value is purely historiological. But some is directly mathematical: looking at the same concepts from a different viewpoint sometimes illuminates aspects I didn't fully appreciate. And the different viewpoint I acquire is one that most other people won't have.

One of my current low-priority projects is reading W. Burnside's important 1897 book Theory of Groups of Finite Order. The value of this, for me, is not so much the group-theoretic content, but in seeing how ideas about groups have evolved. I hope to write more about this topic at some point.

[1] Actually the situation in Principia Mathematica is more complicated. There is a different class 1 defined at each type. But the point still stands.

[2] In ZF, if !!1!! were to exist as defined above, the set !!\{1\}!! would exist also, and we would have !!\{1\} \in 1!! which would contradict the axiom of foundation.

[3] This was a recurring topic of study for Imre Lakatos, most famously in his little book Proofs and Refutations. Also important is his article “Cauchy and the continuum: the significance of nonstandard analysis for the history and philosophy of mathematics.” Math. Intelligencer 1 (1978), #3, p.151–161, which I discussed here earlier, and which you can read in its entireity by paying the excellent people at Elsevier the nominal and reasonable—nay, trivial—sum of only US$39.95.

[Other articles in category /math] permanent link

Thu, 02 Nov 2017

I missed an easy solution to a silly problem

A few years back I wrote a couple of articles about the extremely poor macro plugin I wrote for Blosxom. ([1] [2]). The feature-poorness of the macro system is itself the system's principal feature, since it gives the system simple behavior and simple implementation. Sometimes this poverty means I have to use odd workarounds to get it to do what I want, but they are always simple workarounds and it is never hard to figure out why it didn't do what I wanted.

Yesterday, though, I got stuck. I had defined a macro, ->, which the macro system would replace with →. Fine. But later in the file I had an HTML comment:

    <!-- blah blah -->

and the macro plugin duly transformed this to

    <!-- blah blah -→

creating an unterminated comment.

Normally the way I would deal with this would be to change the name of the macro from -> to something else that did not conflict with HTML syntax, but I was curiously resistant to this. I could not think of anything I wanted to use instead. (After a full night's sleep, it seems to me that => would have been just fine.)

I then tried replacing --> with &#45;&#45;&#62;. I didn't expect this to work, and indeed it didn't. Those &# escapes only work for text parts of an HTML document, and cannot be used for HTML syntax. They remove the special meaning of character sequences, which is the opposite of what I need here.

Eventually, I just deleted the comment and moved on. That worked, although it was obviously suboptimal. I was too tired to think, and I just wanted the problem out of the way. I wish I had been a little less impulsive, because there are at least two other solutions I overlooked:

  • I could have defined an END-HTML-COMMENT macro that expanded to --> and then used it at the end of the comment. The results of macro expansion are never re-expanded, so the resulting --> would not have been subject to further replacement. This is ugly, but would have done the job, and I have used this sort of technique before.

  • Despite its smallness, the macro plugin has had feature creep over the years, and some of these features I put in and then forgot about. (Bad programmer! No cookie!) One of these lost features is the directive #undefall, which instructs the plugin to forget all the macro definitions. I could have solved the problem by sticking this in just before the broken comment.

    (The plugin does not (yet) have a selective #undef. It would be easy to add, but the module has too many features already.)

Yesterday morning I asked my co-workers if there was an alternative HTML comment syntax, or some way to modify the comment ending sequence so that the macro plugin wouldn't spoil it. (I think there isn't, and a short look at the HTML 5.0 standard didn't suggest any workaround.)

One of the co-workers was Tye McQueen. He said that as far as he knew there was no fix to the HTML comments that was like what I had asked for. He asked whether I could define a second macro, -->, which would expand to -->.

I carefully explained why this wouldn't work: when two macro definitions share a prefix, and they both match, the macro system does not make any guarantee about which substitution it will perform. If there are two overlapping macros, say:

    #define pert curt
    #define per  bloods

then the string pertains might turn into curtains (if the first substitution is performed) or into bloodstains (if the second is). But you don't know which it will be, and it might be different each time. I could fix this, but at present I prefer the answer to be “don't define a macro that is a prefix of another macro.”

Then M. McQueen gently pointed out that -> is not a prefix of -->. It is a suffix. My objection does not apply, and his suggestion solves the problem.

When I write an Oops post I try to think about what lesson I can learn from the mistake. This time there isn't too much, but I do have a couple of ideas:

  1. Before giving up on some piece of software I've written, look to see if I foresaw the problem and put in a feature to deal with it. I have a notion that this would work surprisingly often.

    There are two ways to deal with the problem of writing software with too many features, and I have worked on this problem for many years from only one direction: try to write fewer features. But I could also come at it from the opposite direction: try to make better use of the features that I do write.

    It feels odd that this should seem to me like a novel idea, but there it is.

  2. Try to get more sleep.

(I haven't published an oops article in far too long, and it certainly isn't because I haven't been making mistakes. I will try to keep it in mind.)

[Other articles in category /oops] permanent link

Tue, 31 Oct 2017

The Blind Spot and the cut rule

[ The Atom and RSS feeds have done an unusually poor job of preserving the mathematical symbols in this article. It will be much more legible if you read it on my blog. ]

Lately I've been enjoying The Blind Spot by Jean-Yves Girard, a very famous logician. (It is translated from French; the original title is Le Point Aveugle.) This is an unusual book. It is solidly full of deep thought and technical detail about logic, but it is also opinionated, idiosyncratic and polemical. Chapter 2 (“Incompleteness”) begins:

It is out of question to enter into the technical arcana of Gödel’s theorem, this for several reasons:

(i) This result, indeed very easy, can be perceived, like the late paintings of Claude Monet, but from a certain distance. A close look only reveals fastidious details that one perhaps does not want to know.

(ii) There is no need either, since this theorem is a scientific cul-de-sac : in fact it exposes a way without exit. Since it is without exit, nothing to seek there, and it is of no use to be expert in Gödel’s theorem.

(The Blind Spot, p. 29)

He continues a little later:

Rather than insisting on those tedious details which «hide the forest», we shall spend time on objections, from the most ridiculous to the less stupid (none of them being eventually respectable).

As you can see, it is not written in the usual dry mathematical-text style, presenting the material as a perfect and aseptic distillation of absolute truth. Instead, one sees the history of logic, the rise and fall of different theories over time, the interaction and relation of many mathematical and philosophical ideas, and Girard's reflections about it all. It is a transcription of a lecture series, and reads like one, including all of the speaker's incidental remarks and offhand musings, but written down so that each can be weighed and pondered at length. Instead of wondering in the moment what he meant by some intriguing remark, then having to abandon the thought to keep up with the lecture, I can pause and ponder the significance. Girard is really, really smart, and knows way more about logic than I ever will, and his offhand remarks reward this pondering. The book is profound in a way that mathematics books often aren't. I wanted to provide an illustrative quotation, but to briefly excerpt a profound thought is to destroy its profundity, so I will have to refrain.[1])

The book really gets going with its discussion of Gentzen's sequent calculus in chapter 3.

Between around 1890 (when Peano and Frege began to liberate logic from its medieval encrustations) and 1935 when the sequent calculus was invented, logical proofs were mainly in the “Hilbert style”. Typically there were some axioms, and some rules of deduction by which the axioms could be transformed into other formulas. A typical example consists of the axioms $$A\to(B\to A)\\ (A \to (B \to C)) \to ((A \to B) \to (A \to C)) $$ (where !!A, B, C!! are understood to be placeholders that can be replaced by any well-formed formulas) and the deduction rule modus ponens: having proved !!A\to B!! and !!A!!, we can deduce !!B!!.

In contrast, sequent calculus has few axioms and many deduction rules. It deals with sequents which are claims of implication. For example: $$p, q \vdash r, s$$ means that if we can prove all of the formulas on the left of the ⊢ sign, then we can conclude some of the formulas on the right. (Perhaps only one, but at least one.)

A typical deductive rule in sequent calculus is:

$$ \begin{array}{c} Γ ⊢ A, Δ \qquad Γ ⊢ B, Δ \\ \hline Γ ⊢ A ∧ B, Δ \end{array} $$

Here !!Γ!! and !!Δ!! represent any lists of formulas, possibly empty. The premises of the rule are:

  1. !!Γ ⊢ A, Δ!!: If we can prove all of the formulas in !!Γ!!, then we can conclude either the formula !!A!! or some of the formulas in !!Δ!!.

  2. !!Γ ⊢ B, Δ!!: If we can prove all of the formulas in !!Γ!!, then we can conclude either the formula !!B!! or some of the formulas in !!Δ!!.

From these premises, the rule allows us to deduce:

!!Γ ⊢ A ∧ B, Δ!!: If we can prove all of the formulas in !!Γ!!, then we can conclude either the formula !!A \land B!! or some of the formulas in !!Δ!!.

The only axioms of sequent calculus are utterly trivial:

$$ \begin{array}{c} \phantom{A} \\ \hline A ⊢ A \end{array} $$

There are no premises; we get this deduction for free: If can prove !!A!!, we can prove !!A!!. (!!A!! here is a metavariable that can be replaced with any well-formed formula.)

One important point that Girard brings up, which I had never realized despite long familiarity with sequent calculus, is the symmetry between the left and right sides of the turnstile ⊢. As I mentioned, the interpretation of !!Γ ⊢ Δ!! I had been taught was that it means that if every formula in !!Γ!! is provable, then some formula in !!Δ!! is provable. But instead let's focus on just one of the formulas !!A!! on the right-hand side, hiding in the list !!Δ!!. The sequent !!Γ ⊢ Δ, A!! can be understood to mean that to prove !!A!!, it suffices to prove all of the formulas in !!Γ!!, and to disprove all the formulas in !!Δ!!. And now let's focus on just one of the formulas on the left side: !!Γ, A ⊢ Δ!! says that to disprove !!A!!, it suffices to prove all the formulas in !!Γ!! and disprove all the formulas in !!Δ!!.

The all-some correspondence, which had previously caused me to wonder why it was that way and not something else, perhaps the other way around, has turned into a simple relationship about logical negation: the formulas on the left are positive, and the ones on the right are negative.[2]) With this insight, the sequent calculus negation laws become not merely simple but trivial:

$$ \begin{array}{cc} \begin{array}{c} Γ, A ⊢ Δ \\ \hline Γ ⊢ \lnot A, Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A, Δ \\ \hline Γ, \lnot A ⊢ Δ \end{array} \end{array} $$

For example, in the right-hand deduction: what is sufficient to prove !!A!! is also sufficient to disprove !!¬A!!.

(Compare also the rule I showed above for ∧: It now says that if proving everything in !!Γ!! and disproving everything in !!Δ!! is sufficient for proving !!A!!, and likewise sufficient for proving !!B!!, then it is also sufficient for proving !!A\land B!!.)

But none of that was what I planned to discuss; this article is (intended to be) about sequent calculus's “cut rule”.

I never really appreciated the cut rule before. Most of the deductive rules in the sequent calculus are intuitively plausible and so simple and obvious that it is easy to imagine coming up with them oneself.

But the cut rule is more complicated than the rules I have already shown. I don't think I would have thought of it easily:

$$ \begin{array}{c} Γ ⊢ A, Δ \qquad Λ, A ⊢ Π \\ \hline Γ, Λ ⊢ Δ, Π \end{array} $$

(Here !!A!! is a formula and !!Γ, Δ, Λ, Π!! are lists of formulas, possibly empty lists.)

Girard points out that the cut rule is a generalization of modus ponens: taking !!Γ, Δ, Λ!! to be empty and !!Π = \{B\}!! we obtain:

$$ \begin{array}{c} ⊢ A \qquad A ⊢ B \\ \hline ⊢ B \end{array} $$

The cut rule is also a generalization of the transitivity of implication:

$$ \begin{array}{c} X ⊢ A \qquad A ⊢ Y \\ \hline X ⊢ Y \end{array} $$

Here we took !!Γ = \{X\}, Π = \{Y\}!!, and !!Δ!! and !!Λ!! empty.

This all has given me a much better idea of where the cut rule came from and why we have it.

In sequent calculus, the deduction rules all come in pairs. There is a rule about introducing ∧, which I showed before. It allows us to construct a sequent involving a formula with an ∧, where perhaps we had no ∧ before. (In fact, it is the only way to do this.) There is a corresponding rule (actually two rules) for getting rid of ∧ when we have it and we don't want it:

$$ \begin{array}{cc} \begin{array}{c} Γ ⊢ A\land B, Δ \\ \hline Γ ⊢ A, Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A\land B, Δ \\ \hline Γ ⊢ B, Δ \end{array} \end{array} $$

Similarly there is a rule (actually two rules) about introducing !!\lor!! and a corresponding rule about eliminating it.

The cut rule seems to lie outside this classification. It is not paired.

But Girard showed me that it is part of a pair. The axiom

$$ \begin{array}{c} \phantom{A} \\ \hline A ⊢ A \end{array} $$

can be seen as an introduction rule for a pair of !!A!!s, one on each side of the turnstile. The cut rule is the corresponding rule for eliminating !!A!! from both sides.

Sequent calculus proofs are much easier to construct than Hilbert-style proofs. Suppose one wants to prove !!B!!. In a Hilbert system the only deduction rule is modus ponens, which requires that we first prove !!A\to B!! and !!A!! for some !!A!!. But what !!A!! should we choose? It could be anything, and we have no idea where to start or how big it could be. (If you enjoy suffering, try to prove the simple theorem !!A\to A!! in the Hilbert system I described at the beginning of the article. (Solution)

In sequent calculus, there is only one way to prove each kind of thing, and the premises in each rule are simply related to the consequent we want. Constructing the proof is mostly a matter of pushing the symbols around by following the rules to their conclusions. (Or, if this is impossible, one can conclude that there is no proof, and why.[3]) Construction of proofs can now be done entirely mechanically!

Except! The cut rule does require one to guess a formula: If one wants to prove !!Γ, Λ ⊢ Δ, Π!!, one must guess what !!A!! should appear in the premises !!Γ, A ⊢ Δ!! and !!Λ ⊢ A, Π!!. And there is no constraint at all on !!A!!; it could be anything, and we have no idea where to start or how big it could be.

The good news is that Gentzen, the inventor of sequent calculus, showed that one can dispense with the cut rule: it is unnecessary:

In Hilbert-style systems, based on common sense, the only rule is (more or less) the Modus Ponens : one reasons by linking together lemmas and consequences. We just say that one can get rid of that : it is like crossing the English Channel with fists and feet bound !

(The Blind Spot, p. 61)

Gentzen's demonstration of this shows how one can take any proof that involves the cut rule, and algorithmically eliminate the cut rule from it to obtain a proof of the same result that does not use cut. Gentzen called this the “Hauptsatz” (“principal theorem”) and rightly so, because it reduces construction of logical proofs to an algorithm and is therefore the ultimate basis for algorithmic proof theory.

The bad news is that the cut-elimination process can super-exponentially increase the size of the proof, so it does not lead to a practical algorithm for deciding provability. Girard analyzed why, and what he discovered amazed me. The only problem is in the contraction rules, which had seemed so trivial and innocuous—uninteresting, even—that I had never given them any thought:

$$ \begin{array}{cc} \begin{array}{c} Γ, A, A ⊢ Δ \\ \hline Γ, A ⊢ Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A, A, Δ \\ \hline Γ ⊢ A, Δ \end{array} \end{array} $$

And suddenly Girard's invention of linear logic made sense to me. In linear logic, contraction is forbidden; one must use each formula in one and only one deduction.

Previously it had seemed to me that this was a pointless restriction. Now I realized that it was no more of a useless hair shirt than the intuitionistic rejection of the law of the proof by contradiction: not a stubborn refusal to use an obvious tool of reasoning, but a restriction of proofs to produce better reasoning. With the rejection of contraction, cut-elimination no longer explodes proof size, and automated theorem proving becomes practical:

This is why its study is, implicitly, at the very heart of these lectures.

(The Blind Spot, p. 63)

The book is going to get into linear logic later in the next chapter. I have read descriptions of linear logic before, but never understood what it was up to. (It has two logical and operators, and two logical or operators; why?) But I am sure Girard will explain it marvelously.

  1. In place of a profound excerpt, I will present the following, which isn't especially profound but struck a chord for me: “By the way, nothing is more arbitrary than a modal logic « I am done with this logic, may I have another one ? » seems to be the motto of these guys.” (p. 24)
  2. Compare my alternative representation of arithmetic expressions that collapses addition and subtraction, and multiplication and division.
  3. A typically Girardian remark is that analytic tableaux are “the poor man's sequents”.

A brief but interesting discussion of The Blind Spot on Hacker News.

Review by Felipe Zaldivar.

[Other articles in category /math/logic] permanent link

Sun, 15 Oct 2017

Counting increasing sequences with Burnside's lemma

[ I started this article in March and then forgot about it. Ooops! ]

Back in February I posted an article about how there are exactly 715 nondecreasing sequences of 4 digits. I said that !!S(10, 4)!! was the set of such sequences and !!C(10, 4)!! was the number of such sequences, and in general $$C(d,n) = \binom{n+d-1}{d-1} = \binom{n+d-1}{n}$$ so in particular $$C(10,4) = \binom{13}{4} = 715.$$

I described more than one method of seeing this, but I didn't mention the method I had found first, which was to use the Cauchy-Frobenius-Redfeld-Pólya-Burnside counting lemma. I explained the lemma in detail some time ago, with beautiful illustrated examples, so I won't repeat the explanation here. The Burnside lemma is a kind of big hammer to use here, but I like big hammers. And the results of this application of the big hammer are pretty good, and justify it in the end.

To count the number of distinct sequences of 4 digits, where some sequences are considered “the same” we first identify a symmetry group whose orbits are the equivalence classes of sequences. Here the symmetry group is !!S_4!!, the group that permutes the elements of the sequence, because two sequences are considered “the same” if they have exactly the same digits but possibly in a different order, and the elements of !!S_4!! acting on the sequences are exactly what you want to permute the elements into some different order.

Then you tabulate how many of the 10,000 original sequences are left fixed by each element !!p!! of !!S_4!!, which is exactly the number of cycles of !!p!!. (I have also discussed cycle classes of permutations before.) If !!p!! contains !!n!! cycles, then !!p!! leaves exactly !!10^n!! of the !!10^4!! sequences fixed.

Cycle class Number
of cycles
How many
left fixed
41 10,000
3 6 1,000

2 3 + 8 = 11 100
1 6 10
  24 17,160

(Skip this paragraph if you already understand the table. The four rows above are an abbreviation of the full table, which has 24 rows, one for each of the 24 permutations of order 4. The “How many permutations?” column says how many times each row should be repeated. So for example the second row abbreviates 6 rows, one for each of the 6 permutations with three cycles, which each leave 1,000 sequences fixed, for a total of 6,000 in the second row, and the total for all 24 rows is 17,160. There are two different types of permutations that have two cycles, with 3 and 8 permutations respectively, and I have collapsed these into a single row.)

Then the magic happens: We average the number left fixed by each permutation and get !!\frac{17160}{24} = 715!! which we already know is the right answer.

Now suppose we knew how many permutations there were with each number of cycles. Let's write !!\def\st#1#2{\left[{#1\atop #2}\right]}\st nk!! for the number of permutations of !!n!! things that have exactly !!k!! cycles. For example, from the table above we see that $$\st 4 4 = 1,\quad \st 4 3 = 6,\quad \st 4 2 = 11,\quad \st 4 1 = 6.$$

Then applying Burnside's lemma we can conclude that $$C(d, n) = \frac1{n!}\sum_i \st ni d^i .\tag{$\spadesuit$}$$ So for example the table above computes !!C(10,4) = \frac1{24}\sum_i \st 4i 10^i = 715!!.

At some point in looking into this I noticed that $$\def\rp#1#2{#1^{\overline{#2}}}% \def\fp#1#2{#1^{\underline{#2}}}% C(d,n) = \frac1{n!}\rp dn$$ where !!\rp dn!! is the so-called “rising power” of !!d!!: $$\rp dn = d\cdot(d+1)(d+2)\cdots(d+n-1).$$ I don't think I had a proof of this; I just noticed that !!C(d, 1) = d!! and !!C(d, 2) = \frac12(d^2+d)!! (both obvious), and the Burnside's lemma analysis of the !!n=4!! case had just given me !!C(d, 4) = \frac1{24}(d^4 +6d^3 + 11d^2 + 6d)!!. Even if one doesn't immediately recognize this latter polynomial it looks like it ought to factor and then on factoring it one gets !!d(d+1)(d+2)(d+3)!!. So it's easy to conjecture !!C(d, n) = \frac1{n!}\rp dn!! and indeed, this is easy to prove from !!(\spadesuit)!!: The !!\st n k!! obey the recurrence $$\st{n+1}k = n \st nk + \st n{k-1}\tag{$\color{green}{\star}$}$$ (by an easy combinatorial argument1) and it's also easy to show that the coefficients of !!\rp nk!! obey the same recurrence.2

In general !!\rp nk = \fp{(n+k-1)}k!! so we have !!C(d, n) = \rp dn = \fp{(n+d-1)}n = \binom{n+d-1}d = \binom{n+d-1}{n-1}!! which ties the knot with the formula from the previous article. In particular, !!C(10,4) = \binom{13}9!!.

I have a bunch more to say about this but this article has already been in the oven long enough, so I'll cut the scroll here.

[1] The combinatorial argument that justifies !!(\color{green}{\star})!! is as follows: The Stirling number !!\st nk!! counts the number of permutations of order !!n!! with exactly !!k!! cycles. To get a permutation of order !!n+1!! with exactly !!k!! cycles, we can take one of the !!\st nk!! permutations of order !!n!! with !!k!! cycles and insert the new element into one of the existing cycles after any of the !!n!! elements. Or we can take on of the !!\st n{k-1}!! permutations with only !!k-1!! cycles and add the new element in its own cycle.)

[2] We want to show that the coefficients of !!\rp nk!! obey the same recurrence as !!(\color{green}{\star})!!. Let's say that the coefficient of the !!n^i!! term in !!\rp nk!! is !!c_i!!. We have $$\rp n{k+1} = \rp nk\cdot (n+k) = \rp nk \cdot n + \rp nk \cdot k $$ so the coefficient of the the !!n^i!! term on the left is !!c_{i-1} + kc_i!!.

[Other articles in category /math] permanent link

Wed, 20 Sep 2017

Gompertz' law for wooden utility poles

Gompertz' law says that the human death rate increases exponentially with age. That is, if your chance of dying during this year is !!x!!, then your chance of dying during next year is !!cx!! for some constant !!c>1!!. The death rate doubles every 8 years, so the constant !!c!! is empirically around !!2^{1/8} \approx 1.09!!. This is of course mathematically incoherent, since it predicts that sufficiently old people will have a mortality rate greater than 100%. But a number of things are both true and mathematically incoherent, and this is one of them. (Zipf's law is another.)

The Gravity and Levity blog has a superb article about this from 2009 that reasons backwards from Gompertz' law to rule out certain theories of mortality, such as the theory that death is due to the random whims of a fickle god. (If death were entirely random, and if you had a 50% chance of making it to age 70, then you would have a 25% chance of living to 140, and a 12.5% chance of living to 210, which we know is not the case.)

Gravity and Levity says:

Surprisingly enough, the Gompertz law holds across a large number of countries, time periods, and even different species.

To this list I will add wooden utility poles.

A couple of weeks ago Toph asked me why there were so many old rusty staples embedded in the utility poles near our house, and this is easy to explain: people staple up their yard sale posters and lost-cat flyers, and then the posters and flyers go away and leave behind the staples. (I once went out with a pliers and extracted a few dozen staples from one pole; it was very satisfying but ultimately ineffective.) If new flyer is stapled up each week, that is 52 staples per year, and 1040 in twenty years. If we agree that 20 years is the absolute minimum plausible lifetime of a pole, we should not be surprised if typical poles have hundreds or thousands of staples each.

But this morning I got to wondering what is the expected lifetime of a wooden utility pole? I guessed it was probably in the range of 40 to 70 years. And happily, because of the Wonders of the Internet, I could look it up right then and there, on the way to the trolley stop, and spend my commute time reading about it.

It was not hard to find an authoritative sounding and widely-cited 2012 study by electric utility consultants Quanta Technology.

Summary: Most poles die because of fungal rot, so pole lifetime varies widely depending on the local climate. An unmaintained pole will last 50–60 years in a cold or dry climate and 30-40 years in a hot wet climate. Well-maintained poles will last around twice as long.

Anyway, Gompertz' law holds for wooden utility poles also. According to the study:

Failure and breakdown rates for wood poles are thought to increase exponentially with deterioration and advancing time in service.

The Quanta study presents this chart, taken from the (then forthcoming) 2012 book Aging Power Delivery Infrastructures:

The solid line is the pole failure rate for a particular unnamed utility company in a median climate. The failure rate with increasing age clearly increases exponentially, as Gompertz' law dictates, doubling every 12½ years or so: Around 1 in 200 poles fails at age 50, around 1 in 100 of the remaining poles fails at age 62.5, and around 1 in 50 of the remaining poles fails at age 75.

(The dashed and dotted lines represent poles that are removed from service for other reasons.)

From Gompertz' law itself and a minimum of data, we can extrapolate the maximum human lifespan. The death rate for 65-year-old women is around 1%, and since it doubles every 8 years or so, we find that 50% of women are dead by age 88, and all but the most outlying outliers are dead by age 120. And indeed, the human longevity record is currently attributed to Jeanne Calment, who died in 1997 at the age of 122½.

Similarly we can extrapolate the maximum service time for a wooden utility pole. Half of them make it to 90 years, but if you have a large installed base of 110-year-old poles you will be replacing about one-seventh of them every year and it might make more sense to rip them all out at once and start over. At a rate of one yard sale per week, a 110-year-old pole will have accumulated 5,720 staples.

The Quanta study does not address deterioration of utility poles due to the accumulation of rusty staples.

[Other articles in category /tech] permanent link

Mon, 28 Aug 2017

Miscellanea about 24 puzzles

This is a collection of leftover miscellanea about twenty-four puzzles. In case you forgot what that is:

The puzzle «4 6 7 9 ⇒ 24» means that one should take the numbers 4, 6, 7, and 9, and combine them with the usual arithmetic operations of addition, subtraction, multiplication, and division, to make the number 24. In this case the unique solution is $$6\times\frac{7 + 9}{4}.$$ When the target number is 24, as it often is, we omit it and just write «4 6 7 9».

Prior articles on this topic:

How many puzzles have solutions?

For each value of !!T!!, there are 715 puzzles «a b c d ⇒ T». (I discussed this digression in two more earlier articles: [1] [2].) When the target !!T = 24!!, 466 of the 715 puzzles have solutions. Is this typical? Many solutions of «a b c d» puzzles end with a multiplication of 6 and 4, or of 8 and 3, or sometimes of 12 and 2—so many that one quickly learns to look for these types of solutions right away. When !!T=23!!, there won't be any solutions of this type, and we might expect that relatively few puzzles with prime targets have solutions.

This turns out to be the case:

ALT attribute suggestions welcome!

The x-axis is the target number !!T!!, with 0 at the left, 300 at right, and vertical guide lines every 25. The y axis is the number of solvable puzzles out of the maximum possible of 715, with 0 at the bottom, 715 at the top, and horizontal guide lines every 100.

Dots representing prime number targets are colored black. Dots for numbers with two prime factors (4, 6, 9, 10, 14, 15, 21, 22, etc.) are red; dots with three, four, five, six, and seven prime factors are orange, yellow, green, blue, and purple respectively.

Two countervailing trends are obvious: Puzzles with smaller targets have more solutions, and puzzles with highly-composite targets have more solutions. No target number larger than 24 has as many as 466 solvable puzzles.

These are only trends, not hard rules. For example, there are 156 solvable puzzles with the target 126 (4 prime factors) but only 93 with target 128 (7 prime factors). Why? (I don't know. Maybe because there is some correlation with the number of different prime factors? But 72, 144, and 216 have many solutions, and only two different prime factors.)

The smallest target you can't hit is 417. The following numbers 418 and 419 are also impossible. But there are 8 sets of four digits that can be used to make 416 and 23 sets that can be used to make 420. The largest target that can be hit is obviously !!6561 = 9⁴!!; the largest target with two solutions is !!2916 = 4·9·9·9 = 6·6·9·9!!.

(The raw data are available here).

There is a lot more to discover here. For example, from looking at the chart, it seems that the locally-best target numbers often have the form !!2^n3^m!!. What would we see if we colored the dots according to their largest prime factor instead of according to their number of prime factors? (I tried doing this, and it didn't look like much, but maybe it could have been done better.)

Making zero

As the chart shows, 705 of the 715 puzzles of the type «a b c d ⇒ 0», are solvable. This suggests an interesting inverse puzzle that Toph and I enjoyed: find four digits !!a,b,c, d!! that cannot be used to make zero. (The answers).

Identifying interesting or difficult problems

(Caution: this section contains spoilers for many of the most interesting puzzles.)

I spent quite a while trying to get the computer to rank puzzles by difficulty, with indifferent success.


Seven puzzles require the use of fractions. One of these is the notorious «3 3 8 8» that I mentioned before. This is probably the single hardest of this type. The other six are:

    «1 3 4 6»
    «1 4 5 6»
    «1 5 5 5»
    «1 6 6 8»
    «3 3 7 7»
    «4 4 7 7»

(Solutions to these (formatted image); solutions to these (plain text))

«1 5 5 5» is somewhat easier than the others, but they all follow pretty much the same pattern. The last two are pleasantly symmetrical.

Negative numbers

No puzzles require the use of negative intermediate values. This surprised me at first, but it is not hard to see why. Subexpressions with negative intermediate values can always be rewritten to have positive intermediate values instead.

For instance, !!3 × (9 + (3 - 4))!! can be rewritten as !!3 × (9 - (4 - 3))!! and !!(5 - 8)×(1 -9)!! can be rewritten as !!(8 - 5)×(9 -1)!!.

A digression about tree shapes

In one of the earlier articles I asserted that there are only two possible shapes for the expression trees of a puzzle solution:

(((a # b) # c) # d) ((a # b) # (c # d))
Form A Form B

(Pink square nodes contain operators and green round nodes contain numbers.)

Lindsey Kuper pointed out that there are five possible shapes, not two. Of course, I was aware of this (it is a Catalan number), so what did I mean when I said there were only two? It's because I had the idea that any tree that wasn't already in one of those two forms could be put into form A by using transformations like the ones in the previous section.

For example, the expression !!(4×((1+2)÷3))!! isn't in either form, but we can commute the × to get the equivalent !!((1+2)÷3)×4!!, which has form A. Sometimes one uses the associative laws, for example to turn !!a ÷ (b × c)!! into !!(a ÷ b) ÷ c!!.

But I was mistaken; not every expression can be put into either of these forms. The expression !!(8×(9-(2·3))!! is an example.

Unusual intermediate values

The most interesting thing I tried was to look for puzzles whose solutions require unusual intermediate numbers.

For example, the puzzle «3 4 4 4» looks easy (the other puzzles with just 3s and 4s are all pretty easy) but it is rather tricky because its only solution goes through the unusual intermediate number 28: !!4 × (3 + 4) - 4!!.

I ranked puzzles as follows: each possible intermediate number appears in a certain number of puzzle solutions; this is the score for that intermediate number. (Lower scores are better, because they represent rarer intermediate numbers.) The score for a single expression is the score of its rarest intermediate value. So for example !!4 × (3 + 4) - 4!! has the intermediate values 7 and 28. 7 is extremely common, and 28 is quite unusual, appearing in only 151 solution expressions, so !!4 × (3 + 4) - 4!! receives a fairly low score of 151 because of the intermediate 28.

Then each puzzle received a difficulty score which was the score of its easiest solution expression. For example, «2 2 3 8» has two solutions, one (!!(8+3)×2+2!!) involving the quite unusual intermediate value 22, which has a very good score of only 79. But this puzzle doesn't count as difficult because it also admits the obvious solution !!8·3·\frac22!! and this is the solution that gives it its extremely bad score of 1768.

Under this ranking, the best-scoring twenty-four puzzles, and their scores, were:

      «1 2 7 7» 3
    * «4 4 7 7» 12
    * «1 4 5 6» 13
    * «3 3 7 7» 14
    * «1 5 5 5» 15
      «5 6 6 9» 23
      «2 5 7 9» 24
      «2 2 5 8» 25
      «2 5 8 8» 45
      «5 8 8 8» 45
      «2 2 2 9» 47
    * «1 3 4 6» 59
    * «1 6 6 8» 59
      «2 4 4 9» 151
      «3 4 4 4» 151
    * «3 3 8 8» 152
      «6 8 8 9» 152
      «2 2 2 7» 155
      «2 2 5 7» 155
      «2 3 7 7» 155
      «2 4 7 7» 155
      «2 5 5 7» 155
      «2 5 7 7» 156
      «4 4 8 9» 162

(Something is not quite right here. I think «2 5 7 7» and «2 5 5 7» should have the same score, and I don't know why they don't. But I don't care enough to do it over.)

Most of these are at least a little bit interesting. The seven puzzles that require the use of fractions appear; I have marked them with stars. The top item is «1 2 7 7», whose only solution goes through the extremely rare intermediate number 49. The next items require fractions, and the one after that is «5 6 6 9», which I found difficult. So I think there's some value in this procedure.

But is there enough value? I'm not sure. The last item on the list, «4 4 8 9», goes through the unusual number 36. Nevertheless I don't think it is a hard puzzle.

(I can also imagine that someone might see the answer to «5 6 6 9» right off, but find «4 4 8 9» difficult. The whole exercise is subjective.)

Solutions with unusual tree shapes

I thought about looking for solutions that involved unusual sequences of operations. Division is much less common than the other three operations.

To get it right, one needs to normalize the form of expressions, so that the shapes !!(a + b) + (c + d)!! and !!a + (b + (c + d))!! aren't counted separately. The Ezpr library can help here. But I didn't go that far because the preliminary results weren't encouraging.

There are very few expressions totaling 24 that have the form !!(a÷b)÷(c÷d)!!. But if someone gives you a puzzle with a solution in that form, then !!(a×d)÷(b×c)!! and !!(a×d) ÷ (b÷c)!! are also solutions, and one or another is usually very easy to see. For example, the puzzle «1 3 8 9» has the solution !!(8÷1)÷(3÷9)!!, which has an unusual form. But this is an easy puzzle; someone with even a little experience will find the solution !!8 × \frac93 × 1!! immediately.

Similarly there are relatively few solutions of the form !!a÷((b-c)÷d)!!, but they can all be transformed into !!a×d÷(b-c)!! which is not usually hard to find. Consider $$\frac 8{\left(\frac{6 - 4}6\right)}.$$ This is pretty weird-looking, but when you're trying to solve it one of the first things you might notice is the 8, and then you would try to turn the rest of the digits into a 3 by solving «4 6 6 ⇒ 3», at which point it wouldn't take long to think of !!\frac6{6-4}!!. Or, coming at it from the other direction, you might see the sixes and start looking for a way to make «4 6 8 ⇒ 4», and it wouldn't take long to think of !!\frac8{6-4}!!.

Ezpr shape

Ezprs (see previous article) correspond more closely than abstract syntax trees do with our intuitive notion of how expressions ought to work, so looking at the shape of the Ezpr version of a solution might give better results than looking at the shape of the expression tree. For example, one might look at the number of nodes in the Ezpr or the depth of the Ezpr.


When trying to solve one of these puzzles, there are a few things I always try first. After adding up the four numbers, I then look for ways to make !!8·3, 6·4,!! or !!12·2!!; if that doesn't work I start branching out looking for something of the type !!ab\pm c!!.

Suppose we take a list of all solvable puzzles, and remove all the very easy ones: the puzzles where one of the inputs is zero, or where one of the inputs is 1 and there is a solution of the form !!E×1!!.

Then take the remainder and mark them as “easy” if they have solutions of the form !!a+b+c+d, 8·3, 6·4,!! or !!12·2!!. Also eliminate puzzles with solutions of the type !!E + (c - c)!! or !!E×\left(\frac cc\right)!!.

How many are eliminated in this way? Perhaps most? The remaining puzzles ought to have at least intermediate difficulty, and perhaps examining just those will suggest a way to separate them further into two or three ranks of difficulty.

I give up

But by this time I have solved so many twenty-four puzzles that I am no longer sure which ones are hard and which ones are easy. I suspect that I have seen and tried to solve most of the 466 solvable puzzles; certainly more than half. So my brain is no longer a reliable gauge of which puzzles are hard and which are easy.

Perhaps looking at puzzles with five inputs would work better for me now. These tend to be easy, because you have more to work with. But there are 2002 puzzles and probably some of them are hard.

Close, but no cigar

What's the closest you can get to 24 without hitting it exactly? The best I could do was !!5·5 - \frac89!!. Then I asked the computer, which confirmed that this is optimal, although I felt foolish when I saw the simpler solutions that are equally good: !!6·4 \pm\frac 19!!.

The paired solutions $$5 × \left(4 + \frac79\right) < 24 < 7 × \left(4 - \frac59\right)$$ are very handsome.

Phone app

The search program that tells us when a puzzle has solutions is only useful if we can take it with us in the car and ask it about license plates. A phone app is wanted. I built one with Code Studio.

Code Studio is great. It has a nice web interface, and beginners can write programs by dragging blocks around. It looks very much like MIT's scratch project, which is much better-known. But Code Studio is a much better tool than Scratch. In Scratch, once you reach the limits of what it can do, you are stuck, and there is no escape. In Code Studio when you drag around those blocks you are actually writing JavaScript underneath, and you can click a button and see and edit the underlying JavaScript code you have written.

Suppose you need to convert A to 1 and B to 2 and so on. Scratch does not provide an ord function, so with Scratch you are pretty much out of luck; your only choice is to write a 26-way if-else tree, which means dragging around something like 104 stupid blocks. In Code Studio, you can drop down the the JavaScript level and type in ord to use the standard ord function. Then if you go back to blocks, the ord will look like any other built-in function block.

In Scratch, if you want to use a data structure other than an array, you are out of luck, because that is all there is. In Code Studio, you can drop down to the JavaScript level and use or build any data structure available in JavaScript.

In Scratch, if you want to initialize the program with bulk data, say a precomputed table of the solutions of the 466 twenty-four puzzles, you are out of luck. In Code Studio, you can upload a CSV file with up to 1,000 records, which then becomes available to your program as a data structure.

In summary, you spend a lot of your time in Scratch working around the limitations of Scratch, and what you learn doing that is of very limited applicability. Code Studio is real programming and if it doesn't do exactly what you want out of the box, you can get what you want by learning a little more JavaScript, which is likely to be useful in other contexts for a long time to come.

Once you finish your Code Studio app, you can click a button to send the URL to someone via SMS. They can follow the link in their phone's web browser and then use the app.

Code Studio is what Scratch should have been. Check it out.


Thanks to everyone who contributed to this article, including:

  • my daughters Toph and Katara
  • Shreevatsa R.
  • Dr. Lindsey Kuper
  • Darius Bacon
  • everyone else who emailed me

[Other articles in category /math] permanent link

Mon, 21 Aug 2017

Recognizing when two arithmetic expressions are essentially the same

[ Warning: The math formatting in the RSS / Atom feed for this article is badly mutilated. I suggest you read the article on my blog. ]

In this article, I discuss “twenty-four puzzles”. The puzzle «4 6 7 9 ⇒ 24» means that one should take the numbers 4, 6, 7, and 9, and combine them with the usual arithmetic operations of addition, subtraction, multiplication, and division, to make the number 24. In this case the unique solution is !!6·\frac{7 + 9}{4}!!.

When the target number after the is 24, as it often is, we omit it and just write «4 6 7 9». Every example in this article has target number 24.

This is a continuation of my previous articles on this topic:

My first cut at writing a solver for twenty-four puzzles was a straightforward search program. It had a couple of hacks in it to cut down the search space by recognizing that !!a+E!! and !!E+a!! are the same, but other than that there was nothing special about it and I've discussed it before.

It would quickly and accurately report whether any particular twenty-four puzzle was solvable, but as it turned out that wasn't quite good enough. The original motivation for the program was this: Toph and I play this game in the car. Pennsylvania license plates have three letters and four digits, and if we see a license plate FBV 2259 we try to solve «2 2 5 9». Sometimes we can't find a solution and then we wonder: it is because there isn't one, or is it because we just didn't get it yet? So the searcher turned into a phone app, which would tell us whether there was solution, so we'd know whether to give up or keep searching.

But this wasn't quite good enough either, because after we would find that first solution, say !!2·(5 + 9 - 2)!!, we would wonder: are there any more? And here the program was useless: it would cheerfully report that there were three, so we would rack our brains to find another, fail, ask the program to tell us the answer, and discover to our disgust that the three solutions it had in mind were:

$$ 2 \cdot (5 + (9 - 2)) \\ 2 \cdot (9 + (5 - 2)) \\ 2 \cdot ((5 + 9) - 2) $$

The computer thinks these are different, because it uses different data structures to represent them. It represents them with an abstract syntax tree, which means that each expression is either a single constant, or is a structure comprising an operator and its two operand expressions—always exactly two. The computer understands the three expressions above as having these structures:

It's not hard to imagine that the computer could be taught to understand that the first two trees are equivalent. Getting it to recognize that the third one is also equivalent seems somewhat more difficult.

Commutativity and associativity

I would like the computer to understand that these three expressions should be considered “the same”. But what does “the same” mean? This problem is of a kind I particularly like: we want the computer to do something, but we're not exactly sure what that something is. Some questions are easy to ask but hard to answer, but this is the opposite: the real problem is to decide what question we want to ask. Fun!

Certainly some of the question should involve commutativity and associativity of addition and multiplication. If the only difference between two expressions is that one has !!a + b!! where the other has !!b + a!!, they should be considered the same; similarly !!a + (b + c)!! is the same expression as !!(a + b) + c!! and as !!(b + a) + c!! and !!b + (a + c)!! and so forth.

The «2 2 5 9» example above shows that commutativity and associativity are not limited to addition and multiplication. There are commutative and associative properties of subtraction also! For example, $$a+(b-c) = (a+b)-c$$ and $$(a+b)-c = (a-c)+b.$$ There ought to be names for these laws but as far as I know there aren't. (Sure, it's just commutativity and associativity of addition in disguise, but nobody explaining these laws to school kids ever seems to point out that subtraction can enter into it. They just observe that !!(a-b)-c ≠ a-(b-c)!!, say “subtraction isn't associative”, and leave it at that.)

Closely related to these identities are operator inversion identities like !!a-(b+c) = (a-b)-c!!, !!a-(b-c) = (a-b)+c!!, and their multiplicative analogues. I don't know names for these algebraic laws either.

One way to deal with all of this would to build a complicated comparison function for abstract syntax trees that tried to transform one tree into another by applying these identities. A better approach is to recognize that the data structure is over-specified. If we want the computer to understand that !!(a + b) + c!! and !!a + (b + c)!! are the same expression, we are swimming upstream by using a data structure that was specifically designed to capture the difference between these expressions.

Instead, I invented a data structure, called an Ezpr (“Ez-pur”), that can represent expressions, but in a somewhat more natural way than abstract syntax trees do, and in a way that makes commutativity and associativity transparent.

An Ezpr has a simplest form, called its “canonical” or “normal” form. Two Ezprs represent essentially the same mathematical expression if they have the same canonical form. To decide if two abstract syntax trees are the same, the computer converts them to Ezprs, simplifies them, and checks to see if resulting canonical forms are identical.

The Ezpr

Since associativity doesn't matter, we don't want to represent it. When we (humans) think about adding up a long column of numbers, we don't think about associativity because we don't add them pairwise. Instead we use an addition algorithm that adds them all at once in a big pile. We don't treat addition as a binary operation; we normally treat it as an operator that adds up the numbers in a list. The Ezpr makes this explicit: its addition operator is applied to a list of subexpressions, not to a pair. Both !!a + (b + c)!! and !!(a + b) + c!! are represented as the Ezpr

    SUM [ a b c - ]

which just says that we are adding up !!a!!, !!b!!, and !!c!!. (The - sign is just punctuation; ignore it for now.)

Similarly the Ezpr MUL [ a b c ÷ ] represents the product of !!a!!, !!b!!, and !!c!!. (Please ignore the ÷ sign for the time being.)

To handle commutativity, we want those [ a b c ] lists to be bags. Perl doesn't have a built-in bag object, so instead I used arrays and required that the array elements be in sorted order. (Exactly which sorted order doesn't really matter.)

Subtraction and division

This doesn't yet handle subtraction and division, and the way I chose to handle them is the only part of this that I think is at all clever. A SUM object has not one but two bags, one for the positive and one for the negative part of the expression. An expression like !!a - b + c - d!! is represented by the Ezpr:

SUM [ a c - b d ]

and this is also the representation of !!a + c - b - d!!, of !!c + a - d - b!!, of !!c - d+ a-b!!, and of any other expression of the idea that we are adding up !!a!! and !!c!! and then deducting !!b!! and !!d!!. The - sign separates the terms that are added from those that are subtracted.

Either of the two bags may be empty, so for example !!a + b!! is just SUM [ a b - ].

Division is handled similarly. Here conventional mathematical notation does a little bit better than in the sum case: MUL [ a c ÷ b d ] is usually written as !!\frac{ac}{bd}!!.

Ezprs handle the associativity and commutativity of subtraction and division quite well. I pointed out earlier that subtraction has an associative law !!(a + b) - c = a + (b - c)!! even though it's not usually called that. No code is required to understand that those two expressions are equal if they are represented as Ezprs, because they are represented by completely identical structures:

        SUM [ a b - c ]

Similarly there is a commutative law for subtraction: !!a + b - c = a - c + b!! and once again that same Ezpr does for both.

Ezpr laws

Ezprs are more flexible than binary trees. A binary tree can represent the expressions !!(a+b)+c!! and !!a+(b+c)!! but not the expression !!a+b+c!!. Ezprs can represent all three and it's easy to transform between them. Just as there are rules for building expressions out of simpler expressions, there are a few rules for combining and manipulating Ezprs.

Lifting and flattening

The most important transformation is lifting, which is the Ezpr version of the associative law. In the canonical form of an Ezpr, a SUM node may not have subexpressions that are also SUM nodes. If you have

  SUM [ a SUM [ b c - ] - … ]

you should lift the terms from the inner sum into the outer one:

  SUM [ a b c - … ]

effectively transforming !!a+(b+c)!! into !!a+b+c!!. More generally, in

   SUM [ a SUM [ b - c ]
       - d SUM [ e - f ] ]

we lift the terms from the inner Ezprs into the outer one:

   SUM [ a b f - c d e ]

This effectively transforms !!a + (b - c) - d - (e - f))!! to !!a + b + f - c - d - e!!.

Similarly, when a MUL node contains another MUL, we can flatten the structure.

Say we are converting the expression !!7 ÷ (3 ÷ (6 × 4))!! to an Ezpr. The conversion function is recursive and the naïve version computes this Ezpr:

      MUL [ 7 ÷ MUL [ 3 ÷ MUL [ 6 4 ÷ ] ] ]

But then at the bottom level we have a MUL inside a MUL, so the 4 and 6 in the innermost MUL are lifted upward:

      MUL [ 7 ÷ MUL [ 3 ÷ 6 4 ] ]

which represents !!\frac7{\frac{3}{6\cdot 4}}!!. Then again we have a MUL inside a MUL, and again the subexpressions of the innermost MUL can be lifted:

      MUL [ 7 6 4 ÷ 3 ]

which we can imagine as !!\frac{7·6·4}3!!.

The lifting only occurs when the sub-node has the same type as its parent; we may not lift terms out of a MUL into a SUM or vice versa.

Trivial nodes

The Ezpr SUM [ a - ] says we are adding up just one thing, !!a!!, and so it can be eliminated and replaced with just !!a!!. Similarly SUM [ - a ] can be replaced with the constant !!-a!!, if !!a!! is a constant. MUL can be handled similarly.

An even simpler case is SUM [ - ] which can be replaced by the constant 0; MUL [ ÷ ] can be replaced with 1. These sometimes arise as a result of cancellation.


Consider the puzzle «3 3 4 6». My first solver found 49 solutions to this puzzle. One is !!(3 - 3) + (4 × 6)!!. Another is !!(4 + (3 - 3)) × 6!!. A third is !!4 × (6 + (3 - 3))!!.

I think these are all the same: the solution is to multiply the 4 by the 6, and to get rid of the threes by subtracting them to make a zero term. The zero term can be added onto the rest of expression or to any of its subexpressions—there are ten ways to do this—and it doesn't really matter where.

This is easily explained in terms of Ezprs: If the same subexpression appears in both of a node's bags, we can drop it. For example, the expression !!(4 + (3 -3)) × 6!! starts out as

    MUL [ 6 SUM [ 3 4 - 3 ] ÷ ]

but the duplicate threes in SUM [ 3 4 - 3 ] can be canceled, to leave

    MUL [ 6 SUM [ 4 - ] ÷ ]

The sum is now trivial, as described in the previous section, so can be eliminated and replaced with just 4:

    MUL [ 6 4 ÷ ]

This Ezpr records the essential feature of each of the three solutions to «3 3 4 6» that I mentioned: they all are multiplying the 6 by the 4, and then doing something else unimportant to get rid of the threes.

Another solution to the same puzzle is !!(6 ÷ 3) × (4 × 3)!!. Mathematically we would write this as !!\frac63·4·3!! and we can see this is just !!6×4!! again, with the threes gotten rid of by multiplication and division, instead of by addition and subtraction. When converted to an Ezpr, this expression becomes:

    MUL [ 6 4 3 ÷ 3 ]

and the matching threes in the two bags are cancelled, again leaving

    MUL [ 6 4 ÷ ]

In fact there aren't 49 solutions to this puzzle. There is only one, with 49 trivial variations.

Identity elements

In the preceding example, many of the trivial variations on the !!4×6!! solution involved multiplying some subexpression by !!\frac 33!!. When one of the input numbers in the puzzle is a 1, one can similarly obtain a lot of useless variations by choosing where to multiply the 1.

Consider «1 3 3 5»: We can make 24 from !!3 × (3 + 5)!!. We then have to get rid of the 1, but we can do that by multiplying it onto any of the five subexpressions of !!3 × (3 + 5)!!:

$$ 1 × (3 × (3 + 5)) \\ (1 × 3) × (3 + 5) \\ 3 × (1 × (3 + 5)) \\ 3 × ((1 × 3) + 5) \\ 3 × (3 + (1×5)) $$

These should not be considered different solutions. Whenever we see any 1's in either of the bags of a MUL node, we should eliminate them. The first expression above, !!1 × (3 × (3 + 5))!!, is converted to the Ezpr

 MUL [ 1 3 SUM [ 3 5 - ] ÷ ]

but then the 1 is eliminated from the MUL node leaving

 MUL [ 3 SUM [ 3 5 - ] :- ]

The fourth expression, !!3 × ((1 × 3) + 5)!!, is initially converted to the Ezpr

 MUL [ 3 SUM [ 5 MUL [ 1 3 ÷ ] - ] ÷ ]

When the 1 is eliminated from the inner MUL, this leaves a trivial MUL [ 3 ÷ ] which is then replaced with just 3, leaving:

 MUL [ 3 SUM [ 5 3 - ] ÷ ]

which is the same Ezpr as before.

Zero terms in the bags of a SUM node can similarly be dropped.

Multiplication by zero

One final case is that MUL [ 0 … ÷ … ] can just be simplified to 0.

The question about what to do when there is a zero in the denominator is a bit of a puzzle. In the presence of division by zero, some of our simplification rules are questionable. For example, when we have MUL [ a ÷ MUL [ b ÷ c ] ], the lifting rule says we can simplify this to MUL [ a c ÷ b ]—that is, that !!\frac a{\frac bc} = \frac{ac}b!!. This is correct, except that when !!b=0!! or !!c=0!! it may be nonsense, depending on what else is going on. But since zero denominators never arise in the solution of these puzzles, there is no issue in this application.


The Ezpr module is around 200 lines of Perl code, including everything: the function that converts abstract syntax trees to Ezprs, functions to convert Ezprs to various notations (both MUL [ 4 ÷ SUM [ 3 - 2 ] ] and 4 ÷ (3 - 2)), and the two versions of the normalization process described in the previous section. The normalizer itself is about 35 lines.

Associativity is taken care of by the Ezpr structure itself, and commutativity is not too difficult; as I mentioned, it would have been trivial if Perl had a built-in bag structure. I find it much easier to reason about transformations of Ezprs than abstract syntax trees. Many operations are much simpler; for example the negation of SUM [ A - B ] is simply SUM [ B - A ]. Pretty-printing is also easier because the Ezpr better captures the way we write and think about expressions.

It took me a while to get the normalization tuned properly, but the results have been quite successful, at least for this problem domain. The current puzzle-solving program reports the number of distinct solutions to each puzzle. When it reports two different solutions, they are really different; when it fails to support the exact solution that Toph or I found, it reports one essentially the same. (There are some small exceptions, which I will discuss below.)

Since there is no specification for “essentially the same” there is no hope of automated testing. But we have been using the app for several months looking for mistakes, and we have not found any. If the normalizer failed to recognize that two expressions were essentially similar, we would be very likely to notice: we would be solving some puzzle, be unable to find the last of the solutions that the program claimed to exist, and then when we gave up and saw what it was we would realize that it was essentially the same as one of the solutions we had found. I am pretty confident that there are no errors of this type, but see “Arguable points” below.

A harder error to detect is whether the computer has erroneously conflated two essentially dissimilar expressions. To detect this we would have to notice that an expression was missing from the computer's solution list. I am less confident that nothing like this has occurred, but as the months have gone by I feel better and better about it.

I consider the problem of “how many solutions does this puzzle really have to have?” been satisfactorily solved. There are some edge cases, but I think we have identified them.

Code for my solver is on Github. The Ezpr code is in the Ezpr package in the file. This code is all in the public domain.

Some examples

The original program claims to find 35 different solutions to «4 6 6 6». The revised program recognizes that these are of only two types:

!!4 × 6 × 6 ÷ 6!!MUL [ 4 6 - ]
!!(6 - 4) × (6 + 6)!!MUL [ SUM [ 6 - 4 ] SUM [ 6 6 - ] ÷ ]

Some of the variant forms of the first of those include:

$$ 6 × (4 + (6 - 6)) \\ 6 + ((4 × 6) - 6) \\ (6 - 6) + (4 × 6) \\ (6 ÷ 6) × (4 × 6) \\ 6 ÷ ((6 ÷ 4) ÷ 6) \\ 6 ÷ (6 ÷ (4 × 6)) \\ 6 × (6 × (4 ÷ 6)) \\ (6 × 6) ÷ (6 ÷ 4) \\ 6 ÷ ((6 ÷ 6) ÷ 4) \\ 6 × (6 - (6 - 4)) \\ 6 × (6 ÷ (6 ÷ 4)) \\ \ldots

In an even more extreme case, the original program finds 80 distinct expressions that solve «1 1 4 6», all of which are trivial variations on !!4·6!!.

Of the 715 puzzles, 466 (65%) have solutions; for 175 of these the solution is unique. There are 3 puzzles with 8 solutions each («2 2 4 8», «2 3 6 9», and «2 4 6 8»), one with 9 solutions («2 3 4 6»), and one with 10 solutions («2 4 4 8»).

The 10 solutions for «2 4 4 8» are as follows:

!!4 × 8 - 2 × 4 !!SUM [ MUL [ 4 8 ÷ ] - MUL [ 2 4 ÷ ] ]
!!4 × (2 + 8 - 4) !!MUL [ 4 SUM [ 2 8 - 4 ] ÷ ]
!!(8 - 4) × (2 + 4) !!MUL [ SUM [ 8 - 4 ] SUM [ 2 4 - ] ÷ ]
!!4 × (4 + 8) ÷ 2 !!MUL [ 4 SUM [ 4 8 - ] ÷ 2 ]
!!(4 - 2) × (4 + 8) !!MUL [ SUM [ 4 - 2 ] SUM [ 4 8 - ] ÷ ]
!!8 × (2 + 4/4) !!MUL [ 8 SUM [ 1 2 - ] ÷ ]
!!2 × 4 × 4 - 8 !!SUM [ MUL [ 2 4 4 ÷ ] - 8 ]
!!8 + 2 × (4 + 4) !!SUM [ 8 MUL [ 2 SUM [ 4 4 - ] ÷ ] - ]
!!4 + 4 + 2 × 8 !!SUM [ 4 4 MUL [ 2 8 ÷ ] - ]
!!4 × (8 - 4/2) !!MUL [ 4 SUM [ 8 - MUL [ 4 ÷ 2 ] ] ÷ ]

A complete listing of every essentially different solution to every «a b c d» puzzle is available here. There are 1,063 solutions in all.

Arguable points

There are a few places where we have not completely pinned down what it means for two solutions to be essentially the same; I think there is room for genuine disagreement.

  1. Any solution involving !!2×2!! can be changed into a slightly different solution involving !!2+2!! instead. These expressions are arithmetically different but numerically equal. For example, I mentioned earlier that «2 2 4 8» has 8 solutions. But two of these are !! 8 + 4 × (2 + 2)!! and !! 8 + 4 × 2 × 2!!. I am willing to accept these as essentially different. Toph, however, disagrees.

  2. A similar but more complex situation arises in connection with «1 2 3 7». Consider !!3×7+3!!, which equals 24. To get a solution to «1 2 3 7», we can replace either of the threes in !!3×7+3!! with !!(1+2)!!, obtaining !!((1 + 2) × 7) + 3!! or !! (3×7)+(1 +2)!!. My program considers these to be different solutions. Toph is unsure.

It would be pretty easy to adjust the normalization process to handle these the other way if the user wanted that.

Some interesting puzzles

«1 2 7 7» has only one solution, quite unusual. (Spoiler) «2 2 6 7» has two solutions, both somewhat unusual. (Spoiler)

Somewhat similar to «1 2 7 7» is «3 9 9 9» which also has an unusual solution. But it has two other solutions that are less surprising. (Spoiler)

«1 3 8 9» has an easy solution but also a quite tricky solution. (Spoiler)

One of my neighbors has the license plate JJZ 4631. «4 6 3 1» is one of the more difficult puzzles.

What took so long?

Back in March, I wrote:

I have enough material for at least three or four more articles about this that I hope to publish here in the coming weeks.

But the previous article on this subject ended similarly, saying

I hope to write a longer article about solvers in the next week or so.

and that was in July 2016, so don't hold your breath.

And here we are, five months later!

This article was a huge pain to write. Sometimes I sit down to write something and all that comes out is dreck. I sat down to write this one at least three or four times and it never worked. The tortured Git history bears witness. In the end I had to abandon all my earlier drafts and start over from scratch, writing a fresh outline in an empty file.

But perseverance paid off! WOOOOO.

[ Addendum 20170825: I completely forgot that Shreevatsa R. wrote a very interesting article on the same topic as this one, in July of last year soon after I published my first article in this series. ]

[ Addendum 20170829: A previous version of this article used the notations SUM [ … # … ] and MUL [ … # … ], which I said I didn't like. Zellyn Hunter has persuaded me to replace these with SUM [ … - … ] and MUL [ … ÷ … ]. Thank you M. Hunter! ]

[ Yet more on this topic! ]

[Other articles in category /math] permanent link

Tue, 08 Aug 2017

That time I met Erdős

I should have written about this sooner, by now it has been so long that I have forgotten most of the details.

I first encountered Paul Erdős in the middle 1980s at a talk by János Pach about almost-universal graphs. Consider graphs with a countably infinite set of vertices. Is there a "universal" graph !!G!! such that, for any finite or countable graph !!H!!, there is a copy of !!H!! inside of !!G!!? (Formally, this means that there is an injection from the vertices of !!H!! to the vertices of !!G!! that preserves adjacency.) The answer is yes; it is quite easy to construct such a !!G!! and in fact nearly all random graphs have this property.

But then the questions become more interesting. Let !!K_\omega!! be the complete graph on a countably infinite set of vertices. Say that !!G!! is “almost universal” if it includes a copy of !!H!! for every finite or countable graph !!H!! except those that contain a copy of !!K_\omega!!. Is there an almost universal graph? Perhaps surprisingly, no! (Sketch of proof.)

I enjoyed the talk, and afterward in the lobby I got to meet Ron Graham and Joel Spencer and talk to them about their Ramsey theory book, which I had been reading, and about a problem I was working on. Graham encouraged me to write up my results on the problem and submit them to Mathematics Magazine, but I unfortunately never got around to this. Graham was there babysitting Erdős, who was one of Pách's collaborators, but I did not actually talk to Erdős at that time. I think I didn't recognize him. I don't know why I was able to recognize Graham.

I find the almost-universal graph thing very interesting. It is still an open research area. But none of this was what I was planning to talk about. I will return to the point. A couple of years later Erdős was to speak at the University of Pennsylvania. He had a stock speech for general audiences that I saw him give more than once. Most of the talk would be a description of a lot of interesting problems, the bounties he offered for their solutions, and the progress that had been made on them so far. He would intersperse the discussions with the sort of Erdősism that he was noted for: referring to the U.S. and the U.S.S.R. as “Sam” and “Joe” respectively; his ever-growing series of styles (Paul Erdős, P.G.O.M., A.D., etc.) and so on.

One remark I remember in particular concerned the $3000 bounty he offered for proving what is sometimes known as the Erdős-Túran conjecture: if !!S!! is a subset of the natural numbers, and if !!\sum_{n\in S}\frac 1n!! diverges, then !!S!! contains arbitrarily long arithmetic progressions. (A special case of this is that the primes contain arbitrarily long arithmetic progressions, which was proved in 2004 by Green and Tao, but which at the time was a long-standing conjecture.) Although the $3000 was at the time the largest bounty ever offered by Erdős, he said it was really a bad joke, because to solve the problem would require so much effort that the per-hour payment would be minuscule.

I made a special trip down to Philadelphia to attend the talk, with the intention of visiting my girlfriend at Bryn Mawr afterward. I arrived at the Penn math building early and wandered around the halls to kill time before the talk. And as I passed by an office with an open door, I saw Erdős sitting in the antechamber on a small sofa. So I sat down beside him and started telling him about my favorite graph theory problem.

Many people, preparing to give a talk to a large roomful of strangers, would have found this annoying and intrusive. Some people might not want to talk about graph theory with a passing stranger. But most people are not Paul Erdős, and I think what I did was probably just the right thing; what you don't do is sit next to Erdős and then ask how his flight was and what he thinks of recent politics. We talked about my problem, and to my great regret I don't remember any of the mathematical details of what he said. But he did not know the answer offhand, he was not able solve it instantly, and he did say it was interesting. So! I had a conversation with Erdős about graph theory that was not a waste of his time, and I think I can count that as one of my lifetime accomplishments.

After a little while it was time to go down to the auditorium for the the talk, and afterward one of the organizers saw me, perhaps recognized me from the sofa, and invited me to the guest dinner, which I eagerly accepted. At the dinner, I was thrilled because I secured a seat next to Erdős! But this was a beginner mistake: he fell asleep almost immediately and slept through dinner, which, I learned later, was completely typical.

[Other articles in category /math] permanent link