The Universe of Discourse


Wed, 01 Sep 2021

On having the right theorem

A recent Math StackExchange question asks “Prove every permutation of the alphabet contains a subset of six letters in order”. That is, you take a string of length 26 that contains each letter once; you can find a subsequence of six letters that is either increasing or decreasing.

Choosing a permutation at random, suppose we have:

    q y x u l i n g w o c k j d r p f v t s h e a z b m

Then the sequence x w v t s e b has length 7 and is in descending order. Or:

    t h e q u i c k b r o w n f x j m p s v r l z y d g

This contains the ascending sequence h q r s v y. Also the descending sequence t q o n m l d.

I thought about this for a while but couldn't make any progress. But OP had said “I know I have to construct a partially ordered set and possibly use Dilworth's Theorem…” so I looked up Dilworth's theorem.

It turns out I did actually know Dilworth's theorem. It is about partially-ordered sets. Dilworth's theorem says that if you partition a partially-ordered set into totally-ordered subsets, called ‘chains’, then the number of such chains is at least as big as the size of the largest “antichain”. An antichain is a subset in which no two elements are comparable.

This was enough of a hint, and I found the solution pretty quickly after that. Say that !!S[i]!! is the position of letter !!i!! in the string !!S!!. Define the partial order !!\prec!!: $$ i\prec j \qquad \equiv \qquad i < j \text{ and } S[i] < S[j] $$

That is, !!i\prec j!! means that !!i!! is alphabetically earlier than !!j!! and its position in !!S!! is to the left of !!j!!. This is obviously a partial ordering of the letters of the alphabet. Chains under !!\prec!! are, by definition, ascending sequences of letters from !!S!!. It's easy to show that antichains are descending sequences.

Partition !!S!! into chains. If any chain has length !!6!! or more, that is an ascending sequence of letters and we win.

If not, then no chain has more than 5 letters, and so there must be at least !!6!! chains, because !!5·5=25!! and there are !!26!! letters. Since there are at least !!6!! chains, Dilworth's theorem tells us there is an antichain of at least !!6!! letters, and hence a descending sequence of that length, and we win.

Once you have the idea to use Dilworth's theorem, the problem collapses. (You also have to invent the right !!\prec!! relation, but there's really only one possible choice.)

Maybe I'll write a followup article about how just having a theorem doesn't necessarily mean you have an algorithm. Mathematicians will say “partition !!S!! into chains,” but actually programming something like that might be nontrivial. Then finding the antichain among the chains might also be nontrivial.


[Other articles in category /math] permanent link

Sat, 28 Aug 2021

Mo in an alternate universe

In a transparent attempt to capitalize on the runaway success of The Wonderful Wizard of Oz, the publishers of L. Frank Baum's earlier book A New Wonderland re-released it under the title The Magical Monarch of Mo. What if this ploy had actually worked? Would the book have inspired a movie?

We're off to see the Monarch,
The Marvelous Monarch of Mo…

Naah, it kinda falls apart after that.


[Other articles in category /book] permanent link

How to fix hiring?

On Twitter, Mike Coutermarsh suggested:

Job interview: “algorithms”

Reality: “Turn a 127 message deep slack thread between 5 engineers into a decision”

I suppose this was meant facetiously but I think it might contain the germ of a good idea.

Applicants are usually given timed a programming quiz. What if instead, the candidate was supplied with the 127-message Slack thread and given 24 hours to write up a proposal document? I honestly think this might produce good results.

Such a submission would be extremely probative of the candidate's talents and abilities, including:

  • reading and understanding technical arguments
  • balancing engineering tradeoffs
  • foreseeing potential issues
  • writing clear English
  • planning
  • seriousness
  • writing coherent, well-organized, and persuasive documents

It is much more difficult to cheat on this task than on a typical programming exercise. The candidate certainly can't submit a prewritten essay that they found somewhere; that would be easy to detect. A candidate who can take someone else's prewritten essay and quickly rewrite it to plausibly appear original is probably quite well-qualified on many of the important metrics! (Plus an additional important one: the ability to do research. They had to locate, recognize, and read the essay they rewrote.)

It shouldn't be hard to change up the essay topic periodically, since the engineers will be producing several of those 127-message Slack threads every month. This also tends to impede cheating.

When a good candidate comes for an in-person interview, you have a ready-made topic of conversation. Instead of coding at the whiteboard, you can ask them to discuss their proposal.

Complaints that this would discriminate against candidates with poor command of English do not hold water. Good command of English is one of the job requirements, and the whole point of a job interview is to discriminate against unqualified candidates. Besides, if the hiring process encourages candidates to improve their English writing abilities, rather than cramming a bunch of red-black-tree algorithms, language trivia, or irrelevant brainteasters, so much the better for everyone.


[Other articles in category /misc] permanent link

Why is the S combinator an S?

The most important combinator in combinatory logic is the !!S!! combinator, defined simply:

$$ S x y z ⇒ (x z)(y z) $$

or in !!\lambda!!-calculus terms:

$$ S = \lambda x y z. (x z)(y z). $$

A wonderful theorem states that any !!\lambda!!-expression with no free variables can be converted into a combinator expression that contains only the combinators !!S, K,!! and !!I!!, where !!S!! is really the only interesting one of the three, !!I!! being merely the identity function, and !!K!! a constructor of constant functions:

$$ \begin{align} I x & = x \\ K x y & = x \\ \end{align} $$

In fact one can get along without !!I!! since !!S K K = I!!.

A not-too-infrequently-asked question is why the three combinators are named as they are. The !!I!! is an identity function and pretty obvious stands for “identity”.

Similarly the !!K!! constructs constant functions: !!K x!! is the combinator which ignores its argument and yields !!x!!. So it's not hard to imagine that !!K!! is short for Konstant, which is German for “constant”; no mystery there.

But why !!S!!? People typically guess that it stands for “substitution”, the idea being that if you have some application $$A\,B$$ then !!S!! allows one to substitute some term !!T!! for a free variable !!v!! in both !!A!! and !!B!! prior to the application:

$$ S\, A\, B\, T = A[v/T]\, B[v/T]. $$

Although this seems plausible, it's not correct.

Combinatory logic was introduced in a 1924 paper of Moses Schönfinkel. In it, he defines a family of combinators including the standard !!S!!, !!K!!, and !!I!!; he shows that only !!S!! and !!K!! are required. His initial set of combinators comprises the following:

$$ \begin{array}{cllrl} I & \textit{Identitätsfunktion} & \text{“identity function”}& I\,x =& x \\ C & \textit{Konstanzfunktion} & \text{“constancy function”} & C\,x\,y =& x \\ T & \textit{Vertauschungsfunktion} & \text{“swap function”} & T\,x\,y\,z=& x\,z\,y \\ Z & \textit{Zusammensetzungsfunktion} & \text{“composition function”} & Z\,x\,y\,z=& x\,(y\,z) \\ S & \textit{Verschmelzungsfunktion} & \text{“fusion function”} & S\,x\,y\,z=& x\,z\,(y\,z) \end{array} $$

(Schönfinkel also had combinators representing logical operations (one corresponding to the Sheffer stroke, which had been discovered in 1913), and to quantification, but those don't concern us right now.)

!!T!! and !!Z!! are now usually called !!C!! and !!B!!. These names probably originated in Curry's Grundlagen der kombinatorischen Logik (1930). Curry 1930 is probably also the origin of the change from !!C!! to !!K!!. I have no idea why Schönfinkel chose to abbreviate Konstanzfunktion as !!C!! instead of !!K!!. Curry notes that for !!I, K, B, C, S!! Schönfinkel has !!I, C, Z, T, S!!, but does not explain his changes. In Curry and Feys’ influential 1958 book on combinatory logic, the !!B!! and !!C!! combinators given names that are are literal translations of Schönfinkel's: “elementary permutator” and “elementary compositor”.

Returning to the !!S!! combinator, one sees that its German name in Schönfinkel's paper, Verschmelzungsfunktion, begins with the letter V, but so does Vertauschungsfunktion, so abbreviating either with V would have been ambiguous. Schönfinkel instead chose to abbreviate Verschmelzungsfunktion with S for its root schmelzen, “fusion”, and Vertauschungsfunktion with T for its root tauschen, “swap”. The word schmelzen is akin to English words “melt” and “smelt”.

The “swap” is straightforward: the !!T!! combinator swaps the order of the arguments to !!x!! in !!x\,y\,z!!: $$T\,x\,y\,z = x\,z\,y$$ but does not otherwise alter the structure of the expression.

But why is !!S!! the “melting” or “fusion” combinator? It's because Schönfinkel was interested in reducing abitrary mathematical expressions to combinators. He will sometimes have an application !!(f\, x)(g\, x)!! and he wants to ‘fuse’ the two occurrences of !!x!!. He can do this by rewriting the expression as !!S\, f\, g\, x!!. Schönfinkel says:

Der praktische Nutzen der Function !!S!! besteht ersichtlich darin, daß sie es ermöglicht, mehrmals auftresnde Veränderliche — und bis zu einem gewissen Grade auch individuelle Functionen — nur einmal auftreten zu lassen.

Clearly, the practical use of the function !!S!! will be to enable us to reduce the number of occurrences of a variable — and to some extent also of a particular function — from several to a single one.

(Translation from van Heijenoort, p. 362.)

So there you have it: the !!S!! combinator is so-named not for substitution, but because S is the first letter of schmelzen, ‘to fuse’.

References

  • Schönfinkel, M. “Über die Bausteine der mathematischen Logik” (“On the building-blocks of mathematical logic”), Mathematische Annalen (1969), p. 305–316; Berlin, Göttingen, Heidelberg.

    English translation in Van Heijenoort, Jean (ed.) From Frege to Gödel: a Source Book in Mathematical Logic, 1879–1931 (1967) pp. 355–366 Harvard University Press; Cambridge and London.

  • Curry, H.B. “Grundlagen der kombinatorischen Logik” (“Fundamentals of combinatory logic”), _American Journal of Mathematics Vol. 52, No. 3 (Jul., 1930), pp. 509-536.

  • Curry, H.B. and Robert Feys Combinatory Logic (1958) p. 152 North-Holland Publishing Company, Amsterdam.


[Other articles in category /math] permanent link

Thu, 22 Jul 2021

The convergents of 2x

Take some real number !!\alpha!! and let its convergents be !!c_0, c_1, c_2, \ldots!!. Now consider the convergents of !!2\alpha!!. Sometimes they will include !!2c_0, 2c_1, 2c_2, \ldots!!, sometimes only some of these.

For example, the convergents of !!\pi!! and !!2\pi!! are

$$ \begin{array}{rlc} \pi & \approx & \color{darkblue}{3},&&& \color{darkblue}{\frac{22}{7}}, & \color{darkblue}{\frac{333}{106}}, && \color{darkblue}{\frac{355}{113}}, & \color{darkblue}{\frac{103993}{33102}}, && \frac{104348}{33215}, & \color{darkblue}{\frac{208341}{66317}}, & \ldots \\ 2\pi & \approx & \color{darkblue}{6}, & \frac{19}{3}, & \frac{25}{4}, & \color{darkblue}{\frac{44}{7}}, & \color{darkblue}{\frac{333}{53}}, & \frac{377}{60}, & \color{darkblue}{\frac{710}{113}}, & \color{darkblue}{\frac{103393}{16551}}, & \frac{312689}{49766}, && \color{darkblue}{\frac{416682}{66317}}, & \ldots \end{array}
$$

Here are the analogous lists for !!\frac{1+\sqrt{5}}2!! and !!1+\sqrt5!!:

$$ \begin{array}{rlc} \frac12{1+\sqrt{5}}& \approx & 1, & 2, & \color{darkblue}{\frac32}, & \frac53, & \frac85, & \color{darkblue}{\frac{13}8}, & \frac{21}{13}, & \frac{34}{21}, & \color{darkblue}{\frac{55}{34}}, & \frac{89}{55}, & \frac{144}{89}, & \color{darkblue}{\frac{233}{144}}, & \frac{377}{233}, &\frac{610}{377} , & \color{darkblue}{\frac{987}{610} }, & \ldots \\ 1+\sqrt{5} & \approx & & & \color{darkblue}{3}, &&& \color{darkblue}{\frac{13}4}, &&& \color{darkblue}{\frac{55}{17}}, &&& \color{darkblue}{\frac{233}{72}}, &&& \color{darkblue}{\frac{987}{305}}, & \ldots \end{array} $$

This time all the convergents in the second list are matched by convergents in the first list. The number !!\frac{1+\sqrt5}{2}!! is notorious because it's the real number whose convergents converge the most slowly. I'm surprised that !!1+\sqrt5!! converges so much more quickly; I would not have expected the factor of 2 to change the situation so drastically.

I haven't thought about this at all yet, but it seems to me that a promising avenue would be to look at what Gosper's algorithm would do for the case !!x\mapsto 2x!! and see what simplifications can be done. This would probably produce some insight, and maybe a method for constructing a number !!\alpha!! so that all the convergents of !!2\alpha!! are twice those of !!\alpha!!.


[Other articles in category /math] permanent link

Fri, 09 Jul 2021

“Forensic” doesn't mean what I thought it did

Last week at work we released bad code, which had somehow survived multiple reviews. I was very interested in finding out how this happened, dug into the Git history to find out, and wrote a report. Originally I titled the report something like “Forensic analysis of Git history” (and one of my co-workers independently referred to the investigation as forensic) but then I realized I wasn't sure what “forensic” meant. I looked it up, and learned it was the wrong word.

A forensic analysis is one performed in the service of a court or court case. “Forensic” itself is from Latin forum, which is a public assembly place where markets were held and court cases were heard.

Forensic medicine is medicine in service of a court case, for example to determine a cause of death. For this reason it often refers to a postmortem examination, and I thought that “forensic” meant a postmortem or other retrospective analysis. That was the sense I intended it. But no. I had written a postmortem analysis, but not a forensic one.


[Other articles in category /lang/etym] permanent link

Thu, 08 Jul 2021

A simple dice-throwing game that seems hard to play

I ran into a fun math problem yesterday, easy to ask, easy to understand, but somewhat open-ended and seems to produce fairly complex behavior. It might be a good problem for a bright high school student to tinker with.

Consider the following one-player game. You start with a total of n points. On each turn, you choose to throw either a four-, six-, or eight-sided die, and then subtract the number thrown from your point total. The game continues until your total reaches zero (and you win) or goes below zero (and you lose).

This game seems surprisingly difficult to analyze. The computer analysis is quite easy, but what I mean is, if someone comes to you offering to pay you a dollar if you can win starting with !!n=9!! points, and it would be spoilsportish to say “just wait here for half an hour while I write this computer program”, what's your good move?

Is there even a way to make an educated guess, short of doing a full analysis? The !!n≤4!! strategy is obvious, but even for !!n=5!! you need to start calculating: rolling the d4 is safe. Rolling the d6 gives you a chance of wiping out, but also a chance of winning instantly; is that an improvement? (Spoiler: it is, quite substantially so! Your chance of winning increases from !!36\%!! to !!40.7\%!!.)

With the game as described, and optimal play, the probability of winning approaches !!45.66\%!! as the number of points increases, and the strategy is not simple: the best strategy for !!n≤20!! uses the d4 in 13 cases, the d6 in 4 cases, and the d8 in 3 cases:

$$\begin{array}{rcl} n & \text{Best play} & \text{Win %} \\ \hline 1 & 4\quad & 25.00\% \\ 2 & 4\quad & 31.25 \\ 3 & 4\quad & 39.06 \\ 4 & 4\quad & 48.83 \\ \hline 5 & 6 & 40.69\% \\ 6 & 6 & 47.47 \\ 7 & 4\quad & 44.01 \\ 8 & \quad8 & 47.04 \\ \hline 9 & 4\quad & 44.80\% \\ 10 & 4\quad & 45.83 \\ 11 & 4\quad & 45.42 \\ 12 & 4\quad & 45.77 \\ \hline 13 & 6 & 45.48 \% \\ 14 & \quad8 & 45.73 \\ 15 & 4\quad & 45.60 \\ 16 & \quad8 & 45.71 \\ \hline 17 & 4\quad & 45.63 \% \\ 18 & 4\quad & 45.67 \\ 19 & 4\quad & 45.65 \\ 20 & 6 & 45.67 \end{array} $$

It seems fairly clear (and not hard to prove) that when the die with fewest sides has !!d!! sides, the good numbers of points are multiples of !!d!!, with !!kd+1!! somewhat worse, and then !!kd+2, kd+3, \ldots!! generally better and better to the next peak at !!kd+d!!. But there are exceptions: even if !!d!! is not the smallest die, if you have a !!d!!-sided die, it is good to have !!d!! points, and when you do you should roll the !!d!!-sided die.

I did get a little more insight after making the chart above and seeing the 4-periodicity. In a comment on my Math SE post I observed:

There is a way to see quickly that the d4 is better for !!n=7!!. !!n=1!! is the worst possible position. !!n=2,3,!! and !!4!! are increasingly good; !!4!! is best because you can't lose and you might win outright. After that !!5!! is bad again, but not as bad as !!1!!, with !!6,7,8!! increasingly good. The pattern continues this way, with !!4k−3,4k−2,4k−1,4k!! being increasingly good, and then !!4k+1!! being worse again but better than !!4k−3!!. For !!n=7!!, the d6 allows one to land on !!\{1,2,3,4,5,6\}!!, and the d4 on !!\{3,4,5,6\}!!. But !!1!! is worse than !!5!! and !!2!! is worse than !!6!!, so prefer the d4.

The d4-d6-d8 case is unusually confusing, because for example it's not clear whether from 12 points you should throw d4, hoping to land on 8, or d6, hoping to land on 6. (I haven't checked but I imagine the two strategies perform almost equally well; similarly it probably doesn't matter much if you throw the d4 or the d6 first from !!n=10!!.)

That the d6 is best for !!n=13!! is very surprising to me.

Why !!45.66\%!!? I don't know. With only one die, the winning probability for large !!n!! converges to !!\frac2{n+1}!! which I imagine is a fairly straightforward calculation (but I have not done it). For more than one die, it seems much harder.

Is there a way to estimate the winning probability for large !!n!!, given the list of dice? Actually yes, a little bit: the probability of winning with just a d4 is !!\frac 25!!, and the d6 and d8 can't hurt, so we know the chance of winning with all three dice available will be somewhat more than !!40\%!!, as it is. The value of larger dice falls off rapidly with the number of sides, so for example with d4+d6 the chance of winning increases from !!40\%!! to almost !!45\%!!, and adding the d8 only nudges this up to !!45.66\%!!.

The probability of winning with a d2 is !!\frac 23!!, and if you have a d3 also the probability goes up to !!\frac 34!!, which seems simple enough, but if you add a d4 instead of the d3 instead it goes to !!68.965\%!!, whatever that is. And Dfan Schmidt tells me that d3 + d4 converges to !!\frac{512}{891}!!.

I wrote it up for Math StackExchange but nobody has replied yet.

Here's Python code to calculate the values. Enjoy.

[ Addendum: Michael Lugo points out that the d2+d4 probability (“!!68.965\%!!, whatever that is”) is simply !!\frac{20}{29}!!, and gives some other similar results. One is that d3+d4+d5 has a winning probability of !!\frac{16}{27}!!; the small denominator is surprising. ]


[Other articles in category /math] permanent link

Wed, 07 Jul 2021

Examples of dummy pronouns

Katara is interested in linguistics. When school was over for the year and she had time to think about things, I gave her all my old linguistics books. The other day for some reason I mentioned to her that I had known people who were engaged in formal research on the problem of how to get a computer to know what a pronoun referred to, and that this is very difficult.

(I once had a co-worker who claimed that it was simple: the pronoun always refers back to the nearest noun. It wasn't hard to go back in his Slack history and find a counterexample he had uttered a few minutes before.)

Today I wanted to tell Katara about dummy pronouns, which refer to nothing at all. I intended to send her the example from Wiktionary:

it is good to know that you are okay

I started my message:

Here's an interesting example of how hard it can be to find what a pronoun refers to

Then I realized I no longer needed the example.


[Other articles in category /lang] permanent link

Mon, 05 Jul 2021

Duckface in German

In English, this is called duckface:

Ariana Grande looking over her
shoulder with her lips abnormally everted

In German, I've learned, it's Schlauchbootlippen.

Schlauch is “tube”. A Schlauchboot is a tube-boat — an inflatable rubber dingy. Schlauchbootlippen means dinghy-lips.


[Other articles in category /lang/etym] permanent link

Sun, 27 Jun 2021

Dogs that look like board games

In Korean, “바둑이” (/badugi/) is a common name for a spotted dog, especially a black-spotted dog. This is because “바둑” (/baduk/) is the native Korean name for the game of go, in which round black and white stones are placed on a board.

In English, black-and-white spotted dogs are sometimes named “Checkers” for essentially the same reason.


[Other articles in category /lang/etym] permanent link

Mon, 19 Apr 2021

Odd translation choices

Recently I've been complaining about unforced translation errors. ([1] [2]) Here's one I saw today:

A
picture of two cows in a field.  One has a child-sized toy plastic car
on its head.  The cow with the car on its head is saying: “БИП-БИП ВАШ
УБЕР ПРИБЫЛ

The translation was given as:

“honk honk, your Uber has arrived”

“Oleg, what the fuck”

Now, the Russian text clearly says “beep-beep” (“бип-бип”), not “honk honk”. I could understand translating this as "honk honk" if "beep beep" were not a standard car sound in English. But English-speaking cars do say “beep beep”, so why change the original?

(Also, a much smaller point: I have no objection to translating “Что за херня” as “what the fuck”. But why translate “Что за херня, Олег?” as “Oleg, what the fuck” instead of “What the fuck, Oleg”?)

[ Addendum 20210420: Katara suggested that perhaps the original translator was simply unaware that Anglophone cars also “beep beep”. ]


[Other articles in category /lang] permanent link

Wed, 14 Apr 2021

More soup-guzzling

A couple of days ago I discussed the epithet “soup-guzzling pie-muncher”, which in the original Medieval Italian was brodaiuolo manicator di torte. I had compained that where most translations rendered the delightful word brodaiuolo as something like “soup-guzzler” or “broth-swiller”, Richard Aldington used the much less vivid “glutton”.

A form of the word brodaiuolo appears in one other place in the Decameron, in the sixth story on the first day, also told by Emilia, who as you remember has nothing good to say about the clergy:

… lo 'nquisitore sentendo trafiggere la lor brodaiuola ipocrisia tutto si turbò…

J. M. Rigg (1903), who had elsewhere translated brodaiuolo as “broth-guzzling”, this time went with “gluttony”:

…the inquisitor, feeling that their gluttony and hypocrisy had received a home-thrust…

G. H. McWilliam (1972) does at least imply the broth:

…the inquisitor himself, on hearing their guzzling hypocrisy exposed…

John Payne (1886):

the latter, feeling the hit at the broth-swilling hypocrisy of himself and his brethren…

Cormac Ó Cuilleanáin's revision of Payne (2004):

…the inquisitor himself, feeling that the broth-swilling hypocrisy of himself and his brethren had been punctured…

And what about Aldington (1930), who dropped the ball the other time and rendered brodaiuolo merely as “glutton”? Here he says:

… he felt it was a stab at their thick-soup hypocrisy…

Oh, Richard.

I think you should have tried harder.


[Other articles in category /lang] permanent link