Archive:
In this section: Subtopics:
Comments disabled |
Sun, 30 Nov 2008
License plate sabotage
So it is too with car license plates, and for a number of years I have
toyed with the idea of getting a personalized plate with
A plate number like Recently a car has appeared in my neighborhood that seems to be owned by someone with the same idea: In case it's hard to make out in the picture—and remember that that's the whole idea—the license number is 00O0O0.
(That's two letters and four digits.) If you are reading this in a
font in which 0 and O are difficult or impossible to
distinguish—well, remember that that's the whole idea.
Other Pennsylvanians should take note. Consider selecting
Variations on the Goldbach conjecture
- Every prime number is the sum of two even numbers.
- Every odd number is the sum of two primes.
- Every even number is the product of two primes.
Flag variables in Bourne shell programs
Suppose you want to set a flag variable, and then later you want to test it. You probably do something like this:
ifOr maybe you use ${IS_NAKED:-0} or some such
instead of "$IN_NAKED". Whatever.Today I invented a different technique. Try this on instead:
IS_NAKED=false ifThe arguments both for and against it seem to be obvious, so I won't make them. I have never seen this done before, but, as I concluded and R.J.B. Signes independently agreed, it is obvious once you see it. [ Addendum 20090107: some followup notes ]
Another note about Gabriel's Horn
Presumably people think it's paradoxical that the thing should have a finite volume but an infinite surface area. But since the horn is infinite in extent, the infinite surface area should be no surprise.
The surprise, if there is one, should be that an infinite object might
contain a merely finite volume. But we swallowed The pedigree for that paradox goes at least back to Zeno, so perhaps
Gabriel's Horn merely shows that there is still some life in it, even after
2,400 years.[ Addendum 2014-07-03: I have just learned that this same analogy was also described in this math.stackexchange post of 2010. ]
Gabriel's Horn is not so puzzling
finite
amount of paint.The calculations themselves do not lend much insight into what is going on here. But I recently read a crystal-clear explanation that I think should be more widely known. Take out some Play-Doh and roll out a snake. The surface area of the snake (neglecting the two ends, which are small) is the product of the length and the circumference; the circumference is proportional to the diameter. The volume is the product of the length and the cross-sectional area, which is proportional to the square of the diameter.
As you continue to roll the snake thinner and thinner, the volume stays the same, but the surface area goes to infinity. Gabriel's Horn does exactly the same thing, except without the rolling, because the parts of the Horn that are far from the origin look exactly the same as very long snakes. There's nothing going on in the Gabriel's Horn example that isn't also happening in the snake example, except that in the explanation of Gabriel's Horn, the situation is obfuscated by calculus. I read this explanation in H. Jerome Keisler's caclulus textbook. Keisler's book is an ordinary undergraduate calculus text, except that instead of basing everything on limits and on limiting processes, it is based on nonstandard analysis and explicit infinitesimal quantities. Check it out; it is available online for free. (The discussion of Gabriel's Horn is in chapter 6, page 356.) [ Addendum 20081110: A bit more about this. ]
Addenda to recent articles 200810
- I discussed representing
ordinal numbers in the computer and expressed doubt that the
following representation truly captured the awesome complexity of the
ordinals:
data Nat = Z | S Nat data Ordinal = Zero | Succ Ordinal | Lim (Nat → Ordinal) In particular, I asked "What about Ω, the first uncountable ordinal?" Several readers pointed out that the answer to this is quite obvious: Suppose*S*is some countable sequence of (countable) ordinals. Then the limit of the sequence is a countable union of countable sets, and so is countable, and so is not Ω. Whoops! At least my intuition was in the right direction.Several people helpfully pointed out that the notion I was looking for here is the "cofinality" of the ordinal, which I had not heard of before. Cofinality is fairly simple. Consider some ordered set *S*. Say that an element*b*is an "upper bound" for an element*a*if*a*≤*b*. A subset of*S*is*cofinal*if it contains an upper bound for every element of*S*. The*cofinality*of*S*is the minimum cardinality of its cofinal subsets, or, what is pretty much the same thing, the minimum order type of its cofinal subsets.So, for example, the cofinality of ω is ℵ _{0}, or, in the language of order types, ω. But the cofinality of ω + 1 is only 1 (because the subset {ω} is cofinal), as is the cofinality of any successor ordinal. My question, phrased in terms of cofinality, is simply whether any ordinal has uncountable cofinality. As we saw, Ω certainly does.But some uncountable ordinals have countable cofinality. For example, let ω _{n}be the smallest ordinal with cardinality ℵ_{n}for each*n*. In particular, ω_{0}= ω, and ω_{1}= Ω. Then ω_{ω}is uncountable, but has cofinality ω, since it contains a countable cofinal subset {ω_{0}, ω_{1}, ω_{2}, ...}. This is the kind of bullshit that set theorists use to occupy their time.A couple of readers brought up George Boolos, who is disturbed by extremely large sets in something of the same way I am. Robin Houston asked me to consider the ordinal number which is the least fixed point of the ℵ operation, that is, the smallest ordinal number κ such that |κ| = ℵ _{κ}. Another way to define this is as the limit of the sequence 0, ℵ_{0}ℵ_{ℵ0}, ... . M. Houston describes κ as "large enough to be utterly mind-boggling, but not so huge as to defy comprehension altogether". I agree with the "utterly mind-boggling" part, anyway. And yet it has countable cofinality, as witnessed by the limiting sequence I just gave.M. Houston says that Boolos uses κ as an example of a set that is so big that he cannot agree that it really exists. Set theory says that it does exist, but somewhere at or before that point, Boolos and set theory part ways. M. Houston says that a relevant essay, "Must we believe in set theory?" appears in Logic, Logic, and Logic. I'll have to check it out. My own discomfort with uncountable sets is probably less nuanced, and certainly less well thought through. This is why I presented it as a fantasy, rather than as a claim or an argument. Just the sort of thing for a future blog post, although I suspect that I don't have anything to say about it that hasn't been said before, more than once. Finally, a pseudonymous Reddit user brought up a paper of Coquand, Hancock, and Setzer that discusses just which ordinals *are*representable by the type defined above. The answer turns out to be all the ordinals less than ω^{ω}. But in Martin-Löf's type theory (about which more this month, I hope) you can actually represent up to ε_{0}. The paper is Ordinals in Type Theory and is linked from here.Thanks to Charles Stewart, Robin Houston, Luke Palmer, Simon Tatham, Tim McKenzie, János Krámar, Vedran Čačić, and Reddit user "apfelmus" for discussing this with me. [ Meta-addendum 20081130: My summary of Coquand, Hancock, and Setzer's results was utterly wrong. Thanks to Charles Stewart and Peter Hancock (one of the authors) for pointing this out to me. ] - Regarding homophones of
numeral words, several readers pointed out that in non-rhotic
dialects, "four" already has four homophones, including "faw" and
"faugh". To which I, as a smug rhotician, reply "feh".
One reader wondered what should be done about homophones of "infinity", while another observed that a start has already been made on "googol". These are just the sort of issues my proposed Institute is needed to investigate. One clever reader pointed out that "half" has the homophone "have". Except that it's not really a homophone. Which is just right!
Election results
#!/usr/bin/perl my $remain = 1232470800 - time(); $remain > 0 or print("It's finally over.\n"), exit; my @dur; for (60, 60, 24, 100000) { unshift @dur, $remain % $_; $remain -= $dur[0]; $remain /= $_; } my @time = qw(day days hour hours minute minutes second seconds); my @s; for (0 .. $#dur) { my $n = $dur[$_] or next; my $unit = $time[$_*2 + ($n != 1)]; $s[$_] = "$n $unit"; } @s = grep defined, @s; $s[-1] = "and $s[-1]" if @s > 2; print join ", ", @s; print "\n";
Atypical Typing
If you want to skip the notes and just read the talk, here it is.
## Talk abstractMany of the shortcomings of Java's type system were addressed by the addition of generics to Java 5.0. Java's generic types are a direct outgrowth of research into strong type systems in languages like SML and Haskell. But the powerful, expressive type systems of research languages like Haskell are capable of feats that exceed the dreams of programmers familiar only with mainstream languages.I did not say in the abstract that the talk was a retread of a talk I gave for the Perl mongers in 1999 titled "Strong Typing Doesn't Have to Suck. Nobody wants to hear that. Still, the talk underwent a major rewrite, for all the obvious reasons. In 1999, the claim that strong typing does not have to suck was surprising news, and particularly so to Perl Mongers. In 2008, however, this argument has been settled by Java 5, whose type system demonstrates pretty conclusively that strong typing doesn't have to suck. I am not saying that you must like it, and I am not saying that there is no room for improvement. Indeed, it's obvious that the Java 5 type system has room for improvement: if you take the SML type system of 15 years ago, and whack on it with a hammer until it's chipped and dinged all over, you get the Java 5 type system; the SML type system of the early 1990s is ipso facto an improvement. But that type system didn't suck, and neither does Java's. So I took out the arguments about how static typing didn't have to suck, figuring that most of the OOPSLA audience was already sold on this point, and took a rather different theme: "Look, this ivory-tower geekery turned out to be important and useful, and its current incarnation may turn out to be important and useful in the same way and for the same reasons."
In 1999, I talked about Hindley-Milner type systems, and although it
was far from clear at the time that mainstream languages would follow
the path blazed by the HM languages, that was exactly what happened.
So the HM languages, and Haskell in particular, contained some
features of interest, and, had you known then how things would turn
out, would have been worth looking at. But Haskell has continued to
evolve, and perhaps it
Or maybe another way to put it: If the adoption of functional
programming ideas into the mainstream took you by surprise, fair
enough, because sometimes these things work out and sometimes they
don't, and sometimes they get adopted and sometimes they don't. But
if it happens
## Haskell types are hard to explainI spent most of the talk time running through some simple examples of Haskell's type inference algorithm, and finished with a really spectacular example that I first saw in a talk by Andrew R. Koenig at San Antonio USENIX where the type checker detects an infinite-loop bug in a sorting function at compile time. The goal of the 1999 talk was to explain enough of the ML type system that the audience would appreciate this spectacular example. The goal of the 2008 talk was the same, except I wanted to do the examples in Haskell, because Haskell is up-and-coming but ML is down-and-going.It is a lot easier to explain ML's type system than it is to explain Haskell's. Partly it's because ML is simpler to begin with, but also it's because Haskell is so general and powerful that there are very few simple examples! For example, in SML one can demonstrate:
(* SML *) val 3 : int; val 3.5 : real;which everyone can understand.
But in Haskell, 3.5 has the type (Fractional . So you can't explain the types of literal numeric
constants without first getting into type classes.t) ⇒
t
The benefit of this, of course, is that you can write Similarly, in SML you can demonstrate some simple monomorphic functions: not : bool → bool real : int → real sqrt : real → real floor : real → intOf these, only not is simple in Haskell:
not :: Bool → Bool fromInteger :: (Num a) ⇒ Integer → a -- analogous to 'real' sqrt :: (Floating a) ⇒ a → a floor :: (RealFrac a, Integral b) ⇒ a → bThere are very few monomorphic functions in the Haskell standard prelude.
## SlidesI'm still using the same slide-generation software I used in 1999, which makes me happy. It's a giant pile of horrible hacks, possibly the worst piece of software I've ever written. I'd like to hold it up as an example of "worse is better", but actually I think it only qualifies for "bad is good enough". I should write a blog article about this pile of hacks, just to document it for future generations.
## Conference plenary sessionsThis was the first "keynote session" I had been to at a conference in several years. One of the keynote speakers at a conference I attended was such a tedious, bloviating windbag that I walked out and swore I would never attend another conference plenary session. And I kept that promise until last week, when I had to attend, because nowI was not only the bloviating windbag behind the lectern, but
an oath-breaker to boot. This is the "shameful confession" alluded to
on slide 3.
## On the other hand...One of the highest compliments I've ever received. It says "John McCarthy will be there. Mark Jason Dominus, too." Wow, I'm almost in the same paragraph with John McCarthy.
McCarthy didn't actually make it, unfortunately. But I did get to
meet Richard Gabriel and Gregor Kiczales. And Daniel Weinreb,
although I didn't know who he was before I met him. But now I'm glad
I met Daniel Weinreb. During my talk I digressed to say that anyone
who is appalled by Perl's regular expression syntax should take a look
at Common Lisp's
## More explaining of jokesAs I get better at giving conference talks, the online slides communicate less and less of the amusing part of the content. You might find it interesting to compare the 1999 version of this talk with the 2008 version.One joke, however, is too amusing to leave out. At the start of the talk, I pretended to have forgotten my slides. "No problem," I said. "All my talks these days are generated automatically by the computer anyway. I'll just rebuild it from scratch." I then displayed this form, which initialliy looked like this: I then filled out the form appropriately for OOPSLA: I pushed the button, and poof! Instant slides.
## Wadler's anecdoteI had the chance to talk to Philip Wadler, one of the designers of Haskell and of the Java generics system, before the talk. I asked him about the history of the generics feature, and he told me the following story: At this point in the talk I repeated an anecdote that Wadler told me. After he and Odersky had done the work on generics in theirgj and "Pizza" projects, Odersky was hired by Sun to
write the new Java compiler. Odersky thought the generics were a good
idea, so he put them into the compiler. At first the Sun folks always
ran the compiler with the generics turned off. But they couldn't rip
out the generics support completely, because they needed it in the
compiler in order to get it to compile its own source code, which
Odersky had written with generics. So Sun had to leave the feature
in, and eventually they started using it, and eventually they decided
they liked it. I related this story in the talk, but it didn't make
it onto the slides, so I'm repeating it here.I had never been to OOPSLA, so I also asked Wadler what the OOPSLA people would want to hear about. He mentioned STM, but since I don't know anything about STM I didn't say anything about it.
## View it onlineThe slides are online.
[ Addendum 20081031: Thanks to a Max Rabkin for pointing
out that Haskell's analogue of
A proposed correction to an inconsistency in English orthography
English also contains exactly one homophone of "one", namely "won". English does indeed contain two homophones of "two": "too" and "to". However, the expected homophones of "three" are missing. I propose to rectify this inconsistency. This is sure to make English orthography more consistent and therefore easier for beginners to learn. I suggest the following:
thrieI also suggest the founding of a well-funded institute with the following mission:
- Determine the meanings of these three new homophones
- Conduct a public education campaign to establish them in common use
- Lobby politicians to promote these new words by legislation, educational standards, public funding, or whatever other means are appropriate
- Investigate the obvious sequel issues: "four" has only "for" and "fore" as homophones; what should be done about this?
Happy Halloween. All Hail Discordia. [ Addendum 20081106: Some readers inexplicably had nothing better to do than to respond to this ridiculous article. ]
Representing ordinal numbers in the computer and elsewhere
The Turner paper is a must-read. It's about functional programming in languages where every program is guaranteed to terminate. This is more useful than it sounds at first. Turner's initial point is that the presence of ⊥ values in languages like Haskell spoils one's ability to reason from the program specification. His basic example is simple: loop :: Integer -> Integer loop x = 1 + loop xTaking the function definition as an equation, we subtract (loop x) from both sides and get 0 = 1which is wrong. The problem is that while subtracting (loop
x) from both sides is valid reasoning over the integers, it's not
valid over the Haskell Integer type, because Integer
contains a ⊥ value for which that law doesn't hold: 1 ≠ 0, but 1
+ ⊥ = 0 + ⊥.Before you can use reasoning as simple and as familiar as subtracting an expression from both sides, you first have to prove that the value of the expression you're subtracting is not ⊥. By banishing nonterminating functions, one also banishes ⊥ values, and familiar mathematical reasoning is rescued. You also avoid a lot of confusing language design issues. The whole question of strictness vanishes, because strictness is solely a matter of what a function does when its argument is ⊥, and now there is no ⊥. Lazy evaluation and strict evaluation come to the same thing. You don't have to wonder whether the logical-or operator is strict in its first argument, or its second argument, or both, or neither, because it comes to the same thing regardless. The drawback, of course, is that if you do this, your language is no longer Turing-complete. But that turns out to be less of a problem in practice than one would expect. The paper was so interesting that I am following up several of its precursor papers, including Abel's paper, about which the Turner paper says "The problem of writing a decision procedure to recognise structural recursion in a typed lambda calculus with case-expressions and recursive, sum and product types is solved in the thesis of Andreas Abel." And indeed it is. But none of that is what I was planning to discuss. Rather, Abel introduces a representation for ordinal numbers that I hadn't thought much about before.
I will work up to the ordinals via an intermediate example. Abel
introduces a type
The "1" here is not the number 1, but rather a base type that contains only one element, like Haskell's () type or ML's
unit type.
For concreteness, I'll write the single value of this type as '•'.
The ⊕ operator is the disjoint sum operator
for types. The elements of the type
The values of
Succ(Succ(Succ(0))) as usual. (In this
explanation, I omitted some technical details about recursive types.)So much for the natural numbers. Abel then defines a type of ordinal numbers, as: In this scheme, an ordinal is either left(left(•)),
which represents 0, or left(right(n)), which
represents the successor of the ordinal n, or
right(f), which represents the limit ordinal of the
range of the function f, whose type is Nat → Ord.We can define abbreviations:
Zero, 1 = Succ(0),
2 = Succ(1),
and so on. If we define a function id which maps Nat into Ord in
the obvious way:
id ::then ω = Lim(id). Then we easily get ω+1 =
Succ(ω), etc., and the limit of this function is 2ω:
plusomega ::We can define an addition function on ordinals:
This gets us another way to make 2ω: 2ω = Lim(λx.id(x) + ω).
Then this function multiplies a
timesomega ::and Lim(timesomega) is ω^{2}. We can go on like this.But here's what puzzled me. The ordinals are really, really big. Much too big to be a set in most set theories. And even the countable ordinals are really, really big. We often think we have a handle on uncountable sets, because our canonical example is the real numbers, and real numbers are just decimal numbers, which seem simple enough. But the set of countable ordinals is full of weird monsters, enough to convince me that uncountable sets are much harder than most people suppose.
So when I saw that Abel wanted to define an arbitrary ordinals as a
limit of a Well, maybe. I can't think of any reason why not. But it still doesn't seem right. It is a very weird sequence, and one that you cannot write down. Because suppose you had a notation for all the ordinals that you would need. But because it is a notation, the set of things it can denote is countable, and so a fortiori the limit of all the ordinals that it can denote is a countable ordinal, not Ω.
And it's all very well to say that the sequence starts out (0, ω,
2ω, ω
So my question to set theory experts: is I hate uncountable sets, and I have a fantasy that in the mathematics of the 23rd Century, uncountable sets will be looked back upon as a philosophical confusion of earlier times, like Zeno's paradox, or the luminiferous aether. [ Addendum 20081106: Not every limit ordinal is the least upper bound of some countable sequence of (countable) ordinals, and my guess that Ω is not was correct, but the proof is so simple that I was quite embarrassed to have missed it. More details here. ] [ Addendum 20160716: In the 8 years since I wrote this article, the link to Turner's paper at Middlesex has expired. Fortunately, Miëtek Bak has taken it upon himself to create an archive containing this paper and a number of papers on related topics. Thank you, M. Bak! ]
The Lake Wobegon Distribution
To take my favorite example: nearly everyone has an above-average number of legs. I wish I could remember who first brought this to my attention. James Kushner, perhaps? But the world abounds with less droll examples. Consider a typical corporation. Probably most of the employees make a below-average salary. Or, more concretely, consider a small company with ten employees. Nine of them are paid $40,000 each, and one is the owner, who is paid $400,000. The average salary is $76,000, and 90% of the employees' salaries are below average. The situation is familiar to people interested in baseball statistics because, for example, most baseball players are below average. Using Sean Lahman's database, I find that 588 players received at least one at-bat in the 2006 National League. These 588 players collected a total of 23,501 hits in 88,844 at-bats, for a collective batting average of .265. Of these 588, only 182 had an individual batting average higher than 265. 69% of the baseball players in the 2006 National League were below-average hitters. If you throw out the players with fewer than 10 at-bats, you are left with 432 players of whom 279, or 65%, hit worse than their collective average of 23430/88325 = .265. Other statistics, such as earned-run averages, are similarly skewed. The reason for this is not hard to see. Baseball-hitting talent in the general population is normally distributed, like this: Here the right side of the graph represents the unusually good hitters, of whom there aren't very many. The left side of the graph represents the unusually bad hitters; there aren't many of those either. Most people are somewhere in the middle, near the average, and there are about as many above-average hitters as below-average hitters in the general population. But major-league baseball players are not the general population. They are carefully selected, among the best of the best. They are all chosen from the right-hand edge of the normal curve. The people in the middle of the normal curve, people like me, play baseball in Clark Park, not in Quankee Stadium. Here's the right-hand corner of the curve above, highly magnified: As you can see here, the shape is not at all like the curve for the general population, which had the vast majority of the population in the middle, around the average. Here, the vast majority of the population is way over on the left side, just barely good enough to play in the majors, hanging on to their jobs by the skin of their teeth, subject at any moment to replacement by some kid up from the triple-A minors. The above-average players are the ones over on the right end, the few of the few.
Actually I didn't present the case strongly enough. There are around
800 regular major-league ballplayers in the USA, drawn from a population of
around 300 million, a ratio of one per 375,000. Well, no, the ratio is
smaller, since the U.S. leagues also draw the best players from Mexico,
Venezuela, Canada, the Dominican Republic, Japan, and elsewhere. The
curve above is (Note especially the numbers on the y-axis.)This has important implications for the analysis of baseball. A player who is "merely" above average is a rare and precious resource, to be cherished; far more players are below average. Skilled analysts know that comparisons with the "average" player are misleading, because baseball is full of useful, effective players who are below average. Instead, analysts compare players to a hypothetical "replacement level", which is effectively the leftmost edge of the curve, the level at which a player can be easily replaced by one of those kids from triple-A ball. In the Historical Baseball Abstract, Bill James describes some great team, I think one of the Cincinnati Big Red Machine teams of the mid-1970s, as "possibly the only team in history that was above average at every position". That's an important thing to know about the sport, and about team sports in general: you don't need great players to completely clobber the opposition; it suffices to have players that are merely above average. But if you're the coach, you'd better learn to make do with a bunch of players who are below average, because that's what you have, and that's what the other team will beat you with.
The right-skewedness of the right side of a normal distribution has
implications that are important outside of baseball. Stephen Jay
Gould wrote an essay about how he was
diagnosed with cancer
and given six months to live.
This sounds awful, and it is awful.
But six months was the My heavens, I just realized that what I've written is an article about the "long tail". I had no idea I was being so trendy. Sorry, everyone.
Sprague-Grundy theory
LetM. Penn was right there with the answer.
Yesterday, M. Penn asked a question to which Richard Penn asked:
M. Penn observed that there is a simple strategy for the 20-dot circle, but was not able to find one for the 19-dot circle. But solving such problems in general is made easy by the Sprague-Grundy theory, which I will explain in detail.
## 0. Short SpoilersBoth positions are wins for the second player to move.The 20-dot case is trivial, since any first-player move leaves a row of 17 dots, from which the second player can leave two disconnected rows of 7 dots each. Then any first-player move in one of these rows can be effectively answered by the second player in the other row. The 19-dot case is harder. The first player's move leaves a row of 16 dots. The second player can win by removing 3 dots to leave disconnected rows of 6 and 7 dots. After this, the strategy is complicated, but is easily found by the Sprague-Grundy theory. It's at the end of this article if you want to skip ahead. Sprague-Grundy theory is a complete theory of all finite impartial games, which are games like this one where the two players have exactly the same moves from every position. The theory says:
- Every such game position has a "value", which is a non-negative integer.
- A position is a second-player win if and only if its value is zero.
- The value of a position can be calculated from the values of the positions to which the players can move, in a simple way.
- The value of a collection of disjoint positions (such as two disconnected rows of dots) can be calculated from the values of its component positions in a simple way.
## 1. NimIn the game of Nim, one has some piles of beans, and a legal move is to remove some or all of the beans from any one pile. The winner is the player who takes the last bean. Equivalently, the winner is the last player who has a legal move.Nim is important because every position in every impartial game is somehow equivalent to a position in Nim, as we will see. In fact, every position in every impartial game is equivalent to a Nim position with at most one heap of beans! Since single Nim-heaps are trivially analyzed, one can completely analyze any impartial game position by calculating the Nim-heap to which it is equivalent.
## 2. Disjoint sums of gamesDefinition: The "disjoint sum"A # B of two games
A and B is a new game whose rules are as follows: a
legal move in A # B is either a move in A
or a move in B; the winner is the last player with a
legal move.Three easy exercises:
- # is commutative.
- # is associative.
- Let (
*a*,*b*,*c*...) represent the Nim position with heaps*a*,*b*,*c*, etc. Then the game (*a*,*b*,*c*,...) is precisely (*a*) # (*b*) # (*c*) # ... .
0 #for all games a. 0 is a win for the previous player: the next
player to move has no legal moves, and loses.We will call the next player to move "P1", and the player who just moved "P2". Note that a Nim-heap of 0 beans is precisely the 0 game.
## 3. Sums of Nim-heapsWe usually represent a single Nim-heap withn beans as
"∗n". I'll do that from now on.
We observed that ∗0 is a win for the second player. Observe now that
when
From now on we will use the symbol "=" to mean a weaker relation on
games than strict equality. Two games
Taking X = 0, the condition A = B implies that
both games have the same outcome in isolation: if one is a
first-player win, so is the other. But the condition is stronger than
that. Both ∗1 and ∗2 are first-player wins, but ∗1 ≠ ∗2, because
∗1 # ∗1 is a second-player win, while ∗2 # ∗1 is a first-player
win.
Exercise: ∗ It so happens that the disjoint sum of two Nim-heaps is equivalent to a single Nim-heap:
I'll omit the proof, which is pretty easy to find. ⊕ is often described as "write a and b in binary, and add, ignoring all carries." For example 1 ⊕ 2 = 3, and 13 ⊕ 7 = 10. This implies that ∗1 # ∗2 = ∗3, and that ∗13 # ∗7 = ∗10. Although I omitted the proof that # for Nim-heaps is essentially the ⊕ operation in disguise, there are many natural implications of this that you can use to verify that the claim is plausible. For example:
- The Nim-sum theorem implies that ∗0 is a neutral element for #, which we already knew.
- Since
*a*⊕*a*= 0, we have:∗ That is, ∗*a*# ∗*a*= ∗0 for all*a**a*# ∗*a*is a win for P2. And indeed, P2 has an obvious strategy: whatever P1 does in one pile, P2 does in the other pile. P2 never runs out of legal moves until after P1 does, and so must win. - Since
*a*⊕*a*= 0, we have, more generally:∗ No matter what*a*# ∗*a*#*X*=*X*for all*a*,*X**X*is, its outcome is the same as that of ∗*a*# ∗*a*#*X*. Why?Suppose you are the player with a winning strategy for playing *X*alone. Then it is easy to see that you have a winning strategy in ∗*a*# ∗*a*#*X*, as follows: ignore the ∗*a*# ∗*a*component, until your opponent moves in it, when you should copy their move in the other half of that component. Eventually the ∗*a*# ∗*a*part will be used up (that is, reduced to ∗0 # ∗0 = 0) and your opponent will be forced to move in*X*, whereupon you can continue your winning strategy there until you win. - According to the ⊕ operation, ∗1 # ∗2 = ∗3, and so ∗1 # ∗2
# ∗3 = ∗3 # ∗3 = 0, so P2 should have a winning strategy in ∗1 #
∗2 # ∗3. Which he does: If P1 removes any entire heap, P2 can win by
equalizing the remaining heaps, leaving ∗1 # ∗1 = 0 or ∗2 # ∗2 =
0, which he wins easily. If P1 equalizes any two heaps, P2 can remove
the third heap, winning the same way.
- Let's reconsider the game of the previous paragraph, but
change the ∗1 to something else.
2 ⊕ 3 ⊕
*x*> 0 so if ∗*x*≠ 1, ∗2 # ∗3 # ∗*x*= ∗*y*, where*y*>0. Since ∗*y*is a single nonempty Nim-heap, it is obviously a win for P1, and so ∗2 # ∗3 # ∗*x*should be equivalent, also a win for P1. What is P1's winning strategy in ∗2 # ∗3 # ∗*x*? It's easy. If*x*> 1, then P1 can reduce ∗*x*to ∗1, leaving ∗2 # ∗3 # ∗1, which we saw is a winning position. And if*x*= 0, then P1 can move to ∗2 # ∗2 and win.
## 4. The MEX ruleThe important thing about disjoint sums is that they abstract away the strategy. If you have some complicated set of Nim-heaps ∗a #
∗b # ... # ∗z, you can ignore them and pretend instead
that they are a single heap ∗(a ⊕ b ⊕ ... ⊕
z). Your best move in the compound heap can be easily worked
out from the corresponding best move in the fictitious single heap.
For example, how do you figure out how to play in ∗2 # ∗3 #
∗ Now, that is very facile, but ∗2 # ∗3 is not the same game as ∗1, because from ∗1 there is just one legal move, which is to ∗0. Whereas from ∗2 # ∗3 there are several moves. It might seem that your opponent could complicate the situation, say by moving from ∗2 # ∗3 to ∗3, which she could not do if it were really ∗1. But actually this extra option can't possibly help your opponent, because you have an easy response to that move, which is to move right back to ∗1! If pretending that ∗2 # ∗3 was ∗1 was good before, it is certainly good after you make it ∗1 for real. From ∗2 # ∗3 there are a whole bunch of moves:
Move to ∗3But you can disregard the first four of these, because they are reversible: if some player X has a winning strategy that
works by pretending that ∗2 # ∗3 is identical with ∗1, then the extra
options of moving to ∗2 and ∗3 won't help X's opponent, because
X can reverse those moves and turn the ∗2 # ∗3 component back
into ∗1. So we can ignore these options, and say that there's just
one move from ∗2 # ∗3 worth considering further, namely to ∗2 # ∗2 =
0. Since this is exactly the same set of moves that is available from
∗1, ∗2 # ∗3 behaves just like ∗1 in all situations, and have just
proved that ∗2 # ∗3 = ∗1.
Unlike the other moves, the move from ∗2 # ∗3 to ∗0 is
Considering this in more generality, suppose we have some game
position - If your
opponent plays in
*X*, then follow your strategy for*X*# ∗*M*, since the same move will also be available in*X*#*P*. - If your opponent makes
*P*into ∗*y*, with*y*<*M*, then they've discarded their extra options, which are now irrelevant; play as you would if they had moved from*X*# ∗*M*to*X*# ∗*y*. - If your opponent makes
*P*into ∗*y*, with*y*>*M*, then just move from ∗*y*to ∗*M*, leaving*X*# ∗*M*, which you can win.
MEX Theorem: If all the legal moves from a position P
are equivalent to Nim-heaps of sizes {s_{1}, ...,
s_{k}}, then P itself is equivalent to a
nim-heap of size MEX(s_{1}, ...,
s_{k}), where the MEX is the "Minimal EXcluded"
element of the set: the smallest nonnegative integer that is not in
the set.
For example, let's consider what happens if we augment Nim by adding a
special token, called ♦. A player may, in lieu of a regular
move, replace ♦ by a pile of beans of
Since the legal moves from ♦ are {∗1, ∗2, ∗3, ...} and the
MEX is 0, ♦ should behave like ∗0. That is, adding a ♦
token to any position should leave the outcome unaffected. And indeed
it does. If you have a winning strategy in game
Exercise: Let
## 5. The Sprague-Grundy theoryAn "impartial game" is one where both players have the same moves from every position.
Now let's consider Richard Penn's game, which is impartial. A legal move is to cross out any dot, and the adjacent dot or dots, if any.
The Sprague-Grundy theorem says that every row of dots in Penn's game
is equivalent to some Nim-heap. Let's tabulate the size of this heap
(the Nim-value) for each row of [οο] = ∗1 also, since the only legal move from [οο] is to [] = 0, and the MEX of {0} is 1. The legal moves from [οοο] are to [] = ∗0 and [ο] = ∗1, so {∗0, ∗1}, and the MEX is 2. So [οοο] = ∗2. Let's check that this is working. Since the Nim-value of [οοο] is 2, the theory predicts that [οοο] # ∗2 = 0 and so should be a win for P2. P2 should be able to pretend that [οοο] is actually ∗2. Suppose P1 turns the ∗2 into ∗1, moving to [οοο] # ∗1. Then P2 should turn [οοο] into ∗1 also, which he can do by crossing out an end dot and the adjacent one, leaving [ο] # ∗1, which he easily wins. If P1 turns ∗2 into ∗0, moving to [οοο] # ∗0, then P2 should turn [οοο] into ∗0 also, which he can do by crossing out the middle and adjacent dots, leaving [] # ∗0, which he wins immediately.
If P1 plays in the [οοο] component, she must move to [] or to [ο], each
equivalent to some Nim-heap of size Continuing our analysis of rows of dots: In Penn's game, the legal moves from [οοοο] are to [οο] and [ο]. Both of these have Nim-value ∗1, so the MEX is 0.
Easy exercise: Since [οοοο] is supposedly equivalent to ∗0, you
should be able to show that a player who has a winning strategy in
some game The legal moves from [οοοοο] are to [οοο], [οο], and [ο] # [ο]. The Nim-values of these three games are ∗2, ∗1, and ∗0 respectively, so the MEX is 3 and [οοοοο] = ∗3. The legal moves from [οοοοοο] are to [οοοο], [οοο], and [ο] # [οο]. The Nim-values of these three games are 0, 2, and 0, so [οοοοοο] = ∗1.
## 6. Richard Penn's game analyzed
The table says that a row of 19 dots should be a win for P1, if she reduces the Nim-value from 3 to 0. And indeed, P1 has an easy winning strategy, which is to cross the 3 dots in the middle of the row, replacing [οοοοοοοοοοοοοοοοοοο] with [οοοοοοοο] # [οοοοοοοο]. But no such easy strategy obtains in a row of 20 dots, which, indeed, is a win for P2.
The original question involved circles of dots, not rows. But from a
circle of But for the circle of 19 dots the answer is the same, a win for the second player. The first player must move to [ο × 16] = ∗2, and then the second player should win by moving to a 0 position. [ο × 16] must have such a move, because if it didn't, the MEX rule would imply that its Nim-value was 0 instead of 2. So what's the second player's zero move here? There are actually two options. The second player can win by playing to [ο × 14], or by splitting the row into [οοοοοο] # [οοοοοοο]. ## 7. Complete strategy for 19-bean circleJust for completeness, let's follow one of these purportedly winning moves in detail. I claimed that the second player could win by moving to [οοοοοο] # [οοοοοοο]. But what next?First recall that any isolated row of four dots, [οοοο], can be disregarded, because any first-player move in such a row can be answered by a second-player move that crosses out the rest of the row. And any pair of isolated rows of one or two dots, [ο] or [οο], can be similarly disregarded, because any move that crosses out one can be answered by a move that crosses out the other. So in what follows, positions like [οο] # [ο] # [οοοο] will be assumed to have been won by the second player, and we will say that the second player "has an easy win" if he has a move to such a position.
- The first player has three possible moves in the left [οοοοοο] component, as follows:
- If the first player moves to [οοοο] # [οοοοοοο], the second
player has an easy win by moving to [οοοο] # [οοοο].
- If the first player moves to [οοο] # [οοοοοοο] = ∗2 # ∗1, the
second player should reduce the left component to ∗1, by moving to [ο] #
[οοοοοοο]. Then no matter what the first player does, the second
player has an easy win.
- If the first player moves to [ο] # [οο] # [οοοοοοο] = ∗1 # ∗1 # ∗1, the second player can disregard the
[ο] # [οο] component. The second player instead plays to [ο] # [οο]
# [οοοο] and wins.
- If the first player moves to [οοοο] # [οοοοοοο], the second
player has an easy win by moving to [οοοο] # [οοοο].
- The first player has four moves in the right [οοοοοοο] component, as follows:
- If the first player moves to [οοοοοο] # [οοοοο] = ∗1 # ∗3,
the second player should move from ∗3 to ∗1. There must be a move
in [οοοοο] to a position with Nim-value 1. (If there weren't, [οοοοο]
would have Nim-value 1 instead of 3, by the MEX rule.) Indeed, the
second player can move to [οοοοοο] # [οο]. Now whatever the first
player does the second player has an easy win, either to [οοοο] or to
*X*#*X*for some row*X*. - If the first player moves to [οοοοοο] # [οοοο] = ∗1 # ∗0, the second player should move from ∗1 to ∗0.
There must be a move in [οοοοοο] to a position with Nim-value 0, and indeed there is: the second player moves
to [οοοο] # [οοοο] and wins.
- If the first player moves to [οοοοοο] # [ο] # [οοο] = ∗1 # ∗1 #
∗2, the second player can disregard the ∗1 # ∗1 component and
should move in the ∗2 component, to ∗0, which he does by eliminating
it entirely, leaving the first player with [οοοοοο] # [ο]. After any
move by the first player the second player has an easy win.
- If the first player moves to [οοοοοο] # [οο] # [οο] = ∗1 # ∗1 #
∗1, the second player has a number of good choices. The simplest
thing to do is to disregard the [οο] # [οο] component and move in the
[οοοοοο] to some position with Nim-value 0. Moving to [οοοο] # [οο] #
[οο] suffices.
- If the first player moves to [οοοοοο] # [οοοοο] = ∗1 # ∗3,
the second player should move from ∗3 to ∗1. There must be a move
in [οοοοο] to a position with Nim-value 1. (If there weren't, [οοοοο]
would have Nim-value 1 instead of 3, by the MEX rule.) Indeed, the
second player can move to [οοοοοο] # [οο]. Now whatever the first
player does the second player has an easy win, either to [οοοο] or to
But the important point here is not the strategy itself, which is hard
to remember, and which could have been found by computer search. The
important thing to notice is that computing the table of Nim-values
for each row of
data Mu f = In (f (Mu f))
data Mu f = In (f (Mu f)) -- (???)I bet a bunch of people reading this on Planet Haskell are nodding and saying "Oh, that!" When I first saw this I couldn't figure out what it was saying at all. It was totally opaque. I still have trouble recognizing in Haskell what tokens are types, what tokens are type constructors, and what tokens are value constructors. Code like (???) is unusually confusing in this regard. Normally, one sees something like this instead:
data Maybe f = Nothing | Just fHere f is a type variable; that is, a variable that ranges over
types. Maybe is a type constructor, which is like a function
that you can apply to a type to get another type. The most familiar
example of a type constructor is List:
data List e = Nil | Cons e (List e)Given any type f, you can apply the type constructor
List to f to get a new type List .
For example, you can apply fList to Int to get the
type List Int. (The Haskell built-in list type constructor
goes by the funny name of [], but works the same way. The
type [Int] is a synonym for ([] Int).)
Actually, type names are type constructors also; they're argumentless
type constructors. So we have type constructors like
data Either a b = Left a | Right b;Then the type Either Int String contains values like Left
37 and Right "Cotton Mather".
To keep track of how many arguments a type constructor has, one can
consider the, ahem, type, of the type constructor. But to avoid the
obvious looming terminological confusion, the experts use the word
"kind" to refer to the type of a type constructor. The kind of
Continuing the "Maybe" example above, Now here is a crucial point. In declarations of type constructors, such as these:
data Either a b = ... data List e = ... data Maybe f = ...the type variables a, b, e, and f actually
range over type constructors, not over types. Haskell can infer the
kinds of the type constructors Either, List, and
Maybe, and also the kinds of the type variables, from the
definitions on the right of the = signs. In this case, it
concludes that all four variables must have kind *, and so really do
represent types, and not higher-order type constructors. So you can't
ask for Either Int List because List is known to
have kind * → *, and Haskell needs a type constructor of kind * to
serve as an argument to Either. But with a different definition, Haskell might infer that a type variable has a higher-order kind. Here is a contrived example, which might be good for something, perhaps. I'm not sure:
data TyCon f = ValCon (f Int)This defines a type constructor TyCon with kind (* → *) → *,
which can be applied to any type constuctor f that has kind *
→ *, to yield a type. What new type? The new type TyCon
is isomorphic to the type ff of Int. For
example, TyCon List is basically the same as List
Int. The value Just 37 has type Maybe Int,
and the value ValCon (Just 37) has type TyCon
Maybe.
Similarly, the value Now that the jargon is laid out, let's look at (???) again:
data Mu f = In (f (Mu f)) -- (???)When I was first trying to get my head around this, I had trouble seeing what the values were going to be. It looks at first like it has no bottom. The token f here, like in the TyCon
example, is a variable that ranges over type constructors with kind *
→ *, so could be List or Maybe or [],
something that takes a type and yields a new type. Mu itself
has kind (* → *) → *, taking something like f and yielding a
type. But what's an actual value? You need to apply the value
constructor In to a value of type , and it's not immediately clear where to get such a
thing.f (Mu
f)
I asked on
data Mu Maybe = In (Maybe (Mu Maybe))So the In value constructor will take a value of type
Maybe (Mu Maybe) and return a value of type
Mu Maybe. Where do we get a value of type
Maybe (Mu Maybe)? Oh, no problem: the value Nothing
is polymorphic, and has type Maybe for all aa,
so in particular it has type Maybe (Mu Maybe). Whatever
Maybe (Mu Maybe) is, it is a Maybe-type, so it has a
Nothing value. So we do have something to get started
with.
Since In
Nothing, of type Mu Maybe, produces Just (In
Nothing), of type Maybe (Mu Maybe) again. We can repeat
the process as much as we want and produce as many values of type
Mu Maybe as we want; they look like these:
In Nothing In (Just (In Nothing)) In (Just (In (Just (In Nothing)))) In (Just (In (Just (In (Just (In Nothing)))))) ...And that's it, that's the type Mu Maybe, the set of those
values. It will look a little simpler if we omit the In
markers, which don't really add much value. We can just agree to omit
them, or we can get rid of them in the code by defining some semantic
sugar:
nothing = In Nothing just = In . JustThen the values of Mu Maybe look like this:
nothing just nothing just (just nothing) just (just (just nothing)) ...It becomes evident that what the Mu operator does is to close
the type under repeated application. This is analogous to the way the
fixpoint combinator works on values. Consider the usual definition of
the fixpoint combinator:
Y f = f (Y f)Here f is a function of type a → a. Y f
is a fixed point of f. That is, it is a value x of type
a such that f x = x. (Put x = Y
f in the definition to see this.)
The fixed point of a function
⊥
This actually finds the
Consider the example of
{ ⊥ }
The first item is the set that contains nothing but the bottom value,
which we might call
The
type
It might be worth pointing out that this is not the only such fixed
point, but is
So that's what
Y f = f (Y f) -- ordinary fixed-point operator data Mu f = In (f (Mu f)) -- (???)Given f, a function from values to values, Y(f)
calculates a value x such that x = f(x).
Given f, a function from types to types, Mu(f) calculates
a type T such that f(T) = T. That's why the
definitions are identical. (Except for that annoying In
constructor, which really oughtn't to be there.)
You can use this technique to construct various recursive datatypes.
For example,
data Number = Zero | Succ Number;Notice the structural similarity with the definition of Maybe:
data Maybe a = Nothing | Just a;One can similarly define lists:
data Mu f = In (f (Mu f)) data ListX a b = Nil | Cons a b deriving Show type List a = Mu (ListX a) -- syntactic sugar nil :: List a nil = In Nil cons :: a → List a → List a cons x y = In (Cons x y) -- for example ls = cons 3 (cons 4 (cons 5 nil)) -- :: List Integer lt = (cons 'p' (cons 'y' (cons 'x' nil))) -- :: List CharOr you could similarly do trees, or whatever. Why one might want to do this is a totally separate article, which I am not going to write today. Here's the point of today's article: I find it amazing that Haskell's type system is powerful enough to allow one to defined a fixed-point operator for functions over types. We've come a long way since FORTRAN, that's for sure. A couple of final, tangential notes: Google search for "Mu f = In (f (Mu f))" turns up relatively few hits, but each hit is extremely interesting. If you're trying to preload your laptop with good stuff to read on a plane ride, downloading these papers might be a good move.
The Peter Aczel thing seems to be less well-known that it should be.
It is a version of set theory that allows coinductive definitions of
sets instead of inductive definitions. In particular, it allows one
to have a set
Return return
There were two different pieces of code in this paper that wowed me. When I started this article, I was planning to write about #2. I decided that I would throw in a couple of paragraphs about #1 first, just to get it out of the way. This article is that couple of paragraphs. [ Addendum 20080917: Here's the article about #2. ]
Suppose you have a type that represents terms over some type
data Term v = TVar v -- Type variable | TInt -- Integer type | TString -- String type | Fun (Term v) (Term v) -- Function typeThere's a natural way to make the Term type constructor
into an instance of Monad:
instance Monad Term where return v = TVar v TVar v >>= f = f v TInt >>= f = TInt TString >>= f = TString Fun d r >>= f = Fun (d >>= f) (r >>= f)That is, the return operation just lifts a variable name to
the term that consists of just that variable, and the bind
operation just maps its argument function over the variable names in
the term, leaving everything else alone.
Jones wants to write a function,
The result of the unification algorithm should be a set of these
bindings, in this example saying that the input terms can be unified
by replacing the variable "a" with the term
The
unify :: TermOh, but not quite. Because unification can also fail. For example, if you try to unify the terms a → b and Int,
represented by Fun (TVar "a") (TVar "b") and TInt
respectively, the unfication should fail, because there is no term
that is an instance of both of those; one represents a function and
the other represents an integer. So unify does not actually
return a substitution of type v → Term v. Rather, it
returns a monad value, which might contain a substitution, if the
unification is successful, and otherwise contains an error value. To handle
the example above, the unify function will contain a case
like this:
unify TInt (Fun _ _) = fail ("Cannot unify" ....)It will fail because it is not possible to unify functions and integers.
If unification is successful, then instead of using unify should be the
trivial one, a function which takes x and returns TVar
for all variable names xx.
But we already have
such a function. This is just what we decided that
unify TInt TInt = return returnYep, in this case the unify function returns the
return function.Wheee!
At this point in the paper I was skimming, but when I saw That's my couple of paragraphs. I was planning to get to this point and then say "But that's not what I was planning to discuss. What I really wanted to talk about was...". But I think I'll break with my usual practice and leave the other thing for tomorrow. Happy Diada Nacional de Catalunya, everyone! [ Addendum 20080917: Here's the article about the other thing. ]
Factorials are not quite as square as I thought
Let
This actually occurs for
So while there are (of course) solutions to 12! =
Calculations with Mathematica by Mitch Harris show that one has
My thanks to M. Harris, and also to Stephen Dranger, who also wrote in with the results of calculations.
Having gotten this far, I then asked OEIS about the
sequence 1, 1, 3, 1, 9, 27, 15, 18, and (of course) was delivered
a
summary of the
current state of the art in
The question is known as "Brocard's problem", and was posed by Brocard
in 1876. No solutions are known with
The calculations for
[ The original version of this article contained some confusion about
whether
Factorials are almost, but not quite, square
$$n! = a^{2} - b^{2}\qquad (*)$$ will always have solutions whereb is small compared to
a. For example, we have 11! = 6318^{2} - 18^{2}.
But to get
Anyway, back to the subject at hand. Is there an example of
In related matters, it's rather easy to show that there are no
nontrivial examples with
It would be pretty cool to show that equation (*) implied This kept me amused for twenty minutes while I was in line for lunch, anyway. Incidentally, on the lunch line I needed to estimate √11. I described in an earlier article how to do this. Once again it was a good trick, the sort you should keep handy if you are the kind of person who needs to know √11 while standing in line on 33rd Street. Here's the short summary: √11 = √(99/9) = √((100-1)/9) = √((100/9)(1 - 1/100) = (10/3)√(1 - 1/100) ≈ (10/3)(1 - 1/200) = (10/3)(199/200) = 199/60. [ Addendum 20080909: There is a followup article. ]
runN revisited
Check it out. Thank you, M. Crane.
Period three and chaos
A Möbius function is simply a function of the form
One nice thing about the Möbius functions is that you can identify the
Möbius function The matrices are not quite identical with the Möbius functions, because the matrix and the matrix !!{ 2\, 0 \choose 0\,2}!! are the same Möbius function. So you really need to consider the set of matrices modulo the equivalence relation that makes two matrices equivalent if they are the same up to a scalar factor. If you do this you get a group of matrices called the "projective linear group", PGL(2). This takes us off into classical group theory and Lie groups, which I have been intermittently trying to figure out.
You can also consider various subgroups of PGL(2), such as the
subgroup that leaves the set {0, 1, ∞, -1} fixed. The reciprocal
function
In general a Möbius function has three degrees of freedom, since you
can choose the four constants
You may be worrying about the infinities here, but it's really nothing
much to worry about. If
(4 The only other thing you have to remember is that +∞ = -∞, because we're really living on the Riemann sphere. Or rather, we're living on the real part of the Riemann sphere, but either way there's only one ∞. We might call this space the "Riemann circle", but I've never heard it called that. And neither has Google, although it did turn up a bulletin board post in which someone else asked the same question in a similar context. There's a picture of it farther down on the right.
Anyway, most choices of
Here we have eight functions on the reals which make the group D_{4} under the operation of composition. For example,
if f(x) = (x-1)/(x+1), then
f(f(f(f(x)))) = x. Isn't
that nice?Anyway, none of that was what I was really planning to talk about. (You knew that was coming, didn't you?)
What I wanted to discuss was the function
A function with a periodic point of order three is not something you
see every day, and I was
somewhat surprised that as simple a function as 1/(1-
Or you can do a simpler calculation: since
This also gives you a simple matrix
I had noticed a couple of years ago that this 1/(1-
Well, what about it? Sharkovskii's theorem (I misspelled it in the
notebook) is a delightful generalization of the "Period three implies
chaos" theorem of Li and Yorke. It says, among other things, that if
a continuous function of the reals has a periodic point of order 3,
then it also has a periodic point of order
So what about Sharkovskii's theorem? Oh, it only applies to
The Sharkovskii thing is excellent. The Sharkovskii ordering of the integers is:
3 < 5 < 7 < 9 < ... n, then it also has a periodic point of
order m for all m > n in the Sharkovskii ordering.
So if the function has a periodic point of order 2, it must also have
a fixed point; if it has a periodic point of order 4, it must also
have a periodic point of order 2; if it has a periodic point of order
17, it must also have periodic points of all even orders and all odd
orders greater than 17, and so on.
The 1/(1- (The Li and Yorke paper also includes an example of a continuous function with a periodic point of order 5 but no periodic point of order 3. It's pretty simple.) Reading about Sharkovskii's theorem and related matters led me to the web pages of James A. Yorke (of Li and Yorke), and then to the book Chaos: An Introduction to Dynamical Systems that he did with Alligood and Sauer, which is very readable.
I was pleased to finally be studying this material, because it was a
very early inspiration to me. When I was about fourteen, my
cousin Alex, who is an analytic chemist, came to visit, and told me
about period-doubling and chaos in the logistic map. (It was all over
the news at the time.) The logistic map is just (The illustration here, which I copied from Wikipedia, uses r
instead of λ.)I was deeply impressed. For some reason I got the idea that I would need to understand partial differential equations to understand the chaos and the logistic map, so I immediately set out on a program to learn what I thought I would need to know. I enrolled in differential equations courses at Columbia University instead of in something more interesting. The partial differential equations turned out to be a sidetrack, but in those days there were no undergraduate courses in iterated dynamic systems. I am happy to discover that after only twenty-five years I am finally arriving at the destination. Cousin Alex also told me to carry a notebook and pen with me wherever I went. That was good advice, and it took me rather less time to learn.
Freshman electromagnetism questions: answer 3
I had asked:
And one day a couple of months ago it occurred to me that yes, of course the electron vibrates up and down, because that is how radio antennas work. The EM wave comes travelling by, and the electrons bound in the metal antenna vibrate up and down. When electrons vibrate up and down in a metal wire, it is called an alternating current. Some gizmo at the bottom end of the antenna detects the alternating current and turns it back into the voice of Don Imus. I thought about it a little more, and I realized that this vibration effect is also how microwave ovens work. The electromagnetic microwave comes travelling by, and it makes the electrons in the burrito vibrate up and down. But these electrons are bound into water molecules, and cannot vibrate freely. Instead, the vibrational energy is dissipated as heat, so the burrito gets warm. So that's one question out of the way. Probably I have at least three reader responses telling me this exact same thing. And perhaps someday we will all find out together...
Defunctionalization and Java
Consider, for example, the following Haskell program:
```
-- Haskell
aux f = f 1 + f 10
res x = aux (λz -> z + x)
```
The defunctionalization of this example is:
-- Haskell data Hold = HOLD Int fake_apply (HOLD a) b = a + b aux held = fake_apply held 1 + fake_apply held 10 res x = aux (HOLD x)I hope this will make the idea clear. M. Knowles cites the paper Defunctionalization at work by Olivier Danvy and Lasse R. Nielsen, which was lots of fun. (My Haskell example above is a simplification of the example from page 5 of Danvy and Nielsen.) Among other things, Danvy and Nielsen point out that this defunctionalization transformation is in a certain sense dual to the transformation that turns ordinary data structures into λ-terms in Church encoding. Church encloding turns data items like pairs or booleans into higher-order functions; defunctionalization turns them back again. Section 1.4 of the Danvy and Nielsen paper lists a whole bunch of contexts in which this technique has been studied and used, but one thing I didn't think I saw there is that this is essentially the transformation that Java programmers use when they want to use closures.
For example, suppose a Java programmer wants to write something like
```
-- Haskell
aux f = f 1 + f 10
res x = aux (λz -> z + x)
```
But they can't, because Java doesn't have closures.So instead, they do this:
/* Java */ class Hold { private int a; public Hold(int a) { this.a = a; } public int fake_apply(int b) { return this.a + b; } } private static int aux(Hold h) { return h.fake_apply(1) + h.fake_apply(10); } static int res(int x) { Hold h = new Hold(x); return aux(h); }Where the class Hold corresponds directly to the
data type Hold in the defunctionalized Haskell code.
Here is a real example. Consider GNU Emacs. When I enter text-mode
in Emacs, I want a bunch of subsystems to be notified. Emacs has a
In Java, the corresponding facility is called
But in Java the hooks are not functions, as they are in Emacs, because
in Java functions are not first-class entities. Instead, the hooks
are objects which conform to the
Here's another example. I wrote a recursive descent parser in Java a
while back. An
# Perl package ActionParser; sub new { my ($class, $parser, $action) = @_; bless { Parser => $parser, Action => $action } => $class; } # Just like the embedded parser, but invoke the action on success sub parse { my $self = shift; my $input = shift; my $result = $self->{Parser}->parse($input); if ($result->success) $self->{Action}->($result); # Invoke action } return $result; }Here the Action member is expected to be a closure, which is
automatically invoked if the parse succeeds. To use this, I would
write something like this:
# Perl my $missiles; ... my $parser = ActionParser->new($otherParser, sub { $missiles->launch() } ); $parser->parse($input);And then if the input parses correctly, the parser launches the missiles from the anonymous closure, which has captured the local $missiles object.But in Java, you have no closures. Instead, you defunctionalize, and represent closures with objects:
/* Java */ abstract class Action { void invoke(ParseResults results) {} } class ActionParser extends Parser { Action action; Parser parser; ActionParser(Parser p, Action a) { action = a; parser = p; } ParseResults Parse(Input input) { ParseResults res = this.parser.Parse(input); if (res.isSuccess) { this.action.invoke(res); } return res; } }To use this, one writes something like this:
/* Java */ class LaunchMissilesAction extends Action { Missiles m; LaunchMissilesAction(Missiles m) { this.m = m; } void invoke(ParseResults results) { m.launch(); } } ... Action a = new LaunchMissilesAction(missiles); Parser p = new ActionParser(otherParser, a); p.parse(input);The constructor argument missiles takes the place of a free
variable in a closure. The closure itself has been replaced with
an object from an ad hoc class, just as in Danvy and Nielsen's
formulation, the closure is replaced with a synthetic data object that
holds the values of the free variables. The invoke method
plays the role of fake_apply.
Now, it's not a particularly interesting observation that this
Addenda to recent articles 200805
- Regarding the bicameral mind
theory put forth in Julian Jaynes' book The Origin of
Consciousness in the breakdown of the Bicameral Mind, Carl
Witty informs me that the story "Sour Note on Palayata", by James
Schmitz, features a race of bicameral aliens whose mentality is
astonishingly similar to the bicameral mentality postulated by Julian
Jaynes. M. Witty describes it as follows:
The story features a race of humanoid aliens with a "public" and a "private" mind. The "public" mind is fairly stupid, and handles all interactions with the real world; and the "private" mind is intelligent and psychic. The private mind communicates psychically with the private minds of other members of the race, but has only limited influence over the public mind; this influence manifests as visions and messages from God. This would not be so remarkable, since Jaynes' theories have been widely taken up by some science fiction authors. For example, they appear in Neal Stephenson's novel Snow Crash, and even more prominently in his earlier novel The Big U, so much so that I wondered when reading it how anyone could understand it without having read Jaynes first. But Schmitz's story was published in 1956, twenty years before the publication of The Origin of Consciousness. - Also in connection with Jaynes: I characterized his theory as
"either a work of profound genius, or of profound crackpottery". I
should have mentioned that this
characterization was not lost on Jaynes himself. In his book, he
referred to his own theory as "preposterous".
- Many people wrote in with more commentary about my articles on
artificial Finnish
[1]
[2]:
- I had said that "[The one-letter word 'i'] appears in my sample
in connection with Sukselaisen
I hallitus, whatever that is". Several people
explained that this "I" is actually a Roman numeral 1, denoting the ordinal number
"first", and that Sukselaisen
I hallitus is the first government headed by V. J. Sukselaisen.
I had almost guessed this—I saw "Sukselaisen I" in the source material and guessed that the "I" was an ordinal, and supposed that "Sukselaisen I" was analogous to "Henry VIII" in English. But when my attempts to look up the putative King Sukselaisen I met with failure, and I discovered that "Sukselaisen I" never appeared without the trailing "hallitus", I decided that there must be more going on than I had supposed, as indeed there was. Thanks to everyone who explained this. - Marko Heiskanen says that the (fictitious) word
*yhdysvalmistämistammonit*is "almost correct", at least up to the nonsensical plural component "tammonit". The vowel harmony failure can be explained away because compound words in Finnish do not respect the vowel harmony rules anyway. - Several people objected to my program's generation of the word
"klee": Jussi Heinonen said "Finnish has quite few words that begin with
two consonants", and Jarkko Hietaniemi said "No word-initial "kl":s
possible in native Finnish words". I checked, and my sample Finnish
input contains "klassisesta", which Jarkko explained was a loanword,
I suppose from Russian.
Had I used a larger input sample, oddities like "klassisesta" would have had less influence on the output. - I acquired my input sample by selecting random articles from
Finnish Wikipedia, but my random sampling was rather unlucky, since it
included articles about Mikhail Baryshnikov (not Finnish), Dmitry Medvevev
(not Finnish), and Los Angeles (also not Finnish). As a result, the
input contained too many strange un-Finnish letters, like B, D, š, and
G, and so therefore did the output. I could have been more careful in
selecting the input data, but I didn't want to take the time.
Medvedev was also the cause of that contentious "klassisesta", since, according to Wikipedia, "Medvedev pitää klassisesta rock-musiikista". The Medvedev presidency is not even a month old and already he has this international incident to answer for. What catastrophes could be in the future? - Another serious problem with my artificial Finnish is that the
words were too long; several people complained about this, and the
graph below shows the problem fairly clearly:
*x*-axis is word length, and the*y*-axis is frequency, on a logarithmic scale, so that if 1/100 of the words have 17 letters, the graph will include the point (17, -2). The red line, "in.dat", traces the frequencies for my 6 kilobyte input sample, and the blue line, "pseudo.dat", the data for the 1000-character sample I published in the article. ("Ävivät mena osakeyhti...") The green line, "out.dat", is a similar trace for a 6 kb*N*=3 text I generated later. The long right tail is clearly visible. My sincere apologies to color-blind (and blind) readers.I am not sure exactly what happened here, but I can guess. The Markov process has a limited memory, 3 characters in this case, so in particular is has essentially no idea how long the words are that it is generating. This means that the word lengths that it generates should appear in roughly an exponential distribution, with the probability of a word of length *N*approximately equal to !!\lambda e^{-\lambda N} !!, where 1/λ is the mean word length.But there is no particular reason why word lengths in Finnish (or any other language) should be exponentially distributed. Indeed, one would expect that the actual distribution would differ from exponential in several ways. For example, extremely short words are relatively uncommon compared with what the exponential distribution predicts. (In the King James Bible, the most common word length is 3, then 4, with 1 and 8 tied for a distant seventh place.) This will tend to push the mean rightwards, and so it will skew the Markov process' exponential distribution rightwards as well. I can investigate the degree to which both real text and Markov process output approximate a theoretical exponential distribution, but not today. Perhaps later this month.
My thanks again to the many helpful Finnish speakers who wrote in on these and other matters, including Marko Heiskanen, Shae Erisson, Antti-Juhani Kaijanaho, Ari Loytynoja, Ilmari Vacklin, Jarkko Hietaniemi, Jussi Heinonen, Nuutti-Iivari Meriläinen, and any others I forgot to mention. - I had said that "[The one-letter word 'i'] appears in my sample
in connection with Sukselaisen
I hallitus, whatever that is". Several people
explained that this "I" is actually a Roman numeral 1, denoting the ordinal number
"first", and that Sukselaisen
I hallitus is the first government headed by V. J. Sukselaisen.
- My explanation of Korean
vowel harmony rules in that article is substantively correct, but
my description of the three vowel groups was badly wrong. I have
apparently forgotten most of the tiny bit I once knew about Middle
Korean. For a correct description, see
the Wikipedia article
or
this
blog post. My thanks to the anonymous author of the blog post for
his correction.
- Regarding the
transitivity of related-by-blood-ness, Toth András told me about a
(true!) story from the life of Hungarian writer Karinthy
Frigyes:
Karinthy Frigyes got married two times, the Spanish flu epidemic took his first wife away. A son of his was born from his first marriage, then his second wife brought a boy from his previous husband, and a common child was born to them. The memory of this the reputed remark: "Aranka, your child and my child beats our child." (The original Hungarian appears on this page, and the surprisingly intelligible translation was provided by M. Toth and the online translation service at webforditas.hu. Thank you, M. Toth. - Chung-chieh Shan tells me that the missing
document-viewer feature that I described is available in recent
versions of
`xdvi`. Tanaeem M. Moosa says that it is also available in Adobe Reader 8.1.2.
Glade
I wasn't sure how to do this, and my first draft just had a description. But the day before, I had happened to notice a new item that had appeared in the "Programming" menu on my Ubuntu computer: It said "Glade Interface Designer". I had started it up, for no particular reason, and tinkered with it for about two minutes. Glade lets you design a window interface, by positioning buttons and sliders and things, and then does something or other. At the time I didn't know what it would do, but I knew I could mock up the window I wanted, and I thought maybe I could screenshot the mockup for the blog article. The Glade thing was so easy to use that the easiest way to get a mockup of the dialog was to have Glade generate a complete, working windowing application, compile and run the application, and then screenshot the application. I got this done in about fifteen minutes.
The application I made doesn't actually I give Glade a big gold star. I went from having never heard of it to a working (although trivial) window application in one two-minute session and one fifteen-minute session. Maybe two big gold stars and a "Good work!" sticker. [ Addendum 20080530: I went ahead with making an application that actually does something. It worked. ]
More Glade
The outcome: big success. The application has a window with two input fields, a "+" button, and an output field that shows the sum of the input fields when you press the "+" button. It took about half an hour from start to finish, and the only thing I had to look up in the manual was the names of the functions that read and write the values of the text fields. Everything else I got through bricolage and tinkering with the autogenerated monkey code. The biggest problem that I encountered was that the application didn't exit when I clicked the close box, although the window disappeared. I figured out that the close box was sending a "delete" event and not a "destroy" event and fixed it up right quick.
Gtk+ and Glade Interface Designer get
A missing feature in document viewers
Typically, the viewer numbers all the pages sequentially, starting with 1. But many documents have some front matter, such as a table of contents, that is outside the normal numbering. For example, there might be a front cover page, and then a table of contents labeled with page numbers i through xviii, and then the main content of the document follows on pages 1 through 263. Computer programmers, I just realized, have a nice piece of jargon to describe this situation, which is very common. They speak of "logical" and "physical" pages. The "physical" page numbers are the real, honest-to-goodness numbers of the pages, what you get if you start at 1 and count up. The "logical" page numbers are the names by which the pages are referred. In the example document I described, physical page 1 is the front cover, physical page 2 is logical page i, physical page 19 is logical page xviii, physical page 20 is logical page 1, and so forth. The document has 282 physical pages, and the last one is logical page 263. Let's denote physical pages with square brackets and logical pages with curvy brackets. So "(xviii)" and "[19]" denote the same page in this document. Page (1) is page [20], and page (20) is page [39]. Page [1] has no logical designation, or perhaps it is something like "(front cover sheet)". Now the problem I want to discuss is as follows: Every viewer program has a little box where it displays the current page number, and the little box is usually editable. You scan the table of contents, find the topic you want to read about, and the table says that it's on page (165). Then you tell the document viewer to go to page 165, and it does, but it's not the page 165 you want, because the viewer gives you [165], which is actually (146). You actually wanted (165), which is page [184]. Then you curse, mentally subtract 146 (what you got) from 165 (what you wanted), add the result, 19, back to 165, getting 184, and then you ask for 184 to get 165. And if you're me you probably mess up one time in three and have to do it over, because subtraction is hard. But it would be extremely easy for viewer programs to mostly fix this. They need to support an option where you can click on the box and tell it "your page number is wrong here". Maybe you would right-click the little page-number box, and the process would pop up a dialog: Then you would type in 146 (which you can see at the bottom of the page you're viewing) and click "OK". From then on the process would know that the logical and physical page numbers differed by 19, and it would subtract 19 from the number in the little box until you told it something else. You could then type 165 into the little box, and the process would think "well, you asked for (165), and I know that (165) is really [184] because you told me earlier that [165] is really (146)" and then you would get [184], which is what you wanted. And when you scrolled down from (165) to (166), the program would think "ho, you just went from [184] to [185], so I will change the display in the little box and display [185]-19 = (166) there". But no, none of them do this. The document itself should carry this information, and some of them do, sometimes. But not every document will, so viewers should support this feature, which is useful anyway. Some document formats support internal links, but most documents do not use those features, and anyway they are useless when what you are trying to do is look up a reference from someone else's bibliography: "(See Ogul, pp. 662–664.)" This is not a complete solution, but it's an almost complete solution, and it can be implemented unilaterally, by which I mean that the document author and the viewer program author need not agree on anything. It's really easy to do.
[ Addendum 20080521: Chung-chieh Shan informs me that current versions
of [ Addendum 20080530: How I made the dialog box graphic. ]
Luminous band-aids
The band-aid itself is circular, about 1.5 cm in diameter. It is sealed between two pieces of paper, each about an inch square, that have been glued together along the four pairs of edges. There is a flap at one edge that you pull, and then you can peel the two glued-together pieces of paper apart to get the band-aid out. As I peeled apart the two pieces of paper in the dark, there was a thin luminous greenish line running along the inside of the wrapper at the place the papers were being pulled away from each other. The line moved downward following the topmost point of contact between the papers as I pulled the papers apart. It was clearly visible in the dark. I've never heard of anything like this; the closest I can think of is the thing about how wintergreen Life Savers glow in the dark when you crush them. My best guess is that it's a static discharge, but I don't know. I don't have pictures of the phenomenon itself, and I'm not likely to be able to get any. But the band-aids look like this:
Have any of my Gentle Readers seen anything like this before? A cursory Internet search has revealed nothing of value.
[ Addendum 20180911: The phenomenon is well-known; it is called
More artificial Finnish
Of the various comments I received, perhaps the most interesting was from Ilmari Vacklin. ("Vacklin", huh? If my program had generated "Vacklin", the Finns would have been all over the error.) M. Vacklin pointed out that a number of words in my sample output violated the Finnish rules of vowel harmony. (M. Vacklin also suggested that my article must have been inspired by this comic, but it wasn't. I venture to guess that the Internet is full of places that point out that you can manufacture pseudo-Finnish by stringing together a lot of k's and a's and t's; it's not that hard to figure out. Maybe this would be a good place to mention the word "saippuakauppias", the Finnish term for a soap-dealer, which was in the Guinness Book of World Records as the longest commonly-used palindromic word in any language.) Anyway, back to vowel harmony. Vowel harmony is a phenomenon found in certain languages, including Finnish. These languages class vowels into two antithetical groups. Vowels from one group never appear in the same word as vowels from the other group. When one has a prefix or a suffix that normally has a group A vowel, and one wants to join it to a word with group B vowels, the vowel in the suffix changes to match. This happens a lot in Finnish, which has a zillion suffixes. In many languages, including Finnish, there is also a third group of vowels which are "neutral" and can be mixed with either group A or with group B. Modern Korean does not have vowel harmony, mostly, but Middle Korean did have it, up until the early 16th century. The Korean alphabet was invented around 1443, and the notation for the vowels reflected the vowel harmony: [ Addendum 20080517: The following paragraph about vowel harmony contains significant errors of fact. I got the groups wrong. ] The first four vowels in this illustration, with the vertical lines, were incompatible with the second four vowels, the ones with the horizontal lines. The last two vowels were neutral, as was another one, not shown here, which was written as a single dot and which has since fallen out of use. Incidentally, vowel harmony is an unusual feature of languages, and its presence in Korean has led some people to suggest that it might be distantly related to Turkish. The vowel harmony thing is interesting in this context for the following reason. My pseudo-Finnish was generated by a Markov process: each letter was selected at random so as to make the overall frequency of the output match that of real Finnish. Similarly, the overall frequency of two- and three-letter sequences in pseudo-Finnish should match that in real Finnish. Is this enough to generate plausible (although nonsensical) Finnish text? For English, we might say maybe. But for Finnish the answer is no, because this process does not respect the vowel harmony rules. The Markov process doesn't remember, by the time it gets to the end of a long word, whether it is generating a word in vowel category A or B, and so it doesn't know which vowels it whould be generating. It will inevitably generate words with mixed vowels, which is forbidden. This problem does not come up in the generation of pseudo-English. None of that was what I was planning to write about, however. What I wanted to do was to present samples of pseudo-Finnish generated with various tunings of the Markov process.
The basic model is this: you choose a number
Then you start generating text at random. You pick a sequence of
For example, suppose we have
Whether to count capital letters as the same as lowercase, and what to do about punctuation and spaces and so forth, are up to the designer.
Here, as examples, are some samples of pseudo-English, generated with
various
*N*=0- Lt per f idd et oblcs hs hae:uso ar w aaolt y tndh rl ohn otuhrthpboleel.ee n synenihbdrha,spegn.
*N*=1- Cachand t wim, heheethas anevem blsant ims, andofan, ieahrn anthaye s, lso iveeti alll t tand, w.
*N*=2- Ged hich callochbarthe of th to tre said nothem, and rin ing of brom. My and he behou spend the.
*N*=3- Sack one eved of and refor ther of the hand he will there that in the ful, when it up unto rangers.
N parameter. The N=3 sample has mostly real words, and
the few nonsense ones it contains ("eved", "ful") are completely
plausible English. N=2, on the other hand, is mostly nonsense,
although it's mostly plausible nonsense. Even "callochbarthe" is
almost plausible. (The unfortunate "chb" in the middle is just bad
luck. It occurs because Genesis 36 mentions Baalhanan the son of
Achbor.) The N=1 sample is recognizably bogus; no English word
looks like "ieahrn", and the triple "l" in "alll" is nearly
impossible. (I did once write to Jesse Sheidlower, an editor of the Big Dictionary, to ask his advice about
whether "ballless" should be hyphenated.)
I have prepared samples of pseudo-Finnish of various
qualities. The input here was a bunch of text I copied out of Finnish
Wikipedia. (Where else? If you need Finnish text in 1988, you get it
from the Usenet
*N*=0- Vtnnstäklun so so rl sieesjo.Aiijesjeäyuiotiannorin traäl.N vpojanti jonn oteaanlskmt enhksaiaaiiv oenlulniavas. Rottlatutsenynöisu iikännam e lavantkektann eaagla admikkosulssmpnrtinrkudilsorirumlshsmoti,anlosa anuioessydshln.Atierisllsjnlu e.Itatlosyhi vnko ättr otneän akho smalloailäi jiaat kajvtaopnasneilstio tntin einteaonaiimotn:r apoya oruasnainttotne wknaiossäelaäinoev aobrs,vteorlokynv. Aevsrikhanä tp s s oälnlke rvmi il ynae nara ign ssm lkimttbhineaatismäi tst lli ahaltineshne kr keöunv ah s itenh s .Ia pa elstpnanmnuiksriil anaalnttt mr ti.Ooa ka eee eiiei,tnees äusee a nanhetv.Iopkijeatatits,i l eklbiik suössmap tioaotaktdiir rkeaviohiesotkeagarihv nnadvö jlape öt kaeakmjkhykoto tnt iunnuyknnelu rutliie.Leva eiriaösnaj,rk oyumtsle,iioa,aspa aeiaä wsuinn eta y tvati klssviutkuaktmlpnheomi.T akapskushhnuksnhnnheaaaaussitseminmpnamäiaä pät.Kaaaabl unnionuhnpa iaes,outka.Cväinvkshvrnlteeoea rmi re suodmpr autlysa tnliaanäass. Srs rnvrtsita kmidusvjn tii.
*N*=1- Ava pän svun kerekent lsita batävomenasttenerga kovosuujalules rma punntäni rtraliksainoi van eukällä. Enäkukänesinntampalä ttan kolpäsäkyönsllvitivenestakkesenelussivaliite kuuksä kttteni einsuekeita kuterissalietäkilpöikalit ojatäjä pinsin atollukole idoitenn kkaorhjajasteden en vuolynkoiverojaa hta puon ehalan vaivä ihoshäositi. Hde setua tämpitydi makta jasyn sää oinncgrkai jeeten. Ljalanekikeri toiskkksypohoin ta yö atenesällväkeesaatituuun. Paait pukata tuon ktusumitttan zagaleskli va kkanäsin siikutytowhenttvosa veste eten vunovivä. Vorytellkeeni stan jä taa eka kaine ja kurenntonsin kyn o nta ja. Aisst urksetaka. Hotimivaa ta mppussternallai ja. Hdä on koraleerermohtydelen on jon. Rgienon kulinoilisälsa ja holälimmpa vitin, kukausoompremänn ra, palestollebilsen kaalesta, oina. Blilullaushoingiötideispaanoksiton, mulurklimi kermalli pota atebau lmomarymin kypa hta vanon tin kela vanaspoita s kulitekkäjen jäleetuolpan, veesalekäilin oii. Häreli. Ymialisstermimpriekaksst on.
*N*=2- Omaalis onino osa josa hormastaaraktse tyi altäänä tyntellevääostoidesenä, la siä vuansilliana inöön akalkuulukempellys kisä nen myöhelyaminenkiemostamahti omuonsa onite oni kusissa. Kungin sykynteillalkaai ellahasiteisuunnaja eroniemmin javai musuuasinä, sittan tusuovatkryt tormon vuolisenitiivansaliuotkietjuuta sensa. Kutumppalvinen. Vaikintolat hän ja kilkuossa osa koiseuvo keyhdysvisakeemppolowistoisijouliuodosijolasissän muoli ogro soluksi valuksasverix intetormon patlantaan et muiksen paiettaatulun kan vuomesyklees ovain pun. Sesva sa hänerittämpiraun tyi vuoden sälisen sän yhtiit, set tämpiraalletä. Senssaikanoje leemp:tabeten ain raa olliukettyi su. Solulukuuttellerrotolit hee säkinessa hän sekketäärinenvaikeihakti umallailuksin sestunno klossi ilunuta. Klettisaa osen vua vuola, jani ja hinangia en ta kaineemonimien polin barkiviäliukkuta joseseva. Ebb rautta onistärään on ml jokoulistä oheksi anoton allysvallelsiliineuvoja kutuko ala ulkietutablohitkain. Ituno.
*N*=3- Ävivät mena osakeyhti yhdysvalmiininäkin rakenne tuliitä hermoni ja umpirauhastui liin baryshnikoneja. Ain viljelukuullisää olisäke spesideksyylikoliittu latvia. Helsina hän solukeskuksen kannumme, peri palkin vieskeinä sisään on orgaan poikanssisäätelukauno klee laisenäläinen tavastui kauno on länteen muttava hän voimista kilometsästymistettäjän lehtiöiksitoreisö. Sitoutuvat mukalle. Ainettiin sisäke suomaihin, jouluun. Verenkilpalveli valtaineen opisteri poli ohjasionee rakennuttikolan aivastisenäläistuu kehittisetoja, rajahormaailmanajan kulkopuolesti kuluu mooliitoutuvat ovat olle. Ainen yhdysvaltai valiolähtiöiksi vasta, S. Muidentilaisteri jotka verenkirovin verenkiehumistä nelle väliaivoittynyt baleviiliukoisiin maailmestavarasta, jokakuudessa laisu. Sai rakeyhti yhtiö eli gluksessa. Ebbin, ja linnosakkeen hormonien I hallistehtiin kilpirasvua jaajana hormaailusta kunnetteluskäyttöön suomalaivat yhdysvalmistämistammonit veteet olimistuvatta. Hormon oli rautta.
N=2 sample,
let me explain that this is the standard abbreviation for
"millilitra". The "i" in the N=3 sample was a puzzle, since
Marko Heiskanen assures me that Finnish has no one-letter words. But
it appears in my sample in connection with Sukselaisen
I hallitus, whatever that is, so I capitalized it.I must say that I found "yhdysvalmistämistammonit" rather far-fetched, even in Finnish. But then I discovered that "yhdeksänkymmenvuotiaaksi" and "yhdysvalloissakaan" are genuine, so who am I to judge? [ Addendum 20080601: Some additional notes. ]
did actually know English, I could not
properly
evaluate it.But around that time the Internet was just beginning to get into full swing. The Finnish government was investing a lot of money in networking infrastructure, and a lot of people in Finland were starting to appear on the Internet.
I have a funny story about that: Around the same time, a colleague
named Marc Edgar approached me in the computer lab to ask if I knew of
any Internet-based medium he could use to chat with his friend at the
University of Oulu. I thought at first that he was putting me on (and
maybe he was) because in 1989 the University of Oulu was just about
the A new set of Finnish-language newsgroups had recently appeared on Usenet, and people posted to them in Finnish. So I had access to an unlimited supply of computer-readable Finnish text, something which would have been unthinkable a few years before, and I could do the experiment in Finnish.
Uttavalon estaa ain pahalukselle? Min omatunu selle menneet hy, toista. Palveljen alh tkö an välin oli ei alkohol pisten jol elenin. Että, ille, ittavaikki oli nim tor taisuuristä usein an sie a in sittä asia krista sillo si mien loinullun, herror os; riitä heitä suurinteen palve in kuk usemma. Tomalle, äs nto tai sattia yksin taisiä isiäk isuuri illää hetorista. Varsi kaikenlaineet ja pu distoja paikelmai en tulissa sai itsi mielim ssän jon sn ässäksi; yksen kos oihin! Jehovat oli kukahdol ten on teistä vak kkiasian aa itse ee eik tse sani olin mutta todistanut t llisivat oisessa sittä on raaj a vaisen opinen. Ihmisillee stajan opea tajat ja jumalang, sitten per sa ollut aantutta että voinen opeten. Ettuj, jon käs iv telijoitalikantaminun hä seen jälki yl nilla, kkeen, vaaraajil tuneitteistamaan same? In those days, the world was 7-bit, and Finnish text was posted in a Finnish national variant of ASCII that caused words like "tkö an välin" to look like "tk| an v{lin". The presence of the curly braces heightened the apparent similarity, because that was all you could see at first glance. At the time I was pleased, but now I think I see some defects. There are some vowelless words, such as "sn" and "t", which I think doesn't happen in Finnish. Some other words look defective: "ssän" and "kkeen", for example. Also, my input sample wasn't big enough, so once the program generated "alk" it was stuck doing the rest of "alkohol". Still, I think this could pass for Finnish if the reader wasn't paying much attention. I was satisfied with the results of the experiment, and was willing to believe that randomly-contructed English really did look enough like English to fool a non-English-speaking observer. [ Addendum 20080514: There is a followup to this article. ] [ Addendum 20080601: Some additional notes. ]
Human consciousness (which Jaynes describes and defines in considerable detail) is a relatively recent development, dating back at most only about 3,000 years or so. That is the shocking part of the theory. Most people probably imagine consciousness arising much, much earlier, perhaps before language. Jaynes disagrees. In his theory, language, and in particular its mediation of thought through the use of metaphors, is an essential prerequisite for consciousness. And his date for the development of consciousness means that human consciousness would postdate several other important developments, such as metalworking, large-scale agriculture, complex hierarchical social structures, and even writing. Jaynes thinks that the development of consciousness is a historical event and is attested to by written history. He tries to examine the historical record to find evidence not only of preconscious culture, but of the tremendous upheavals that both caused and were the result of the arrival of consciousness. If preconscious humans farmed, built temples and granaries, and kept records, they must have had some sort of organizing behavior that sufficed in place of consciousness. Jaynes believes that prior to the development of consciousness, humans had a very different mentality. When you or I need to make a decision, we construct a mental narrative, in which we imagine ourselves trying several courses of action, and attempt to predict the possible consequences. Jaynes claims that Bronze Age humans did not do this. What then? Instead, says Jaynes, the two halves of the brain were less well-integrated in preconscious humans than they are today. The preconscious mentality was "bicameral", with the two halves of the brain operating more independently, and sometimes at odds with each other. The left hemisphere, as today, was usually dominant. Faced with a difficult decision, preconscious human would wait, possibly undergoing (and perhaps even encouraging) an increasingly agitated physical state, until they heard the voice of a god directing them what to do. These hallucinated voices were generated by the right hemisphere of the brain, and projected internally into the left hemisphere. For example, when the Iliad says that the goddess Athena spoke to Achilles, and commanded and physically restrained him from killing Agamemnon, it is not fabulating: Achilles' right brain hallucinated the voice of the Goddess and restrained him. In Jaynes' view, there is a large amount of varied literary, anthropological, and neurological evidence supporting this admittedly bizarre hypothesis. For example, he compares the language used in the Biblical Book of Amos (bicameral) with that in Ecclesiastes (conscious). He finds many examples of records from the right period of history bewailing the loss of the guidance of the gods, the stilling of their voices, and the measures that people took, involving seers and prophets, to try to bring the guiding voices back. Jaynes speculates that mental states such as schizophrenia, which are frequently accompanied by irresistible auditorily hallucinated commands, may be throwbacks to the older, "bicameral" mental state.
Whether you find the theory amazingly brilliant or amazingly stupid, I
urge to to withhold judgment until you have read the book.
It is a
fat book, and there is a mass of fascinating detail. As I implied,
it's either a work of profound genius or of profound crackpottery, and
I'm not sure which. (Yaakov Sloman tells me that the response to
Wittgenstein's Tractatus Logico-Philosophicus was
similarly ambivalent when it was new. I think the consensus is now on
the genius side.) Either way, it is quite fascinating. There needs
to be Anyway, I did not mean to get into this in so much detail. The reason I brought this up is that because of my continuing interest in Jaynes' theory, and how it is viewed by later scholars, I am reading Muses, Madmen, and Prophets: Rethinking the History, Science, and Meaning of Auditory Hallucination by Daniel B. Smith. I am not very far into it yet, but Smith has many interesting things to say about auditory hallucinations, their relationship to obsessive-compulsive disorder, and other matters. On page 37 Smith mentions a paper, which as he says, has a wonderful title: "Involuntary Masturbation as a Manifestation of Stroke-Related Alien Hand Syndrome". Isn't that just awesome? It gets you coming and going, like a one-two punch. First there's the involuntary masturbation, and while you're still reeling from that it follows up with "alien hand syndrome". To save you the trouble of reading the paper, I will summarize. The patient is a 72-year-old male. He has lesions in his right frontal lobe. He is experiencing "alien hand syndrome", where his hand seems to be under someone else's control, grabbing objects, like the TV remote control, or grabbing pieces of chicken off his plate and feeding them to him, when what he wanted to do was feed himself with the fork in his right hand. "During his hospital stay, the patient expressed frustration and dismay when he realized that he was masturbating publicly and with his inability to voluntarily release his grasp of objects in the left hand." Reaction time tests of his hands revealed that when the left hand was under his conscious control, it suffered from a reaction time delay, but when it was under the alien's control, it didn't. Whee, freaky.
At that moment, the novice was enlightened...
(I have changed the name of the other person.)
Recounting the rationals
Let
And
The sequence of values of
1 1 2 1 3 2 3 1 4 3 5 2 5 3 4 1 5 4 7 3 8 5 7 2 7 5 8 3 7 4 5 ...Now consider the sequence b(n) / b(n+1). This is just
what you get if you
take two copies of the b(n) sequence and place one over the other, with
the bottom one shifted left one place,
like this:
1 1 2 1 3 2 3 1 4 3 5 2 5 3 4 1 5 4 7 3 8 5 7 2 7 5 8 3 7 4 5 ... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 2 1 3 2 3 1 4 3 5 2 5 3 4 1 5 4 7 3 8 5 7 2 7 5 8 3 7 4 5 ...Reading each pair as a rational number, we get the sequence b(n) /
b(n+1), which is 1/1, 1/2, 2/1, 1/3, 3/2, 2/3, 3/1, 1/4,
4/3, 3/5, 5/2, ... .Here is the punchline: This sequence contains each positive rational number exactly once.
If you are just learning to read math papers, or you think you might
like to learn to read them, the
paper in which this is proved would be a good place to start. It
is serious research mathematics, but elementary. It is very short.
The result is very elegant. The proofs are straightforward. The
techniques used are typical and widely applicable; there is no weird
ad-hockery. The discussion in the paper is sure to inspire you to
tinker around with it more on your own. All sorts of nice things turn
up. The Thanks to Brent Yorgey for bringing this to my attention. I saw it in this old blog article, but then discovered he had written a six-part series about it. I also discovered that M. Yorgey independently came to the same conclusion that I did about the paper: it would be a good first paper to read. [ Addendum 20080505: Brad Clow agrees that it was a good place to start. ]
A few notes on "The Manticore"
Here are a few miscellaneous notes about The Manticore.
## Early memoriesHere is David Staunton's earliest memory, from chapter 2, section 1. (Page 87 in my Penguin paperback edition.)
Here is the earliest memory of Francis Cornish, the protagonist of Davies' novel What's Bred in the Bone (1985):
It was in a garden that Francis Cornish first became truly aware of himself as a creature observing a world apart from himself. He was almost three years old, and he was looking deep into a splendid red peony.That is the opening sentence of part two, page 63 in my Penguin Books copy.
## The sideboardThis is from chapter 3 of The Manticore, David's diary entry of Dec. 20:
Inside, it is filled with ... gigantic pieces of furniture on which every surface has been carved within an inch of its life with fruits, flowers, birds, hares, and even, on one thing which seems to be an altar to greed but is more probably a sideboard, full-sized hounds; six of them with real bronze chains on their collars.The following quotation is from Davies' 1984 New York Times article "In a Welsh Border House, the Legacy of the Victorians", a reminiscence of the house his father lived in after his retirement in 1950:
Until my father had it dismantled and removed to a stable, the Great Hall was dominated by what I can only call an altar to gluttony against the south wall. It was a German sideboard of monumental proportions that the Naylors had acquired at the Great Exhibition of 1851. Every fruit, flower, meat, game, and edible was carved on it in life size, including four large hounds, chained to the understructure with wooden chains, so cunningly wrought that they could be moved, like real chains.This is reprinted in The Enthusiasms of Robertson Davies, Judith Skelton Grant, ed.
## What do Canadians think of Saints?Davies has said on a number of occasions that in Fifth Business he wanted to write about the nature of sainthood, and in particular how Canadians would respond if they found that they had a true saint among them. For example, in his talk "What May Canada Expect from Her Writers?" (reprinted in One Half of Robertson Davies, pp. 139–140) he says:
For many years the question occurred to me at intervals: What would Canada do with a saint, if such a strange creature were to appear within our borders? I thought Canada would reject the saint because Canada has no use for saints, because saints hold unusual opinions, and worst of all, saints do not pay. So in 1970 I wrote a book, called Fifth Business, in which that theme played a part.Fifth Business does indeed treat this theme extensively and subtly. In The Manticore he is somewhat less subtle. A perpetual criticism I have of Davies is that he is never content to trust the reader to understand him. He always gets worried later that the reader is not clever enough, and he always comes back to hammer in his point a little more obviously. For example, Fifth Business ends with the question "Who killed Boy Staunton?" and a cryptic, oracular answer. But Davies was unable to resist the temptation to explain his answer for the benefit of people unable or unwilling to puzzle out their own answers, and the end of The Manticore includes a detailed explanation. I think there might be an even plainer explanation in World of Wonders, but I forget. I have a partly-finished essay in progress discussing this tendency in Davies' writing, but I don't know when it will be done; perhaps never. What would Canada think of a saint? Fifth Business is one answer, a deep and brilliant one. But Davies was not content to leave it there. He put a very plain answer into The Manticore. This is again from David's diary entry of Dec. 20 (p. 280):
Eisengrim's mother had been a dominant figure in his own life. He spoke of her as "saintly," which puzzles me. Wouldn't Netty have mentioned someone like that?David's old nurse Netty did indeed mention Eisengrim's mother, although David didn't know that that was who was being mentioned. The mention appears in chapter 2, section 6, p. 160:
She had some awful piece of lore from Deptford to bring out. It seems there had been some woman there when she was a little girl who had always been "at it" and had eventually been discovered in a gravel pit, "at it" with a tramp; of course this woman had gone stark, staring mad and had to be kept in her house, tied up.If you want to know what Robertson Davies thinks that Canada would make of a saint, but you don't want to read and ponder Fifth Business to find out, there you have it in one sentence. [ Addendum: The New York Times review of The Manticore is interesting for several reasons. The title is misspelled in the headline: "The Manitcore". The review was written by a then-unknown William Kennedy, who later became the author of Ironweed (which won the Pulitzer Prize) and other novels. Check it out. ]
Is blood a transitive relation?
It might seem at first glance that "is related to" is transitive, but, at least under conventional definitions, it isn't, because my wife is not related to my cousins. (I was once invited to speak at Haverford College, and, since I have no obvious qualifications in the topic on which I was speaking, I was asked how I had come to be there. I explained that it was because my wife's mother's younger brother's daughter's husband's older brother's wife was the chair of the mathematics department. Remember, it's not what you know, it's who you know.) I think I had sometimes tried to turn "related to" into a transitive relation by restricting it to "is related to by blood". This rules out the example above of my wife being unrelated to my cousins, because my relationship with my wife is not one of blood. I don't quite remember using "related by blood" as an example of a transitive relation, but I think I might have, because I was quite surprised when I realized that it didn't work. I spent a lot of time that morning going over my counterexample in detail, writing it up in my head, as it were. I was waiting around in Trevose for the mechanic to finish examining my car, and had nothing better to do that morning. If I had had a blog then, I would probably have posted it. But it is a good thing that I didn't, because in spite of all my thought about it, I missed something important.
The example is as follows.
Well, this is all very well, but the reason I have filed it under oops/, and the reason it's a good thing I
didn't post it on my (then nonexistent) blog is that this elaborate
counterexample contains a much simpler one: I wish I had some nice conclusion to draw here, but if there's something I could learn from it I can't think would it might be.
Suffering from "make install"
After building There are at least five solutions to this common problem. Uh-oh. If solution #1 had worked, people would not have needed to invent solution #2. If solution #2 had worked, people would not have needed to invent solution #3. Since there are five solutions, there is a good chance that none of them work. You can, I am informed:
- Set
`PREFIX=X`when building the`Makefile` - Set
`INSTALLDIRS=vendor`and`VENDORPREFIX=X`when building the`Makefile`- Or maybe instead of
`VENDORPREFIX`you need to set`INSTALLVENDORLIB`or something - Or maybe instead of setting them while building the
`Makefile`you need to set them while running the`make install`target
- Or maybe instead of
- Set
`LIB=X/lib`when building the`Makefile` - Use
`PAR` - Use
`local::lib`
PAR does; I just want to install the
damn modules into X/lib where the application can find them.
Some of these items fail because they just plain fail. For example,
the first thing everyone says is that you can just set
The manual (which goes by the marvelously
obvious and easily-typed name of
It turns out there is a simple solution that doesn't involve
travelling to New Jersey. The first thing you have to do is give up
entirely on trying to use
So no. But the solution is actually simple. The normal module build
process (as distinct from the install process) puts all this crap
under
In fact, the modules are installed into the proper subdirectories of
For modules with a shared library, you need to copy
MODULE/blib/arch/auto/* into
I remember suffering over this at least ten years ago, when a student
in a class I was teaching asked me how to do it and I let
This is one of those days when I am not happy with software. It
sometimes surprises me how many of those days involve
Dennis Ritchie once said that "
I would like to end this article on a positive note. If you haven't
already, please read Recursive
The "z" command: output filtering
The idea of
z grepand it does approximately the same as: zgrepor you could do: z sedand it would do the same as: zsedif there were a zsed command, although there isn't.
Much of the discussion has concerned a problem with the implementation,
which is that the names of the original compressed files are not
available to the command, due to the legerdemain
Here
## Another possible solutionAt the time I wrote the first article, it occurred to me briefly that it would be possible to havez capture the output of the
command and attempt to translate /proc/self/fd/3 back to
env-2.blog.gz or whatever is appropriate, because although
the subcommand does not know the original filenames, z itself
does. The code would look something like this. Instead of ending by
execing the command, as the original version of z
did:
exec $command, @ARGV; die "Couldn't run '$command': $!.\n";this revised version of z, which we might call zz,
would end with the code to translate back to the original
filenames:
open my($out), "-|", $command, @ARGV or die "Couldn't run '$command': $!.\n"; while (<$out>) { s{/proc/self/fd/(\d+)}{$old[$1]}g; print; }Here @old is an array that translates from file descriptors
back to the original filename.At the time, I thought of doing this, and my immediate thought was "well, that is so obviously a terrible idea that it is not worth even mentioning", so I left it out. But since then at least five people have written to me to suggest it, so it appears that it is not obviously a terrible idea. I had to think a little deeper about why I thought it was a terrible idea.
Really the question is why I think this is a more terrible idea than
the original
The output of the command has a certain format, a certain structure.
We don't know ahead of time what that structure is, but it can be
described for any particular command. For instance, the output of
Similarly, the output of
The original behavior of
With
What if the original sequence was understood as part of a sequence of 2-byte
integers? The result is not even properly aligned. What if that
initial 2f was a count? The resulting count (66) is much too long.
The result would be utterly garbled and unintelligible to
I suppose the short summary here is that
One correspondent argued that the garbling is very unlikely, and
proposed various techniques to make it even less likely, mostly by
rewriting the input filenames to various long random strings. But I
felt then that this was missing the point, and I still do. He
On the other hand, this correspondent made a good point that if the
output of Complete modified source code is available. (Diffs from previous version.)
The "z" command: alternative implementations
For a detailed discussion, see the previous article.
Fixing this flaw seems difficult-to-impossible. As I said earlier,
the trick is to fool the command into reading from a pipe when it
thinks it is opening a file, and this is precisely what
The output is an improvement, but it is not completely solved, and the cost is that the process and file management are much more complicated. In fact, the cost is so high that you have to wonder if it might not be simpler to replace z with a shell script that
copies the data to a temporary directory, uncompresses the files, and
runs the command on the uncompressed files, perhaps something along
these lines:
#!/bin/sh DIR=/tmp/$$ mkdir $DIR COMMAND=$1 shift cp -p "$@" $DIR cd $DIR gzip -d * $COMMAND *This has problems too, but my point is that if you are willing to accept a crappy, semi-working solution along the lines of the FIFO one, simpler ones are at hand. You can compare the FIFO version directly with the shell script, and I think the FIFO version loses. The z implementation I have is a
solution in a different direction, and different tradeoffs, and so
might be preferable to it in a number of ways. But as I said, I don't know yet. [ Addendum 20080325: Several people suggested a fix that I had considered so unwise that I didn't even mention it. But after receiving the suggestion repeatedly, I wrote an article about it. ]
z-commands
But for anything else, you either need to uncompress the files, or
build a special tool. I have a utility that scans the web logs of
while (<>) { ... do something with $_ ... }The <> operator implicitly scans all the lines in all the
files named in the command-line arguments, opening a new file each
time the previous one is exhausted. To decompress the files on the fly, one can preprocess the command-line arguments:
for (@ARGV) { if (/\.gz$/) { $_ = "gzip -dc $_ |"; } } while (<>) { ... do something with $_ ... }The for loop scans the command-line arguments, replacing each
one that has the form foo.gz with gzip -dc foo.gz |.
Perl's magic open semantics treat filenames specially if they end with
a pipe symbol: a pipe to a command is opened instead. Of course, anyone can
think of half a dozen ways in which this can go wrong. But Larry
Wall's skill in making such tradeoffs has been a large factor in Perl's
success.
But it bothered me to have to make this kind of change in every
program that wanted to handle compressed files. We have
But after I got to thinking about it, I decided that I could write a
single
zsed -e 's/:.*//' * | ...where the * matches some files that have .gz
suffixes and some that haven't, one would write:
z sed -e 's/:.*//' * | ...and it would Just Work. That's the idea, anyway.
If
# hypothetical implementation of z # my $command = shift; for (@ARGV) { if (/\.gz$/) { $_ = "gzip -dc $_ |"; } } exec $command, @ARGV; die "Couldn't run command '$command': $!\n";But sed is not written in Perl, and has no magic open. So I
have to play a trickier trick:
for my $file (@ARGV) { if ($file =~ /\.gz$/) { unless (open($fhs[@fhs], "-|", "gzip", "-cd", $file)) { warn "Couldn't open file '$file': $!; skipping\n"; next; } my $fd = fileno $fhs[-1]; $_ = "/proc/self/fd/$fd"; } } # warn "running $command @ARGV\n"; exec $command, @ARGV; die "Couldn't run command '$command': $!\n";This is a stripped-down version to illustrate the idea. For various reasons that I explained yesterday, it does not actually work. The complete, working source code is here.
The idea, as before, is that the program preprocesses the command-line
arguments. But instead of replacing the arguments with pipe commands, which
are not supported by
The trick depends crucially on having
The
But here's the same thing with z:
The problem is even more glaring in the case of commands like wc:
So perhaps This is article #300 on my blog. Thanks for reading. [ Addendum 20080322: There is a followup to this article. ] [ Addendum 20080325: Another followup. ]
Closed file descriptors: the answer
my $command = shift; for my $file (@ARGV) { if ($file =~ /\.gz$/) { my $fh; unless (open $fh, "<", $file) { warn "Couldn't open $file: $!; skipping\n"; next; } my $fd = fileno $fh; $file = "/proc/self/fd/$fd"; } } exec $command, @ARGV; die "Couldn't run command '$command': $!\n";When the loop exits, $fh is out of scope, and the
filehandle it contains is garbage-collected, closing the file."Duh."
Several people suggested that it was because open files are not
preserved across an Abhijit Menon-Sen ran a slightly different test than I did:
% z cat foo.gz bar.gz cat: /proc/self/fd/3: No such file or directory cat: /proc/self/fd/3: No such file or directoryAs he said, this makes it completely obvious what is wrong, since the two files are both represented by the same file descriptor.
Closed file descriptors
my $command = shift; for my $file (@ARGV) { if ($file =~ /\.gz$/) { my $fh; unless (open $fh, "<", $file) { warn "Couldn't open $file: $!; skipping\n"; next; } my $fd = fileno $fh; $file = "/proc/self/fd/$fd"; } } exec $command, @ARGV; die "Couldn't run command '$command': $!\n";The idea here is that this program, called z, will preprocess
the arguments of some command, and then run the command
with the modified arguments. For some of the command-line arguments,
here the ones named *.gz, the original file will be replaced
by the output of some file descriptor. In the example above, the
descriptor is attached to the original file, which is pointless. But
once this part of the program was working, I planned to change the code
so that the descriptor would be
attached to a pipe instead.Having written something like this, I then ran a test, which failed:
"Aha," I said instantly. "I know what is wrong. Perl set the close-on-exec flag on file descriptor 3."
You see, after a successful
So there is an easy fix for the problem: I just set
Huh, something is still wrong.
Maybe I misspelled
Maybe
Nevertheless I put in
use Fcntl; .... my $flags = fcntl($fh, F_GETFD, 0); fcntl($fh, F_SETFD, $flags & ~FD_CLOEXEC);And try it again:
Huh.
I then wasted a lot of time trying to figure out an easy way to
tell if the file descriptor was actually open after the Now your job is to figure out what is wrong. It took me a shockingly long time. No need to email me about it; I have it working now. I expect that you will figure it out faster than I did, but I will also post the answer on the blog tomorrow. Sometime on Friday, 21 March 2008, this link will start working and will point to the answer. [ Addendum 20080321: I posted the answer. ]
Drawing lines
As part of my job I had to produce the following display: The idea here is that the user can fill in the names of three organisms into the form blanks, and the application will find all the studies in its database which conclude that those organisms are related in the indicated way. For example, the user can put "whale" and "hippo" in the top two blanks and "cow" in the bottom one, and the result will be all the studies that conclude (perhaps among other things) that whales and hippos are more closely related to each other than either is to cows. (I think "cothurnocystis bifida" is biologist jargon for cows.) If you wanted to hear more about phylogeny, Java programming, or tree algorithms, you are about to be disappointed. The subject of my article today is those fat black lines. The first draft of the page did not have the fat black lines. It had some incredibly awful ASCII-art that was not even properly aligned. Really it was terrible; it would have been better to have left it out completely. I will not make you look at it. I needed the lines, so I popped down the "graphics" menu on my computer and looked for something suitable. I tried the Gimp first. It seems that the Gimp has no tool for drawing straight lines. If someone wants to claim that it does, I will not dispute the claim. The Gimp has a huge and complex control panel covered with all sorts of gizmos, and maybe one of those gizmos draws a straight line. I did not find one. I gave up after a few minutes. Next I tried Dia. It kept selecting the "move the line around on the page" tool when I thought I had selected the "draw another line" tool. The lines were not constrained to a grid by default, and there was no obvious way to tell it that I wanted to draw a diagram smaller than a whole page. I would have had to turn the thing into a bitmap and then crop the bitmap. "By Zeus's Beard," I cried, "does this have to be so difficult?" Except that the oath I actually uttered was somewhat coarser and less erudite than I have indicated. I won't repeat it, but it started with "fuck" and ended with "this". Here's what I did instead. I wrote a program that would read an input like this:
>-v-< '-+-`and produce a jpeg file that looks like this: Or similarly this:
.---, | >--, '---` '-Becomes this: You get the idea. Now I know some of you are just itching to write to me and ask "why didn't you just use...?", so before you do that, let me remind you of two things. First, I had already wasted ten or fifteen minutes on "just use..." that didn't work. And second, this program only took twenty minutes to write. The program depends on one key insight, which is that it is very, very easy to write a Perl program that generates a graphic output in "PBM" ("portable bitmap") format. Here is a typical PBM file:
P1 10 10 1111111111 1000000001 1000000001 1001111001 1001111001 1001111001 1001111001 1000000001 1000000001 1111111111The P1 is a magic number that identifies the file format; it
is always the same. The 10 10 warns the processor that the
upcoming bitmap is 10 pixels wide and 10 pixels high. The following
characters are the bitmap data.
I'm not going to insult you by showing the 10×10 bitmap image
that this represents.
PBM was invented about twenty years ago by Jef Poskanzer. It was
intended to be an interchange format: say you want to convert images
from format There are also PGM (portable graymap) and PPM (portable pixmap) formats for grayscale and 24-bit color images as well. They are only fractionally more complicated.
Because these formats are so very, very simple, they have been widely adopted.
For example, the JPEG reference implementation includes a sample
Writing a Perl program to generate a P?M file, and then feeding the
output to The program may be interesting as an example of this technique, and possibly also as a reminder of something else. The Perl community luminaries invest a lot of effort in demonstrating that not every Perl program looks like a garbage heap, that Perl can be as bland and aseptic as Java, that Perl is not necessarily the language that most closely resembles quick-drying shit in a tube, from which you can squirt out the contents into any shape you want and get your complete, finished artifact in only twenty minutes and only slightly smelly. No, sorry, folks. Not everything we do is a brilliant, diamond-like jewel, polished to a luminous gloss with pages torn from one of Donald Knuth's books. This line-drawing program was squirted out of a tube, and a fine brown piece of engineering it is.
#!/usr/bin/perl my ($S) = shift || 50; $S here is "size". The default is to turn every character in
the input into a 50×50 pixel tile. Here's the previous example
with $S=10:
my ($h, $w); my $output = []; while (<>) { chomp; $w ||= length(); $h++; push @$output, convert($_); }The biggest defect in the program is right here: it assumes that each line will have the same width $w. Lines all must be
space-padded to the same width. Fixing this is left as an easy
exercise, but it wasn't as easy as padding the inputs, so I didn't do it.The magic happens here:
open STDOUT, "| pnmscale 1 | cjpeg" or die $!; print "P1\n", $w * $S, " ", $h * $S, "\n"; print $_, "\n" for @$output; exit;The output is run through cjpeg to convert the PBM data to
JPEG. For some reason cjpeg doesn't accept PBM data, only
PGM or PPM, however, so the output first goes through
pnmscale, which resizes a P?M input. Here the scale factor
is 1, which is a no-op, except that pnmscale happens to turn
a PBM input into a PGM output. This is what is known in the business
as a "trick". (There is a pbmtopgm program, but it does
something different.)
If we wanted gif output, we could have used
I'm going to omit the details of
# sub convert_ch { my @rows; my $ch = shift; my $up = $ch =~ /[<|>^'`+]/i; my $dn = $ch =~ /[<|>V.,+]/i; my $lt = $ch =~ /[-<V^,`+]/i; my $rt = $ch =~ /[->V^.'+]/i;These last four variables record whether the tile has a line from its center going up, down, left, or right respectively. For example, " |" produces a tile with lines coming up and down from the
center, but not left or right. The /i in the regexes is
because I kept writing v instead of V in the
inputs.
my $top = int($S * 0.4); my $mid = int($S * 0.2); my $bot = int($S * 0.4);The tile is divided into three bands, of the indicated widths. This probably looks bad, or fails utterly, unless $S is a multiple
of 5. I haven't tried it. Do you think I care? Hint: I haven't tried
it.
my $v0 = "0" x $S; my $v1 = "0" x $top . "1" x $mid . "0" x $bot; push @rows, ($up ? $v1 : $v0) x $top;This assembles the top portion of the tile, including the "up" line, if there is one. Note that despite their names, $top also
determines the width of the left portion of the tile, and
$bot determines the width of the right portion. The letter
"v" here is for "vertical".
Perhaps I should explain for the benefit of the readers of Planet
Haskell (if any of them have read this far and not yet fainted with
disgust) that "$a x $b" in Perl is like
my $ls = $lt ? "1" : "0"; my $ms = ($lt || $rt || $up || $dn) ? "1" : "0"; my $rs = $rt ? "1" : "0"; push @rows, ($ls x $top . $ms x $mid . $rs x $bot) x $mid;This assembles the middle section, including the "left" and "right" lines.
push @rows, ($dn ? $v1 : $v0) x $bot;This does the bottom section.
return @rows; }And we are done. Nothing to it. Adding diagonal lines would be a fairly simple matter. Download the complete source code if you haven't seen enough yet. There is no part of this program of which I am proud. Rather, I am proud of the thing as a whole. It did the job I needed, and it did it by 5 PM. Larry Wall once said that "a Perl script is correct if it's halfway readable and gets the job done before your boss fires you." Thank you, Larry. No, that is not quite true. There is one line in this program that I'm proud of. I noticed after I finished that there is exactly one comment in this program, and it is blank. I don't know how that got in there, but I decided to leave it in. Who says program code can't be funny?
Throttling qmail
Since I moved house, I have all sorts of internet-related problems that I didn't have before. I used to do business with a small ISP, and I ran my own web server, my own mail service, and so on. When something was wrong, or I needed them to do something, I called or emailed and they did it. Everything was fine. Since moving, my ISP is Verizon. I have great respect for Verizon as a provider of telephone services. They have been doing it for over a hundred years, and they are good at it. Maybe in a hundred years they will be good at providing computer network services too. Maybe it will take less than a hundred years. But I'm not as young as I once was, and whenever that glorious day comes, I don't suppose I'll be around to see it. One of the unexpected problems that arose when I switched ISPs was that Verizon helpfully blocks incoming access to port 80. I had moved my blog to outside hosting anyway, because the blog was consuming too much bandwidth, so I moved the other plover.com web services to the same place. There are still some things that don't work, but I'm dealing with them as I have time.
Another problem was that a lot of sites now rejected my SMTP
connections. My address was in a different netblock. A Verizon DSL
netblock. Remote SMTP servers assume that anybody who is dumb enough
to sign up with Verizon is also too dumb to run their own MTA. So any
mail coming from a DSL connection in Verizonland
The solution here (short of getting rid of Verizon) is to relay the
mail through Verizon's SMTP relay service. But but but.
If my machine sends more than
So what happens if someone sends a message to one of the
500-subscriber email lists that I host here?
One solution is to get a better mail provider. Lorrie has an
Earthlink account that comes with outbound mail relay service. But
they do the same thing for the same reason. My Dreamhost subscription
comes with an outbound mail relay service. But they do the same thing
for the same reason. My So far there are at least five solutions that are on the "eh, maybe, if I have to" list:
- Get a non-suck ISP
- Find a better mail relay service
- Hack SASL into
`qmail`and send mail through`Pobox.com` - Do some skanky thing with
`serialmail` - Get rid of
`qmail`in favor of postfix, which presumably supports SASL
It also occurred to me in the shower this morning that the old ISP might be
willing to sell me mail relaying and nothing else, for a small fee.
That might be worth pursuing. It's gotta be easier than turning
The
There have been some people on the
I hacked
Here's how it works. Whenever
The program is not strictly correct. It has some race conditions.
Suppose the policy limits So let's see? What else is interesting about this program? I made at least one error, and almost made at least one more. The almost-error was this: The original design for the program was something like:
- do
- lock the history file, read it, and unlock it
- lock the history file, update it, and unlock it
- send the message
A
update the file after process B reads but before B
updates it. Then B's update will destroy A's.One way to fix this is to have the processes append to the history file, but never remove anything from it. That is clearly not a sustainable strategy. Someone must remove expired entries from the history file. Another fix is to have the read and the update in the same critical section:
- lock the history file
- do
- read the history file
- update the history file and unlock it
- send the message
qmail-remote process
can make progress. I had decided that I wanted to try to retain the
concurrency, and so I wasn't willing to accept this.
Cleaning the history file could be done by a separate process that
periodically locks the file and rewrites it. But instead, I have the
- do
- lock the history file, read it, and unlock it
- lock the history file, read it, update it, and unlock it
- send the message
Here's a mistake that I
while (@last >= $msgs) { my $oldest = $last[0]; my $age = time() - $oldest; my $zzz = $time - $age + int(rand(3)); $zzz = 1 if $zzz < 1; # Log("Sleeping for $zzz secs"); sleep $zzz; shift @last while $last[0] < time() - $time; load_policy(); }The throttling policy is expressed by two numbers, $msgs and $time,
and the program tries to send no more than $msgs messages per
$time seconds. The @last array contains a list of
Unix epoch timestamps of the times at which the messages of the last
$time seconds were sent.
So the loop condition checks to see if fewer than $msgs
messages were sent in the last $time seconds. If not, the
program continues immediately, possibly posting its message. (It
rereads the history file first, in case some other messages have been
posted while it was asleep.)
Otherwise the program will sleep for a while. The first three lines in the loop calculate how long to sleep for. It sleeps until the time the oldest message in the history will fall off the queue, possibly plus a second or two. Then the crucial line:
shift @last while $last[0] < time() - $time;which discards the expired items from the history. Finally, the call to load_policy() checks to see if the policy has changed, and
the loop repeats if necessary.
The bug is in this crucial line. if
shift @last while @last && $last[0] < time() - $time;Whoops. I noticed this this morning when my system's load was around 12, and eight or nine qmail-remote processes were collectively eating 100% of
the CPU. I would have noticed sooner, but outbound deliveries hadn't
come to a complete halt yet.
Incidentally, there's another potential problem here arising from the
concurrency. A process will complete the sleep loop in at most
Here's the code. I hereby place it in the public domain. It was written between 1 AM and 3 AM last night, so don't expect too much.
"Boolean" or "boolean"?
... a logical negation function ... takes a boolean argument and returns a boolean result.I worried for some time about whether to capitalize "boolean" here. But writing "Boolean" felt strange enough that I didn't actually try it to see how it looked on the page. I looked at the the Big Dictionary, and all the citations were capitalized. But the most recent one was from 1964, so that was not much help. Then I tried Google search for "boolean capitalized". The first hit was a helpful article by Eric Lippert. M. Lippert starts by pointing out that "Boolean" means "pertaining to George Boole", and so should be capitalized. That much I knew already. But then he pointed out a countervailing consideration:
English writers do not usually capitalize the eponyms "shrapnel" (Henry Shrapnel, 1761-1842), "diesel" (Rudolf Diesel, 1858-1913), "saxophone" (Adolphe Sax, 1814-1894), "baud" (Emile Baudot, 1845-1903), "ampere" (Andre Ampere, 1775-1836), "chauvinist" (Nicolas Chauvin, 1790-?), "nicotine" (Jean Nicot, 1530-1600) or "teddy bear" (Theodore Roosevelt, 1858-1916).Isn't that a great paragraph? I just had to quote the whole thing. Lippert concluded that the tendency is to capitalize an eponym when it is an adjective, but not when it is a noun. (Except when it isn't that way; consider "diesel engine". English is what it is.) I went back to my example to see if that was why I resisted capitalizing "Boolean":
... takes a boolean argument and returns a boolean result.Hmm, no, that wasn't it. I was using "boolean" as an adjective in both places. Wasn't I? Something seemed wrong. I tried changing the example:
... takes an integer argument and returns an integer result.Aha! Notice "integer", not "integral". "Integral" would have been acceptable also, but that isn't analogous to the expression I intended. I wasn't using "boolean" as an adjective to modify "argument" and "result". I was using it as a noun to denote a certain kind of data, as part of a noun phrase. So it is a noun, and that's why I didn't want to capitalize it. I would have been happy to have written "takes a boolean and returns a boolean", and I think that's the controlling criterion. Sorry, George.
Uniquely-decodable codes revisited
Alan Morgan wrote to ask if there was a difference between
uniquely-decodable (UD) codes for strings and for streams. That is,
is there a code for which every finite string is UD, but for which
some I pondered this a bit, and after a little guessing came up with an example: { "a", "ab", "bb" } is UD, because it is a suffix code. But the stream "abbbbbbbbbb..." can be decoded in two ways. After I found the example, I realized that I shouldn't have needed to guess, because I already knew that you sometimes have to see the last symbol of a string before you can know how to decode it, and in such a code, if there is no such symbol, the decoding must be ambiguous. The code above is UD, but to decode "abbbbbbbbbbbbbbb" you have to count the "b"s to figure out whether the first code word is "a" or "ab". Let's say that a code is UD+ if it has the property that no two infinite sequences of code words have the same concatenation. Can we characterize the UD+ codes? Clearly, UD+ implies UD, and the example above shows that the converse is not true. A simple argument shows that all prefix codes are UD+. So the question now is, are there UD+ codes that are not prefix codes? I don't know. [ Addendum 20080303: Gareth McCaughan points out that { "a", "ab" } is UD+ but not prefix. ]
[ Note: You are cautioned that this article is in the oops section of my blog, and so you should understand it as a description of a mistake that I have made. ] For more than a year now my day job has involved work on a large project written entirely in Java. I warned my employers that I didn't have any professional experience in Java, but they hired me anyway. So I had to learn Java. Really learn it, I mean. I hadn't looked at it closely since version 1.2 or so.
Java 1.5 has parametrized types, which they call "generics". These
looked pretty good to me at first, and a big improvement on the cruddy
1975-style type system Java had had before. But then I made a
shocking discovery: If In particular, you cannot:
For a couple of weeks I went around muttering to myself about what idiots the Java people must be. "Geez fuckin' Louise," I said. "The Hindley-Milner languages have had this right for twenty years. How hard would it have been for those Java idiots to pick up a damn book or something?"
I was, of course, completely wrong in all respects. The assignment
above leads to problems that are obvious if you think about it a bit,
and that should have been obvious to me, and would have been, except
that I was so busy exulting in my superiority to the entire Java
community that I You would like to be able to do this:
But if you do, you are setting up a type failure:
Here listOfSoldiers and listOfGenerals refer to the
same underlying object, so we put a common soldier into that object
back on line 4, and then took it out again on line 5. But line 5 is
expecting it to be a General, and it is not. So we either have a type
failure right there, or else we have a General variable that holds a
a Soldier
object, and then on line 6 a mere private is
allowed to order an attack, causing a run-time type failure. If we're
lucky.
The language designers must forbid one of these operations, and the
best choice appears to be to forbid the assignment of the The canonical Java generics tutorial has an example just like this one, to explain precisely this feature of Java generics. I would have known this, and I would have saved myself two weeks of grumbling, if I had picked up a damn book or something.
Furthermore, my premise was flawed. The H-M languages (SML, Haskell,
Miranda, etc.) have
The naive approach for SML is simple. It says that if
ref :The ref function takes a value and produces a reference to
it, like & in C; if the original value had type α then
the result has type ref α. The ! function takes a reference
of type ref α and dereferences it, returning the value of type α that
it refers to, like * in C. And the := function,
usually written infix, takes a reference and a value, stores the value
into the place that the reference points to, replacing what was there
before, and returns nothing. So for example:
val a = "Kitty cat"; (* a : string *) val r = ref a; (* r : ref string *) r := "Puppy dog"; print !r;This prints Puppy dog. But this next example fails, as you
would hope and expect:
val a = "Kitty cat"; val r = ref a; r := 37; (* fails *)because r has type ref string, but 37
has type int, and := requires that the type of
the value on the right match the type referred to by the reference on
the left. That is the obvious, naive approach. What goes wrong, though? The canonical example is:
fun id x = x (* id :The key here is that a is a variable of type ref (α → α), that is,
a reference to a cell that can hold a function whose argument is any
type α and whose return value is the same type. Here it holds a
reference to id, which is the identity function.
Then we define a logical negation function,
Then we dereference I've already talked about SML way longer than I planned to, and I won't belabor you further with explanations of the various schemes that were hatched over the years to try to sort this out. Suffice it to say that the problem is still an open research area. Java, of course, is all references from top to bottom, so this issue obtrudes. The Java people do not know the answer either.
The big error that I made here was to jump to the conclusion that the
Java world must be populated with idiots who know nothing about type
theory or Haskell or anything else that would have tipped them off to
the error I thought they had committed. Probably most of them know
nothing about that stuff, but there are a lot of them, and presumably
As P.J. Plauger once said in a similar circumstance, there is a name for people who are so stupid that they think everyone else is stupid instead. Maybe I won't be that person next time. [ Addendum 20201021: [Here's the original post by Plauger](https://groups.google.com/g/comp.std.c/c/aPENvL9wZFw/m/II24s61rOcoJ), in case I want to find it again, or someone else is interested. ]
More rational roots of polynomials
Suppose you have a polynomial
But consider
So we have an extension of the rational root theorem: instead of
guessing that some factor My notes conclude with:
Is this really less work than just trying all the divisors ofLet's find out.
As in the previous article, say
So we need to find three consecutive integers that respectively divide 12, 15, and 16. The Britannica has no specific technique for this; it suggests doing it by eyeball. In this case, 2–3–4 jumps out pretty quickly, giving the root 3, and so does 6–5–4, which is the root -5. But the method also yields a false root: 4–3–2 suggests that -3 might be a root, and it is not.
Let's see how this goes for a harder example. I wrote a little
Haskell program that generated the random polynomial
That required a fair amount of mental arithmetic, and I screwed up and got 502 instead of 512, which I only noticed because 502 is not composite enough; but had I been doing a non-contrived example, I would not have noticed the error. (Then again, I would have done the addition on paper instead of in my head.) Clearly this example was not hard enough because 2–3–4 and 4–5–6 are obviously solutions, and it will not always be this easy. I increased the range on my random number generator and tried again.
The next time, it came up with the very delightful polynomial
I decided to try one more example, of intermediate difficulty. The
program first gave me
This is just past my mental arithmetic ability; I got 34884 instead of 34864 in the first row, and balked at factoring 61446 in my head. But going ahead (having used the computer to finish the arithmetic), the 17 and 19 in the first and last rows are suggestive, and there is indeed a 17–18–19 to be found. Following up on the 19 in the first row suggests that we look for 19–20–21, which there is, and following up on the 11 in the last row, hoping for a 9–10–11, finds one of those too. All of these are roots, and I do have to admit that I don't know any better way of discovering that. So perhaps the method does have some value in some cases. But I had to work hard to find examples for which it made sense. I think it may be more reasonable with 18th-century technology than it is with 21st-century technology.
Happy Leap Day! Persian edition
If I understand the rules correctly, to determine if a Persian year is
a leap year, one applies the following algorithm to the Persian year
number
- Let
*a*= (*y*+ 2345) % 2820. - If
*a*is 2819,*y*is a leap year. Otherwise, - Let
*b*=*a*% 128. - If
*b*< 29, let*c*=*b*. Otherwise, let*c*= (*b*- 29) % 33. - If
*c*= 0,*y*is not a leap year. Otherwise, - If
*c*is a multiple of 4,*y*is a leap year. Otherwise, -
*y*is not a leap year.
This produces 683 leap years out of every 2820, which means that the average calendar year is 365.24219858 days. How does this compare with the Dominus calendar? It is indeed more accurate, but I consider 683/2820 to be an unnecessarily precise representation of the vernal equinox year, especially inasmuch as the length of the year is changing. And the rule, as you see, is horrendous, requiring either a 2,820-entry lookup table or complicated logic. Moreover, the Persian and Gregorian calendar are out of sync at present. Persian year 1387, which begins next month on the vernal equinox, is a leap year. But the intercalation will not take place until the last day of the year, around 21 March 2009. The two calendars will not sync up until the year 2092/1470, and then will be confounded only eight years later by the Gregorian 100-year exception. After that they will agree until 2124/1502. Clearly, even if it were advisable to switch to the Persian calendar, the time is not yet right. I found this Frequently Asked Questions About Calendars page extremely helpful in preparing this article. The Wikipedia article was also useful. Thanks again to Roland Young for bringing this matter to my attention.
Happy Leap Day!
Everyone knows that our calendar periodically contains an extra day, known to calendar buffs as an "intercalary day", to help make it line up with the seasons, and that this intercalary day is inserted at the end of February. But, depending on how you interpret it, this isn't so. The extra day is actually inserted between February 23 and February 24, and the rest of February has to move down to make room.
I will explain. In Rome,
23 February was a holiday called When Julius Caesar reformed the calendar in 46, he specified that there would be a single intercalary day every four years much as we have today. As in the old calendar, the intercalary day was inserted after Terminalia, although February was no longer truncated. So the extra day is actually 24 February, not 29 February. Or not. Depends on how you look at it. Scheduling intercalary days is an interesting matter. The essential problem is that the tropical year, which is the length of time from one vernal equinox to the next, is not an exact multiple of one day. Rather, it is about 365¼ days. So the vernal equinox moves relative to the calendar date unless you do something to fix it. If the tropical year were exactly 365¼ days long, then four tropical years would be exactly 1461 days long, and it would suffice to make four calendar years 1461 days long, to match. This can be accomplished by extending the 365-day calendar year with one intercalary day every four years. This is the Julian system. Unfortunately, the tropical year is not exactly 365¼ days long. It is closer to 365.24219 days long. So how many intercalary days are needed? It suffices to make 100,000 calendar years total exactly 36,524,219 days, which can be accomplished by adding a day to 24,219 years out of every 100,000. But this requires a table with 100,000 entries, which is too complicated. We would like to find a system that requires a simpler table, but which is still reasonably accurate. The Julian system requires a table with 4 entries, but gives a calendar year that averages 365.25 days long, which is 0.00781 too many. Since this is about 1/128 day, the Julian calendar "gains a day" every 128 years or so, which means that the vernal equinox slips a day earlier every 128 years, and eventually the daffodils and crocuses are blooming in January. Not everyone considers this a problem. The Islamic calendar is only 355 days long, and so "loses" 10 days per year, which means that after 18 years the Islamic new year has moved half a year relative to the seasons. The annual Islamic holy month of Ramadan coincided with July-August in 1980 and with January-February in 1997. The Muslims do intercalate, but they do it to keep the months in line with the phases of the moon.
Still, supposing that we do consider this a problem, we would like to
find an intercalation scheme that is simple and accurate. This is
exactly the problem of finding a simple rational approximation to
0.24219. If Any time the question is "find a simple rational approximation to a number" the answer is likely to involve continued fractions. 365.24219 is equal to:
$$ 365 + {1\over \displaystyle 4 + {\strut 1\over\displaystyle 7 + {\strut 1\over\displaystyle 1 + {\strut 1\over\displaystyle 3 + {\strut 1\over\displaystyle 24 + {\strut 1\over\displaystyle 6 + \cdots }}}}}}$$ which for obvious reasons, mathematicians abbreviate to [365; 4, 7, 1, 3, 24, 6, 2, 2]. This value is exact. (I had to truncate the display above because of a bug in my TeX formula tool: the full fraction goes off the edge of the A0-size page I use as a rendering area.)As I have mentioned before, the reason this horrendous expression is interesting is that if you truncate it at various points, the values you get are the "continuants", which are exactly the best possible rational approximations to the original number. For example, if we truncate it to [365], we get 365, which is the best possible integer approximation to 365.24219. If we truncate it to [365; 4], we get 365¼, which is the Julian calendar's approximation. Truncating at the next place gives us [365; 4, 7], which is 365 + 1/(4 + 1/7) = 356 + 1/(29/7) = 365 + 7/29. In this calendar we would have 7 intercalary days out of 29, for a calendar year of 365.241379 days on average. This calendar loses one day every 1,234 years.
The next convergent is [365; 4, 7, 1] = 8/33, which requires 8
intercalary days every 33 years for an average calendar year of
0.242424 days. This schedule gains only one day in 4,269 years and so
is actually The real question, however, is not whether the table can be made smaller but whether the rule can be made simpler. The rule for the Gregorian calendar requires second-order corrections:
- If the year is a multiple of 400, it is a leap year; otherwise
- If the year is a multiple of 100, it is not a leap year; otherwise
- If the year is a multiple of 4, it is a leap year.
And one frequently sees computer programs that omit one or both of the exceptions in the rule. The 8/33 calendar requires dividing by 33, which is its most serious problem. But it can be phrased like this:
- Divide the year by 33. If the result is 0, it is not a leap year. Otherwise,
- If the result is divisible by 4, it is a leap year.
Furthermore, the rule as I gave it above has another benefit: it matches the Gregorian calendar this year and will continue to do so for several years. This was more compelling when I first proposed this calendar back in 1998, because it would have made the transition to the new calendar quite smooth. It doesn't matter which calendar you use until 2016, which is a leap year in the Gregorian calendar but not in the 8/33 calendar as described above. I may as well mention that I have modestly named this calendar the Dominus calendar. But time is running out for the smooth transition. If we want to get the benefits of the Dominus calendar we have to do it soon. Help spread the word! [ Pre-publication addendum: Wikipedia informs me that it is not correct to use the tropical year, since this is not in fact the time between vernal equinoxes, owing to the effects of precession and nutation. Rather, one should use the so-called vernal equinox year, which is around 365.2422 days long. The continued fraction for 365.2422 is slightly different from that of 356.24219, but its first few convergents are the same, and all the rest of the analysis in the article holds the same for both years. ] [ Addendum 20080229: The Persian calendar uses a hybrid 7/29 and 8/33 system. Read all about it. ] [ Addendum 20240229: Alas, the clock has run out on my attempt to get the world to adopt the Dominus calendar. ]
Algebra techniques that don't work, except when they do
She says "I stopped him before he factored out the
I was a bit surprised by this, because the work so far seemed
reasonable to me. I think the only mistake was not dividing the whole
thing by 3 in the first step. But it is not too late to do that, and
even without it, you can still make progress.
If you get rid of the extra factor of 3 in the first place, the thing
is even easier, because you have
Now obviously, this is not always going to work, but it works often enough
that it would have been the first thing I would have tried.
It is a lot quicker than calculating But probably the student did not have enough ingenuity or number sense to correctly carry off this technique (he didn't notice the 3), so that M. Hirta's advice to just use the damn quadratic formula already is probably good. Still, I wonder if perhaps such students would benefit from exposure to this technique. I can guess M. Hirta's answer to this question: these students will not benefit from exposure to anything.
[ Addendum 20080228: Robert C. Helling points out that I could have
factored the 45 in the first place, without any algebraic
manipulations. Quite so; I completely botched my explanation of what
I was doing. I meant to point out that once you have
[ Addendum 20080301: There is a followup to this article. ]
Uniquely-decodable codes
For example, consider
## Coding theoryIn coding theory, the property has the awful name "unique decodability". The idea is that you have some input symbols, and each input symbol is represented with one output symbol, which is one of the strings fromS. Then suppose you receive some message like
"abbab". Can you figure out what the original input was? For
S_{2}, yes: it must have been ZY. But for
S_{1}, no: it could have been either YZ or XZX.
In coding theory, the strings are called "code words" and the set of
strings is a "code". So the question is how to tell whether a code is
uniquely-decodable.
One obvious way to take a non-uniquely-decodable code and turn it into a uniquely-decodable code is to
append delimiters to the code words. Consider
But if you don't want to transmit the extra delimiters, you can save
bandwidth by making your code uniquely-decodable even without delimiters. The
delimiters are a special case of a more general principle, which is
that a
The proof of this is pretty simple: you have some concatenation of
code words, say
There is a straightforward correspondence between prefix codes and
trees; the code words can be arranged at the leaves of a tree, and
then to decode some concatenation Prefix codes include, as a special case, codes where all the words are the same length. For those codes, the tree is balanced, and has all branches the same length. But uniquely-decodable codes need not be prefix codes. Most obviously, a suffix code is uniquely-decodable and may not be a prefix code. For example, {"a", "aab", "bab", "bb" } is uniquely-decodable but is not a prefix code, because "a" is a prefix of "aab". The proof of uniquely-decodableness is obvious: this is just the last prefix code example from before, with all the code words reversed. If there were two sequences of words with the same concatenation, then the reversed sequences of reversed words would also have the same concatenation, and this would show that the code of the previous paragraph was not uniquely-decodable. But that was a prefix code, and so must be uniquely-decodable. But codes can be uniquely-decodable without being either prefix or suffix codes. For example, { "aabb", "abb", "bb", "bbba" } is uniquely-decodable but is neither a prefix nor a suffix code. Rik wanted a method for deciding. I told Rik about the prefix code stuff, which at least provides a sufficient condition for uniquely-decodableness, and then started poking around to see what else I could learn. Ahem, I mean, researching. I suppose that a book on elementary coding theory would have a discussion of the problem, but I didn't have one at hand, and all I could find online concerned prefix codes, which are of more practical interest because of the handy tree method for speedy decoding. But after tinkering with it for a couple of days (and also making an utterly wrong intermediate guess that it was undecidable, based on a surface resemblance to the Post correspondence problem) I did eventually figure out an algorithm, which I wrote up and released on CPAN, my first CPAN post in about a year and a half.
## An exampleThe idea is pretty simple, and I think best illustrated by an example, as so many things are. We will consider { "ab", "abba", "b" } again. We want to find two sequences of code words whose concatenations are the same. So say we wantpX_{1} = qY_{1}, where
p and q are code words and
X_{1} and Y_{1} are some longer strings. This can only happen
if
p and q are different lengths
and if one is a prefix of the other, since otherwise the two strings
pX_{1} and
qY_{1} don't begin with the same symbols. So we
consider just the cases where
p is a prefix of q, which means
that in this example we want to find
"ab"X_{1} = "abba"Y_{1}, or, equivalently,
X_{1} = "ba"Y_{1}.
Now
Now
The first of these, "b"
where the last line of the table is exactly the solution we seek.
Following the other one, "bba"
But this last equation is essentially the same as the
## The algorithmHaving seen an example, here's the description of the algorithm. We will tabulate solutions toXs = Y, where X and
Y are sequences of code words, for various strings s.
If s is empty, we win.
We start the tabulation by looking for pairs of keywords
If
If If we encounter some queue item that we have seen before, we can discard it; this will prevent us from going in circles. If the next queue item is the empty string, we have proved that the code is not uniquely-decodable. (Alternatively, we can stop just before queueing the empty string.) If the queue is empty, we have investigated all possibilities and the code is uniquely-decodable.
## PseudocodeHere's the summary:
**Initialization:**For each pair of code words*c*_{1}and*c*_{2}with*c*_{1}*s*=*c*_{2}, put*s*in the queue.**Main loop:**Repeat the following until termination- If the queue is empty, terminate. The code is uniquely-decodable.
- Otherwise:
- Take an item
*s*from the queue. - For each code word
*c*:- If
*c*=*s*, terminate. The code is not uniquely-decodable. - If
*cs'*=*s*, and*s'*has not been seen before, queue*s'*. - If
*c*=*ss'*, and*s'*has not been seen before, queue*s'*.
- If
- Take an item
To this we can add a little bookkeeping so that the algorithm emits
the two ambiguous sequences when the code is not uniquely-decodable. The implementation I
wrote uses a hash to track which strings
## NotesAs I said, I suspect this is covered in every elementary coding theory text, but I couldn't find it online, so perhaps this writeup will help someone in the future.
After solving this problem I meditated a little on my role in the
programming community. The kind of job I did for Rik here is a
familiar one to me. When I was in college, I was the math guy who
hung out in the computer lab with the hackers. Periodically one of
them would come to me with some math problem: "Crash, I am writing a
ray tracer. If I have a ray and a triangle in three dimensions, how
can I figure out if the ray intersects the triangle?" And then I
would go off and figure out how to do that and come back with the
algorithm, perhaps write some code, or perhaps provide some
instruction in matrix computations or whatever was needed. In physics
class, I partnered with Jim Kasprzak, a physics major, and we did all
the homework together. We would read the problem, which would be some
physics thing I had no idea how to solve. But Jim understood physics,
and could turn the problem from physics into some mathematics thing
that Puzzle: Is { "ab", "baab", "babb", "bbb", "bbba" } uniquely-decodable? If not, find a pair of sequences that concatenate to the same string.
Research question: What's the worst-case running time of the
algorithm? The queue items are all strings that are strictly shorter
than the longest code word, so if this has length Project to do: Reimplement in Haskell. Compare with Perl implementation. Meditate on how they can suck in such completely different ways. [ There is a brief followup to this article. ] [ Addendum 20170318: Marius Buzea has brought to my attention that this is exactly the Sardinas-Patterson Algorithm. The Wikipedia article claims that the worst-case running time is !!O(nk)!! where !!k!! is the number of code words and !!n!! is their total length. ]
Crappiest literary theory this month
Cornaptious
Webster's came up with nothing. Nothing but "corniculate", anyway, which didn't appear to be related. At that point we had exhausted our meager resources. That's what things were like in those days. The episode stuck with me, though, and a few years later when I became the possessor of the First Edition of the Oxford English Dictionary, I tried there. No luck. Some time afterwards, I upgraded to the Second Edition. Still no luck. Years went by, and one day I was reading The Lyre of Orpheus, by Robertson Davies. The unnamed Dean of the music school describes the brilliant doctoral student Hulda Schnakenburg:
"Oh, she's a foul-mouthed, cornaptious slut, but underneath she is all untouched wonderment.""Aha," I said. "So this is what they were reading that time." More years went by, the oceans rose and receded, the continents shifted a bit, and the Internet crawled out of the sea. I returned to the problem of "cornaptious". I tried a Google book search. It found one use only, from The Lyre of Orpheus. The trail was still cold.
But wait! It also had a suggestion: "Did you mean:
Ho!
Fifty-six hits for "carnaptious", all from books about Scots and
Irish. And the OED does list "carnaptious". "Sc. and Irish dial." it
says. It means bad-tempered or quarrelsome. Had Davies spelled it
correctly, we would have found it right away, because "carnaptious"
So that's that then. A twenty-year-old spelling error cleared up by Google Books. [ Addendum 20080228: The Dean's name is Wintersen. Geraint Powell, not the Dean, calls Hulda Schnakenburg a cornaptious slut. ]
Acta Quandalia
The least interesting number
This reads like a joke, and it is tempting to dismiss it as a trite bit of foolishness. But it has rather interesting and deep connections to other related matters, such as the Grelling-Nelson paradox and Gödel's incompleteness theorem. I plan to write about that someday.
But today my purpose is only to argue that there
$$\sum_{i=1}^\infty {10}^{-i!} = 0.1100010000000000000001000\ldots$$ Why is this number of any concern? In 1844 Joseph Liouville showed that there was an upper bound on how closely an irrational algebraic number could be approximated by rationals.L can be
approximated much more closely than that, and so must therefore be
transcendental. This was the proof of the existence of transcendental
numbers.
The only noteworthy mathematical property possessed by
Liouville's theorem shows how to construct many transcendental
numbers, but the construction generates many similar numbers. For
example, you can replace the 10 with a 2, or the The argument in Berry's paradox fails for the real numbers: since the real numbers are not well-ordered, the set of uninteresting real numbers need have no smallest element, and in fact (by Berry's argument) does not. Liouville's number is not the smallest number of its type, nor the largest, nor anything else of interest.
If someone were to come along and prove that Liouville's number was
the
Trivial theorems
Except of course it wasn't Acta Quandalia (which would
never commit such a silly error) and it didn't concern
This would not qualify as a major screwup under my definition in the original article, since the theorems are true, but it certainly would have been rather embarrassing. Journals are not supposed to publish papers about the properties of the empty set. Hmm, there's a thought. How about a Journal of the Properties of the Empty Set? The editors would never be at a loss for material. And the cover almost designs itself. Handsome, isn't it? I See A Great Need!
Ahem. Anyway, if the folklore in question is true, I suppose the
mathematicians involved might have felt proud rather than ashamed,
since they could now boast of having completely solved the problem of
partially uniform Is this story true? Are there any examples? Please help me, gentle readers. [ Addendum 20210828: Mathematician [Carl Ludig Siegel once wrote in a letter](http://docmadhattan.fieldofscience.com/2019/12/carl-ludwig-siegel-number-thoery-pig.html) “I am afraid that mathematics will perish before the end of this century if the present trend for senseless abstraction — as I call it: theory of the empty set — cannot be blocked up.” ]
Steganography in 1665: correction
Phil Rodgers has pointed out that a "physique" is not an emetic, as I thought, but a laxative. Are there any among you who doubt that Bruce Schneier can shoot sluggbullets out of his ass? Let the unbelievers beware!
Steganography in 1665
He told us a very handsome passage of the King's sending him his message ... in a sluggbullet, being writ in cypher, and wrapped up in lead and swallowed. So the messenger come to my Lord and told him he had a message from the King, but it was yet in his belly; so they did give him some physique, and out it come.Sure, Bruce Schneier can mount chosen-ciphertext attacks without even choosing a ciphertext. But dare he swallow a "sluggbullet" and bring it up again to be read?
Silly me. Bruce Schneier can probably cough up a sluggbullet [ Addendum 20080205: A correction. ]
Major screwups in mathematics: example 1
Readers suggested several examples, and I got lucky and turned up one on my own.
Some of the examples were rather obscure technical matters, where
Professor Snorfus publishes in Acta Quandalia that all
partially uniform
## General remarksI would like to make some general remarks first, but I don't quite know yet what they are. Two readers independently suggested that I should read Proofs and Refutations by Imre Lakatos, and raised a number of interesting points that I'm sure I'd like to expand on, except that I haven't read the book. Both copies are checked out of the Penn library, which is a good sign, and the interlibrary loan copy I ordered won't be here for several days.Still, I can relate a partial secondhand understanding of the ideas, which seem worth repeating. Whether a result is "correct" may be largely a matter of definition. Consider Lakatos' principal example, Euler's theorem about polyhedra: Let F, E, and V be the number of faces, edges,
and vertices in a polyhedron. Then F - E + V =
2. For example, the cube has (F, E, V) = (6, 12,
8), and 6 - 12 + 8 = 2.
Sometime later, someone observed that Euler's theorem was false for
polyhedra with holes in them. For example, consider the object
shown at right. It has (
Can we say that Euler was wrong? Not really. The question hinges on
the definition of "polyhedron". Euler's theorem is proved for
"polyhedra", but we can see from the example above that it only holds
for "simply-connected polyhedra". If Euler proved his theorem at a
time when "polyhedra" was implicitly meant
"simply-connected", and the generally-understood definition changed
out from under him, we can't hold that against Euler. In fact, the
failure of Euler's theorem for the object above suggests that maybe we
Okay, enough introductory remarks. My first example is unquestionably a genuine error, and from a first-class mathematician.
## Mathematical backgroundSome terminology first. A "formula" is just that, for example something like this:
$$\displaylines{ ((\forall a.\lnot R(a,a)) \wedge\cr (\forall b\forall c.R(b,c)\to\lnot R(c,b))\wedge\cr (\forall d\forall e\forall f.(R(d,e)\wedge R(e,f)\to R(d,f))) \to\cr (\forall x\exists y.R(y,x)) }$$ It may contain a bunch of quantified variables (a, b,
c, etc.), relations (like R), and logical connectives
like ∧. A formula might also include functions and constants
(which I didn't) or equality symbols (there are none here).
One can ask whether the formula is true (or, in the jargon, "valid"),
which means that it must hold regardless of how one chooses the set
$$ \forall a\exists b.(P(a)\wedge P(b))\to P(a) $$ This is true regardless of the meaning we ascribe toP, and
regardless of the set from which a and b are required to
be drawn.
The longer formula above, which requires that Gödel famously showed that it is an undecidable problem to determine whether a given formula of arithmetic is satisfiable. That is, there is no method which, given any formula, is guaranteed to tell you correctly whether or not there is some interpretation in which the formula is true. But one can limit the form of the allowable formulas to make the problem easier. To take an extreme example, just to illustrate the point, consider the set of formulas of the form:
∃ a, b, etc. are each either 0 or 1, all one needs to do
to decide whether the formula is satisfiable is to try every possible
assignment of 0 and 1 to the n variables and see whether
R(a,b,...) is true in any of the
2^{n} resulting cases. If so, the formula is satisfiable, if not
then not.
## Kurt Gödel, 1933One would like to prove decidability for a larger and more general class of formulas than the rather silly one I just described. How big can the class of formulas be and yet be decidable?It turns out that one need only consider formulas where all the quantifiers are at the front, because there is a simple method for moving quantifiers to the front of a formula from anywhere inside. So historically, attention has been focused on formulas in this form.
One fascinating result concerns the class of formulas called [∃ q...∃z,
with exactly two ∀ quantifiers, with no intervening
∃s. These formulas may contain arbitrary relations amongst the variables,
but no functions or constants, and no equality symbol. [∃^{*}∀^{2}∃^{*}, all, (0)] is
decidable: there is a method which takes any formula in this form and
decides whether it is satisfiable. But if you allow three
∀ quantifiers (or two with an ∃ in between) then the set
of formulas is no longer decidable. Isn't that freaky?
The decidability of the class [∃
In conclusion, I would still like to remark that Theorem I can also be proved, by the same method, for formulas that contain the identity sign. ## OopsThis was believed to be true for more than thirty years, and the result was used by other mathematicians to prove other results. But in the mid-1960s, Stål Aanderaa showed that Gödel's proof would not actually work if the formulas contained equality, and in 1983, Warren D. Goldfarb proved that Gödel had been mistaken, and the satisfiability of formulas in the larger class wasnot
decidable.
## SourcesGödel's original 1933 paper is Zum Entscheidungsproblem des logischen Funktionenkalküls (On the decision problem for the functional calculus of logic) which can be found on pages 306–327 of volume I of his Collected Works. (Oxford University Press, 1986.) There is an introductory note by Goldfarb on pages 226–231, of which pages 229–231 address Gödel's error specifically.I originally heard the story from Val Tannen, and then found it recounted on page 188 of The Classical Decision Problem, by Egon Boerger, Erich Grädel, and Yuri Gurevich. But then blog reader Jeffrey Kegler found the Goldfarb note, of which the Boerger-Grädel-Gurevich account appears to be a summary. Thanks very much to everyone who contributed, and especially to M. Kegler.
(I remind readers who have temporarily forgotten, that [ Addendum 20080206: Another article in this series. ] [ Addendum 20200206: A serious mistake by Henri Lebesgue. ]
Addenda to recent articles 200801
- As a result of my research into the Harriet Tubman mural that was
demolished in 2002, I learned that it had been repainted last year
at 2950 Germantown Avenue.
- A number of readers, including some honest-to-God Italians, wrote
in with explanations of Boccaccio's term
*milliantanove*, which was variously translated as "squillions" and "a thousand hundreds".The "milli-" part suggests a thousand, as I guessed. And "-anta" is the suffix for multiples of ten, found in "quaranta" = "forty", akin to the "-nty" that survives in the word "twenty". And "nove" is "nine". So if we wanted to essay a literal translation, we might try "thousanty-nine". Cormac Ó Cuilleanáin's choice of "squillions" looks quite apt. - My article about clubbing
someone to death with a loaded Uzi neglected an essential
technical point. I repeatedly said that
for my $k (keys %h) { if ($k eq $j) { f($h{$k}) } } could be replaced with: f($h{$j}) But this is only true if `$j`actually appears in`%h`. An accurate translation is:f($h{$j}) if exists $h{$j} I was, of course, aware of this. I left out discussion of this because I thought it would obscure my point to put it in, but I was wrong; the opposite was true. I think my original point stands regardless, and I think that even programmers who are unaware of the existence of `exists`should feel a sense of unease when presented with (or after having written) the long version of the code.An example of this error appeared on PerlMonks shortly after I wrote the article. - Robin Houston provides another example of a
nonstandard adjective in mathematics: a quantum group is
*not*a group.We then discussed the use of nonstandard adjectives in biology. I observed that there seemed to be a trend to eliminate them, as with "jellyfish" becoming "jelly" and "starfish" becoming "sea star". He pointed out that botanists use a hyphen to distinguish the standard from the nonstandard: a "white fir" is a fir, but a "Douglas-fir" is not a fir; an "Atlas cedar" is a cedar, but a "western redcedar" is not a cedar. Several people wrote to discuss the use of "partial" versus "total", particularly when one or the other is implicit. Note that a total order is a special case of a partial order, which is itself a special case of an "order", but this usage is contrary to the way "partial" and "total" are used for functions: just "function" means a total function, not a partial function. And there are clear cases where "partial" is a standard adjective: partial fractions are fractions, partial derivatives are derivatives, and partial differential equations are differential equations. - Steve Vinoski posted a very interesting solution to my
question about how to set Emacs file modes: he suggested
that I could define a replacement
`aput`function. - In my utterly useless review of Robert Graves' novel King
Jesus I said "But how many of you have read I,
Claudius and Suetonius? Hands? Anyone? Yeah, I didn't think
so." But then I got email from James Russell, who said he had indeed
read both, and that he knew
*just*what I meant, and, as a result, was going directly to the library to take out King Jesus. And he read the article on Planet Haskell. Wow! I am speechless with delight. Mr. Russell, I love you. From now on, if anyone asks (as they sometimes do) who my target audience is, I will say "It is James Russell." - A number of people wrote in with examples of "theorems" that were
believed proved, and later turned out to be false. I am preparing a
longer article about this for next month. Here are some teasers:
- Cauchy
apparently "proved" that if a sum of continuous functions converges
pointwise, then the sum is also a continuous function, and this error
was widely believed for several years.
- I just learned of a major
screwup by none other than Kurt Gödel concerning the decidability
of a certain class of sentences of first-order arithmetic which went
undetected for thirty years.
- Robert Tarjan proved in the
1970s that the time complexity of a certain algorithm for the
union-find problem was slightly worse than linear. And several people
proved that this could not be improved upon. But Hantao Zhang has a paper
submitted to STOC
2008 which, if it survives peer review, shows that that the
analysis is wrong, and the algorithm is actually
*O*(*n*).[ Addendum 20160128: Zhang's claim was mistaken, and he retracted the paper. ] - Finally, several people, including John Von Neumann, proved that the
axioms of arithmetic are consistent. But it was shown later that no
such proof is possible.
- Cauchy
apparently "proved" that if a sum of continuous functions converges
pointwise, then the sum is also a continuous function, and this error
was widely believed for several years.
- A number of people wrote in with explanations of "more than twenty states"; I
will try to follow up soon.
Unnecessary imprecision
McCain has won all of the state's 57 delegates, and the last primary before voters in more than 20 states head to the polls next Tuesday.Why "more than 20 states"? Why not just say "23 states", which is shorter and conveys more information? I'm not trying to pick on CTV here. A Google News search finds 42,000 instances of "more than 20", many of which could presumably be replaced with "26" or whatever. Well, I had originally written "most of which", but then I looked at some examples, and found that the situation is better than I thought it would be. Here are the first ten matches:
- Australian Stocks Complete Worst Month in More Than 20 Years
- It said the US air force committed more than 20 cases of aerial espionage by U-2 strategic espionage planes this month.
- Farmland prices have climbed more than 20% over the past year in many Midwestern states...
- "We have had record-breaking growth in our monthly shipments, as much as more than 20 percent improvements per month," said Christopher Larkins, President...
- More than 20 people, including a district officer, were injured when two bombs exploded outside a stadium in the town yesterday...
- By a vote of 14-7, the Senate Finance Committee last night voted to deliver $500 tax rebates to more than 20 million American senior citizens...
- 9 killed, more than 20 injured in bus accident
- While Tuesday's results may not lock up the nomination for either candidate, Democrats will have their say in more than 20 states...
- Facing the potential anointment of his rival, John McCain, Romney has less than a week to convince voters in more than 20 states that...
- More than 20 Aberdeen citizens qualified for elections as April ...
#2 may be legitimate, if the number of cases of aerial espionage is not known with certitude, or if the anonymous source really did say "more than 20". Similarly #4 is entirely off the hook since it is a quotation. #3 may be legitimate if the price of farmland is uncertain and close to 20%. #5 is probably a loser. #7 is definitely a loser: it was the headline of an article that began "Nine people were killed and at least 22 injured when...". The headline could certainly have been "9 killed, 22 injured in bus accident". #8 and #9 are losers, but they are the same example with which I began the article, so they don't count. #10 is a loser. So I have, of eight examples (disregarding #8 and #9) three certain or near-certain failures (#5, #7, and #10), one certain non-failure (#4), and four cases to which I am willing to extend the benefit of the doubt. This is not as bad as I feared. I like when things turn out better than I thought they would. But I really wonder what is going on with all these instances of "more than 20 states". Is it just sloppy writing? Or is there some benefit that I am failing to appreciate?
Ramanujan's congruences
Looking at this, anyone could conjecture that
But there are other such congruences. For example, according to Partition Congruences and the Andrews-Garvan-Dyson Crank:
$$ p(17\cdot41^4k + 1122838) = 0 \pmod{17} $$ Isn't mathematics awesome?
The Census Bureau's data file
The data is available from the Census Bureau's web site. It is a CSV file. Most of the file contains actual data, like this:
20220,,"Dubuque, IA",Metropolitan Statistical Area,"92,384","91,603","91,223","90,635","89,571","89,216","89,265","89,156","89,143"Experienced data mungers will feel a sense of foreboding as they look at the commas in those numerals. Commas are for people, and if the data file is written for people, rather than for computers, then getting the computer to read it is going to require at least a little bit of suffering. Indeed, the rest of the data is rather dirty. There is a useless header: table with row headers in column A and column headers in rows 3 through 4 (leading dots indicate sub-parts),,,,,,,,,,,,^M "Table 1. Annual Estimates of the Population of Metropolitan and Micropolitan Statistical Areas: April 1, 2000 to July 1, 2006",,,,,,,,,,,,^M CBSA Code,"Metro Division Code",Geographic area,"Legal/statistical area description",Population estimates,,,,,,,"April 1, 2000",^M ,,,,"July 1, 2006","July 1, 2005","July 1, 2004","July 1, 2003","July 1, 2002","July 1, 2001","July 1, 2000",Estimates base,Census^M ,,Metropolitan statistical areas,,,,,,,,,,^MAnd there is a similarly useless footer on the bottom of the file. Any program that wants to use this data has to trim off the header and the footer, or ignore them, or the user will have to trim them off manually.
(I've translated ASCII CR characters to Well, all this is minor. My real complaint is that some of the state name abbreviations are garbled:
```
19740,,"Denver-Aurora, CO1",Metropolitan Statistical Area,"2,408,750","2,361,778","2,326,126","2,299,879","2,276,592","2,245,030","2,193,737","2,179,320","2,179,240"
```
Notice that it says CO1 rather than CO, short for
"Colorado". I was fortunate to notice this garbling. Since it
occurred on the line for Denver (among others) the result was that the
program was unable to locate the population of Denver, which is the
capital of Colorado, and a mandatory part of the program's output. So
it raised a warning. Then I went in and manually corrected the
CO1 to say CO. I also added a check to the program
to make sure that it recognized all the state abbreviations; I should
have had this in there in the first place.Then I sent email to an acquaintance who works for the Census Bureau (identity suppressed to protect the innocent), pointing out the errors so that they could be corrected.
My contact checked with the people who produced the data, and informed
me that, according to them,
"1Broomfield, CO was formed from parts of Adams, Boulder, Jefferson, and Weld Counties, CO on November 15, 2001 and was coextensive with Broomfield city.",,,,,,,,,,,,^M "For purposes of presenting data for metropolitan and micropolitan statistical areas for Census 2000, Broomfield is treated as if it were a county at the time of the 2000 census.",,,,,,,,,,,,^MA footnote. I realize now that that footer was not as useless as I thought it was. Wow. A footnote. Wow. I would like to suggest the following as a basic principle of computerized data processing:
Data files should contain data. Not metadata. Not explanations. Not little essays. And not footnotes. Just the data.There's a larger issue here about confusing content and presentation. But "Data files should contain data" is simpler and easier to remember. I suspect that this file was exported from a spreadsheet program, probably Excel. Spreadsheet programs desperately want you to confuse content and presentation. This is why one should not use a spreadsheet as a database. I now recall another occasion when I had to deal with data that was exported from a spreadsheet that was pretending to be a database. It was a database of products made by a large cosmetics company. A typical record looked like this:
"Soft-Pressed Powder Blusher","618J-05","Warm, natural-looking powder colour for all skins. Wide range of shades-subtle to vibrant. With applicator brush.","Cheeks","Nudes","Chestnut Blush","All","","19951201","Yes","","14.5",""The 618J-05 here is a product code. Bonus
points if you see what's coming next.
```
"Water-Dissolve Cream Cleanser","6.61E+01","Creamy cleanser for drier, more sensitive skins. Dissolves even the most tenacious makeups.","Cleansers","","","","Sub I, I, II","19951201","Yes","1","14.5",""
```
That 6.61E+01 should have been 661E-01, but Excel
decided that it was a numeral, in scientific notation, and put it into
normal form.
Back to the Census Bureau, which almost screwed me by putting a footnote
on a state name. What if they had decided to put footnotes on the
population figures? Then I would have been No, wait! It's all become clear. That's why they put the commas in the numerals! [ Addendum 20080129: My Census Bureau contact tells me that the authors of the data file have seen the wisdom of my point of view, in spite of my unconstructive and unhelpful feedback (I said "Wow, that is an incredibly terrible idea") and are planning to address the issue in the next release of the data. Hooray for happy endings! ]
[ Addendum 20080129: My Census Bureau contact tells me
that they
Nonstandard adjectives in mathematics
The property is not really attached to the adjective itself. Red emeralds are not emeralds, so "red" is nonstandard when applied to emeralds. Fake expressions of sympathy are still expressions of sympathy, however insincere. "Toy" often goes both ways: a toy fire engine is not a fire engine, but a toy ball is a ball and a toy dog is a dog. Adjectives in mathematics are rarely nonstandard. An Abelian group is a group, a second-countable topology is a topology, an odd integer is an integer, a partial derivative is a derivative, a well-founded order is an order, an open set is a set, and a limit ordinal is an ordinal. When mathematicians want to express that a certain kind of entity is similar to some other kind of entity, but is not actually some other entity, they tend to use compound words. For example, a pseudometric is not (in general) a metric. The phrase "pseudo metric" would be misleading, because a "pseudo metric" sounds like some new kind of metric. But there is no such term.
But there is one glaring exception. A partial function is If I were more quixotic, I would propose that partial functions be called "partialfunctions" instead. Or perhaps "pseudofunctions". Or one could go the other way and call them "normal relations", where "normal" can be replaced by whatever adjective you prefer—ejective relations, anyone? I was about to write "any of these would be preferable to the current confusion", but actually I think it probably doesn't matter very much. [ Addendum 20080201: Another example, and more discussion of "partial". ] [ Addendum 20081205: A contravariant functor is not a functor. ] [ Addendum 20090121: A hom-set is not a set. ] [ Addendum 20110905: A skew field is not a field. The Wikipedia article about division rings observes that this use of "skew" is counter to the usual behavior of adjectives in mathematics. ] [ Addendum 20120819: A snub cube is not a cube. Several people have informed me that a quantum group is not a group. ]
[ Addendum 20140708: nLab refers to the [ Addendum 20160505: The gaussian integers contain the integers, not vice versa, so a gaussian integer is not in general an integer. ] [ Addendum 20190503: Timon Salar Gutleb points out that affine spaces were at one time called “affine vector spaces” so that every vector space was an affine vector space, but not vice versa. ] [ Addendum 20221106: the standard term for this appears to be “privative adjective”. ] [ Addendum 20240508: [Robin Houston asks if there is such a thing as an incorrect proof](https://mathstodon.xyz/@robinhouston/112401229287518162). [Simon Tatham points out](https://mathstodon.xyz/@simontatham@hachyderm.io/112404198589351199) that a “manifold with boundary” is not a manifold. ]
Emacs and alists
Yesterday I upgraded Emacs, and since it was an upgrade, something that
had been working for me for fifteen years stopped working, because
that's what "upgrade" means. My
(aput 'auto-mode-alist "\\.pl\\'" (function cperl-mode)) (aput 'auto-mode-alist "\\.t\\'" (function cperl-mode)) (aput 'auto-mode-alist "\\.cgi\\'" (function cperl-mode)) (aput 'auto-mode-alist "\\.pm\\'" (function cperl-mode)) (aput 'auto-mode-alist "\\.blog\\'" (function text-mode)) (aput 'auto-mode-alist "\\.sml\\'" (function sml-mode))I should explain this, since I imagine that most readers of this blog are like me in that they touch Emacs Lisp only once a year on Saint Vibrissa's Day. An alist ("association list") is a common data
structure in Lisp programs. It is a list of pairs; the first element
of each pair is a key, and the second element is an associated value.
The pairs in the special auto-mode-alist variable have regexes
as their keys and functions as their values. Whenever Emacs opens a
new file, it scans this alist, until it finds a regex that matches the
name of the file. It then executes the associated function. Thus the
effect of the first line above is to have Emacs enable the
cperl-mode function on any file whose name ends in ".pl".
The
When I upgraded emacs, this broke. The
I asked about this on IRC, and was told that the correct way to do
this, if I did not want to
(mapc (lambda (x) (when (eq 'perl-mode (cdr x)) (setcdr x 'cperl-mode))) (append auto-mode-alist interpreter-mode-alist))The effect of this is to scan over auto-mode-alist (and also
interpreter-mode-alist, a related variable) looking for any
association whose value was the perl-mode function, and
using setcdr to replace perl-mode with
cperl-mode.
(This does not address the issue of what to do with
I was totally boggled. Choosing the right editing mode for a file is
a basic function of emacs. I could not believe that the best and
simplest way to add or change associations was to use
Apparently the issue was that if
This seemed very unlikely to me. You see, an alist is a
I looked in the manual, and reported that the I pondered the manual some more and found this passage:
However, association lists have their own advantages. Depending on your application, it may be faster to add an association to the front of an association list than to update a property.That is, it is expressly endorsing the technique of adding a new item to the front of an alist in order to override any later item that might have the same key.
After finding that the add-to-the-front technique really did work, I
reasoned that if someday Emacs stopped searching alists sequentially,
I would not be in any more trouble than I had been today when they
removed the So I did not take the advice I was given. Instead, I left it pretty much the way it was. I did take the opportunity to clean up the code a bit:
(push '("\\.pl\\'" . cperl-mode) auto-mode-alist) (push '("\\.t\\'" . cperl-mode) auto-mode-alist) (push '("\\.cgi\\'" . cperl-mode) auto-mode-alist) (push '("\\.pm\\'" . cperl-mode) auto-mode-alist) (push '("\\.blog\\'" . text-mode) auto-mode-alist) (push '("\\.sml\\'" . sml-mode) auto-mode-alist)The push function simply appends an element to the front of a
list, modifying the list in-place.But wow, the advice I got was phenomenally bad. It was bad in a really interesting way, too. It reminded me of the advice people get on the #math channel, where some guy comes in with some question about triangles and gets the category-theoretic viewpoint on triangles as natural transformations of something or other. The advice was bad because although it was correct, it was completely devoid of common sense. [ Addendum 20080124: It has been brought to my attention that the Emacs FAQ endorses my solution, which makes the category-theoretic advice proposed by the #emacs blockheads even less defensible. ]
[ Addendum 20080201: Steve Vinoski
suggests replacing the
Smallest state capitals
At the other end of the scale, of course, we have state capitals like
Boston, Denver, Atlanta, and Honolulu that Well, James, it only took me thirty years, but here it is. I tried to resolve the question manually a few weeks ago, by browsing Wikipedia for the populations of likely candidates. Today I took a more methodical approach, downloading the U.S. Census Bureau's July 2006 estimates for populations of metropolitan areas, and writing a couple of little programs to grovel the data. I had to augment the Census Bureau's data with two items: Annapolis, MD, and Montpelier, VT are not large enough to be included in the metropolitan area data file. I used U.S. Census 2006 estimates for these cities as well. I discarded one conurbation: the Census Bureau includes a "Metropolitan Division" in New Hampshire that consists of Rockingham and Strafford counties; this was the most populous identified area in New Hampshire. It didn't seem entirely germane to the question, so I took it out. On the other hand, including it doesn't change the results much: its population is 416,000, compared with Manchester-Nashua's 402,000. The results follow.
Vermont is an interesting outlier here. It makes fourth place not
because it has a large city, but because its capital, Montpelier, is
so The x-axis is the
population of the state capital; the
y-axis is the quotient. (Both axes are log scale.) Vermont is
the leftmost point, near the top. The large collection of points on
the x-axis are of course the nineteen states for which the
capital and largest city coincide.[ Addendum 20080129: Some remarks about the format of the Census Bureau's data file. ] [ Addendum 20090217: A comparison of the relative sizes of each state's largest and second-largest cities. ]
Utterly Useless Book Reviews (#1 in a series?)
Okay, here's the explanation. Robert Graves was a novelist and a poet. (He himself said he was a poet who wrote novels so that he could earn enough money to write poetry.) I, Claudius is his best-known work. It is a history of the Roman emperors from the end of the reign of Julius Caesar up to the coronation of Claudius, told from the point of view of Claudius, who, though most of the book, is viewed by most of the other characters as harmless and inept, perhaps mentally deficient, or perhaps merely a doofus. It is this inept doofosity that explains his survival and eventual ascension to the Imperial throne at a time when everyone else in line for it was being exiled, burnt, poisoned, or disemboweled. The book is still in print, and in the 1970s, the BBC turned it into an extremely successful TV miniseries starring Derek Jacobi (as Claudius, obviously) and a lot of other actors who subsequently became people you have heard of. (Patrick Stewart! With hair!) Graves was a classical scholar, and based his novel on the historical accounts available, principally The Twelve Caesars of Suetonius. Suetonius wrote his history after all the people involved were dead, and his book reads like a collection of anecdotes placed in approximately chronological order. Suetonius seems to have dug up and recorded as fact every scurrilous rumor he could find. Some of the rumors are contradictory, and some merely implausible. When Graves turned The Twelve Caesars into I, Claudius, he resolved this mass of unprocessed material into a coherent product. The puzzling trivialities are explained. The contradictions are cleared up. Sometimes the scurrilous rumors are explained as scurrilous rumors; sometimes Claudius explains the grain of truth that lies at their center. Other times the true story, as related by Claudius, is even worse than the watered-down version that came to Suetonius's ears. Suetonius mentions that, as emperor, Claudius tried to introduce three new letters into the alphabet. Huh? In Graves' novel, this is foreshadowed early, and when it finally happens, it makes sense. In King Jesus, then, Graves has done for the Bible what he did for Suetonius in I, Claudius. He takes a mass of material, much of it misreported, or partly-forgotten stories written down a generation later, and reconstructs a plausible history from which that mass of material could have developed. The miracles are explained, without requiring anything supernatural or magical, but, at the same time, without becoming any less miraculous. There is a story that Borges tells about the miracles performed by the Buddha, who generally eschewed miracles as being too showy. But Borges tells the story that one day the Buddha had to cross a desert, and seven different gods each gave him a parasol to shade his head. The Buddha did not want to offend any of the gods, so he split himself into seven Buddhas, and each one crossed the desert using a different parasol. He performed a miracle of politeness. (The trouble with Borges's stories is that you never know which ones he read in some obscure 17th-century book, and which ones he made up himself. I spent a whole year thinking how clever Borges had been to have invented the novelist Adolfo Bioy Casares, with his alphabetical initials, and then one day I was in the bookstore and came upon the Adolfo Bioy Casares section. Oops.)
Anyway, Graves lets Jesus have the miracles, and they
are indeed miraculous, but they are miracles of kindness and insight, not
miracles of stage magic. When Graves explains the miracles, you say
"oh, of course", I recommend it. Check it out. [ Addendum 20080201: James Russell has read both I, Claudius and Twelve Caesars. ]
Help, help!
Przemek Klosowski wrote to offer me physics help, and also to ask about introspection on Perl objects. Specifically, he said that if you called a nonexistent method on a TCL object, the error message would include the names of all the methods that would have worked. He wanted to know if there was a way to get Perl to do something similar. There isn't, precisely, because Perl has only a conventional distinction between methods and subroutines, and you Just Have To Know which is which, and avoid calling the subroutines as methods, because the Perl interpreter has no idea which is which. But it does have enough introspection features that you can get something like what you want. This article will explain how to do that. Here is a trivial program that invokes an undefined method on an object:
use YAML; my $obj = YAML->new; $obj->nosuchmethod;When run, this produces the fatal error:
Can't locate object method "nosuchmethod" via package "YAML" at test.pl line 4.( YAML in this article is just an example; you don't have to
know what it does. In fact, I don't know what it does.)Now consider the following program instead:
```
use YAML;
use Help 'YAML';
my $obj = YAML->new;
$obj->nosuchmethod;
```
Now any failed method calls to YAML objects, or objects of
YAML's subclasses, will produce a more detailed error
message:
Unknown method 'nosuchmethod' called on object of class YAML Perhaps try: Bless Blessed Dump DumpFile Load LoadFile VALUE XXX as_heavy (inherited from Exporter) die (inherited from YAML::Base) dumper_class dumper_object export (inherited from Exporter) export_fail (inherited from Exporter) export_ok_tags (inherited from Exporter) export_tags (inherited from Exporter) export_to_level (inherited from Exporter) field freeze global_object import (inherited from Exporter) init_action_object loader_class loader_object new (inherited from YAML::Base) node_info (inherited from YAML::Base) require_version (inherited from Exporter) thaw warn (inherited from YAML::Base) ynode Aborting at test.pl line 5Some of the methods in this list are bogus. For example, the stuff inherited from Exporter should almost certainly not be
called on a YAML object.
Some of the items may be intended to be called as functions, and not
as methods. Some may be functions imported from some other module. A
common offender here is
Even when the items in the list really are methods, they may be
undocumented, internal-use-only methods, and may disappear in future
versions of the
But even with all these warnings,
The real reason for this article is to present the code for
Here's the code:
package Help; use Carp 'croak'; sub import { my ($selfclass, @classes) = @_; for my $class (@classes) { push @{"$class\::ISA"}, $selfclass; } } sub AUTOLOAD { my ($bottom_class, $method) = $AUTOLOAD =~ /(.*)::(.*)/; my %known_method; my @classes = ($bottom_class); while (@classes) { my $class = shift @classes; next if $class eq __PACKAGE__; unshift @classes, @{"$class\::ISA"}; for my $name (keys %{"$class\::"}) { next unless defined &{"$class\::$name"}; $known_method{$name} ||= $class; } } warn "Unknown method '$method' called on object of class $bottom_class\n"; warn "Perhaps try:\n"; for my $name (sort keys %known_method) { warn " $name " . ($known_method{$name} eq $bottom_class ? "" : "(inherited from $known_method{$name})") . "\n"; } croak "Aborting"; } sub help { $AUTOLOAD = ref($_[0]) . '::(none)'; goto &AUTOLOAD; } sub DESTROY {} 1;
When any part of the program invokes |