The Universe of Discourse

Tue, 21 May 2019

Super-obscure bug in my code

Say $dt is a Perl DateTime object.

You are allowed to say

  $dt->add( days => 2 )
  $dt->subtract( days => 2 )

Today Jeff Boes pointed out that I had written a program that used

  $dt->add({ days => 2 })

which as far as I can tell is not documented to work. But it did work. (I wrote it in 2016 and would surely have noticed by now if it hadn't.) Jeff told me he noticed when he copied my code and got a warning. When I tried it, no warning.

It turns out that

  $dt->add({ days => 2 })
  $dt->subtract({ days => 2 })

both work, except that:

  1. The subtract call produces a warning (add doesn't! and Jeff had changed my add to subtract)

  2. If you included an end_of_month => $mode parameter in the arguments to subtract, it would get lost.

Also, the working-ness of what I wrote is a lucky fluke. It is undocumented (I think) and works only because of a quirk of the implementation. ->add passes its arguments to DateTime::Duration->new, which passes them to Params::Validate::validate. The latter is documented to accept either form. But its use by DateTime::Duration is an undocumented implementation detail.

->subtract works the same way, except that it does a little bit of preprocessing on the arguments before calling DateTime::Duration->new. That's where the warning comes from, and why end_of_month won't work with the hashref form.

(All this is as of version 1.27. The current version is 1.51. Matthew Horsfall points out that 1.51 does not raise a warning, because of a different change to the same interface.)

This computer stuff is amazingly complicated. I don't know how anyone gets anything done.

[Other articles in category /prog/bug] permanent link

Mon, 20 May 2019

Alphabetical order in Korean

Alphabetical order in Korean has an interesting twist I haven't seen in any other language.

(Perhaps I should mention up front that Korean does not denote words with individual symbols the way Chinese does. It has a 24-letter alphabet, invented in the 15th century.)

Consider the Korean word “문어”, which means “octopus”. This is made up of five letters ㅁㅜㄴㅇㅓ. The ㅁㅜㄴ are respectively equivalent to English ‘m’, ‘oo‘ (as in ‘moon‘), and ‘n’. The ㅇis silent, just like ‘k’ in “knit”. The ㅓis a vowel we don't have in English, partway between “saw” and “bud”. Confusingly, it is usually rendered in Latin script as ‘eo’. (It is the first vowel in “Seoul”, for example.) So “문어” is transliterated to Latin script as “muneo”, or “munǒ”, and approximately pronounced “moon-aw”.

But as you see, it's not written as “ㅁㅜㄴㅇㅓ” but as “문어”. The letters are grouped into syllables of two or three letters each. (Or, more rarely, four or even five.)

Now consider the word “무해” (“harmless”) This word is made of the four letters ㅁㅜㅎㅐ. The first two, as before, are ‘m’, ‘oo’. The ㅎ is ‘h’ and the ‘ㅐ’ is a vowel that is something like the vowel in “air”, usually rendered in Latin script as ‘ae’. So it is written “muhae” and pronounced something like “moo-heh”.

ㅎis the last letter of the alphabet. Because ㅎfollows ㄴ, you might think that 무해 would follow 문어. But it does not. In Korean, alphabetization is also done at the syllable level. The syllable 무 comes before 문, because it is a proper prefix, so 무해 comes before 문어. If the syllable break in 문어 were different, causing it to be spelled 무너, it would indeed come before 무해. But it isn't, so it doesn't. (“무너” does not seem to be an actual word, but it appears as a consitutent in words like 무너지다 (“collapse”) and 무너뜨리다 (“demolish”) which do come before 무해 in the dictionary.)

As far as I know, there is nothing in Korean analogous to the English alphabet song.

Or to alphabet soup! Koreans love soup! And they love the alphabet, so why no hangeul-tang? There is a hundred dollar bill lying on the sidewalk here, waiting to be picked up.

[ Previously, but just barely related: Medieval Chinese typesetting technique. ]

[Other articles in category /lang] permanent link

Fri, 17 May 2019

What's the difference between 0/0 and 1/0?

Last year a new Math Stack Exchange user asked What's the difference between !!\frac00!! and !!\frac10!!?.

I wrote an answer I thought was pretty good, but the question was downvoted and deleted as “not about mathematics”. This is bullshit, but what can I do?

I can repatriate my answer here, anyway.

This long answer has two parts. The first one is about the arithmetic, and is fairly simple, and is not very different from the other answers here: neither !!\frac10!! nor !!\frac00!! has any clear meaning. But your intuition is a good one: if one looks at the situation more carefully, !!\frac10!! and !!\frac00!! behave rather differently, and there is more to the story than can be understood just from the arithmetic part. The second half of my answer tries to go into these developments.

The notation !!\frac ab!! has a specific meaning:

The number !!x!! for which $$x\cdot b=a.$$

Usually this is simple enough. There is exactly one number !!x!! for which !!x\cdot 7=21!!, namely !!3!!, so !!\frac{21}7=3!!. There is exactly one number !!x!! for which !!x\cdot 4=7!!, namely !!\frac74!!, so !!\frac74\cdot4=7!!.

But when !!b=0!! we can't keep the promise that is implied by the word "the" in "The number !!x!! for which...". Let's see what goes wrong. When !!b=0!! the definition says:

The number !!x!! for which $$x\cdot 0=a.$$

When !!a\ne 0!! this goes severely wrong. The left-hand side is zero and the right-hand size is not, so there is no number !!x!! that satisfies the condition. Suppose !!x!! is the ugliest gorilla on the dairy farm. But the farm has no gorillas, only cows. Any further questions you have about !!x!! are pointless: is !!x!! a male or female gorilla? Is its fur black or dark gray? Does !!x!! prefer bananas or melons? There is no such !!x!!, so the questions are unanswerable.

When !!a!! and !!b!! are both zero, something different goes wrong:

The number !!x!! for which $$x\cdot 0=0.$$

It still doesn't work to speak of "The number !!x!! for which..." because any !!x!! will work. Now it's like saying that !!x!! is ‘the’ cow from the dairy farm, But there are many cows, so questions about !!x!! are still pointless, although in a different way: Does !!x!! have spots? I dunno man, what is !!x!!?

Asking about this !!x!!, as an individual number, never makes sense, for one reason or the other, either because there is no such !!x!! at all (!!\frac a0!! when !!a≠0!!) or because the description is not specific enough to tell you anything (!!\frac 00!!).

If you are trying to understand this as a matter of simple arithmetic, using analogies about putting cookies into boxes, this is the best you can do. That is a blunt instrument, and for a finer understanding you need to use more delicate tools. In some contexts, the two situations (!!\frac00!! and !!\frac10!!) are distinguishable, but you need to be more careful.

Suppose !!f!! and !!g!! are some functions of !!x!!, each with definite values for all numbers !!x!!, and in particular !!g(0) = 0!!. We can consider the quantity $$q(x) = \frac{f(x)}{g(x)}$$ and ask what happens to !!q(x)!! when !!x!! gets very close to !!0!!. The quantity !!q(0)!! itself is undefined, because at !!x=0!! the denominator is !!g(0)=0!!. But we can still ask what happens to !!q!! when !!x!! gets close to zero, but before it gets all the way there. It's possible that as !!x!! gets closer and closer to zero, !!q(x)!! might get closer and closer to some particular number, say !!Q!!; we can ask if there is such a number !!Q!!, and if so what it is.

It turns out we can distinguish quite different situations depending on whether the numerator !!f(0)!! is zero or nonzero. When !!f(0)\ne 0!!, we can state decisively that there is no such !!Q!!. For if there were, it would have to satisfy !!Q\cdot 0=f(0)!! which is impossible; !!Q!! would have to be a gorilla on the dairy farm. There are a number of different ways that !!q(x)!! can behave in such cases, when its denominator approaches zero and its numerator does not, but all of the possible behaviors are bad: !!q(x)!! can increase or decrease without bound as !!x!! gets close to zero; or it can do both depending on whether we approach zero from the left or the right; or it can oscillate more and more wildly, but in no case does it do anything like gently and politely approaching a single number !!Q!!.

But if !!f(0) = 0!!, the answer is more complicated, because !!Q!! (if it exists at all) would only need to satisfy !!Q\cdot 0=0!!, which is easy. So there might actually be a !!Q!! that works; it depends on further details of !!f!! and !!g!!, and sometimes there is and sometimes there isn't. For example, when !!f(x) = x^2+2x!! and !!g(x) = x!! then !!q(x) = \frac{x^2+2x}{x}!!. This is still undefined at !!x=0!! but at any other value of !!x!! it is equal to !!x+2!!, and as !!x!! approaches zero, !!q(x)!! slides smoothly in toward !!2!! along the straight line !!x+2!!. When !!x!! is close to (but not equal to) zero, !!q(x)!! is close to (but not equal to) !!2!!; for example when !!x=0.001!! then !!q(x) = \frac{0.002001}{0.001} = 2.001!!, and as !!x!! gets closer to zero !!q(x)!! gets even closer to !!2!!. So the number !!Q!! we were asking about does exist, and is in fact equal to !!2!!. On the other hand if !!f(x) = x!! and !!g(x) = x^2!! then there is still no such !!Q!!.

The details of how this all works, when there is a !!Q!! and when there isn't, and how to find it, are very interesting, and are the basic idea that underpins all of calculus. The calculus part was invented first, but it bothered everyone because although it seemed to work, it depended on an incoherent idea about how division by zero worked. Trying to frame it as a simple matter of putting cookies into boxes was no longer good enough. Getting it properly straightened out was a long process that took around 150 years, but we did eventually get there and now I think we understand the difference between !!\frac10!! and !!\frac00!! pretty well. But to really understand the difference you probably need to use the calculus approach, which may be more delicate than what you are used to. But if you are interested in this question, and you want the full answer, that is definitely the way to go.

[Other articles in category /math] permanent link

Thu, 02 May 2019

Mathematical jargon failures

A while back I wrote an article about confusing and misleading technical jargon, drawing special attention to botanists’ indefensible misuse of the word “berry” and then to the word “henge”, which archaeologists use to describe a class of Stonehenge-like structures of which Stonehenge itself is not a member.

I included a discussion of mathematical jargon and generally gave it a good grade, saying:

Nobody hearing the term “cobordism” … will think for an instant that they have any idea what it means … they will be perfectly correct.

But conversely:

The non-mathematician's idea of “line”, “ball”, and “cube” is not in any way inconsistent with what the mathematician has in mind …

Today I find myself wondering if I gave mathematics too much credit. Some mathematical jargon is pretty bad. Often brought up as an example are the topological notions of “open” and “closed” sets. It sounds as if they should be exclusive and exhaustive — surely a set that is open is not closed, and vice versa? — but no, there are sets that are neither open nor closed and other sets that are both. Really the problem here is entirely with “open”. The use of “closed” is completely in line with other mathematical uses of “closed” and “closure”. A “closed” object is one that is a fixed point of a closure operator. Topological closure is an example of a closure operator, and topologically closed sets are its fixed points.

(Last month someone asked on Stack Exchange if there was a connection between topological closure and binary operation closure and I was astounded to see a consensus in the comments that there was no relation between them. But given a binary operation !!\oplus!!, we can define an associated closure operator !!\text{cl}_\oplus!! as follows: !!\text{cl}_\oplus(S)!! is the smallest set !!\bar S!! that contains !!S!! and for which !!x,y\in\bar S!! implies !!x\oplus y\in \bar S!!. Then the binary operation !!\oplus!! is said to be “closed on the set !!S!!” precisely if !!S!! is closed with respect to !!\text{cl}_\oplus!!; that is if !!\text{cl}_\oplus(S) = S!!. But I digress.)

Another example of poor nomenclature is “even” and “odd” functions. This is another case where it sounds like the terms ought to form a partition, as they do in the integers, but that is wrong; most functions are neither even nor odd, and there is one function that is both. I think what happened here is that first an “even” polynomial was defined to be a polynomial whose terms all have even exponents (such as !!x^4 - 10x^2 + 1!!) and similarly an “odd” polynomial. This already wasn't great, because most polynomials are neither even nor odd. But it was not too terrible. And at least the meaning is simple and easy to remember. (Also you might like the product of an even and an odd polynomial to be even, as it is for even and odd integers, but it isn't, it's always odd. As far as even-and-oddness is concerned the multiplication of the polynomials is analogous to addition of integers, and to get anything like multiplication you have to compose the polynomials instead.)

And once that step had been taken it was natural to extend the idea from polynomials to functions generally: odd polynomials have the property that !!p(-x) = -p(x)!!, so let's say that an odd function is one with that property. If an odd function is analytic, you can expand it as a Taylor series and the series will have only odd-degree terms even though it isn't a polynomial.

There were two parts to that journey, and each one made some sense by itself, but by the time we got to the end it wasn't so easy to see where we started from. Unfortunate.

I tried a web search for bad mathematics terminology and the top hit was this old blog article by my old friend Walt. (Not you, Walt, another Walt.) Walt suggests that

the worst terminology in all of mathematics may be that of !!G_\delta!! and !!F_\sigma!! sets…

I can certainly get behind that nomination. I have always hated those terms. Not only does it partake of the dubious open-closed terminology I complained of earlier (you'll see why in a moment), but all four letters are abbreviations for words in other languages, and not the same language. A !!G_\delta!! set is one that is a countable intersection of open sets. The !!G!! is short for Gebiet, which is German for an open neighborhood, and the !!\delta!! is for durchschnitt, which is German for set intersection. And on the other side of the Ruhr Valley, an !!F_\sigma!! set, which is a countable union of closed sets, is from French fermé (“closed”) and !!\sigma!! for somme (set union). And the terms themselves are completely opaque if you don't keep track of the ingredients of this unwholesome German-French-Greek stew.

This put me in mind of a similarly obscure pair that I always mix up, the type I and type II errors. One if them is when you fail to ignore something insignificant, and the other is when you fail to notice something significant, but I don't remember which is which and I doubt I ever will.

But the one I was thinking about today that kicked all this off is, I think, worse than any of these. It's really shameful, worthy to rank with cucumbers being berries and with Stonhenge not being a henge.

These are all examples of elliptic curves:

These are not:

That's right, ellipses are not elliptic curves, and elliptic curves are not elliptical. I don't know who was responsible for this idiocy, but if I ever meet them I'm going to kick them in the ass.

[ Addendum 20200510: Several people have earnestly explained to me how this terminological disaster came about. Please be assured that I am well aware of the history here. The situation is similar to the one that gave us “even” and “odd” functions: a long chain of steps each of which made some sense individually, but whose concatenation ended in a completely different place. This MathOverflow post has a good summary. ]

[ Addendum 20200510: Mark Badros has solved the “Type I / II” problem for me. They point out that in the story of the Boy Who Cried Wolf, there are two episodes. In the first episode, the boy and the villagers commit a Type I error by reacting to the presence of a wolf when there is none. In the second episode, they commit a Type II error by failing to react to the actual wolf. Thank you! ]

[Other articles in category /lang] permanent link