The Universe of Discourse
           
Fri, 29 Feb 2008

Happy Leap Day! Persian edition
Roland Young has brought to my attention that the Persian calendar uses a hybrid 7/29 and 8/33 system. I was going to post this as an addendum to today's Leap Day article, but it got too long.

If I understand the rules correctly, to determine if a Persian year is a leap year, one applies the following algorithm to the Persian year number y. (Note that the current Persian year is not 2008, but 1386. Persian year 1387 will begin on the vernal equinox.) I will write a % b to denote the remainder when a is divided by b. Then:

  1. Let a = (y + 2345) % 2820.
  2. If a is 2819, y is a leap year. Otherwise,
  3. Let b = a % 128.
  4. If b < 29, let c = b. Otherwise, let c = (b - 29) % 33.
  5. If c = 0, y is not a leap year. Otherwise,
  6. If c is a multiple of 4, y is a leap year. Otherwise,
  7. y is not a leap year.
(Perl source code is available.)

This produces 683 leap years out of every 2820, which means that the average calendar year is 365.24219858 days.

How does this compare with the Dominus calendar? It is indeed more accurate, but I consider 683/2820 to be an unnecessarily precise representation of the vernal equinox year, especially inasmuch as the length of the year is changing. And the rule, as you see, is horrendous, requiring either a 2,820-entry lookup table or complicated logic.

Moreover, the Persian and Gregorian calendar are out of sync at present. Persian year 1387, which begins next month on the vernal equinox, is a leap year. But the intercalation will not take place until the last day of the year, around 21 March 2009. The two calendars will not sync up until the year 2092/1470, and then will be confounded only eight years later by the Gregorian 100-year exception. After that they will agree until 2124/1502. Clearly, even if it were advisable to switch to the Persian calendar, the time is not yet right.

I found this Frequently Asked Questions About Calendars page extremely helpful in preparing this article. The Wikipedia article was also useful. Thanks again to Roland Young for bringing this matter to my attention.


[Other articles in category /calendar] permanent link

Note on point-free programming style
This old comp.lang.functional article by Albert Y. C. Lai, makes the point that Unix shell pipeline programming is done in an essentially "point-free" style, using the shell example:

    grep '^X-Spam-Level' | sort | uniq | wc -l
and the analogous Haskell code:

    length . nub . sort . filter (isPrefixOf "X-Spam-Level")
Neither one explicitly mentions its argument, which is why this is "point-free". In "point-free" programming, instead of defining a function in terms of its effect on its arguments, one defines it by composing the component functions themselves, directly, with higher-order operators. For example, instead of:

  foo x y = 2 * x + y
one has, in point-free style:

  foo = (+) . (2 *)
where (2 *) is the function that doubles its argument, and (+) is the (curried) addition function. The two definitions of foo are entirely equivalent.

As the two examples should make clear, point-free style is sometimes natural, and sometimes not, and the example chosen by M. Lai was carefully selected to bias the argument in favor of point-free style.

Often, after writing a function in pointful style, I get the computer to convert it automatically to point-free style, just to see what it looks like. This is usually educational, and sometimes I use the computed point-free definition instead. As I get better at understanding point-free programming style in Haskell, I am more and more likely to write certain functions point-free in the first place. For example, I recently wrote:

        soln = int 1 (srt (add one (neg (sqr soln))))
and then scratched my head, erased it, and replaced it with the equivalent:

        soln = int 1 ((srt . (add one) . neg . sqr) soln)
I could have factored out the int 1 too:
        soln = (int 1 . srt . add one . neg . sqr) soln
I could even have removed soln from the right-hand side:

        soln = fix (int 1 . srt . add one . neg . sqr)
but I am not yet a perfect sage.

Sometimes I opt for an intermediate form, one in which some of the arguments are explicit and some are implicit. For example, as an exercise I wrote a function numOccurrences which takes a value and a list and counts the number of times the value occurs in the list. A straightforward and conventional implementation is:

        numOccurrences x []     = 0
        numOccurrences x (y:ys) = 
                if (x == y) then 1 + rest
                else                 rest
            where rest = numOccurrences x ys
but the partially point-free version I wrote was much better:

        numOccurrences x = length . filter (== x)
Once you see this, it's easy to go back to a fully pointful version:

        numOccurrences x y = length (filter (== x) y)
Or you can go the other way, to a point-free version:

        numOccurrences = (length .) . filter . (==)
which I find confusing.

Anyway, the point of this note is not to argue that the point-free style is better or worse than the pointful style. Sometimes I use the one, and sometimes the other. I just want to point out that the argument made by M. Lai is deceptive, because of the choice of examples. As an equally biased counterexample, consider:

        bar x = x*x + 2*x + 1
which the automatic converter informs me can be written in point-free style as:

        bar = (1 +) . ap ((+) . join (*)) (2 *)
Perusal of this example will reveal much to the attentive reader, including the definitions of join and ap. But I don't think many people would argue that it is an improvement on the original. (Maybe I'm wrong, and people would argue that it was an improvement. I won't know for sure until I have more experience.)

For some sort of balance, here is another example where I think the point-free version is at least as good as the pointful version: a recent comment on Reddit suggested a >>> operator that composes functions just like the . operator, but in the other order, so that:

        f >>> g = g . f
or, if you prefer:

        >>> f g x = g(f(x))
The point-free definition of >>> is:

        (>>>) = flip (.)
where the flip operator takes a function of two arguments and makes a new function that does the same thing, but with the arguments in the opposite order. Whatever your feelings about point-free style, it is undeniable that the point-free definition makes perfectly clear that >>> is nothing but . with its arguments in reverse order.


[Other articles in category /prog/haskell] permanent link

Happy Leap Day!
I have an instructive followup to yesterday's article all ready to go, analyzing a technique for finding rational roots of polynomials that I found in the First Edition of the Encyclopædia Britannica. A typically Universe-of-Discourse kind of article. But I'm postponing it to next month so that I can bring you this timely update.

Everyone knows that our calendar periodically contains an extra day, known to calendar buffs as an "intercalary day", to help make it line up with the seasons, and that this intercalary day is inserted at the end of February. But, depending on how you interpret it, this isn't so. The extra day is actually inserted between February 23 and February 24, and the rest of February has to move down to make room.

I will explain. In Rome, 23 February was a holiday called Terminalia, sacred to Terminus, the god of boundary markers. Under the calendars of the Roman Republic, used up until 46 BCE, an intercalary month, Mercedonius, was inserted into the calendar from time to time. In these years, February was cut down to 23 days (and good riddance; nobody likes February anyway) and Mercedonius was inserted at the end.

When Julius Caesar reformed the calendar in 46, he specified that there would be a single intercalary day every four years much as we have today. As in the old calendar, the intercalary day was inserted after Terminalia, although February was no longer truncated.

So the extra day is actually 24 February, not 29 February. Or not. Depends on how you look at it.

Scheduling intercalary days is an interesting matter. The essential problem is that the tropical year, which is the length of time from one vernal equinox to the next, is not an exact multiple of one day. Rather, it is about 365¼ days. So the vernal equinox moves relative to the calendar date unless you do something to fix it. If the tropical year were exactly 365¼ days long, then four tropical years would be exactly 1461 days long, and it would suffice to make four calendar years 1461 days long, to match. This can be accomplished by extending the 365-day calendar year with one intercalary day every four years. This is the Julian system.

Unfortunately, the tropical year is not exactly 365¼ days long. It is closer to 365.24219 days long. So how many intercalary days are needed?

It suffices to make 100,000 calendar years total exactly 36,524,219 days, which can be accomplished by adding a day to 24,219 years out of every 100,000. But this requires a table with 100,000 entries, which is too complicated.

We would like to find a system that requires a simpler table, but which is still reasonably accurate. The Julian system requires a table with 4 entries, but gives a calendar year that averages 365.25 days long, which is 0.00781 too many. Since this is about 1/128 day, the Julian calendar "gains a day" every 128 years or so, which means that the vernal equinox slips a day earlier every 128 years, and eventually the daffodils and crocuses are blooming in January.

Not everyone considers this a problem. The Islamic calendar is only 355 days long, and so "loses" 10 days per year, which means that after 18 years the Islamic new year has moved half a year relative to the seasons. The annual Islamic holy month of Ramadan coincided with July-August in 1980 and with January-February in 1997. The Muslims do intercalate, but they do it to keep the months in line with the phases of the moon.

Still, supposing that we do consider this a problem, we would like to find an intercalation scheme that is simple and accurate. This is exactly the problem of finding a simple rational approximation to 0.24219. If p/q is close to 0.24219, then one can introduce p intercalary days every q years, and q is the size of the table required. The Julian calendar takes p/q = 1/4 = 0.25, for an error around 1/128. The Gregorian calendar takes p/q = 97/400 = 0.2425, for an error of around 1/3226. Again, this means that the Gregorian calendar gains a day on the seasons every 3,226 years or so. Can we do better?

Any time the question is "find a simple rational approximation to a number" the answer is likely to involve continued fractions. 365.24219 is equal to:

 $$ 365 + {1\over \displaystyle 4 + {\strut 1\over\displaystyle 7 + {\strut 1\over\displaystyle 1 + {\strut 1\over\displaystyle 3 + {\strut 1\over\displaystyle 24 + {\strut 1\over\displaystyle 6 + \cdots }}}}}}$$

which for obvious reasons, mathematicians abbreviate to [365; 4, 7, 1, 3, 24, 6, 2, 2]. This value is exact. (I had to truncate the display above because of a bug in my TeX formula tool: the full fraction goes off the edge of the A0-size page I use as a rendering area.)

As I have mentioned before, the reason this horrendous expression is interesting is that if you truncate it at various points, the values you get are the "continuants", which are exactly the best possible rational approximations to the original number. For example, if we truncate it to [365], we get 365, which is the best possible integer approximation to 365.24219. If we truncate it to [365; 4], we get 365¼, which is the Julian calendar's approximation.

Truncating at the next place gives us [365; 4, 7], which is 365 + 1/(4 + 1/7) = 356 + 1/(29/7) = 365 + 7/29. In this calendar we would have 7 intercalary days out of 29, for a calendar year of 365.241379 days on average. This calendar loses one day every 1,234 years.

The next convergent is [365; 4, 7, 1] = 8/33, which requires 8 intercalary days every 33 years for an average calendar year of 0.242424 days. This schedule gains only one day in 4,269 years and so is actually more accurate than the Gregorian calendar currently in use, while requiring a table with only 33 entries instead of 400.

The real question, however, is not whether the table can be made smaller but whether the rule can be made simpler. The rule for the Gregorian calendar requires second-order corrections:

  1. If the year is a multiple of 400, it is a leap year; otherwise
  2. If the year is a multiple of 100, it is not a leap year; otherwise
  3. If the year is a multiple of 4, it is a leap year.

And one frequently sees computer programs that omit one or both of the exceptions in the rule.

The 8/33 calendar requires dividing by 33, which is its most serious problem. But it can be phrased like this:

  1. Divide the year by 33. If the result is 0, it is not a leap year. Otherwise,
  2. If the result is divisible by 4, it is a leap year.
The rule is simpler, and the weird exceptions come every 33 years instead of every 100. This means that people are more likely to remember them. If you are a computer programmer implementing calendar arithmetic, and you omit the 400-year exception, it may well happen that nobody else will catch the error, because most of the time there is nobody alive who remembers one. (Right now, many people remember one, because it happened, for the second time ever, only 8 years ago. We live at an unusual moment of history.) But if you are a computer programmer who omits the exception in the 8/33 calendar, someone reviewing your code is likely to speak up: "Hey, isn't there some exception when the result is 0? I think I remember something like that happening in third grade."

Furthermore, the rule as I gave it above has another benefit: it matches the Gregorian calendar this year and will continue to do so for several years. This was more compelling when I first proposed this calendar back in 1998, because it would have made the transition to the new calendar quite smooth. It doesn't matter which calendar you use until 2016, which is a leap year in the Gregorian calendar but not in the 8/33 calendar as described above. I may as well mention that I have modestly named this calendar the Dominus calendar.

But time is running out for the smooth transition. If we want to get the benefits of the Dominus calendar we have to do it soon. Help spread the word!

[ Pre-publication addendum: Wikipedia informs me that it is not correct to use the tropical year, since this is not in fact the time between vernal equinoxes, owing to the effects of precession and nutation. Rather, one should use the so-called vernal equinox year, which is around 365.2422 days long. The continued fraction for 365.2422 is slightly different from that of 356.24219, but its first few convergents are the same, and all the rest of the analysis in the article holds the same for both years. ]

[ Addendum 20080229: The Persian calendar uses a hybrid 7/29 and 8/33 system. Read all about it. ]


[Other articles in category /calendar] permanent link