The Universe of Discourse


Sun, 29 Apr 2007

Your age as a fraction, again
In a recent article, I discussed methods for calculating your age as a fractional year, in the style of (a sophisticated) three-and-a-half-year-old. For example, as of today, Richard M. Stallman is (a sophisticated) 54-and-four-thirty-thirds-year-old; tomorrow he'll be a 54-and-one-eighth-year-old.

I discussed several methods of finding the answer, including a clever but difficult method that involved fiddling with continued fractions, and some dead-simple brute force methods that take nominally longer but are much easier to do.

But a few days ago on IRC, a gentleman named Mauro Persano said he thought I could use the Stern-Brocot tree to solve the problem, and he was absolutely right. Application of a bit of clever theory sweeps away all the difficulties of the continued-fraction approach, leaving behind a solution that is clever and simple and fast.

Here's the essence of it: We consider a list of intervals that covers all the positive rational numbers; initially, the list contains only the interval (0/1, 1/0). At each stage we divide each interval in the list in two, by chopping it at the simplest fraction it contains.

To chop the interval (a/b, c/d), we split it into the two intervals (a/b, (a+c)/(b+d)), ((a+c)/(b+d)), c/d). The fraction (a+c)/(b+d) is called the mediant of a/b and c/d. It's not obvious that the mediant is always the simplest possible fraction in the interval, but it is true.

So we start with the interval (0/1, 1/0), and in the first step we split it at (0+1)/(1+0) = 1/1. It is now two intervals, (0/1, 1/1) and (1/1, 1/0). At the next step, we split these two intervals at 1/2 and 2/1, respectively; the resulting four intervals are (0/1, 1/2), (1/2, 1/1), (1/1, 2/1), and (2/1, 1/0). We split these at 1/3, 2/3, 3/2, and 3/1. The process goes on from there:

0/1                 1/0                
0/1         1/1         1/0        
0/1     1/2     1/1     2/1     1/0    
0/1   1/3   1/2   2/3   1/1   3/2   2/1   3/1   1/0  
0/1 1/4 1/3 2/5 1/2 3/5 2/3 3/4 1/1 4/3 3/2 5/3 2/1 5/2 3/1 4/1 1/0

Or, omitting the repeated items at each step:

0/1                 1/0                
          1/1                  
      1/2           2/1          
    1/3       2/3       3/2       3/1      
  1/4   2/5   3/5   3/4   4/3   5/3   5/2   4/1  

If we disregard the two corners, 0/1 and 1/0, we can see from this diagram that the fractions naturally organize themselves into a tree. If a fraction is introduced at step N, then the interval it splits has exactly one endpoint that was introduced at step N-1, and this is its parent in the tree; conversely, a fraction introduced at step N is the parent of the two step-N+1 fractions that are introduced to split the two intervals of which it is an endpoint.

This process has many important and interesting properties. The splitting process eventually lists every positive rational number exactly once, as a fraction in lowest terms. Every fraction is simpler than all of its descendants in the tree. And, perhaps most important, each time an interval is split, it is divided at the simplest fraction that the interval contains. ("Simplest" just means "has the smallest denominator".)

This means that we can find the simplest fraction in some interval simply by doing binary tree search until we find a fraction in that interval.

For example, Placido Polanco had a .368 batting average last season. What is the smallest number of at-bats he could have had? We are asking here for the denominator of the simplest fraction that lies in the interval [.3675, .3685).

  • We start at the root, which is 1/1. 1 is too big, to we move left down the tree to 1/2.
  • 1/2 = .5000 and is also too big, so we move left down the tree to 1/3.
  • 1/3 = .3333 and is too small, so we move right down the tree to 2/5.
  • 2/5 = .4000 and is too big, so go left to 3/8, which is the mediant of 1/3 and 2/5.
  • 3/8 = .3750, so go left to 4/11, the mediant of 1/3 and 3/8.
  • 4/11 = .3636, so go right to 7/19, the mediant of 3/8 and 4/11.
  • 7/19 = .3684, which is in the interval, so we are done.
If we knew nothing else about Polanco's batting record, we could still conclude that he must have had at least 19 at-bats. (In fact, he had 35 hits in 95 at-bats.)

Calculation of mediants is incredibly simple, even easier than adding fractions. Tree search is simple, just compare and then go left or right. Calculating whether a fraction is in an interval is simple too. Everything is simple simple simple.

Our program wants to find the simplest fraction in some interval, say (L, R). To do this, it keeps track of l and r, initially 0/1 and 1/0, and repeatedly calculates the mediant m of l and r. If the mediant is in the target interval, the function is done. If the mediant is too small, set l = m and continue; if it is too large set r = m and continue:

        # Find and return numerator and denominator of simplest fraction
        # in the range [$Ln/$Ld, $Rn/$Rd)
        #
        sub find_simplest_in {
            my ($Ln, $Ld, $Rn, $Rd) = @_;
            my ($ln, $ld) = (0, 1);
            my ($rn, $rd) = (1, 0);
            while (1) {
                my ($mn, $md) = ($ln + $rn, $ld + $rd);
        #	print "  $ln/$ld  $mn/$md  $rn/$rd\n";
                if (isin($Ln, $Ld, $mn, $md, $Rn, $Rd)) {
                    return ($mn, $md);
                } elsif (isless($mn, $md, $Ln, $Ld)) {
                    ($ln, $ld) = ($mn, $md);
                } elsif (islessequal($Rn, $Rd, $mn, $md)) {
                    ($rn, $rd) = ($mn, $md);
                } else {
                    die;
                }
            }
        }
(In this program, rn and rd are the numerator and the denominator of r.)

The isin, isless, and islessequal functions are simple utilities for comparing fractions.

        # Return true iff $an/$ad < $bn/$bd
        sub isless {
            my ($an, $ad, $bn, $bd) = @_;
            $an * $bd < $bn * $ad;
        }

        # Return true iff $an/$ad <= $bn/$bd
        sub islessequal {
            my ($an, $ad, $bn, $bd) = @_;
            $an * $bd <= $bn * $ad;
        }

        # Return true iff $bn/$bd is in [$an/$ad, $cn/$cd).
        sub isin {
            my ($an, $ad, $bn, $bd, $cn, $cd) = @_;
            islessequal($an, $ad, $bn, $bd) and isless($bn, $bd, $cn, $cd);
        }
The asymmetry between isless and islessequal is because I want to deal with half-open intervals.

Just add a trivial scaffold to run the main function and we are done:

        #!/usr/bin/perl

        my $D = shift || 10;
        for my $N (0 .. $D-1) {
            my $Np1 = $N+1;
            my ($mn, $md) = find_simplest_in($N, $D, $Np1, $D);
            print "$N/$D - $Np1/$D : $mn/$md\n";
        }
Given the argument 10, the program produces this output:

        0/10 - 1/10 : 1/11
        1/10 - 2/10 : 1/6
        2/10 - 3/10 : 1/4
        3/10 - 4/10 : 1/3
        4/10 - 5/10 : 2/5
        5/10 - 6/10 : 1/2
        6/10 - 7/10 : 2/3
        7/10 - 8/10 : 3/4
        8/10 - 9/10 : 4/5
        9/10 - 10/10 : 9/10
This says that the simplest fraction in the range [0/10, 1/10) is 1/11; the simplest fraction in the range [3/10, 4/10) is 1/3, and so forth. The simplest fractions that do not appear are 1/5, which is beaten out by the simpler 1/4 in the [2/10, 3/10) range, and 3/5, which is beaten out by 2/3 in the [6/10, 7/10) range.

Unlike the programs from the previous article, this program is really fast, even in principle, even for very large arguments. The code is brief and simple. But we had to deploy some rather sophisticated number theory to get it. It's a nice reminder that the sawed-off shotgun doesn't always win.

This is article #200 on my blog. Thanks for reading.


[Other articles in category /math] permanent link