The Universe of Discourse


Sat, 17 Jun 2006

Excellent numbers
Another programmer presented the Philadelphia Perl Mongers with this problem, apropos of something else; he had gotten the problem from a friend of his. The problem is to find "excellent numbers". A number n is excellent if it has an even number of digits, and if when you chop it into a front half a and a back half b, you have b2 - a2 = n. For example, 48 is excellent, because 82 - 42 = 48, and 3468 is excellent, because 682 - 342 = 4624 - 1156 = 3468.

This other programmer had written a program to do brute-force search, which wasn't very successful, because there aren't very many excellent numbers. He was presenting it to the Perl Mongers because in the course of trying to speed up the program, he had learned something interesting: using the Memoize module to memoize the squaring function makes a program slower, not faster. But that is another matter for another article.

A slightly less brutal brute-force search locates excellent numbers quite easily. Suppose the number n has 2k digits, and that it is the concatenation of a and b, each with k digits. We want b2 - a2 = n. Since n is the concatenation of a and b, it is equal to a·10k + b. So we want:

a·10k + b = b2 - a2

Or equivalently:

a·(10k + a) = b2 - b

Let's say that a number of the form b2 - b is "rectangular". So our problem is simply to find k-digit numbers a for which a·(10k + a) is rectangular.

That is, instead of brute-forcing n, which has 2k digits, we need only do a brute-force search on a, which has only k digits. This is a lot faster. To find all 10-digit excellent numbers, we are now searching only 90,000 values of a instead of 9,000,000,000 values of n. This is feasible, whereas the larger search isn't.

All we need is some way to determine whether a given number q is rectangular, and that is not very hard to do. One way is simply to precompute a table of all rectangular numbers up to a certain size, and look up q in the table.

But another way is to notice that it is sufficient to be able to extract the "rectangular root" of q. That's the number b such that b2 - b = q. If we can make a good guess about what b might be, then we can calculate b2-b and see if we get back q. This is analogous to checking whether a number s is a perfect square by guessing its square root r and then checking to see if r2 = s. (Of course, it only works if you can guess right!)

But how can we guess the rectangular root of q? The computer has a built-in square root function, but no rectangular root function. But that's okay, because they are very nearly the same thing. Rectangular numbers are very nearly squares: when b is large, b2 - b is relatively very close to b2. So if we are given some number q, and we want to know if it has the form b2 - b, we can get a very good estimate of the size of b just by taking √q.

For example, is 29750 a rectangular number? √29750 = 172.48, so if 29750 is rectangular, it will be 1732 - 173 = 29756. So, no.

Is 1045506 rectangular? √1045506 = 1022.4998, so if 1045506 is rectangular, it will be 10232-1023 = 1045506—check!

The only possible problem is that our assumption that b2-b and b2 will be close may not hold when b is too small. So we might need to put a special case in our program to use a different test for rectangularity when q is too small. But it turns out that the test works fine even for q as small as 2: Is 2 rectangular? √2 = 1.4, so if 2 is rectangular, it will be 22-2—check!

So we code up an is_rectangular function:

        sub is_rectangular {
          my $q = shift;
          my $b = 1 + int(sqrt($q));  # Guess
          if ($b * $b - $b == $q) {   # Check guess
            return $b;                # Aha!
          } else {
            return;                   # Not rectangular
          }
        }
This function returns false if its argument is not rectangular; and it returns the rectangular root otherwise. We will need the rectangular root, because that's exactly the lower half of an excellent number.

Now we need a function that tells us whether we have some number a that might be the upper half of an excellent number:

        my $k = shift || 3;
        my $p = 10**$k;

        sub is_upper_half_of_excellent_number {
          my $a = shift;
          return is_rectangular($a * ($p + $a));
        }
$k here is the number of digits in a. We'll let it be a command-line argument, and default to 3.

We could infer k from a itself, but we'll hardwire it for two reasons. One reason is speed. The other is that we might want to accept a number like 0003514284 as being excellent, where here a has some leading zeroes; fixing k is one way to do this. We also precalculate the constant p = 10k, which we need for the calculation.

Now we just write the brute-force loop:

        for my $a (0 .. $p-1) {
          if(my $b = is_upper_half_of_excellent_number($a)) {
            print "$a$b\n";
          }
        }
The program instantaneously coughs up:

        01
        10101
        16128
        34188
        140400
        190476
        216513
        300625
        334668
        416768
        484848
        530901
        6401025
        8701276
Most of these are correct. For example, 9012 - 5302 = 811801 - 280900 = 530901. 5132 - 2162 = 263169 - 46656 = 216513.

The ones at the top of the list are correct, if you remember about the leading zeroes: 1882 - 0342 = 35344 - 1156 = 034188, and 0012 - 0002 = 1 - 0 = 000001.

The seven-digit numbers at the bottom don't work, because the program has a bug: we forgot to enforce the restriction that b must also have exactly k digits; the last two numbers in the display have four-digit b's. So we need to add another test to the main loop:

        for my $a (0 .. $p-1) {
          if(my $b = is_upper_half_of_excellent_number($a)) {
            print "$a$b\n" if length($b) == $k;
          }
        }
This eliminates the wrong answers from the tail of the list. It also skips over cases where b is too short, and needs leading zeroes, such as a=000, b=001, but fixing that is both easy and unimportant.

The ten-digit solutions are:

        0101010101
        3333466668
        4848484848
        4989086476
I think this is interesting because it shows how a very little bit of mathematical analysis, and some very sloppy numerical work, can turn an intractable problem into a tractable one. We could have worked real hard and maybe come up with some way to generate excellent numbers with no search. But instead, we did just a little work, and that was enough to narrow down the search enormously, to the point where it's practical.

The other thing that I think is interesting is that the other programmer was trying to solve his performance problem with optimization tricks. But when you are searching over 9,000,000,000 cases, optimization tricks don't work. At 1,000 cases per second, 9,000,000,000 cases takes about 104 days to finish. If you can optimize your program to speed it up by 50%, it will still take 52 days to finish. But cutting the 9,000,000,000 cases to 90,000 cuts the run time from 104 days to 90 seconds.

Not to say that optimizing programs never works, of course. But you have to know when optimization might work and when it can't. In this case, micro-optimization wasn't going to help; the only way to fix an algorithm that does a brute-force search of 9,000,000,000 cases is to find a better algorithm. The point of this article is to show that sometimes finding a better algorithm can be pretty easy.

[ Addendum 20060620: I wrote a followup article about how it's not a coincidence that 4848484848 is excellent. ]

[ Addendum 20160105: brian d foy has a whole website about his work on this problem. ]


[Other articles in category /math] permanent link