The Universe of Discourse


Thu, 05 Apr 2007

More sawed-off shotguns

[ Note: because this article is in the oops section of my blog, I intend that you understand it as a description of a mistake that I have made. ]

Abhijit Menon-Sen wrote to me to ask for advice in finding the smallest triangular number that has at least 500 divisors. (That is, he wants the smallest n such that both n = (k2 + k)/2 for some integer k and also ν(n) ≥ 500, where ν(n) is the number of integers that divide n.) He said in his note that he believed that brute-force search would take too long, and asked how I might trim down the search.

The first thing that occurred to me was that ν is a multiplicative function, which means that ν(ab) = ν(a)ν(b) whenever a and b are relatively prime. Since n and n-1 are relatively prime, we have that ν(n(n-1)) = ν(n)·ν(n-1), and so if T is triangular, it should be easy to calculate ν(T). In particular, either n is even, and ν(T) = ν(n/2)·ν(n-1), or n is odd, and ν(T) = ν(n)·ν((n-1)/2).

So I wrote a program to run through all possible values of n, calculating a table of ν(n), and then the corresponding ν(n(n-1)/2), and then stopping when it found one with sufficiently large ν.

        my $N = 1;
        my $max = 0;
        while (1) {
            my $n   = $N % 2 ? divisors($N) : divisors($N/2);
            my $np1 = $N % 2 ? divisors(($N+1)/2) : divisors($N+1);
            if ($n * $np1 > $max) {
                $max = $n * $np1;
                print "N=$N; T=", $N*($N+1)/2, "; nd()=$max\n";
            }
            last if $max >= 500;
            $N++;
        }
There may be some clever way to quickly calculate ν(n) in general, but I don't know it. But if you have the prime factorization of n, it's easy: if n = p1a1p2a2... then ν(n) = (a1 + 1)(a2 + 1)... . This is a consequence of the multiplicativity of ν and the fact that ν(pn) is clearly n+1. Since I expected that n wouldn't get too big, I opted to factor n and to calculate ν from the prime factorization:

        my @nd;
        sub divisors {
            my $n = shift;
            return $nd[$n] if $nd[$n];
            my @f = factor($n);
            my $ND = 1;
            my $cur = 0;
            my $curct = 0;
            while (@f) {
                my $next = shift @f;
                if ($next != $cur) {
                    $cur = $next;
                    $ND *= $curct+1;
                    $curct = 1;
                } else {
                    $curct++;
                }
            }
            $ND *= $curct+1;
            return $ND;
        }
Unix comes with a factor program that factors numbers pretty quickly, so I used that:
        sub factor {
            my $r = qx{factor $_[0]};
            my @f = split /\s+/, $r;
            shift @f;
            return @f;
        }
This found the answer, 76,576,500, in about a minute and a half. (76,576,500 = 1 + 2 + ... + 12,375, and has 576 factors.) I sent this off to Abhijit.

I was rather pleased with myself, so I went onto IRC to boast about my cleverness. I posed the problem, and rather than torment everyone there with a detailed description of the mathematics, I just said that I had come up with some advice about how to approach the problem that turned out to be good advice.

A few minutes later one of the gentlemen on IRC, who goes by "jeek", (real name T.J. Eckman) asked me if 76,576,500 was the right answer. I said that I thought it was and asked how he'd found it. I was really interested, because I was sure that jeek had no idea that ν was multiplicative or any of that other stuff. Indeed, his answer was that he used the simplest possible brute force search. Here's jeek's program:

        $x=1; $y=0; 
        while(1) { 
          $y += $x++; $r=0; 
          for ($z=1; $z<=($y ** .5); $z++) { 
            if (($y/$z) == int($y/$z)) { 
              $r++; 
              if (($y/$z) != ($z)) { $r++; } 
           }
          } 
          if ($r>499) {print "$y\n";die}
        }
(I added whitespace, but changed nothing else.)

In this program, the variable $y holds the current triangular number. To calculate ν(y), this program just counts $z from 1 up to √y, incrementing a counter every time it discovers that z is a divisor of y. If the counter exceeds 499, the program prints y and stops. This takes about four and a half minutes.

It takes three times as long, but uses only one-third the code. Beginners may not see this as a win, but it is a huge win. It is a lot easier to reduce run time than it is to reduce code size. A program one-third the size of another is almost always better—a lot better.

In this case, we can trim up some obvious inefficiencies and make the program even smaller. For example, the tests here can be omitted:

              if (($y/$z) != ($z)) { $r++; }
It can yield false only if y is the square of z. But y is triangular, and triangular numbers are never square. And we can optimize away the repeated square root in the loop test, and use a cheaper and simpler $y % $z == 0 divisibility test in place of the complicated one.

        while(1) { 
          $y += $x++; $r=0;
          for $z (1 .. sqrt($y)) {
            $y % $z == 0 and $r+=2; 
          }
          if ($r>499) {print "$x $y\n";die}
        }
The program is now one-fifth the size of mine and runs in 75 seconds. That is, it is now smaller and faster than mine.

This shows that jeek's approach was the right one and mine was wrong, wrong, wrong. Simple programs are a lot easier to speed up than complicated ones. And I haven't even consider the cost of the time I wasted writing my complicated program when I could have written Jeek's six-liner that does the same thing.

So! I had to write back to my friend to tell him that my good advice was actually bad advice.

The sawed-off shotgun wins again!

[ Addendum 20070405: Robert Munro pointed out an error in the final version of the program, which I have since corrected. ]

[ Addendum 20070405: I said that triangular numbers are never square, which is manifestly false, since 1 is both, as is 36. I should have remembered this, since some time in the past three years I investigated this exact question and found the answer. But it hardly affects the program. The only way it could cause a failure is if there were a perfect square triangular number with exactly 499 factors; such a number would be erroneously reported as having 500 factors instead. But the program reports a number with 576 factors instead. Conceivably this could actually be a misreported perfect square with only 575 factors, but regardless, the reported number is correct. ]

[ Addendum 20060617: I have written an article about triangular numbers that are also square. ]


[Other articles in category /oops] permanent link