The Universe of Discourse

Wed, 15 Mar 2006

Why pi is 3
At the end of my post about why π is so peculiar, I said:

Simon [Cozens] also asked me why the number came out to be around 3, rather than around 5 or 57, and there I was on much shakier ground. I did not have any clever insights, and all I could do was itemize a bunch of stuff that seemed to bear on the issue. It will probably appear here in a future article.

1. The most obviously germane fact I came up with was this: Inscribe a regular hexagon in the unit circle. Such a hexagon obviously has a perimeter of 6. The circle goes through the same six points, but instead of taking direct paths between them, it takes a circuitous route, so its perimeter is a bit more than 6. Therefore pi is a bit more than 3.

Several people have written to me to point this out, and nobody has pointed out anything different, which I think supports my contention that this is the most obviously germane fact available.

Also, as I replied to M. Cozens:

I do not know any way to calculate the perimeter of a circle without considering it as a limiting case of a polygon with a lot of very short sides, so I think any investigation of why pi is 3.14 and not something else will have to start here.

If you circumscribe a hexagon around the circle, a little basic geometry reveals an upper bound: π < 2√3. By using polygons with more sides, you get better bounds. With a square, you get only that 2√2 < π < 4, for example; the hexagon improves this to 3 < π < 2√3. About 2200 years ago Archimedes did the calculation for 96-gons and got the value correct to two decimal places: 3 + 10/71 < π < 3 + 1/7.

(This raises an interesting question: with a 96-gon, you would expect the bounds to involve things like √3, like the hexagon does. Where do the weird fractions 10/71 and 1/7 come from? Answer: A bound of the type 2√3 was of limited use to the Greeks, because it replaces the poorly-understood number π with another poorly-understood number √3. So Archimedes replaced surds with rational approximations; for example, early on he replaces √3 with the rational approximation 265/153. (See Dr. Chuck Lindsey's detailed explanation of Archimedes' calculation, and my explanation of where 265/153 comes from.) I'd like to work through this and see what he would have come up with if he had done the exact calculation, but it'll take me some time.)

Anyway, the other items I sent to M. Cozens were:

2. The shortest curve that can enclose a unit area has length 2π. (Or conversely, the largest area that can be enclosed by a unit path is 1/4π.)

This is going to depend strongly on the Euclidean metric again. I don't know how to extend a general metric to give an area measure; indeed, I'm not sure yet of a sensible way to ask the question. I have to think about it. (Yes, I'm sure someone has already studied this, and I could simply look it up, but I will get a lot more out of the answer if I think about the question myself for a while before peeking in the back of the book. There is, as they say, no royal road to geometry.)

The "wordless" proof of the Pythagorean theorem shows that I'll have to be very careful in making the extension from length to area in Manhattan:

(For more about metrics, including the Manhattan metric, see my article on metric spaces.)

Independent of the metric, this proof demonstrates that the two small white thingies on the left have the same area as the large white thingy on the right. In Euclidean space, this equality establishes the Pythagorean theorem. It had better not do so in Manhattan, because the Pythagorean theorem is false in Manhattan; in the diagram, c is not equal to √(a² + b²) but to a + b.

I think I've convinced myself that a square with side s in Manhattan still has area s². And I'm pretty sure that those two white thingies on the left are squares, and so have areas a² and b², respectively.

But this implies that the large white thing on the right has area a² + b², and therefore that it is not a square, and does not have area c² as labeled, because c = a + b, and c² is not equal to a² + b².

Squares in Manhattan are required to have edges that are parallel to the coordinate axes. I think. I don't know where to go next; maybe I'll figure it out on the way home from work today.

The next item is my second favorite, after the observation about the inscribed hexagon:

3. If you put a penny on the table, then you can get at most six other pennies to touch it at the same time. This is closely related to the fact that 6 is the largest integer less than 2π. Analogous results hold in higher dimensions. The area of a sphere is 4π, or about 12.5; you can get 12 spheres to touch another sphere at the same time, but not 13.

This business of the spheres touching a central sphere is known as the "kissing number problem"; we say that the "kissing number in two dimensions" is 6, and the "kissing number in three dimensions" is 12.

After this item, I was pretty much out of circle-related facts. So I switched tactics and tried to look at things that seemed completely unrelated to circles:

4. &pi satisfies the equation:
x - x³/6 + x⁵/120 - x⁷/5040 + ... = 0
This was the only thing I could come up with that seemed both fairly elementary, and at the same time a good way to get π out without putting it in to begin with.

M. Cozens had observed that you can get π by using integral calculus to calculate the area of a unit circle:

But he complained (rightly, I think) that this gives you no insight at all as to where the π comes from, because the π sneaks in as arccos(1) when you do the trig substitution. And since the π was one of the principal ingredients in the recipe for the cosine function to begin with, all that has happened is that you got out what you put in. (The cosine function relates the length of a circular arc to the length of a related straight segment. π gets in there because the segment has length 0 when the arc has length π/2. But then you're right back at the mystery of why a complete circular arc has length 2π.) The integral acquires the π from the cosine function, and the cosine function got it directly from the length of the circumference.

So I started trying to come up with ways to get π that seem to have nothing to do with circles. The infinite polynomial was the first thing I came up with.

It can be related to the circles, but not easily, which I think is an advantage. You need to be able to relate it to circles, or else it doesn't tell you anything about why the perimeter of the circle is pi. But it mustn't be too closely related, because I think that items 1-3 probably exhaust what can be gotten directly from the circles.

I just know that some smart person out there is itching to point out that the polynomial is just the Maclaurin expansion for sin(π), and of course that is how I came up with it. (What, did you think it was just a lucky guess?) But if you did not know about the Maclaurin series, you might be quite shocked to discover that π was a zero of this expression. The terms are already starting to get small by the time you get to π⁹/362880, so in spite of the transcendentality of π we have a 9th-degree polynomial of which it is almost a zero, a polynomial that is based on elementary notions, in which there is no obvious circle.

Item 5 was the Buffon's needle problem, but I said that the appearance of π there appeared to be an obvious consequence of its appearing as the perimeter of a unit circle, so let's pass on to the next thing.

6. The probability that two randomly-selected integers are relatively prime is 6/π². I said:

This gets π out, without putting it in anywhere obvious, but does not seem to me to be elementary. And how you could relate it to the circle, I have no idea.

But now, the relationship with circles seems somewhat clearer to me. You can turn this into a geometry problem like this: You are standing at the origin, looking out on an infinite orchard of apple trees. There is a tree at (a, b) for every pair of integers. The trees have zero width, but when one tree is directly behind another tree, it is blocked and you cannot see it. What fraction of the trees are visible?

There is one visible tree for each (rational) direction you can look in. So there's a relationship between the points on a circle and the visible trees.

(Digression: if the trees have positive diameter, only a finite number are visible from the origin. If the diameter is d, let the number of visible trees be v(d). Estimate v. I believe this problem is still open.)

7. The next item I mentioned was that 1 + 1/4 + 1/9 + ... = π²/6. This is probably related to the orchard thing somehow.

It might be that a good understanding of this identity will lead one to a good understanding of why π is a bit more than 3. It might also be that it has some relationship with the circle. But I told M. Cozens that if he wanted someone to make sense out of this, "you really need to be talking to someone with expertise in analytic number theory, instead of to me." I'll stand by that.

8. Finally, I pointed out that π does not appear only in circles; it also appears in spheres. For example, the volume of a unit sphere is 4π/3. By this time I was scraping the barrel. It is pretty obvious that π is going to get into spheres because spheres are just stacks of circles, and π is already in the circles. Adding together a bunch of line segments that have no relation more complicated than a square root is one thing; it is surprising to see π come out of that. But adding together a bunch of circles that all involve π and getting out something that involves π again is no surprise.

As I said, the inscribed hexagon thing sweems the most germane, followed closely by the kissing number.

A couple of people have written to me to point out that π also appears in a number of constants and laws from physics, such as Coulomb's law. I believe that these appearances are invariably derived from the appearance of π as the circumference, and, in many cases, that this is quite obvious. I'll address this in detail in a future article. The inclusion of π in these formulas signals their dependence on Euclidean space, which has some interesting implications, since general relativity claims that real space is non-Euclidean: we shouldn't expect Coulomb's law to hold over large distances, for example. I imagine that this is old news to the astrophysicists, but it might be a surprise to the physics graduate students.

[Other articles in category /math] permanent link