In this section:
Mon, 22 Jun 2015
In late April I served a residency at Recurse Center, formerly known as Hacker School. I want to write up what I did before I forget.
Recurse Center bills itself as being like a writer's retreat, but for programming. Recursers get better at programming four days a week for three months. There are some full-time instructors there to help, and periodically a resident, usually someone notable, shows up for a week. It's free to students: RC partners with companies that then pay it a fee if they hire a Recurser.
I got onto the RC chat system and BBS a few weeks ahead and immediately realized that it was going to be great. I am really wary about belonging to groups, but I felt like I fit right in at RC, in a way that I hadn't felt since I went off to math camp at age 14. Recurse Center isn't that different from math camp now that I think about it.
The only prescribed duty of a resident is to give a half-hour talk on Monday night, preferably on a technical topic. I gave mine on the history and internals of lightweight hash structures in programming languages like Python and Perl. (You can read all about that if you want to.)
Here's what else I did:
There was a lot of other stuff. I met or at least spoke with around 90% of the seventy or so Recursers who were there with me. I attended the daily stand-up status meetings with a different group each time. I ate lunch and dinner with many people and had many conversations. I went out drinking with Recursers at least once. The RC principals kindly rescheduled the usual Thursday lightning talks to Monday so I could attend. I met Erik Osheim for lunch one day. And I baked cookies for our cookie-decorating party!
It was a great time, definitely a high point in my life. A thousand thanks to RC, to Rachel Vincent and Dave Albert for essential support while I was there, and to the facilitators, principals, and especially to the other Recursers.
Thu, 02 Apr 2015
We had a party last week for Toph's 7th birthday, at an indoor rock-climbing gym, same as last year. Last year at least two of the guests showed up and didn't want to climb, so Lorrie asked me to help think of something for them to do if the same thing happened this year. After thinking about it, I decided we should have cookie decorating.
This is easy to set up and kids love it. I baked some plain sugar cookies, bought chocolate, vanilla, and strawberry frosting, several tubes of edible gel, and I mixed up five kinds of colored sugar. We had some colored sprinkles and little gold dragées and things like that. I laid the ingredients out on the table in the gym's side room with some plastic knives and paintbrushes, and the kids who didn't want to climb, or who wanted a break from climbing, decorated cookies. It was a great success. Toph's older sister Katara had hurt her leg, and couldn't climb, so she helped the littler kids with cookies. Even the tiny two-year-old sister of one of the guests was able to participate, and enjoyed playing with the dragées.
(It's easy to vary the project depending on how much trouble you want to take. I made the cookies from scratch, which is pretty easy, but realized later I could have bought prefabricated cookie batter, which would have been even easier. The store sold colored sugar for $3.29 for four ounces, which is offensive, so I went home and made my own. You put one drop of food coloring per two ounces of sugar in a sealed container and shake it up for a minute, for a total cost of close to zero; Toph helped with this. I bought my frosting, but when my grandmother used to do it she'd make a simple white frosting from confectioners' sugar and then color it with food coloring.)
I was really pleased with the outcome, and not just because the guests liked it, but also because it is a violation of gender norms for a man to plan a cookie-decorating activity and then bake the cookies and prepare the pastel-colored sugar and so forth. (And of course I decorated some cookies myself.) These gender norms are insidious and pervasive, and to my mind no opportunity to interfere with them should be wasted. Messing with the gender norms is setting a good example for the kids and a good example for other dads and for the rest of the world.
I am bisexual, and sometimes I feel that it doesn't affect my life very much. The sexual part is mostly irrelevant now; I fell in love with a woman twenty years ago and married her and now we have kids. I probably won't ever have sex with another man. Whatever! In life you make choices. My life could have swung another way, but it didn't.
But there's one part of being bisexual that has never stopped paying dividends for me, and that is that when I came out as queer, it suddenly became apparent to me that I had abandoned the entire gigantic structure of how men are supposed to behave. And good riddance! This structure belongs in the trash; it completely sucks. So many straight men spend a huge amount of time terrified that other straight men will mock them for being insufficiently manly, or mocking other straight men for not being sufficiently manly. They're constantly wondering "if I do this will the other guys think it's gay?" But I've already ceded that argument. The horse is out of the barn, and I don't have to think about it any more. If people think what I'm doing is gay, that's a pill I swallowed when I came out in 1984. If they say I'm acting gay I'll say "close, but actually, I'm bi, and go choke on a bag of eels, jackass."
You don't have to be queer to opt out of straight-guy bullshit, and I think I would eventually have done it anyway, but being queer made opting out unavoidable. When I was first figuring out being queer I spent a lot of time rethinking my relationship to society and its gender constructions, and I felt that I was going to have to construct my own gender from now and that I no longer had the option of taking the default. I wasn't ever going to follow Rule Number One of Being a Man (“do not under any circumstances touch, look at, mention, or think about any dick other than your own”), so what rules was I going to follow? Whenever someone tried to pull “men don't” on me, (or whenever I tried to pull it on myself) I'd immediately think of all the Rule Number One stuff I did that “men don't” and it would all go in the same trash bin. Where (did I say this already?) it belongs.
Opting out frees up a lot of mental energy that I might otherwise waste worrying about what other people think of stuff that is none of their business, leaving me more space to think about how I feel about it and whether I think it's morally or ethically right and whether it's what I want. It means that if someone is puzzled or startled by my pink sneakers, I don't have to care, except I might congratulate myself a little for making them think about gender construction for a moment. Or the same if people find out I have a favorite flower (CROCUSES YEAH!) or if I wash the dishes or if I play with my daughters or watch the ‘wrong’ TV programs or cry or apologize for something I did wrong or whatever bullshit they're uncomfortable about this time.
Opting out frees me up to be a feminist; I don't have to worry that a bunch of men think I'm betraying The Team, because I was never on their lousy team in the first place.
And it frees me up to bake cookies for my kid's birthday party, to make a lot of little kids happy, and to know that that can only add to, not subtract from, my identity. I'm Dominus, who loves programming and mathematics and practicing the piano and playing with toy octopuses and decorating cookies with a bunch of delightful girls.
This doesn't have to be a big deal. Nobody is likely to be shocked or even much startled by Dad baking cookies. But these tiny actions, chipping away at these vile rules, are one way we take tiny steps toward a better world. Every kid at that party will know, if they didn't before, that men can and do decorate cookies.
And perhaps I can give someone else courage to ignore some of that same bullshit that prevents all of us from being as great as we could and should be, all those rules about stuff men aren't supposed to do and other stuff women aren't supposed to do, that make everyone less. I decided about twenty years ago that that was the best reason for coming out at all. People are afraid to be different. If I can be different, maybe I can give other people courage and comfort when they need to be different too. As a smart guy once said, you can be a light to the world, like a city on a hilltop that cannot be hidden.
And to anyone who doesn't like it, I say:
Sun, 20 Jul 2014
As I've discussed elsewhere, I once wrote a program to enumerate all the possible quilt blocks of a certain type. The quilt blocks in question are, in quilt jargon, sixteen-patch half-square triangles. A half-square triangle, also called a “patch”, is two triangles of fabric sewn together, like this:
Then you sew four of these patches into a four-patch, say like this:
Then to make a sixteen-patch block of the type I was considering, you take four identical four-patch blocks, and sew them together with rotational symmetry, like this:
It turns out that there are exactly 72 different ways to do this. (Blocks equivalent under a reflection are considered the same, as are blocks obtained by exchanging the roles of black and white, which are merely stand-ins for arbitrary colors to be chosen later.) Here is the complete set of 72:
It's immediately clear that some of these resemble one another, sometimes so strongly that it can be hard to tell how they differ, while others are very distinctive and unique-seeming. I wanted to make the computer classify the blocks on the basis of similarity.
My idea was to try to find a way to get the computer to notice which blocks have distinctive components of one color. For example, many blocks have a distinctive diamond shape in the center.
Some have a pinwheel like this:
which also has the diamond in the middle, while others have a different kind of pinwheel with no diamond:
I wanted to enumerate such components and ask the computer to list which blocks contained which shapes; then group them by similarity, the idea being that that blocks with the same distinctive components are similar.
The program suite uses a compact notation of blocks and of shapes that makes it easy to figure out which blocks contain which distinctive components.
Since each block is made of four identical four-patches, it's enough just to examine the four-patches. Each of the half-square triangle patches can be oriented in two ways:
Here are two of the 12 ways to orient the patches in a four-patch:
Each 16-patch is made of four four-patches, and you must imagine that the
four-patches shown above are in the upper-left position in the
16-patch. Then symmetry of the 16-patch block means that triangles with the
same label are in positions that are symmetric with respect to the
entire block. For example, the two triangles labeled
Triangles must be colored opposite colors if they are part of the same patch, but other than that there are no constraints on the coloring.
A block might, of course, have patches in both orientations:
All the blocks with diagonals oriented this way are assigned
descriptors made from the letters
Once you have chosen one of the 12 ways to orient the diagonals in the
four-patch, you still have to color the patches. A descriptor like
In each case, all four diagonals run from northwest to southeast. (All other ways of coloring this four-patch are equivalent to one of these under one or more of rotation, reflection, and exchange of black and white.)
We can describe a patch by listing the descriptors of the eight triangles, grouped by which triangles form connected regions. For example, the first block above is:
because there's an isolated white
The other five
All six have
I made up a list of the descriptors for all 72 blocks; I think I did
this by hand. (The work directory contains a
and it will only be significant if the
in which case you get this distinctive and interesting-looking hook component.
There is only one block that includes this distinctive hook component;
it has descriptor
and the blocks formed from such patches always have a distinctive half-diamond component on each edge, like this:
(The stippled areas vary from block to block, but the blocks with
The blocks listed at http://hop.perl.plover.com/quilt/analysis/images/ee.html
all have the
Other distinctive components have similar short descriptors. The two pinwheels I
mentioned above are
so the full sixteen-patch looks like this:
where the stippled parts can vary. A look at the list of blocks with
I had made a list of the descriptors for each of the the 72 blocks, and from this I extracted a list of the descriptors for interesting component shapes. Then it was only a matter of finding the component descriptors in the block descriptors to know which blocks contained which components; if the two blocks share two different distinctive components, they probably look somewhat similar.
Then I sorted the blocks into groups, where two blocks were in the same group if they shared two distinctive components. The resulting grouping lists, for each block, which other blocks have at least two shapes in common with it. Such blocks do indeed tend to look quite similar.
This strategy was actually the second thing I tried; the first thing didn't work out well. (I forget just what it was, but I think it involved finding polygons in each block that had white inside and black outside, or vice versa.) I was satisfied enough with this second attempt that I considered the project a success and stopped work on it.
The complete final results were:
It may also be interesting to browse the work directory.
Tue, 24 Dec 2013
Every week one of the founders of my company sends around a miscellaneous question, and collates the answers, which are shared with everyone on Monday. This week the question was “What's the best advice you've ever heard?”
My first draft went like this:
I thought about that for a while and wondered if I could think of anything else to write down. I added:
I thought about that for a while and decided that nothing was a much cleverer thing to say, and I had better take my own advice. So I scrubbed it all out.
I did finally find a good answer. I told everyone that when I was fifteen, my cousin Alex, who is a chemistry professor, told me never to go anywhere without a pen and paper. That may actually be the best advice I have ever received, and I do think it beats out the TA's.
Fri, 10 Feb 2012
I abandon my abusive relationship with Facebook
This interview with Eben Moglen provides many of the reasons, and was probably as responsible as anything for my decision.
But the straw that broke the camel's back was a tiny one. What finally pushed me over the edge was this: "People who can see your info can bring it with them when they use apps.". This time, that meant that when women posted reviews of men they had dated on a dating review site, the review site was able to copy the men's pictures from Facebook to insert into the reviews. Which probably was not what the men had in mind when they first posted those pictures to Facebook.
This was, for me, just a little thing. But it was the last straw because when I read Facebook's explanation of why this was, or wasn't, counter to their policy, I realized that with Facebook, you cannot tell the difference.
For any particular appalling breach of personal privacy you can never guess whether it was something that they will defend (and then do again), or something that they will apologize for (and then do again anyway). The repeated fuckups for which they are constantly apologizing are indistinguishable from their business model.
So I went to abandon my account, and there was a form they wanted me to fill out to explain why: "Reason for leaving (Required)": One choice was "I have a privacy concern.":
There was no button for "I have 53 privacy concerns", so I clicked that one. A little yellow popup box appeared, which you can see in the screenshot. It said:
Please remember that you can always control the information that you share and who can see it. Before you deactivate, please take a moment to learn more about how privacy works on Facebook. If there is a specific question or concern you have, we hope you'll let us know so we can address it in the future.It was really nice of Facebook to provide this helpful reminder that their corporation is a sociopath: "Please remember that you can always control the information that you share and who can see it. You, and my wife, Morgan Fairchild."
Facebook makes this insane claim in full innocence, expecting you to believe it, because they believe it themselves. They make this claim even after the times they have silently changed their privacy policies, the times they have silently violated their own privacy policies, the times they have silently opted their users into sharing of private information, the times they have buried the opt-out controls three pages deep under a mountain of confusing verbiage and a sign that said "Beware of The Leopard".
There's no point arguing with a person who makes a claim like that. Never mind that I was in the process of deactivating my account. I was deactivating my account, and not destroying it, because they refuse to destroy it. They refuse to relinquish the personal information they have collected about me, because after all it is their information, not mine, and they will never, ever give it up, never. That is why they allow you only to deactivate your account, while they keep and continue to use everything, forever.
But please remember that you can always control the information that you share and who can see it. Thanks, Facebook! Please destroy it all and never let anyone see it again. "Er, no, we didn't mean that you could have that much control."
This was an abusive relationship, and I'm glad I decided to walk away.
[ Addendum 20120210: Ricardo Signes points out that these is indeed an option that they claim will permanently delete your account, although it is hard to find. ]
Mon, 13 Jun 2011
Longshot bets by time travelers
If you're a time traveller, one way to make money is by placing bets on events that you already know the outcomes of. The stock market is only the most obvious example of this; who wouldn't have liked to have invested in IBM in 1916?I never finished this article, and I no longer remember where I planned to take it from there. I think I probably abandoned it because I realized that nothing else I could say would be as interesting as the Bosom Buddies thing. I don't think I could think of any examples that were less likely-seeming.
Until today. I wonder what sort of odds you could have gotten in 1995 betting that within twenty years, the creators of "What Would Brian Boitano Do" ("Dude! Don't say 'pigfucker' in front of Jesus!") would win the Tony Award for "Best Musical".
Thu, 05 May 2011
The toilet snake costs ten dollars. It is a three-foot long steel spring with a pointy corkscrew on the end. You work it down the toilet pipe while twisting it. It either breaks up the blockage, pokes the blockage far enough down the pipe that it can be flushed away, or else the corkscrew tangles in the blockage and you can pull it out. Professional plumbers use a much larger, motorized version to unclog sewer pipes.
People on the train stare at you if you are carrying around a toilet snake, I learned.
When I got home from work I snaked the toilet, and lo and behold, it disgorged a plastic comb. Mission accomplished, most likely.
Later on, Toph, who is three now and fascinated with all things poop, asked me for the umpteenth time why the toilet was clogged. This time I had an answer. "It clogged because someone flushed a comb down it. Do you have any idea who might have done that?"
Toph considered. "I don't know..." she said, thoughtfully. Then her face brightened. "Maybe Katara did it!"
Fri, 05 Nov 2010
Linus Torvalds' Greatest Invention
On Wednesday night I gave a talk for the Philadelphia Linux Users' Group on "Linus Torvalds' Greatest Invention". I was hoping that some people would show up expecting it to be Linux, but someone spilled the beans ahead of time and everyone knew it was going to be about Git. But maybe that wasn't so bad because a bunch of people showed up to hear about Git.
But then they were probably disappointed because I didn't talk at all about how to use Git. I talked a little about what it would do, and some about implementation. I used to give a lot of talks of the type "a bunch of interesting crap I threw together about X", but in recent years I've felt that a talk needs to have a theme. So for quite a while I knew I wanted to talk about Git internals, but I didn't know what the theme was.
Then I realized that the theme could be that Git proves that Linus was really clever. With Linux, the OS was already designed for him, and he didn't have to make a lot of the important decisions, like what it means to be a file. But Git is full of really clever ideas that other people might not have thought of.
Anyway, the slides are online if you are interested in looking.
Thu, 21 May 2009
Tue, 17 Feb 2009
It appears that the second-largest city in New York state is some place called (get this) "Buffalo". Okay, whatever. But that got me wondering if New York was the state with the greatest disparity between its largest and second-largest city. Since I had the census data lying around from a related project (and a good thing too, since the Census Bureau website moved the file) I decided to find out.
The answer is no. New York state has only one major city, since its next-largest settlement is Buffalo, with 1.1 million people. (Estimated, as of 2006.) But the second-largest city in Illinois is Peoria, which is actually the punchline of jokes. (Not merely because of its small size; compare Dubuque, Iowa, a joke, with Davenport, Iowa, not a joke.) The population of Peoria is around 370,000, less than one twenty-fifth that of Chicago.
But if you want to count weird exceptions, Rhode Island has everyone else beat. You cannot compare the sizes of the largest and second-largest cities in Rhode Island at all. Rhode Island is so small that it has only one city, Seriously. No, stop laughing! Rhode Island is no laughing matter.
The Articles of Confederation required unanimous consent to amend, and Rhode Island kept screwing everyone else up, by withholding consent, so the rest of the states had to junk the Articles in favor of the current United States Constitution. Rhode Island refused to ratify the new Constitution, insisting to the very end that the other states had no right to secede from the Confederation, until well after all of the other twelve had done it, and they finally realized that the future of their teeny one-state Confederation as an enclave of the United States of America was rather iffy. Even then, their vote to join the United States went 34–32.
But I digress.
Actually, for many years I have said that you can impress a Rhode Islander by asking where they live, and then—regardless of what they say—remarking "Oh, that's near Providence, isn't it?" They are always pleased. "Yes, that's right!" The census data proves that this is guaranteed to work. (Unless they live in Providence, of course.)
Here's a joke for mathematicians. Q: What is Rhode Island? A: The topological closure of Providence.
Okay, I am finally done ragging on Rhode Island.
Here is the complete data, ordered by size disparity. I wasn't sure whether to put Rhode Island at the top or the bottom, so I listed it twice, just like in the Senate.
Some of this data is rather odd because of the way the census bureau aggregates cities. For example, the largest city in New Jersey is Newark. But Newark is counted as part of the New York City metropolitan area, so doesn't count separately. If it did, New Jersey's quotient would be 5.86 instead of 1.35. I should probably rerun the data without the aggregation. But you get oddities that way also.
I also made a scatter plot. The x-axis is the population of the largest city, and the y-axis is the population of the second-largest city. Both axes are log-scale:
Nothing weird jumps out here. I probably should have plotted population against quotient. The data and programs are online if you would like to mess around with them.
I gratefully acknowledge the gift of Tim McKenzie. Thank you!
Sun, 15 Feb 2009
Milo of Croton and the sometimes failure of inductive proofs
"Did it work?" I asked.
"No," said Ranjit. "A newborn calf already weighs like a hundred pounds."
Usually you expect the induction step to fail, but sometimes it's the base case that gets you.
Fri, 13 Feb 2009
Anyway, it's been on my to-do list for about three years, so what the hell.
Around 1986, I heard it claimed that Ronald Reagan did not have practical qualifications for the presidency, because he had not been a lawyer or a general or anything like that, but rather an actor. "An actor?" said this person. "How does being an actor prepare you to be President?"
I pointed out that he had also been the Governor of California.
But it doesn't even stop there. Who says some actor is qualified to govern California? Well, he had previously been president of the Screen Actors' Guild, which seems like a reasonable thing for the Governor of California to have done.
Around 1992, I was talking to a woman who claimed that the presidency was not open to the disabled, because the President was commander-in-chief of the army, he had to satisfy the army's physical criteria, and they got to disqualify him if he couldn't complete basic training, or something like that. I asked how her theory accommodated Ronald Reagan, who had been elected at the age of 68 or whatever. Then I asked how the theory accommodated Franklin Roosevelt, who could not walk, or even stand without assistance, and who traveled in a wheelchair.
I was once harangued by someone for using the phrase "my girlfriend." "She is not 'your' girlfriend," said this knucklehead. "She does not belong to you."
Sometimes you can't think of the right thing to say at the right time, but this time I did think of the right thing. "My father," I said. "My brother. My husband. My doctor. My boss. My congressman."
My notes also suggest a long article about dumb theories in general. For example, I once read about someone who theorized that people were not actually smaller in the Middle Ages than they are today. We only think they were, says this theory, because we have a lot of leftover suits of armor around that are too small to fit on modern adults. But, according to the theory, the full-sized armor got chopped up in battles and fell apart, whereas the armor that's in good condition today is the armor of younger men, not full-grown, who outgrew their first suits, couldn't use them any more, and hung then on the wall as mementoes. (Or tossed them in the cellar.)
I asked my dad about this, and he wanted to know how that theory applied to the low doorways and small furniture. Heh.
I think Herbert Illig's theory is probably in this category, Herbert Illig, in case you missed it, believes that this is actually the year 1712, because the years 614–911 never actually occurred. Rather, they were created by an early 7th-century political conspiracy to rewrite history and tamper with the calendar. Unfortunately, most of the source material is in German, which I cannot read. But it would seem that cross-comparisons of astronomical and historical records should squash this theory pretty flat.
In high school I tried to promulgate the rumor that John Entwistle and Keith Moon were so disgusted by the poor quality of the album Odds & Sods that they refused to pose for the cover photograph. The rest of the band responded by finding two approximate lookalikes to stand in for Moon and Enwistle, and by adopting a cover design calculated to obscure the impostors' faces as much as possible.
This sort of thing was in some ways much harder to pull off successfully in 1985 than it is today. But if you have heard this story before, please forget it, because I made it up.
Addendum 20150513: I have several times heard an argument against hate speech laws on the grounds that no other law punishes a person not for what they did but for what was in their mind. There may be good arguments against hate speech laws, this is not one. If you are the lookout for a failed bank robbery, you will be charged with bank robbery, despite having done nothing but stand on a public streetcorner; if you were the getaway driver you will be charged with bank robbery even if you paid the parking meter. Honest mistakes are distinguished from from fraud based on state of mind: fraud requires an intent to deceive. Assault, battery, and even homicide can be defensible if you are acting in self-defense, which requires a belief that one is in imminent danger. Identical behavior can result in a charge of criminal negligence, manslaughter, second-degree murder, or first-degree murder depending only on the perpetrator's state of mind. Criminal conspiracy is just two people talking in a room, nothing illegal about that, they might have been discussing a mystery novel, except they weren't, they were planning a crime. And so on.
I would like to acknowledge the generous gift of Jack Kennedy. Thank you very much!
This acknowledgement is not intended to be apropos of this blog post. I just decided I should start acknowledging gifts, and this happened to be the first post since I made the decision.
[ Addendum 20090214: A similar equivocation of "your" is mentioned by Plato. ]
Mon, 01 Dec 2008
License plate sabotage
So it is too with car license plates, and for a number of years I have toyed with the idea of getting a personalized plate with II11I11I or 0OO0OO00 or some such, on the theory that there is no possible drawback to having the least legible plate number permitted by law. (If you are reading this post in a font that renders 0 and O the same, take my word for it that 0OO0OO00 contains four letters and four digits.)
A plate number like O0OO000O increases the chance that your traffic tickets (or convictions!) will be thrown out because your vehicle has not been positively identified, or that some trivial clerical error will invalidate them.
Recently a car has appeared in my neighborhood that seems to be owned by someone with the same idea:
In case it's hard to make out in the picture—and remember that that's the whole idea—the license number is 00O0O0. (That's two letters and four digits.) If you are reading this in a font in which 0 and O are difficult or impossible to distinguish—well, remember that that's the whole idea.
Other Pennsylvanians should take note. Consider selecting OO0O0O, 00O00, and other plate numbers easily confused with this one. The more people with easily-confused license numbers, the better the protection.
Tue, 29 Jan 2008
The Census Bureau's data file
The data is available from the Census Bureau's web site. It is a CSV file. Most of the file contains actual data, like this:
20220,,"Dubuque, IA",Metropolitan Statistical Area,"92,384","91,603","91,223","90,635","89,571","89,216","89,265","89,156","89,143"Experienced data mungers will feel a sense of foreboding as they look at the commas in those numerals. Commas are for people, and if the data file is written for people, rather than for computers, then getting the computer to read it is going to require at least a little bit of suffering. Indeed, the rest of the data is rather dirty. There is a useless header:
table with row headers in column A and column headers in rows 3 through 4 (leading dots indicate sub-parts),,,,,,,,,,,,^M "Table 1. Annual Estimates of the Population of Metropolitan and Micropolitan Statistical Areas: April 1, 2000 to July 1, 2006",,,,,,,,,,,,^M CBSA Code,"Metro Division Code",Geographic area,"Legal/statistical area description",Population estimates,,,,,,,"April 1, 2000",^M ,,,,"July 1, 2006","July 1, 2005","July 1, 2004","July 1, 2003","July 1, 2002","July 1, 2001","July 1, 2000",Estimates base,Census^M ,,Metropolitan statistical areas,,,,,,,,,,^MAnd there is a similarly useless footer on the bottom of the file. Any program that wants to use this data has to trim off the header and the footer, or ignore them, or the user will have to trim them off manually.
(I've translated ASCII CR characters to ^M sequences so that you can see that although the lines of the file are CR-LF terminated, some of the items contain extra LFs for no particular reason.)
Well, all this is minor. My real complaint is that some of the state name abbreviations are garbled:
19740,,"Denver-Aurora, CO1",Metropolitan Statistical Area,"2,408,750","2,361,778","2,326,126","2,299,879","2,276,592","2,245,030","2,193,737","2,179,320","2,179,240"Notice that it says CO1 rather than CO, short for "Colorado". I was fortunate to notice this garbling. Since it occurred on the line for Denver (among others) the result was that the program was unable to locate the population of Denver, which is the capital of Colorado, and a mandatory part of the program's output. So it raised a warning. Then I went in and manually corrected the CO1 to say CO. I also added a check to the program to make sure that it recognized all the state abbreviations; I should have had this in there in the first place.
Then I sent email to an acquaintance who works for the Census Bureau (identity suppressed to protect the innocent), pointing out the errors so that they could be corrected.
My contact checked with the people who produced the data, and informed me that, according to them, CO1 was not an error. Rather, the 1 was a footnote mark, directing me to a footnote at the bottom of the file:
"1Broomfield, CO was formed from parts of Adams, Boulder, Jefferson, and Weld Counties, CO on November 15, 2001 and was coextensive with Broomfield city.",,,,,,,,,,,,^M "For purposes of presenting data for metropolitan and micropolitan statistical areas for Census 2000, Broomfield is treated as if it were a county at the time of the 2000 census.",,,,,,,,,,,,^MA footnote.
I realize now that that footer was not as useless as I thought it was. Wow. A footnote. Wow.
I would like to suggest the following as a basic principle of computerized data processing:
Data files should contain data.Not metadata. Not explanations. Not little essays. And not footnotes. Just the data.
There's a larger issue here about confusing content and presentation. But "Data files should contain data" is simpler and easier to remember.
I suspect that this file was exported from a spreadsheet program, probably Excel. Spreadsheet programs desperately want you to confuse content and presentation. This is why one should not use a spreadsheet as a database.
I now recall another occasion when I had to deal with data that was exported from a spreadsheet that was pretending to be a database. It was a database of products made by a large cosmetics company. A typical record looked like this:
"Soft-Pressed Powder Blusher","618J-05","Warm, natural-looking powder colour for all skins. Wide range of shades-subtle to vibrant. With applicator brush.","Cheeks","Nudes","Chestnut Blush","All","","19951201","Yes","","14.5",""The 618J-05 here is a product code. Bonus points if you see what's coming next.
"Water-Dissolve Cream Cleanser","6.61E+01","Creamy cleanser for drier, more sensitive skins. Dissolves even the most tenacious makeups.","Cleansers","","","","Sub I, I, II","19951201","Yes","1","14.5",""That 6.61E+01 should have been 661E-01, but Excel decided that it was a numeral, in scientific notation, and put it into normal form.
Back to the Census Bureau, which almost screwed me by putting a footnote on a state name. What if they had decided to put footnotes on the population figures? Then I would have been really screwed, because it would have been completely undetectable.
No, wait! It's all become clear. That's why they put the commas in the numerals!
[ Addendum 20080129: My Census Bureau contact tells me that the authors of the data file have seen the wisdom of my point of view, in spite of my unconstructive and unhelpful feedback (I said "Wow, that is an incredibly terrible idea") and are planning to address the issue in the next release of the data. Hooray for happy endings! ]
[ Addendum 20080129: My Census Bureau contact tells me that they do sometimes put footnotes on the data items, so don't laugh too hard at my remark about the commas. ]
Wed, 23 Jan 2008
Smallest state capitals
At the other end of the scale, of course, we have state capitals like Boston, Denver, Atlanta, and Honolulu that are their state's largest cities. For these states, the population quotient is 1, its theoretical minimum.
Well, James, it only took me thirty years, but here it is.
I tried to resolve the question manually a few weeks ago, by browsing Wikipedia for the populations of likely candidates. Today I took a more methodical approach, downloading the U.S. Census Bureau's July 2006 estimates for populations of metropolitan areas, and writing a couple of little programs to grovel the data.
I had to augment the Census Bureau's data with two items: Annapolis, MD, and Montpelier, VT are not large enough to be included in the metropolitan area data file. I used U.S. Census 2006 estimates for these cities as well.
I discarded one conurbation: the Census Bureau includes a "Metropolitan Division" in New Hampshire that consists of Rockingham and Strafford counties; this was the most populous identified area in New Hampshire. It didn't seem entirely germane to the question, so I took it out. On the other hand, including it doesn't change the results much: its population is 416,000, compared with Manchester-Nashua's 402,000.
The results follow.
Vermont is an interesting outlier here. It makes fourth place not because it has a large city, but because its capital, Montpelier, is so very small. I tried doing some scatter plots, to see if anything else jumped out, but they weren't very illuminating. If anything, the data is suprisingly evenly distributed. Here's an example:
The x-axis is the population of the state capital; the y-axis is the quotient. (Both axes are log scale.) Vermont is the leftmost point, near the top. The large collection of points on the x-axis are of course the nineteen states for which the capital and largest city coincide.
[ Addendum 20080129: Some remarks about the format of the Census Bureau's data file. ]
[ Addendum 20090217: A comparison of the relative sizes of each state's largest and second-largest cities. ]
Tue, 04 Dec 2007
An Austrian coincidence
(Andy Lester recently referred to my blog as "the single most intelligent blog out there". I'll make you eat those words, Andy!)
Thu, 13 Sep 2007
Girls of the SEC
This isn't the first time I've had this reaction on learning about a Playboy pictorial; last time was probably in August 2002 when I saw the "Women of Enron" cover. (I am not making this up.) I wasn't aware of The December 2002 feature, "Women of Worldcom" (I swear I'm not making this up), but I would have had the same reaction if I had been.
I know that in recent years the Playboy franchise has fallen from its former heights of glory: circulation is way down, the Playboy Clubs have all closed, few people still carry Playboy keychains. But I didn't remember that they had fallen quite so far. They seem to have exhausted all the plausible topics for pictorial features, and are now well into the scraping-the-bottom-of-the-barrel stage. The June 1968 feature was "Girls of Scandinavia". July 1999, "Girls of Hawaiian Tropic". Then "Women of Enron" and now "Girls of the SEC".
How many men have ever had a fantasy about sexy SEC employees, anyway? How can you even tell? Sexy flight attendants, sure; they wear recognizable uniforms. But what characterizes an SEC employee? A rumpled flannel suit? An interest in cost accounting? A tendency to talk about the new Basel II banking regulations? I tried to think of a category that would be less sexually inspiring than "SEC employees". It's difficult. My first thought was "Girls of Wal-Mart." But no, Wal-Mart employees wear uniforms.
If you go too far in that direction you end up in the realm of fetish. For example, Playboy is unlikely to do a feature on "girls of the infectious disease wards". But if they did, there is someone (probably on /b/), who would be extremely interested. It is hard to imagine anyone with a similarly intense interest in SEC employees.
So what's next for Playboy? Girls of the hospital gift shops? Girls of State Farm Insurance telephone customer service division? Girls of the beet canneries? Girls of Acadia University Grounds and Facilities Services? Girls of the DMV?
[ Pre-publication addendum: After a little more research, I figured out that SEC refers here to "Southeastern Conference" and that Playboy has done at least two other features with the same title, most recently in October 2001. I decided to run the article anyway, since I think I wouldn't have made the mistake if I hadn't been prepared ahead of time by "Women of Enron". ]
[ Addendum 20070913: This article is now on the first page of Google searches for "girls of the sec playboy". ]
[ Addendum 20070915: The article has moved up from tenth to third place. Truly, Google works in mysterious ways. ]
Fri, 20 Jul 2007
One of the biology interns asked a me a good one a couple of weeks ago: he asked how, if Perl runs Perl scripts, and the OS is running Perl, what is running the OS? Now that is a tough question to answer. I explained about logic gates, and how the logic gates are built into trivial arithmetic and memory circuits, how these are then built up into ALUs and memories, and how these in turn are controlled by microcode, and finally how the logical parts are assembled into a computer. I don't know how understandable it was, but it was the best I could do in five minutes, and I think I got some of the idea across. But I started and finished by saying that it was basically miraculous.
My daughter Katara asks a ton of questions, some better than others. On any given evening she is likely to ask "Daddy, what are you doing?" about fifteen times, and "why?" about fifteen million times. "Why" can be a great question, but sometimes it's not so great; Katara asks both kinds. Sometimes it's in response to "I'm eating a sandwich." Then the inevitable "why?" is rather annoying.
Some of the "why" questions are nearly impossible to answer. For example, we see a lady coming up the street toward us. "Is that Susanna?" "No." "Why is it not Susanna?"
I think what's happening here is that having discovered this magic word that often produces interesting information, Katara is employing it whenever possible, even when it doesn't make sense, because she hasn't yet learned when it works and when not. Why is that not Susanna? Hey, you never know when you might get an interesting answer. But there might be something else going on that I don't appreciate.
But the nice thing about Katara's incessant questions is that she listens to and remembers the answers, ponders them deeply, and then is likely to come back with an insightful followup when you least expect it.
the official site for the official story. My wife provided the helpful analogy with the composter.) As I expected, Katara was interested, and thought this over; she confirmed that they turned poop into soil, and then asked what they made pee into. I was not prepared for that one, and I had to promise her I would find out later. It took me some Internet research time to find out about denitrogenation.
Speaking of poop, last month Katara asked a puzzler: why don't birds use toilets? I think this was motivated by our earlier discussion of bird poop on our car.
In Make Way for Ducklings there's a picture of the friendly policeman Michael, running back to his police box to order a police escort to help the ducklings across Beacon Street. He's holding his billy club. Katara asked what that was for. I thought a moment, and then said "It's for hitting people with." Later I wondered if I had given an inaccurate or incomplete answer, so I asked around, and did some reading. It appears I got that one right. Some folks I know suggested that I should have said it was for hitting bad people, but I'd rather stick to the plain facts, and leave out the editorializing.
So I was reading the Mattingly book this evening, and Katara was eating and playing with Play-Doh on the kitchen floor. After the eleventh repetition of "Daddy, what are you doing?" "Reading." I decided to tell Katara what I was reading about. I said that I was reading about ships, that ships are big boats; they carry lots of men and guns. Katara asked why they carried guns, and I explained that often the ships carried treasure, like spices or gold or jewels or cloth, and that pirates tried to steal it. Katara asked if the cloth was like a wash cloth, and I said no, it was more like the kind of cloth that Mommy makes quilts from, or like the silk that her silk dress is made of. I explained about the pirates, which she seemed to understand, because toddlers know all about people who try to take stuff that isn't theirs. And then she asked the question I couldn't answer: Why were there men on the ships, but no women?
I was totally stumped; I don't even know where to begin explaining to a three-year-old why there are no women on ships in 1588. The only answers I could think of had to do with women's traditional roles, with European mores, social constructions of gender, and so on, all stuff that wouldn't help. Sometimes women were smuggled aboard ship, but I wasn't going to say that either.
I don't usually give up, but this time I gave up. This is a tough question of the first order, easy to ask, hard to answer. It's a lot easier to explain wastewater treatment.