The Universe of Discourse

Fri, 01 Jul 2016

Don't tug on that, you never know what it might be attached to

This is a story about a very interesting bug that I tracked down yesterday. It was causing a bad effect very far from where the bug actually was.


The emacs text editor comes with a separate utility, called emacsclient, which can communicate with the main editor process and tell it to open files for editing. You have your main emacs running. Then somewhere else you run the command

     emacsclient some-files...

and it sends the main emacs a message that you want to edit some-files. Emacs gets the message and pops up new windows for editing those files. When you're done editing some-files you tell Emacs, by typing C-# or something, it it communicates back to emacsclient that the editing is done, and emacsclient exits.

This was more important in the olden days when Emacs was big and bloated and took a long time to start up. (They used to joke that “Emacs” was an abbreviation for “Eight Megs And Constantly Swapping”. Eight megs!) But even today it's still useful, say from shell scripts that need to run an editor.

Here's the reason I was running it. I have a very nice shell script, called also, that does something like this:

  • Interpret command-line arguments as patterns
  • Find files matching those patterns
  • Present a menu of the files
  • Wait for me to select files of interest
  • Run emacsclient on the selected files

It is essentially a wrapper around menupick, a menu-picking utility I wrote which has seen use as a component of several other tools. I can type

    also Wizard

in the shell and get a menu of the files related to the wizard, select the ones I actually want to edit, and they show up in Emacs. This is more convenient than using Emacs itself to find and open them. I use it many times a day.

Or rather, I did until this week, when it suddenly stopped working. Everything ran fine until the execution of emacsclient, which would fail, saying:

 emacsclient: can't find socket; have you started the server?

(A socket is a facility that enables interprocess communication, in this case between emacs and emacsclient.)

This message is familiar. It usually means that I have forgotten to tell Emacs to start listening for emacsclient, by running M-x server-start. (I should have Emacs do this when it starts up, but I don't. Why not? I'm not sure.) So the first time it happened I went to Emacs and ran M-x server-start. Emacs announced that it had started the server, so I reran also. And the same thing happened.

 emacsclient: can't find socket; have you started the server?

Finding the socket

So the first question is: why can't emacsclient find the socket? And this resolves naturally into two subquestions: where is the socket, and where is emacsclient looking?

The second one is easily answered; I ran strace emacsclient (hi Julia!) and saw that the last interesting thing emacsclient did before emitting the error message was

    stat("/mnt/tmp/emacs2017/server", 0x7ffd90ec4d40) = -1 ENOENT (No such file or directory)

which means it's looking for the socket at /mnt/tmp/emacs2017/server but didn't find it there.

The question of where Emacs actually put the socket file was a little trickier. I did not run Emacs under strace because I felt sure that the output would be voluminous and it would be tedious to grovel over it.

I don't exactly remember now how I figured this out, but I think now that I probably made an educated guess, something like: emacsclient is looking in /mnt/tmp; this seems unusual. I would expect the socket to be under /tmp. Maybe it is under /tmp? So I looked under /tmp and there it was, in /tmp/emacs2017/server:

    srwx------ 1 mjd mjd 0 Jun 27 11:43 /tmp/emacs2017/server

(The s at the beginning there means that the file is a “Unix-domain socket”. A socket is an endpoint for interprocess communication. The most familiar sort is a TCP socket, which has a TCP address, and which enables communication over the internet. But since ancient times Unix has also supported Unix-domain sockets, which enable communication between two processes on the same machine. Instead of TCP addresses, such sockets are addressed using paths in the filesystem, in this case /tmp/emacs2017/server. When the server creates such a socket, it appears in the filesystem as a special type of file, as here.)

I confirmed that this was the correct file by typing M-x server-force-delete in Emacs; this immediately caused /tmp/emacs2017/server to disappear. Similarly M-x server-start made it reappear.

Why the disagreement?

Now the question is: Why is emacsclient looking for the socket under /mnt/tmp when Emacs is putting it in /tmp? They used to rendezvous properly; what has gone wrong? I recalled that there was some environment variable for controlling where temporary files are put, so I did

       env | grep mnt

to see if anything relevant turned up. And sure enough there was:


When programs want to create tmporary files and directories, they normally do it in /tmp. But if there is a TMPDIR setting, they use that directory instead. This explained why emacsclient was looking for /mnt/tmp/emacs2017/socket. And the explanation for why Emacs itself was creating the socket in /tmp seemed clear: Emacs was failing to honor the TMPDIR setting.

With this clear explanation in hand, I began to report the bug in Emacs, using M-x report-emacs-bug. (The folks in the #emacs IRC channel on Freenode suggested this. I had a bad experience last time I tried #emacs, and then people mocked me for even trying to get useful information out of IRC. But this time it went pretty well.)

Emacs popped up a buffer with full version information and invited me to write down the steps to reproduce the problem. So I wrote down

     % export TMPDIR=/mnt/tmp
     % emacs

and as I did that I ran those commands in the shell.

Then I wrote

     In Emacs:
     M-x getenv TMPDIR
     (emacs claims there is no such variable)

and I did that in Emacs also. But instead of claiming there was no such variable, Emacs cheerfully informed me that the value of TMPDIR was /mnt/tmp.

(There is an important lesson here! To submit a bug report, you find a minimal demonstration. But then you also try the minimal demonstration exactly as you reported it. Because of what just happened! Had I sent off that bug report, I would have wasted everyone else's time, and even worse, I would have looked like a fool.)

My minimal demonstration did not demonstrate. Something else was going on.

Why no TMPDIR?

This was a head-scratcher. All I could think of was that emacsclient and Emacs were somehow getting different environments, one with the TMPDIR setting and one without. Maybe I had run them from different shells, and only one of the shells had the setting?

I got on a sidetrack at this point to find out why TMPDIR was set in the first place; I didn't think I had set it. I looked for it in /etc/profile, which is the default Bash startup instructions, but it wasn't there. But I also noticed an /etc/profile.d which seemed relevant. (I saw later that the /etc/profile contained instructions to load everything under /etc/profile.d.) And when I grepped for TMPDIR in the profile.d files, I found that it was being set by /etc/profile.d/, which the sysadmins had installed. So that mystery at least was cleared up.

That got me on a second sidetrack, looking through our Git history for recent changes involving TMPDIR. There weren't any, so that was a dead end.

I was still puzzled about why Emacs sometimes got the TMPDIR setting and sometimes not. That's when I realized that my original Emacs process, the one that had failed to rendezvous with emacsclient, had not been started in the usual way. Instead of simply running emacs, I had run

    git re-edit

which invokes Git, which then runs


which is a Perl program I wrote that does a bunch of stuff to figure out which files I was editing recently and then execs emacs to edit them some more. So there are several programs here that could be tampering with the environment and removing the TMPDIR setting.

To more accurately point the finger of blame, I put some diagnostics into the git-re-edit program to have it print out the value of TMPDIR. Indeed, git-re-edit reported that TMPDIR was unset. Clearly, the culprit was Git, which must have been removing TMPDIR from the environment before invoking my Perl program.

Who is stripping the environment?

To confirm this conclusion, I created a tiny shell script, /home/mjd/bin/git-env, which simply printed out the environment, and then I ran git env, which tells Git to find git-env and run it. If the environment it printed were to omit TMPDIR, I would know Git was the culprit. But TMPDIR was in the output.

So I created a Perl version of git-env, called git-perlenv, which did the same thing, and I ran it via git perlenv. And this time TMPDIR was not in the output. I ran diff on the outputs of git env and git perlenv and they were identical—except that git perlenv was missing TMPDIR.

So it was Perl's fault! And I verified this by running perl /home/mjd/bin/git-re-edit directly, without involving Git at all. The diagnostics I had put in reported that TMPDIR was unset.

WTF Perl?

At this point I tried getting rid of get-re-edit itself, and ran the one-line program

    perl -le 'print $ENV{TMPDIR}'

which simply runs Perl and tells it to print out the value of the TMPDIR environment variable. It should print /mnt/tmp, but instead it printed the empty string. This is a smoking gun, and Perl no longer has anywhere to hide.

The mystery is not cleared up, however. Why was Perl doing this? Surely not a bug; someone else would have noticed such an obvious bug sometime in the past 25 years. And it only failed for TMPDIR, not for other variables. For example

    FOO=bar perl -le 'print $ENV{FOO}'

printed out bar as one would expect. This was weird: how could Perl's environment handling be broken for just the TMPDIR variable?

At this point I got Rik Signes and Frew Schmidt to look at it with me. They confirmed that the problem was not in Perl generally, but just in this Perl. Perl on other systems did not display this behavior.

I looked in the output of perl -V, which says what version of Perl you are using and which patches have been applied, and wasted a lot of time looking into CVE-2016-2381, which seemed relevant. But it turned out to be a red herring.

Working around the problem, 1.

While all this was going on I was looking for a workaround. Finding one is at least as important as actually tracking down the problem because ultimately I am paid to do something other than figure out why Perl is losing TMPDIR. Having a workaround in hand means that when I get sick and tired of looking into the underlying problem I can abandon it instantly instead of having to push onward.

The first workaround I found was to not use the Unix-domain socket. Emacs has an option to use a TCP socket instead, which is useful on systems that do not support Unix-domain sockets, such as non-Unix systems. (I am told that some do still exist.)

You set the server-use-tcp variable to a true value, and when you start the server, Emacs creates a TCP socket and writes a description of it into a “server file”, usually ~/.emacs.d/server/server. Then when you run emacsclient you tell it to connect to the socket that is described in the file, with

    emacsclient --server-file=~/.emacs.d/server/server

or by setting the EMACS_SERVER_FILE environment variable. I tried this, and it worked, once I figured out the thing about server-use-tcp and what a “server file” was. (I had misunderstood at first, and thought that “server file” meant the Unix-domain socket itself, and I tried to get emacsclient to use the right one by setting EMACS_SERVER_FILE, which didn't work at all. The resulting error message was obscure enough to lead me to IRC to ask about it.)

Working around the problem, 2.

I spent quite a while looking for an environment variable analogous to EMACS_SERVER_FILE to tell emacsclient where the Unix-domain socket was. But while there is a --socket-name command-line argument to control this, there is inexplicably no environment variable. I hacked my also command (responsible for running emacsclient) to look for an environment variable named EMACS_SERVER_SOCKET, and to pass its value to emacsclient --socket-name if there was one. (It probably would have been better to write a wrapper for emacsclient, but I didn't.) Then I put

    EMACS_SERVER_SOCKET=$TMPDIR/emacs$(id -u)/server

in my Bash profile, which effectively solved the problem. This set EMACS_SERVER_SOCKET to /mnt/tmp/emacs2017/server whenever I started a new shell. When I ran also it would notice the setting and pass it along to emacsclient with --socket-name, to tell emacsclient to look in the right place. Having set this up I could forget all about the original problem if I wanted to.

But but but WHY?

But why was Perl removing TMPDIR from the environment? I didn't figure out the answer to this; Frew took it to the #p5p IRC channel on, where the answer was eventually tracked down by Matthew Horsfall and Zefrem.

The answer turned out to be quite subtle. One of the classic attacks that can be mounted against a process with elevated privileges is as follows. Suppose you know that the program is going to write to a temporary file. So you set TMPDIR beforehand and trick it into writing in the wrong place, possibly overwriting or destroying something important.

When a program is loaded into a process, the dynamic loader does the loading. To protect against this attack, the loader checks to see if the program it is going to run has elevated privileges, say because it is setuid, and if so it sanitizes the process’ environment to prevent the attack. Among other things, it removes TMPDIR from the environment.

I hadn't thought of exactly this, but I had thought of something like it: If Perl detects that it is running setuid, it enables a secure mode which, among other things, sanitizes the environment. For example, it ignores the PERL5LIB environment variable that normally tells it where to look for loadable modules, and instead loads modules only from a few compiled-in trustworthy directories. I had checked early on to see if this was causing the TMPDIR problem, but the perl executable was not setuid and Perl was not running in secure mode.

But Linux supports a feature called “capabilities”, which is a sort of partial superuser privilege. You can give a program some of the superuser's capabilities without giving away the keys to the whole kingdom. Our systems were configured to give perl one extra capability, of binding to low-numbered TCP ports, which is normally permitted only to the superuser. And when the dynamic loader ran perl, it saw this additional capability and removed TMPDIR from the environment for safety.

This is why Emacs had the TMPDIR setting when run from the command line, but not when run via git-re-edit.

Until this came up, I had not even been aware that the “capabilities” feature existed.

A red herring

There was one more delightful confusion on the way to this happy ending. When Frew found out that it was just the Perl on my development machine that was misbehaving, he tried logging into his own, nearly identical development machine to see if it misbehaved in the same way. It did, but when he ran a system update to update Perl, the problem went away. He told me this would fix the problem on my machine. But I reported that I had updated my system a few hours before, so there was nothing to update!

The elevated capabilities theory explained this also. When Frew updated his system, the new Perl was installed without the elevated capability feature, so the dynamic loader did not remove TMPDIR from the environment.

When I had updated my system earlier, the same thing happened. But as soon as the update was complete, I reloaded my system configuration, which reinstated the capability setting. Frew hadn't done this.


  • The system configuration gave perl a special capability
  • so the dynamic loader sanitized its environment
  • so that when perl ran emacs,
  • the Emacs process didn't have the TMPDIR environment setting
  • which caused Emacs to create its listening socket in the usual place
  • but because emacsclient did get the setting, it looked in the wrong place


This computer stuff is amazingly complicated. I don't know how anyone gets anything done.

[ Addendum 20160709: Frew Schmidt has written up the same incident, but covers different ground than I do. ]

[ Addendum 20160709: A Hacker News comment asks what changed to cause the problem? Why was Perl losing TMPDIR this week but not the week before? Frew and I don't know! ]

[Other articles in category /tech] permanent link

Sun, 01 May 2016


It will suprise nobody to learn that when I was a child, computers were almost unknown, but it may be more surprising that typewriters were unusual.

Probably the first typewriter I was familiar with was my grandmother’s IBM “Executive” model C. At first I was not allowed to touch this fascinating device, because it was very fancy and expensive and my grandmother used it for her work as an editor of medical journals.

The “Executive” was very advanced: it had proportional spacing. It had two space bars, for different widths of spaces. Characters varied between two and five ticks wide, and my grandmother had typed up a little chart giving the width of each character in ticks, which she pasted to the top panel of the typewriter. The font was sans-serif, and I remember being a little puzzled when I first noticed that the lowercase j had no hook: it looked just like the lowercase i, except longer.

The little chart was important, I later learned, when I became old enough to use the typewriter and was taught its mysteries. Press only one key at a time, or the type bars will collide. Don't use the (extremely satisfying) auto-repeat feature on the hyphen or underscore, or the platen might be damaged. Don't touch any of the special controls; Grandma has them adjusted the way she wants. (As a concession, I was allowed to use the “expand” switch, which could be easily switched off again.)

The little chart was part of the procedure for correcting errors. You would backspace over the character you wanted to erase—each press of the backspace key would move the carriage back by one tick, and the chart told you how many times to press—and then place a slip of correction paper between the ribbon and the paper, and retype the character you wanted to erase. The dark ribbon impression would go onto the front of the correction slip, which was always covered with a pleasing jumble of random letters, and the correction slip impression, in white, would exactly overprint the letter you wanted to erase. Except sometimes it didn't quite: the ribbon ink would have spread a bit, and the corrected version would be a ghostly white letter with a hair-thin black outline. Or if you were a small child, as I was, you would sometimes put the correction slip in backwards, and the white ink would be transferred uselessly to the back of the ribbon instead of to the paper. Or you would select a partly-used portion of the slip and the missing bit of white ink would leave a fragment of the corrected letter on the page, like the broken-off leg of a dead bug.

Later I was introduced to the use of Liquid Paper (don't brush on a big glob, dot it on a bit at a time with the tip of the brush) and carbon paper, another thing you had to be careful not to put in backward, although if you did you got a wonderful result: the typewriter printed mirror images.

From typing alphabets, random letters, my name, and of course qwertyuiops I soon moved on to little poems, stories, and other miscellanea, and when my family saw that I was using the typewriter for writing, they presented me with one of my own, a Royal manual (model HHE maybe?) with a two-color ribbon, and I was at last free to explore the mysteries of the TAB SET and TAB CLEAR buttons. The front panel had a control for a three-color ribbon, which forever remained an unattainable mystery. Later I graduated to a Smith-Corona electric, on which I wrote my high school term papers. The personal computer arrived while I was in high school, but available printers were either expensive or looked like crap.

When I was in first grade our classroom had acquired a cheap manual typewriter, which as I have said, was an unusual novelty, and I used it whenever I could. I remember my teacher, Ms. Juanita Adams, complaining that I spent too much time on the typewriter. “You should work more on your handwriting, Jason. You might need to write something while you’re out on the street, and you won't just be able to pull a typewriter out of your pocket.”

She was wrong.

[Other articles in category /tech] permanent link

Thu, 27 Feb 2014

2banner, which tells you when someone else is looking at the same web page

I was able to release a pretty nice piece of software today, courtesy of my employer, ZipRecruiter. If you have a family of web pages, and whenever you are looking at one you want to know when someone else is looking at the same page, you can use my package. The package is called 2banner, because it pops up a banner on a page whenever two people are looking at it. With permission from ZipRecruiter, I have put it on github, and you can download and use it for free.

A typical use case would be a customer service organization. Say your users create requests for help, and that the customer service reps have to answer the requests. There is a web page with a list of all the unserviced requests, and each one has a link to a page with details about what is requested and how to contact the person who made the request. But it might sometimes happes that Fred and Mary independently decide to service the same request, which is at best a waste of effort, and at worst confusing for the customer who gets email from both Fred and Mary and doesn't know how to respond. With 2banner, when Mary arrives at the customer's page, she sees a banner in the corner that says Fred is already looking at this page!, and at the same time a banner pops up in Fred's browser that says Mary has started looking at this page! Then Mary knows that Fred is already on the case, and she can take over a different one, or Fred and Mary can communicate and decide which of them will deal with this particular request.

You can similarly trick out the menu page itself, to hide the menu items that someone is already looking out.

I wanted to use someone else's package for this, but I was not able to find one, so I wrote one myself. It was not as hard as I had expected. The system comprises three components:

  • The back-end database for recording who started looking at which pages and when. I assumed a SQL database and wrote a component that uses Perl's DBIx::Class module to access it, but it would be pretty easy throw this away and use something else instead.

  • An API server that can propagate notifications like “user X is now looking at page Y” and “user X is no longer looking at page Y” into the database, and which can answer the question “who else is looking at page Y right now?”. I used Perl's Catalyst framework for this, because our web app already runs under it. It would not be too hard to throw this away and use something else instead. You could even write a standalone server using HTTP::Server, and borrow most of the existing code, and probably have it working in under an hour.

  • A JavaScript thingy that lives in the web page, sends the appropriate notifications to the API server, and pops up the banner when necessary. I used jQuery for this. Probably there is something else you could use instead, but I have no idea what it is, because I know next to nothing about front-end web programming. I was happy to have the chance to learn a little about jQuery for this project.

Often a project seems easy but the more I think about it the harder it seems. This project was the opposite. I thought it sounded hard, and I didn't want to do it. It had been an outstanding request of the CS people for some time, but I guess everyone else thought it sounded hard too, because nobody did anything about it. But the longer I let it percolate around in my head, the simpler it seemed. As I considered it, one difficulty after another evaporated.

Other than the jQuery stuff, which I had never touched before, the hardest part was deciding what to do about the API server. I could easily have written a standalone, special-purpose API server, but I felt like it was the wrong thing, and anyway, I didn't want to administer one. But eventually I remembered that our main web site is driven by Catalyst, which is itself a system for replying to HTTP requests, which already has access to our database, and which already knows which user is making each request.

So it was natural to say that the API was to send HTTP requests to certain URLs on our web site, and easy-peasy to add a couple of handlers to the existing Catalyst application to handle the API requests, query the database, and send the right replies.

I don't know why it took me so long to think of doing the API server with Catalyst. If it had been someone else's suggestion I would probably feel dumb for not having thought of it myself, because in hindsight it is so obvious. Happily, I did think of it, because it is clearly the right solution for us.

It was not too hard to debug. The three components are largely independent of one another. The draft version of the API server responded to GET requests, which are easier to generate from the browser than the POST requests that it should be getting. Since the responses are in JSON, it was easy to see if the API server was replying correctly.

I had to invent techniques for debugging the jQuery stuff. I didn't know the right way to get diagnostic messages out of jQuery, so I put a big text area on the web page, and had the jQuery routines write diagnostic messages into it. I don't know if that's what other people do, but I thought it worked pretty well. JavaScript is not my ideal language, but I program in Perl all day, so I am not about to complain. Programming the front end in JavaScript and watching stuff happen on the page is fun! I like writing programs that make things happen.

The package is in ZipRecruiter's GitHub repository, and is available under a very lenient free license. Check it out.

(By the way, I really like working for ZipRecruiter, and we are hiring Perl and Python developers. Please send me email if you want to ask what it is like to work for them.)

[Other articles in category /tech] permanent link

Mon, 16 Dec 2013

Things do get better
I flew back from Amsterdam on Friday, and the plane had an in-flight entertainment system that offered to show me movies or play me music. That itself is a reasonable thing to try, I think, because the flight is dull. But until this flight, I never felt that the promise had been fulfilled. Usually, in my experience, these things offer four or five awful movies that you can only imagine watching while strapped into a chair Clockwork Orange style, and one canned selection of music from each of nine genres. So the in-flight entertainment system is yet another perpetrator of oppression by mass media and yet another shovelful of the least-common denominator culture that mass media fosters.

I don't think that least-common-denominator culture distributed by mass media is the worst evil perpetrated by the 20th century, but I do seriously think it is on the list of the top ten.

But not this time. Digital information technology has improved to the point that the in-flight system was able to offer me several dozen movies, a few of which I actually wanted to see, and a large selection of music, much more than I could possibly listen to during the seven-hour flight. And one of those selections was the 9th Symphony of Philip Glass.

I spent a large part of the flight alternately listening to the symphony, which I had not heard before, and marveling that it was there at all. “Who on earth,” I wondered, “thought it would be a good idea to put that in there?” I can't imagine there are that many people who want to listen to Philip Glass on a long airplane flight. But it seems that the technology has advanced to the point that the programming people have extra space they need to fill, so much extra space that it doesn't matter if they throw in some Philip Glass just in case, because why not?

I imagine it will get better from here too. Perhaps the next flight will offer me not just one selection from Philip Glass, but every possible selection from John Adams. But I think the in-flight entertainment system has crossed a critical threshold, and I will mock it no longer.

(My thanks to whatever crazy person decided to include Philip Glass on KLM flight 6053 last Friday. It brought me a lot of pleasure and helped pass the slow hours across the north Atlantic.)

[ Addendum 20150501: Unable to find a copy online, I asked my wife to get my a CD of the 9th Symphony for my birthday, and it is as wonderful as I remembered. Here's another way things got better: I put the CD into my laptop, to rip some MP3s from it, and discovered that Orange Mountain Music had saved me the trouble; the CS was pre-equipped with audio files in MP3, FLAC, and Ogg Vorbis format. ]

[Other articles in category /tech] permanent link

Tue, 24 Sep 2013

The shittiest project I ever worked on
Sometimes in job interviews I've been asked to describe a project I worked on that failed. This is the one I always think of first.

In 1995 I quit my regular job as senior web engineer for Time-Warner and became a consultant developing interactive content for the World-Wide Web, which was still a pretty new thing at the time. Time-Warner taught me many things. One was that many large companies are not single organizations; they are much more like a bunch of related medium-sized companies that all share a building and a steam plant. (Another was that I didn't like being part of a large company.)

One of my early clients was Prudential, which is a large life insurance, real estate, and financial services conglomerate based in Newark, New Jersey—another fine example of a large company that actually turned out to be a bunch of medium-sized companies sharing a single building. I did a number of projects for them, one of which was to produce an online directory of Prudential-affiliated real estate brokers. I'm sure everyone is familiar with this sort of thing by now. The idea was that you would visit a form on their web site, put in your zip code or town name, and it would extract the nearby brokers from a database and present them to you on a web page, ordered by distance.

The project really sucked, partly because Prudential was disorganized and bureaucratic, and partly because I didn't know what I was doing. I quoted a flat fee for the job, assuming that it would be straightforward and that I had a good idea of what was required. But I hadn't counted on bureaucratic pettifoggery and the need for every layer of the management hierarchy to stir the soup a little. They tweaked and re-tweaked every little thing. The data set they delivered was very dirty, much of it garbled or incomplete, and they kept having to fix their exporting process, which they did incompletely, several times. They also changed their minds at least once about which affiliated real estate agencies should be in the results, and had to re-send a new data set with the new correct subset of affiliates, and then the new data would be garbled or incomplete. So I received replacement data six or seven times. This would not have been a problem, except that each time they presented me with a file in a somewhat different format, probably exported from some loser's constantly-evolving Excel spreadsheet. So I had to write seven or eight different versions of the program that validated and loaded the data. These days I would handle this easily; after the first or second iteration I would explain the situation: I had based my estimate on certain expectations of how much work would be required; I had not expected to clean up dirty data in eight different formats; they had the choice of delivering clean data in the same format as before, renegotiating the fee, or finding someone else to do the project. But in 1995 I was too green to do this, and I did the extra work for free.

Similarly, they tweaked the output format of the program repeatedly over weeks: first the affiliates should be listed in distance order, but no, they should be listed alphabetically if they are in the same town and then after that the ones from other towns, grouped by town; no, the Prudential Preferred affiliates must be listed first regardless of distance, which necessitated a redelivery of the data which up until then hadn't distinguished between ordinary and Preferred affiliates; no wait, that doesn't make sense, it puts a far-off Preferred affiliate ahead of a nearby regular affiliate... again, this is something that many clients do, but I wasn't expecting it and it took a lot of time I hadn't budgeted for. Also these people had, I now know, an unusually bad case of it.

Anyway, we finally got it just right, and it had been approved by multiple layers of management and given a gold star by the Compliance Department, and my clients took it to the Prudential Real Estate people for a demonstration.

You may recall that Prudential is actually a bunch of medium-sized companies that share a building in Newark. The people I was working with were part of one of these medium-sized companies. The real estate business people were in a different company. The report I got about the demo was that the real estate people loved it, it was just what they wanted.

“But,” they said, “how do we collect the referral fees?”

Prudential Real Estate is a franchise operation. Prudential does not actually broker any real estate. Instead, a local franchisee pays a fee for the use of the name and logo and other services. One of the other services is that Prudential runs a national toll-free number; you can call this up and they will refer you to a nearby affiliate who will help you buy or sell real estate. And for each such referral, the affiliate pays Prudential a referral fee.

We had put together a real estate affiliate locator application which let you locate a nearby Prudential-affiliated franchisee and contact them directly, bypassing the referral and eliminating Prudential's opportunity to collect a referral fee.

So I was told to make one final change to the affiliate locator. It now worked like this: The user would enter their town or zip code; the application would consult the database and find the contact information for the nearby affiliates, it would order them in the special order dictated by the Compliance Department, and then it would display a web page with the addresses and phone numbers of the affiliates carefully suppressed. Instead, the name of each affiliate would be followed by the Prudential national toll-free number AND NOTHING ELSE. Even the names were suspect. For a while Prudential considered replacing each affiliate's name with a canned string, something like "Prudential Real Estate Affiliate", because what if the web user decided to look up the affiliate in the Yellow Pages and call them directly? It was eventually decided that the presence of the toll-free number directly underneath rendered this risk acceptably small, so the names stayed. But everything else was gone.

Prudential didn't need an affiliate locator application. They needed a static HTML page that told people to call the number. All the work I had put into importing the data, into formatting the output, into displaying the realtors in precisely the right order, had been a complete waste of time.

[ Addendum 20131018: This article is available in Chinese. ]

[Other articles in category /tech] permanent link

Thu, 11 Jul 2013


This is a public service announcement.

This is not a picture of a cobbled street:

Rather, these stones are "Belgian block", also called setts.

Cobblestones look like this:

I took these pictures in front of the library of the American Philosophical Society on South 5th Street in Philadelphia. South 5th Street is paved with Belgian block, and the lane beside the APS is cobbled. You can just barely distinguish them in this satellite photograph.

[Other articles in category /tech] permanent link

Wed, 25 May 2011

Why use a digital stadiometer?

A couple of years ago I wrote an article about a stadiometer (height-measuring device) that used an optical scanner to read a Gray-coded height off the scale.

The article periodically shows up on places like Reddit and Hacker News, and someone often asks why the stadiometer is so complex. Most recently, for example:

How is this an advance on looking at a conventionally numbered ruler (with a similar bracket to touch the top of the head) and writing down the number? It's technological and presumably expensive, but it isn't delivering any discernible benefit that I can see.
Not long after I wrote the original article, I was back at the office, so I asked one of the senior doctors about it. She said that the manual stadiometers were always giving inaccurate readings and that they constantly had to have the service guys in to recalibrate them. The electronic stadiometer, she said, is much more reliable.

"But it's a really expensive stadiometer," I said.

"The service calls on the manual stadiometers were costing us a fortune."

This stadiometer transmits its reading via radio to a portable digital display. For this doctors' office, the portable display is a red herring. They had the display mounted on the wall right next to the stadiometer. I asked if they ever took it down and moved it around; the doctor said they never did.

At the time I observed that the answer was mundane and reasonable, but not something that one would be able to deduce. In the several discussions of the topic, none of the people speculating have guessed the correct answer.

When I was working on Red Flags talks, people would send me code, and I would then fix it up to be better code. Often you see code written in what seems to be the worst possible way, and the obvious conclusion is that the author is a complete idiot, or maybe just mentally ill. Perhaps this is sometimes the case, but when I took the time to write back and ask why the author had done it the way they did, there was usually a reasonable answer.

Here's an example that stands out in my memory. A novice once sent me a program he had written that did some sort of tedious file-munging job in Perl, selecting files and copying some of them around in the filesystem. It was a bad program in many ways, but what was most striking about it was that there were many functions to perform operations on lists of filenames, and whenever one of these functions called another, it passed the list of data by writing it to a temporary file, which the called function would then read back.

The diagram at right shows the structure of the program. Rectangles with rounded corners indicate subroutines; dotted rectangles are the temporary files they use for argument passing.

I suggested to the author that it would have been easier to have passed the data using the regular argument passing techniques, and his reply astounded me, because it was so reasonable: he said he had used the temporary files as a debugging measure, because that way he could inspect the files and see if the contents were correct.

I was thunderstruck. I had been assuming that the programmer was either a complete beginner, who didn't even know how to pass arguments to a function, or else a complete blockhead. But I was utterly wrong. He was just someone who needed to be introduced to the debugger. Or perhaps the right suggestion for him would be to call something like this from inside the functions that needed debugging:

        sub dump_arguments {
          my ($file) = (caller)[4];
          open my($f), ">", $file or die "$file: $!";
          print $f join("\n", @_, "");
But either way, this was clearly a person who was an order of magnitude less incompetent than I initially imagined from seeing the ridiculous code he had written. He had had a specific problem and had chosen a straightforward and reasonably effective way to address it. But until I got the correct explanation, the only explanation I could think of was unlimited incompetence.

This is only one of many such examples. Time and time again people would send me perfectly idiotic code, and when I asked why they had done it that way the answer was not that they were idiots, but that there was some issue I had not appreciated, some problem they were trying to solve that was not apparent. Not to say that the solutions were not inept, or badly engineered, or just plain wrong. But there is a difference between a solution that is inept and one that is utterly insane. These appeared at first to be insane, but on investigation turned out to be sane but clumsy.

I said a while back that it is a good idea to get in the habit of assuming that everything is more complex than you imagine. I think there is parallel advice here: assume that bad technical decisions are made rationally, for reasons that are not apparent.

[Other articles in category /tech] permanent link

Sat, 12 Dec 2009

On failing open
An axiom of security analysis is that nearly all security mechanisms must fail closed. What this means is that if there is an uncertainty about whether to grant or to deny access, the right choice is nearly always to deny access.

For example, consider a login component that accepts a username and a password and then queries a remote authentication server to find out if the password is correct. If the connection to the authentication server fails, or if the authentication server is down, the login component must decide whether to grant or deny access, in the absence of correct information from the server. The correct design is almost certainly to "fail closed", and to deny access.

I used to teach security classes, and I would point out that programs sometimes have bugs, and do the wrong thing. If your program has failed closed, and if this is a bug, then you have an irate user. The user might call you up and chew you out, or might chew you out to your boss, and they might even miss a crucial deadline because your software denied them access when it should have granted access. But these are relatively small problems. If your program has failed open, and if this is a bug, then the user might abscond with the entire payroll and flee to Brazil.

(I was once teaching one of these classes in Lisbon, and I reached the "flee to Brazil" example without having realized ahead of time that this had greater potential to offend the Portuguese than many other people. So I apologized. But my hosts very kindly told me that they would have put it the same way, and that in fact the Mayor of Lisbon had done precisely what I described a few years before. The moral of the story is to read over the slides ahead of time before giving the talk.)

But I digress. One can find many examples in the history of security that failed the wrong way.

However, the issue is on my mind because I was at a job interview a few weeks ago with giant media corporation XYZ. At the interview, we spent about an hour talking about an architectural problem they were trying to solve. XYZ operates a web site where people can watch movies and TV programs online. Thy would like to extend the service so that people who subscribe to premium cable services, such as HBO, can authenticate themselves to the web site and watch HBO programs there; HBO non-subscribers should get only free TV content. The problem in this case was that the authentication data was held on an underpowered legacy system that could serve only a small fraction of the requests that came in.

The solution was to cache the authentication data on a better system, and gather and merge change information from the slow legacy system as possible.

I observed during the discussion that this was a striking example of the rare situation in which one wants the authentication system to fail open instead of closed. For suppose one grants access that should not be granted. Then someone on the Internet gets to watch a movie or an episode of The Sopranos for free, which is not worth getting excited about and which happens a zillion times a day anyhow.

But suppose the software denies access that should have been granted. Then there is a legitimate paying customer who has paid to watch The Sopranos, and we told them no. Now they are a legitimately irate customer, which is bad, and they may call the support desk, costing XYZ Corp a significant amount of money, which is also bad. So all other things being equal, one should err on the side of lenity: when in doubt, grant access.

I would like to thank Andrew Lenards for his gift.

[Other articles in category /tech] permanent link

Fri, 30 May 2008

A missing feature in document viewers
It often happens that I'm looking at some multi-page document, such as a large PDF file, with a viewer program, say Adobe's Acrobat Reader, or Gnome Document Viewer, and the page numbers don't match.

Typically, the viewer numbers all the pages sequentially, starting with 1. But many documents have some front matter, such as a table of contents, that is outside the normal numbering. For example, there might be a front cover page, and then a table of contents labeled with page numbers i through xviii, and then the main content of the document follows on pages 1 through 263.

Computer programmers, I just realized, have a nice piece of jargon to describe this situation, which is very common. They speak of "logical" and "physical" pages. The "physical" page numbers are the real, honest-to-goodness numbers of the pages, what you get if you start at 1 and count up. The "logical" page numbers are the names by which the pages are referred. In the example document I described, physical page 1 is the front cover, physical page 2 is logical page i, physical page 19 is logical page xviii, physical page 20 is logical page 1, and so forth. The document has 282 physical pages, and the last one is logical page 263.

Let's denote physical pages with square brackets and logical pages with curvy brackets. So "(xviii)" and "[19]" denote the same page in this document. Page (1) is page [20], and page (20) is page [39]. Page [1] has no logical designation, or perhaps it is something like "(front cover sheet)".

Now the problem I want to discuss is as follows: Every viewer program has a little box where it displays the current page number, and the little box is usually editable. You scan the table of contents, find the topic you want to read about, and the table says that it's on page (165). Then you tell the document viewer to go to page 165, and it does, but it's not the page 165 you want, because the viewer gives you [165], which is actually (146). You actually wanted (165), which is page [184].

Then you curse, mentally subtract 146 (what you got) from 165 (what you wanted), add the result, 19, back to 165, getting 184, and then you ask for 184 to get 165. And if you're me you probably mess up one time in three and have to do it over, because subtraction is hard.

But it would be extremely easy for viewer programs to mostly fix this. They need to support an option where you can click on the box and tell it "your page number is wrong here". Maybe you would right-click the little page-number box, and the process would pop up a dialog:

Then you would type in 146 (which you can see at the bottom of the page you're viewing) and click "OK". From then on the process would know that the logical and physical page numbers differed by 19, and it would subtract 19 from the number in the little box until you told it something else. You could then type 165 into the little box, and the process would think "well, you asked for (165), and I know that (165) is really [184] because you told me earlier that [165] is really (146)" and then you would get [184], which is what you wanted. And when you scrolled down from (165) to (166), the program would think "ho, you just went from [184] to [185], so I will change the display in the little box and display [185]-19 = (166) there".

But no, none of them do this.

The document itself should carry this information, and some of them do, sometimes. But not every document will, so viewers should support this feature, which is useful anyway.

Some document formats support internal links, but most documents do not use those features, and anyway they are useless when what you are trying to do is look up a reference from someone else's bibliography: "(See Ogul, pp. 662–664.)"

This is not a complete solution, but it's an almost complete solution, and it can be implemented unilaterally, by which I mean that the document author and the viewer program author need not agree on anything. It's really easy to do.

[ Addendum 20080521: Chung-chieh Shan informs me that current versions of xdvi have this feature. I was unaware of this, because the version installed on my machine was compiled in celebration of the 1926 Philadelphia Sesquicentennial Exhibition and so predates the addition of this feature. ]

[ Addendum 20080530: How I made the dialog box graphic. ]

[Other articles in category /tech] permanent link

Thu, 01 May 2008

At that moment, the novice was enlightened...
Presented without further comment, a conversation I had yesterday on IRC. I am yrlnry:

--> You are now talking on #ubuntu
23:37<yrlnry>I upgraded to HH this afternoon. Since the upgrade, when I select a URL in gnome-terminal and then pick the "open this link" menu item, the link doesn't open in my browser. Instead, I get a dialog that says "Could not open the address "http://...": There was an error launching the default action command associated with this location." How can I fix this, or find out what the "error" was?
23:38<lpkmgj> yrlnry: this happeds in Windows
   yrlnry: i get that in Windows 2
23:39<yrlnry> lpkmgj: thanks! that fixed my problem!
 <lpkmgj>yrlnry: sarcasm?
 <yrlnry>lpkmgj: No!
 <lpkmgj>yrlnry: right ....
23:40<yrlnry>lpkmgj: WHen you said that, I realized that the problem was that HH had installed Firefox 3, and that the terminal program wants to use the default browser, which is FF2, which is no longer present since the upgrade.
 <yrlnry>lpkmgj: so I told FF3 to make itself the default browser, and the problem went away.
 <lpkmgj>yrlnry: oh, well glad i helped : )

(I have changed the name of the other person.)

[Other articles in category /tech] permanent link

Tue, 20 Mar 2007

How big is a five-gallon jug?
Office water coolers in the United States commonly take five-gallon jugs of water. You are probably familiar with these jugs, but here is a picture of a jug, to refresh your memory. A random graduate student has been provided for scale:

Here's today's riddle: Can you estimate the volume of the jug in cubic feet? "Estimate" means by eyeballing it, not by calculating, measuring, consulting reference works, etc. But feel free to look at an actual jug if you have one handy.

Once you've settled on your estimate, compare it with the correct answer, below.

It is about 2/3 of a cubic foot.

One gallon contains about 231 cubic inches. Five gallons contain about 1155 cubic inches.

One cubic foot contains 12×12×12 = 1728 cubic inches.

Hard to believe, isn't it? ("Strange but true.") I took one of these jugs around my office last year, asking everyone to guess how big it was; nobody came close. People typically guessed that it was about three times as big as it actually is.

This puzzle totally does not work anywhere except in the United States. The corresponding puzzle for the rest of the world is "Here is a twenty-liter jug. Can you guess the volume of the jug in liters?" I suppose this is an argument in favor of the metric system.

[Other articles in category /tech] permanent link

Wed, 14 Mar 2007

The Spite House

New York's Architectural Holdouts
New York's Architectural Holdouts
with kickback
no kickback
The subject of really narrow buildings came up on Reddit last week, and my post about the "Spite House" was well-received. Since pictures of it seem to be hard to come by, I scanned the pictures from New York's Architectural Holdouts by Andrew Alpern and Seymour Durst.

The book is worth checking out, particularly if you are familiar with New York. The canonical architectural holdout occurs when a developer is trying to assemble a large parcel of land for a big building, and a little old lady refuses to sell her home. The book is full of astonishing pictures: skyscrapers built with holdout buildings embedded inside them and with holdout buildings wedged underneath them. Skyscrapers built in the shape of the letter E (with the holdouts between the prongs), the letter C (with the holdout in the cup), and the letter Y (with the holdout in the fork).

Photo credit: Jerry Callen
When Henry Siegel, a New York store owner, got news in 1898 that Macy's was going to build a gigantic new flagship store on Herald Square, he bought the corner lot for $375,000 to screw over his competitors. The Herald Square Macy's still has a notch cut out of its corner; see the picture at right. The Macy's store on Queens Boulevard is in the shape of a perfect circle, except for the little bit cut out of one side where the proverbial old lady (this time named Mary Sendek) refused to sell a 7×15-foot back corner of her lot for $200,000 because she wanted her dog to have a place to play. (Here's a satellite view of the building. The notch is clearly visible at the northwest corner, facing 55th Avenue.)

But anyway, the Spite House. The story, as told by Alpern and Durst, is that around 1882, Patrick McQuade wanted to build some houses on 82nd Street at Lexington Avenue. The adjoining parcel of land, around the corner on Lexington, was owned by Joseph Richardson, shown at left. If McQuade could acquire this parcel, he would be able to extend his building all the way to Lexington Avenue, and put windows on that side of the building. No problem: the parcel was a strip of land 102 feet long and five feet wide along Lexington, useless for any other purpose. Surely Richardson would sell.

McQuade offered $1,000, but Richardson demanded $5,000. Unwilling to pay, McQuade started building his houses anyway, complete with windows looking out on Richardson's five-foot-wide strip, which was unbuildable. Or so he thought.

Richardson built a building five feet wide and 102 feet long, blocking McQuade's Lexington Avenue windows. (Click the pictures for large versions.)

The building soon became known as the "Spite House". The photograph above was taken around 1895. Lexington Avenue is torn up for maintenance in this picture.

Richardson took advantage of a clause in the building codes that allowed him to build bay window extensions in his building. This allowed him to extend its maximum width 2'3" beyond the boundary of the lot. (Alpern and Durst say "In those days, such encroachments on the public sidewalks were not prohibited.") The rooms of the Spite House were in these bay window extensions, connected by extremely narrow hallways:

As you can see, the Spite House was divided into two dwellings, each with a separate entrance, four floors, and two rooms on each floor. The rooms were 7'3" wide and were connected by hallways 3'4" wide.

After construction was completed, Richardson moved into the Spite House and lived there until he died in 1897. The pictures below and at left are from that time.

The edge-on photograph below, showing the Spite House's 3'4" frontage on 82nd Street, was taken in 1912.

The Spite House was demolished in 1915.

Picture credits

The photograph of the Macy's Herald Square store is copyright ©2004 Jerry Callen, and is used with permission.

All other pictures and photographs are in the public domain. I took them from pages 122–124 of the book New York's Architectural Holdouts, by Alpern and Durst. The original sources, as given by Alpern and Durst, are as follows:

Collection of Andrew Alpern.

January 1897 issue of Scientific American.

New York Journal, 5 June 1897
New York Public Service Commission

[Other articles in category /tech] permanent link

Mon, 20 Mar 2006

The 20 most important tools
Forbes magazine recently ran an article on The 20 Most Important Tools. I always groan when I hear that some big magazine has done something like that, because I know what kind of dumbass mistake they are going to make: they are going to put Post-It notes at #14. The Forbes folks did not make this mistake. None of their 20 items were complete losers.

In fact, I think they did a pretty good job. They assembled a panel of experts, including Don Norman and Henry Petroski; they also polled their readers and their senior editors. The final list isn't the one I would have written, but I don't claim that it's worse than one I would have written.

Criticizing such a list is easy---too easy. To make the rules fair, it's not enough to identify items that I think should have been included. I must identify items that I think nearly everyone would agree should have been included.

Unfortunately, I think there are several of these.

First, to the good points of the list. It doesn't contain any major clinkers. And it does cover many vitally important tools. It provokes thought, which is never a bad thing. It was assembled thoughtfully, so one is not tempted to dismiss any item without first carefully considering why it is in there.

Here's the Forbes list:

  1. The Knife
  2. The Abacus
  3. The Compass
  4. The Pencil
  5. The Harness
  6. The Scythe
  7. The Rifle
  8. The Sword
  9. Eyeglasses
  10. The Saw
  11. The Watch
  12. The Lathe
  13. The Needle
  14. The Candle
  15. The Scale
  16. The Pot
  17. The Telescope
  18. The Level
  19. The Fish Hook
  20. The Chisel
The Forbes list has some restrictions. "Tools" must be simple, portable physical implements. Fundamental machines are omitted; most notably, this excludes "the lever" and "the wheel". (The invention of real importance there is not the wheel, but the axle. But that's another article for another time.) Inventions like fire, glassblowing, the computer, gunpowder, the windmill, and written language are ruled out, not because they are unimportant, but because they are not "tools" in the sense of being fairly simple, portable physical implements. They belong on some list, but not this one. (That didn't stop Don Norman from writing a ponderous and obvious essay about how the Forbes list was the wrong list to make. I know Don Norman has his fans, but I've never understood why.)


The Pencil
The Pencil
with kickback
no kickback
The Forbes items are also allowed to stand for categories. For example, "the Rifle" really stands for portable firearms, including muskets and such. "The pencil" includes pens and writing brushes. (Why put "the pencil" and not "the pen"? I imagine Henry Petroski arguing about it until everyone else got tired and gave up.) The spoon, had they included it, would have stood for eating utensils in general.

But here is my first quibble: it's not really clear why some items stood for whole groups, and others didn't. The explanatory material points out that five other items on the list are special cases of the knife: the scythe, lathe, saw, chisel, and sword. The inclusion of the knife as #1 on the list is, I think, completely inarguable. The power and the antiquity of the knife would put it in the top twenty already.

Consider its unmatched versatility as well and you just push it up into first place, and beyond. Make a big knife, and you have a machete; bigger still, and you have a sword. Put a knife on the end of a stick and you have an axe; put it on a longer stick and you have a spear. Bend a knife into a circle and you have a sickle; make a bigger sickle and you have a scythe. Put two knives on a hinge or a spring and you have shears. Any of these could be argued to be in the top twenty. When you consider that all these tools are minor variations on the same device, you inevitably come to the conclusion that the knife is a tool that, like Babe Ruth among baseball players, is ridiculously overqualified even for inclusion with the greatest.

But Forbes people gave the sword a separate listing (#8), and a sword is just a big knife. It serves the same function as a knife and it serves it in the same mechanical way. So it's hard to understand why the Forbes people listed them separately. If you're going to list the sword separately, how can you omit the axe or the spear? Grouping the items is a good idea, because otherwise the list starts to look like the twenty most important ways to use a knife. But I would have argued for listing the sword, axe, chisel, and scythe under the heading of "knife".

I find the other knifelike devices less objectionable. The saw is fundamentally different from a knife, because it is made and used differently, and operates in a different way: it is many tiny knives all working in the same direction. And the lathe is not a special case of the knife, because the essential feature of the lathe is not the sharp part but the spinning part. (I wouldn't consider the lathe a small, portable implement, but more about that below.)


I said that I was required to identify items that everyone would agree are major omissions. I have two such criticisms. One is that the list has room for six cutting tools, but no pounding tools. Where is the club? Where is the hammer? I could write a whole article about the absurdity of omitting the hammer. It's like leaving Abraham Lincoln off of a list of the twenty greatest U.S. presidents. It's like leaving Albert Einstein off of a list of the twenty greatest scientists. It's like leaving Honus Wagner off of a list of the twenty greatest baseball players.

No, I take it back. It's not like any of those things. Those things should all be described as analogous to leaving the hammer of the list of the twenty most important tools, not the other way around.

Was the hammer omitted because it's not a simple, portable physical implement? Clearly not.

Was the hammer omitted because it's an abstract fundamental machine, like the lever? Is a hammer really just a lever? Not unless a knife is just a wedge.

Is the hammer subsumed in one of the other items? I can't see any candidates. None of the other items is for pounding.

Did the Forbes panel just forget about it? That would have been weird enough. Two thousand Forbes readers, ten editors, and Henry Petroski all forgot about the hammer? Impossible. If you stop someone on the street and ask them to name a tool, odds are that they will say "hammer". And how can you make a list of the twenty most important tools, include the chisel as #20, and omit the hammer, without which the chisel is completely useless?

The article says:

We eventually came up with a list of more than 100 candidate tools. There was a great deal of overlap, so we collapsed similar items into a single category, and chose one tool to represent them. That left us with a final list of 33 items, each one a part of a particular class or style of tool; for instance, the spoon is representative of all eating utensils.

Perhaps the hammer was one of the 13 classes of tools that didn't make the cut? The writer of the article, David M. Ewalt, kindly provided me with a complete list of the 33 classes, including the also-rans. The hammer was not with the also-rans; I'm not sure if I find that more or less disturbing.


Well, enough about hammers. The 13 classes that did not make the cut were:

  • spoon
  • longbow
  • broom
  • paper clip
  • computer mouse
  • floppy disk
  • syringe
  • toothbrush
  • barometer
  • corkscrew
  • gas chromatograph
  • condom
  • remote control
Presumably some of these would have been cleaned up for publication, had they been selected for the top 20. For example, "longbow" should obviously be "bow". So I don't want to criticize these too much. The omissions seem more striking to me than the inclusions. But some of the inclusions are just too strange to let pass without comment, and some of those comments will help us understand what should be on the list and what shouldn't be.

"Gas chromatograph" seems to be someone's attempt to steer the list away from ancient inventions and to include some modern tools on the list. This is a worthy purpose. But I wish that they had thought of a better representative than the gas chromatograph. It seems to me that most tools of modern invention serve only very specialized purposes. The gas chromatograph is not an exception. I've never used a gas chromatograph. I don't think I know anyone who has. I've never seen a gas chromatograph. I might well go to the grave without using one. How is it possible that the gas chromatograph is one of the 33 most important tools of all time, beating out the hammer?

With "syringe", I imagine the authors were thinking of the hypodermic needle, but maybe they really were thinking of the syringe in general, which would include the meat syringe, the vacuum pipette, and other similar devices. If the latter, I have no serious complaint; I just wanted to point out the possible misunderstanding.

"Paper clip" is just the kind of thing I was afraid would appear. The paper clip isn't one of the top hundred most important tools, perhaps not even one of the top thousand. If the hammer were annihilated, civilization would collapse within twenty-four hours. If the paper clip were annihilated, we would shrug, we would go back to using pins, staples, and ribbons to bind our papers, and life would go on. If the pin isn't qualified for the list, the paper clip isn't even close.

I was speechless at the inclusion of the corkscrew in a list of essential tools that omits both bottles and corks, reduced to incoherent spluttering. The best I could do was mutter "insane".

I don't know exactly what was intended by "remote control", but it doesn't satisfy the criteria. The idea of remote control is certainly important, but this is not a list of important ideas or important functions but important tools. If there were a truly universal remote control that I could carry around with me everywhere and use to open doors, extinguish lights, summon vehicles, and so on, I might agree. But each particular remote control is too specialized to be of any major value.

Putting the computer mouse on the list of the twenty (or even 33) most important tools is like putting the pastrami on rye on the list of the twenty most important foods. Tasty, yes. Important? Surely not. In the same class as the soybean? Absurd.

The floppy disk is already obsolete.

Other comparisons

The telescope

Returning to the main list, eyeglasses and telescopes are both special cases of the lens, but their fundamentally different uses seem to me to clearly qualify them for separate listing; fair enough. I'm not sure I would have included the telescope, though. Is the telescope the most useful and important object of its type? Maybe I'm missing something, but it seems to me that most of the uses of the telescope are either scientific or military. The military value of the telescope is not in the same class as the value of the sword or the rifle. The scientific value of the telescope, however, is enormous. So it's on it scientific credentials that the telescope goes into the list, if at all.

But the telescope has a cousin, the microscope. Is the telescope's scientific value comparable to that of the microscope? I would argue that it is not. Certainly the microscope is much more widely used, in almost any branch of science you could name except astronomy. The telescope enabled the discovery that the earth is not the center of the universe, a discovery of vast philosophical importance. Did the microscope lead to fundamental discoveries of equal importance? I would argue that the discovery of microorganisms was at least as important in every way.

Arguing that "X is in the list, so Y should be too" is a slippery slope that leads to a really fat list in which each mistaken inclusion justifies a dozen more. I won't make that argument in this article. But the reverse argument, that "Y isn't in the list so X shouldn't be either", is much safer. If the microscope isn't important enough to make the list, then neither is the telescope.

The level

This is the only tool on the list that I thought was a serious mistake, not quite on the order of the Post-It note, but silly in the same way, if to a much lesser degree. It is another item of the type exemplified by the telescope, an item that is on the list, but whose more useful and important cousin is omitted. Why the level and not the plumb line? The plumb line does everything a level does, and more. The level tells you when things are horizontal; the plumb line tells you when they are horizontal or vertical, depending on what you need. The plumb line is simpler and older. The plumb line finds the point or surface B that is directly below point A; the level does nothing of the kind.

I'm boggled; I don't know what the level is doing there. But the fact that my most serious complaint about any particular item is with item #18 shows how well-done I think the list is overall.


The needle made the list at #13, but thread did not. A lot of sewing things missed out. Most of these, I think, are not serious omissions. The spinning wheel, for example: hand-spinning works adequately, although more slowly. The thimble? Definitely not in the top twenty. The button, with frogs and other clasps included? Maybe, maybe not. But one omission is serious, and must be considered seriously: the loom. I suppose it was eliminated for being too big; there can be no other excuse. But the lathe is #12, and the lathe is not normally small or portable.

There are small, portable lathes. But there are also small, portable looms, hand looms, and so on. I think the loom has a better claim to being a tool in this sense than a lathe does. Cloth is surely one of the ten most important technological inventions of all time, up there with the knife, the gun, and the pot. Cloth does not belong on the Forbes list, because it is not a tool. But omission of the loom surprises me.


Similarly, the omission of the windmill is quite understandable. But what about the quern? Flour is surely a technology of the first importance., Grain can be ground into flour without a windmill, and in many places was or still is. This morning I planned to write that it must have been omitted because it is hardly used any more, but then I thought a little harder and realized that I own not one but two devices that are essentially querns. (One for grinding coffee beans, the other for peppercorns.) I wouldn't want to argue that the quern is on the top twenty, but I think it's worth considering.

Male bias?

In fact, the list seems to omit a lot of important handicraft and home items that have fallen into disfavor. Male bias, perhaps? I briefly considered writing this article with the male-bias angle as the main point, but it's not my style. The authors might learn something from consideration of this question anyway.

The pot made the list, but not the potter's wheel. An important omission, perhaps? I think not, that a good argument could be made that the potter's wheel was only an incremental improvement, not suitable for the top twenty.

I do wonder what happened to rope; here I could only imagine that they decided it wasn't a "tool". (M. Ewalt says that he is at a loss to explain the omission of rope.) And where's the basket? Here I can't imagine what the argument was.


With the mention of baskets, I can't put off any longer my biggest grievance about the list: Where is the bag?

The bag! Where is the bag?

I will say it again: Where is the bag?

Is the bag a small, portable implement? Yes, almost by definition. "Stop for a minute and think about what you've done today--every job you've accomplished, every task you've completed." begins the Forbes article. Did I have my bag with me? I did indeed. I started the day by opening up a bag of grapes to eat for breakfast. Then I made my lunch and put it in a bag, which I put into another, larger bag with my pens and work papers. Then I carried it all to work on my bicycle. Without the bag, I couldn't have carried these things to work. Could I have gotten that stuff to work without a bag? No, I would not have had my hands free to steer the bicycle. What if I had walked, instead of riding? Still probably not; I would have dropped all the stuff.

The bag, guys!. Which of you comes to work in the morning without a bag? I just polled the folks in my office; thirteen of fourteen brought bags to work today. Which of you carries your groceries home from the store without a bag? Paleolithic people carried their food in bags too. Did you use a lathe today? No? A telescope? No? A level? A fish hook? A candle? Did you use a bag today? I bet you did. Where is the bag?

The only container on the Forbes list is the pot. Could the bag be considered to be included under the pot? M. Ewalt says that it was, and it was omitted for that reason. I believe this is a serious error. The bag is fundamentally different from the pot. I can sum up the difference in one sentence: the pot is for storage; the bag is for transportation.

Each one has several advantages not possessed by the other. Unlike the pot, the bag is lightweight and easy to carry; pots are bulky. You can sling the bag over your shoulder. The bag is much more accommodating of funny-shaped objects: It's much easier to put a hacked-up animal or a heterogeneous bunch of random stuff into a bag than into a pot. My bag today contains some pads of paper, a package of crackers, another bag of pens, a toy octopus, and a bag of potato chips. None of this stuff would fit well into a pot. The bag collapses when it's empty; the pot doesn't.

The pot has several big advantages over the bag:

  1. The pot is rigid. It tends to protect its contents more than a bag would, both from thumping and banging, and from rodents, which can gnaw through bags but not through pots.

  2. The pot is impermeable. This means that it is easy to clean, which is an important health and safety issue. Solids, such as grain or beans, are protected from damp when stored in pots, but not in bags. And the pot, being impermeable, can be used to store liquids such as food and lighting oils; making a bag for storing liquids is possible but nontrivial. (Sometimes permeability is an advantage; we store dirty laundry in bags and baskets, never pots.)

  3. The pot is fireproof, and so can be used for cooking. Being both fireproof and impermeable, the pot enables the preparation of soup, which increases the supply of available food and the energy that can be extracted from the food.

The bag probably predates the pot. To make pots, you must locate a suitable source of clay, shape it, and sun-dry or bake it. To make a bag requires nothing more than to grab a large animal skin by the corners. The bag doesn't get as much notice by anthropologists---not because it's less important, but because it's not as durable. We have potsherds that are thirteen thousand years old. All the bags that old have long since turned to dust.

I have no objection to Forbes' inclusion of the pot on their list, none at all. In fact, I think that it should be put higher than #16. But the bag needs to be listed too.

Other possible omissions

After the hammer, the bag, and rope, I have no more items that I think are so inarguable that they are sure substitutes for items in Forbes' list. There are items I think are probably better choices, but I think it is arguable, and, as I explained at the beginning of the article, I don't want to take cheap shots. Any list of the 20 most important tools will leave out a lot of important tools; switching around which tools are omitted is no guarantee of an objectively better list. For discussion purposes only, I'll mention tongs (including pliers), baskets, and shovels. Of the items on Forbes' near-miss list that I would want to consider are the bow, the broom and the spoon.

Revised list

Here, then, is my revised list. It's still not the list I would have made up from scratch, but I wanted to try to retain as much of the Forbes list as I could, because I think the items at the bottom are judgement calls, and there is plenty of room for reasonable disagreement about any of them.

Linguists found a while ago that if you ask subjects to judge whether certain utterances are grammatically correct or not, they have some difficulty doing it, and their answers do not show a lot of agreement with other subjects'. But if you allow them an "I'm not sure" category, they have a lot less difficulty, and you do see a lot of agreement about which utterances people are unsure about. I think a similar method may be warranted here. Instead of the tools that are in or out of the list, I'm going to make two lists: tools that I'm sure are in the list, and tools that I'm not sure are out of the list.

The Big Eight, tools that I think you'd have to be crazy to omit, are:

  1. Knife (includes sword, axe, scythe, chisel, spear, shears, scissors)
  2. Hammer (includes club, mace, sledgehammer, mallet)
  3. Bag (includes wineskin, water skin, leather bottle, purse)
  4. Pot (includes plate, bowl, pitcher, rigid bottle, mortar)
  5. Rope (includes string and thread)
  6. Harness (includes collar and yoke)
  7. Pen (includes pencil, writing brush, etc.)
  8. Gun (includes rifle and musket, but not cannon)
The lesser twelve, the tools that I'm not sure are off the list, are:

  1. Compass
  2. Plumb line (includes level)
  3. Sewing needle
  4. Candle (includes lamp, lantern, torch)
  5. Ladder
  6. Eyeglasses (includes contact lenses)
  7. Saw
  8. Balance
  9. Fishhook
  10. Lathe
  11. Abacus (includes counting board)
  12. Microscope
My lists merge the sword, scythe, and chisel under the knife. This frees up space for the hammer, the bag, and rope, which I think were Forbes' most serious omissions. The only other omission I felt that I had to correct was the ladder; I removed the watch to make room, although I had misgivings about that.

The other adjustments are minor: The pot got a big promotion, from #16 to #4. The pencil is represented by the pen, instead of the other way around. The rifle is teamed with the musket as "the gun". The telescope is replaced with the microscope. The level is replaced with the plumb line. The scale is replaced by the balance, which is more a terminological difference than anything else.

The omission of mine that worries me the most is the basket. I left it out because although it didn't seem very much like either the pot or the bag, it did seem too much like both of them. I worry about omitting the pin, but I'm not sure it qualifies as a "tool".

If I were to get another 13 slots, I might include:

  1. Basket
  2. Broom
  3. Horn
  4. Pry bar
  5. Quern
  6. Radio (Walkie-talkies)
  7. Scraper
  8. Shovel
  9. Spoon
  10. Tape
  11. Tongs
  12. Touchstone
  13. Welding torch
[ Addendum 20120628: National Geographic reports the discovery of the oldest known "purse", estimated to be between 4200 and 4500 years old. The purse itself has disintegrated, leaving only its exterior decorations: a hundred dog teeth. ]

[Other articles in category /tech] permanent link