The Universe of Discourse

Mark Dominus (陶敏修)
mjd@pobox.com

12 recent entries

An anecdote about backward compatibility
My new git utility `what-changed-twice` needs a new name
Mystery of the quincunx's missing quincunx
The fivefold symmetry of the quince
A descriptive theory of seasons in the Mid-Atlantic
Claude and I write a utility program
A puzzle about balancing test tubes in a centrifuge
Proof by insufficient information
Willie Singletary will you please go now?
How our toy octopuses got revenge on a Philadelphia traffic court judge
Does someone really have to do the dirty jobs?
The mathematical past is a foreign country

Archive:

2026: J
2025: JF M A MJ
JASOND
2024: JF M A M J
J ASOND
2023: JF M A M J
J A S O N D
2022: J F M A M J
JAS O N D
2021: J F M AMJ
J A S O N D
2020: J F M A M J
J A S O N D
2019: JFM A M J
J A S O N D
2018: J F M A M J
J A S O N D
2017: J F M A M J
J A S O N D
2016: JF M A M J
JASON D
2015: JFM A M J
J A S O N D
2014: J F M AMJ
JASON D
2013: JFMAMJ
JAS OND
2012: J F MAMJ
JASOND
2011: JFMAM J
JASOND
2010: JFMAMJ
JA S O ND
2009: J F MAM J
JASOND
2008: J F M A M J
JAS O ND
2007: J F M A M J
J A S O N D
2006: J F M A M J
JAS O N D
2005: O N D

Subtopics:

Mathematics 245

Programming 100

Language 95

Miscellaneous 75

Book 50

Tech 49

Etymology 35

Haskell 33

Oops 30

Unix 27

Cosmic Call 25

Math SE 25

Law 22

Physics 21

Perl 17

Biology 16

Brain 15

Calendar 15

Food 15

Comments disabled

Thu, 02 Feb 2006

Preventing multiple SMTP connections
As I believe I said somewhere else, I have a lot of ideas, and only about one in ten turns out really well in the end. About seven-tenths get abandoned before the development stage, and then of the three that survive to development, two don't work.

The MIT CS graduate student handbook used to have a piece of advice in it that I think is very wise. It said that two-thirds of all research projects are failures. The people that you (as a grad student at MIT) see at MIT who seem to have one success after another are not actually successful all the time. It's just that they always have at least three projects going on at once, and you don't hear about the ones that are flops. So the secret of success is to always be working on several projects at once. (I wish I could find this document again. It used to be online.)

Anyway, this note is about one of those projects that I carried through to completion, but that was a failure.

The project was based in the observation that spam computers often make multiple simultaneous connections to my SMTP server and use all the connections to send spam at once. It's very rare to get multiple legitimate simultaneous connections from the same place. So rejecting or delaying simultaneous connections may reduce incoming spam and load on the SMTP server.

My present SMTP architecture spawns a separate process for each incoming SMTP connection. So an IPC mechanism is required for the separate processes to communicate with each other to determine that several of them are servicing the same remote IP address. Then all but one of the processes can commit suicide. If the incoming email is legitimate, the remote server will try the other messages later.

The IPC mechanism I chose was this: Whenever an SMTP server starts up, servicing a connection from address a.b.c.d, it creates an empty file named smtplock/a/b/c/d and gets an exclusive file lock on it. It's OK if the file already exists. But if the file is already locked, the server should conclude that some other server is already servicing that address, and commit suicide.

This worked really well for a few weeks. Then one day I came home from work and found my system all tied into knots. The symptoms were similar to those that occurred the time I used up all the space on the root partition, so I checked the disk usage right away. But none of the disks were full. A minute later, I realized what the problem was: creating a file for each unique IP address that sent mail had used up all the inodes, and no new files could be created.

The IPC strategy I had chosen gobbled up about twenty thousand inodes a day. At the time I had implemented it, I had had a few hundred thousand inodes free on the root partition. Whoops.

Compounding this mistake, I had made another. I didn't provide any easy way to shut off the feature! It would have been trivial to do so. All I had to do was predicate its use on the existence of the smtplock directory. Then I could have moved the directory aside and the servers would have instantly stopped consuming inodes. I would then have been able to remove the directory at my convenience. Instead, the servers "gracefully" handle the non-existence of smtplock by reporting a temporary failure back to the remote system and aborting the transmission. To shut off the feature, I had to modify, recompile, and re-install the server software.

Duh.

[ Addendum 20060202: There is a brief followup to this article. ]

[Other articles in category /oops] permanent link