The Universe of Disco


Sun, 23 Oct 2022

This search algorithm is usually called "group testing"

Yesterday I described an algorithm that locates the ‘bad’ items among a set of items, and asked:

does this technique have a name? If I wanted to tell someone to use it, what would I say?

The answer is: this is group testing, or, more exactly, the “binary splitting” version of adaptive group testing, in which we are allowed to adjust the testing strategy as we go along. There is also non-adaptive group testing in which we come up with a plan ahead of time for which tests we will perform.

I felt kinda dumb when this was pointed out, because:

  • A typical application (and indeed the historically first application) is for disease testing
  • My previous job was working for a company doing high-throughput disease testing
  • I found out about the job when one of the senior engineers there happened to overhear me musing about group testing
  • Not only did I not remember any of this when I wrote the blog post, I even forgot about the disease testing application while I was writing the post!

Oh well. Thanks to everyone who wrote in to help me! Let's see, that's Drew Samnick, Shreevatsa R., Matt Post, Matt Heilige, Eric Harley, Renan Gross, and David Eppstein. (Apologies if I left out your name, it was entirely unintentional.)

I also asked:

Is the history of this algorithm lost in time, or do we know who first invented it, or at least wrote it down?

Wikipedia is quite confident about this:

The concept of group testing was first introduced by Robert Dorfman in 1943 in a short report published in the Notes section of Annals of Mathematical Statistics. Dorfman's report – as with all the early work on group testing – focused on the probabilistic problem, and aimed to use the novel idea of group testing to reduce the expected number of tests needed to weed out all syphilitic men in a given pool of soldiers.

Eric Harley said:

[It] doesn't date back as far as you might think, which then makes me wonder about the history of those coin weighing puzzles.

Yeah, now I wonder too. Surely there must be some coin-weighing puzzles in Sam Loyd or H.E. Dudeney that predate Dorfman?

Dorfman's original algorithm is not the one I described. He divides the items into fixed-size groups of n each, and if a group of n contains a bad item, he tests the n items individually. My proposal was to always split the group in half. Dorfman's two-pass approach is much more practical than mine for disease testing, where the test material is a body fluid sample that may involve a blood draw or sticking a swab in someone's nose, where the amount of material may be limited, and where each test offers a chance to contaminate the sample.

Wikipedia has an article about a more sophisticated of the binary-splitting algorithm I described. The theory is really interesting, and there are many ingenious methods.

Thanks to everyone who wrote in. Also to everyone who did not. You're all winners.

[ Addendum 20221108: January First-of-May has brought to my attention section 5c of David Singmaster's Sources in Recreational Mathematics, which has notes on the known history of coin-weighing puzzles. To my surprise, there is nothing there from Dudeney or Loyd; the earliest references are from the American Mathematical Monthly in 1945. I am sure that many people would be interested in further news about this. ]


[Other articles in category /prog] permanent link