The Universe of Discourse


Thu, 12 Jan 2006

Medieval Chinese typesetting technique
One of my longtime fantasies has been to write a book called Quipus and Abacuses: Digital Information Processing Before 1946. The point being that digital information processing did exist well before 1946, when large-scale general-purpose electronic digital computers first appeared. (Abacuses you already know about, and a future blog posting may discuss the use of abacuses in Roman times and in medieval Europe. Quipus are bunches of knotted cords used in Peru to record numbers.) There are all sorts of interesting questions to be answered. For instance, who first invented alphabetization? (Answer: the scribes at the Great Library in Alexandria, around 200 CE.) And how did they do it? (Answer to come in a future blog posting.) How were secret messages sent? (Answer: lots of steganography.) How did people do simple arithmetic with crappy Roman numerals? (Answer: abacuses.) How were large quantities of records kept, indexed, and searched? How were receipts made when the recipients were illiterate?

Here's a nice example. You may have heard that the Koreans and the Chinese had printing presses with movable type before Gutenberg invented it in Europe. How did they organize the types?

In Europe, there is no problem to solve. You have 26 different types for capital letters and 26 for small letters, so you make two type cases, each divided into 26 compartments. You put the capital letter types in the upper case and the small letter types in the lower case. (Hence the names "uppercase letter" and "lowercase letter".) You put some extra compartments into the cases for digits, punctuation symbols, and blank spaces. When you break down a page, you sort the types into the appropriate compartments. There are only about 100 different types, so whatever you do will be pretty easy.

However, if you are typesetting Chinese, you have a much bigger problem on your hands. You need to prepare several thousand types just for the common characters. You need to store them somehow, and when you are making up a page to be printed you need to find the required types efficiently. The page may require some rare characters, and you either need to have up to 30,000 rarely-used types made up in advance or some way to quickly make new types as needed. And you need a way to sort out the types and put them away in order when the page is complete.

(I'm sure some reader is itching to point out that Korean is written with a phonetic alphabet, hangul, which avoids the problem by having only 28 letters. But in fact that is wrong for two reasons. First, the layout of Korean writing requires that a type be made for each two- or three-letter syllable. And second, perhaps more to the point, moveable type presses were used in Korea before the invention of hangul, before Korean even had a written form. Movable type was invented in Korea around 1234 CE; hangul was first promulgated by Sejong the Great in 1443 or 1444. The first Korean moveable type presses were used to typeset documents in Chinese, which was the language of scholarship and culture in Korea until the 19th century.)

In fact, several different solutions were adopted. The earliest movable types in China were made of clay mixed with glue. These had the benefit of being cheap. Copper types were made later, but had two serious disadvantages. First, they were very expensive. And second, since much of their value could be recovered by melting them down, the government was always tempted to destroy them to recover the copper, which did indeed happen.

Wang Chen (王禎), in 1313, writes that the types were organized as follows: There were two circular bamboo tables, each seven feet across and with one leg in the middle; the tabletops were mounted on the legs so that they could rotate. One table was for common types and the other for the rare, one-off types. The top of each table was divided into eight sections, and in each section, types were arranged in their numerical order according to their listing in the Book of Rhymes, an early Chinese dictionary that organized the characters by their sounds.

To set the type for a page, the compositors would go through the proof and number each character with a code indicating its code number from the Book of Rhymes. One compositor would then read from the list of numbers while the other, perched on a seat between the two rotating tables, would select the types from the tables. Wang doesn't say, but one supposes that the compositors would first put the code numbers into increasing order before starting the search for the right types. This would have two benefits: First, it would enable a single pass to be made over the two tables, and second, if a certain character appeared multiple times on the page, it would allow all the types needed for that character to be picked up at once.

The types would then be inserted into the composition frame. If a character was needed for which there was no type, one was made on the spot. Wang Chen's types were made of wood. The character was carefully written on very thin paper, which was then pasted upside-down onto a blank type slug. A wood carver with a delicate chisel would then cut around the character into the wood.

(Source: Invention of printing in China and its spread westward. Thomas Francis Carter, 1925.)

In 1776 a great printing project was overseen by Jian Jin (Chin Ch'ien), also using wooden types. Jin left detailed instructions about how the whole thing was accomplished. By this time the Book of Rhymes had been superseded.

The Imperial K'ang Hsi Dictionary (K'ang-hsi tzu-tien or Kāngxī Zìdiǎn, 康熙字典), written between 1710 and 1716, was the gold standard for Chinese dictionaries at the time, and to some extent, still is, since it set the pattern for the organization of Chinese characters that is still followed today. If you go into a store and buy a Chinese dictionary (or a Chinese-English dictionary) that was published last week, its organization will be essentially the same as that of the Imperial K'ang Hsi Dictionary. Since readers may be unfamiliar with the organization of Chinese dictionaries, I will try to explain.

Characters are organized primarily by a "classifier", more usually called a "radical" today. The typical Chinese character incorporates some subcharacters. For example, the character for "bright" is clearly made up of the characters for "sun" and "moon"; the character for "sweat" is made up of "water" and "shield". (The "shield" part is not because of anything relating to a shield, but because it sounds like the word for "shield".) Part of each character is designated its radical. For "sweat", the radical is "water"; for "bright" it is "sun". How do you know that the radical for "bright" is "sun" and not "moon"? You just have to know.

What about characters that are not so clearly divisible? They have radicals too, some of which were arbitrarily designated long ago, and some of which were designated based on incorrect theories of etymology. So some of it is arbitrary. But all ordering of words is arbitrary to some extent or another. Why does "D" come before "N"? No reason; you just have to know. And if you have ever seen a first-grader trying to look up "scissors" in the dictionary, you know how difficult it can be. How do you know it has a "c"? You just have to know.

Anyway, a character has a radical, which you can usually guess in at most two or three tries if you don't already know it. There are probably a couple of hundred radicals in all, and they are ordered first by number of strokes, and then in an arbitrary but standard order among the ones with the same number of strokes. The characters in the dictionary are listed in order by their radical. Then, among all the characters with a particular radical, the characters are ordered by the number of strokes used in writing them, from least to most. This you can tell just by looking at the characters. Finally, among characters with the same number of strokes and the same radical, the order is arbitrary. But it is still standardized, because it is the order used by the Imperial K'ang Hsi Dictionary.

So if you want to look up some character like "sweat", you first identify the radical, which is "water", and has four strokes. You look in the radical index among the four-stroke radicals, of which there are no more than a couple dozen, until you find "water", and this refers you to the section of the dictionary where all the characters with the "water" radical are listed. You turn to this section, and look through it for the subsection for characters that have seven strokes. Among these characters, you search until you find the one you want.

This is the solution to the problem of devising an ordering for the characters in the dictionary. Since this ordering was (and is) well-known, Jin used it to organize his type cases. He writes:

Label and arrange twelve wooden cabinets according to the names of the twelve divisions of the Imperial K'ang Hsi Dictionary. The cabinets are 5'7" high, 5'1" wide, 2'2" deep with legs 1'5" high. Before each one place a wooden bench of the same height as the cabinet's legs; they are convenient to stand on when selecting type. Each case has 200 sliding drawers, and each drawer is divided into eight large and eight small compartments, each containing four large or four small type. Write the characters, with their classifiers and number of strokes, on labels on the front of each drawer.

When selecting type, first examine the make-up of the character for its corresponding classifier, and then you will know in which case it is stored. Next, count the number of strokes, and then you will know in which drawer it is. If one is experienced in this method, the hand will not err in its movements.

There are some rare characters that are seldom used, and for which few type will have been prepared. Arrange small separate cabinets for them, according to the twelve divisions mentioned above, and place them on top of each type case where they may be seen at a glance.

(Source: A Chinese Printing Manual, 1776. Translated by Richard C. Rudolph, 1954.)

The size measurements here are misleading. The translator says that the "inch" used here is the Chinese inch of the time, which is about 32.5 mm, not the 25.4 mm of the modern inch. He does not say what is meant by "foot"; I assume 12 inches. That means that the type cases are actually 7'2" high, 6'6" wide, 2'9" deep, (218 cm × 198 cm × 84 cm) with legs 1'10" high (55 cm), in modern units.

(Addendum 20060116: The quote doesn't say, but the illustration in Jin's book shows that the cabinets have 20 rows of 10 drawers each.)

One puzzle I have not resolved is that there do not appear to be enough type drawers. Jin writes that there are twelve cabinets with 200 drawers each; each drawer contains 16 compartments, and each compartment four type. This is enough space for 153,600 types (remember that you need multiples of the common characters), but Jin reports that 250,000 types were cut for his project. Still, it seems clear that the technique is feasible.

Another puzzle is that I still don't know what the "twelve divisions" of the Imperial K'ang Hsi Dictionary are. I examined a copy in the library and I didn't see any twelve divisions. Perhaps some reader can enlighten me.

As in Wang's project, one compositor would first go over the proof page, making a list of which types needed to be selected, and how many; new types were cut from wood as needed. Then compositors would visit the appropriate cases to select the types as necessary; another compositor would set the type, and then the page would be printed, and the type broken down. These activities were always going on in parallel, so that page n was being printed while page n+1 was being typeset, the types for page n+2 were being selected, and page n-1 was being broken down and its types returned to the cabinet.


[Other articles in category /IT] permanent link