The Universe of Discourse


Thu, 25 May 2023

Egyptian crocodile hieroglyphs in Unicode

A while back Rik Signes brought my attention to the Unicode codepoint with the long and peculiar name TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR (U+0c7e) and this inspired me to write an article about what it was for.

Recently I was looking into how Egyptian hieroglyphic characters are encoded in Unicode. The possible character set is quite large; for example here's the name of the god Osiris:

Hieroglyph consisting of three
components.  At right, the figure of a beared, kneeling man.  At left,
a polygon representing a throne, above a human eye.

Is this a single codepoint? No, there are codepoints for the three components of the hieroglyph (the kneeling bearded man, the eye, and the polygon thingy that represents a throne), and then some combining characters to say how they should be assembled, also combining characters to indicate notations like cartouches and rubrics.

(I learned this hieroglyphic with the eye part uppermost and the man and throne side-by-side below, but I suppose Egyptian spelling must have changed over the millennia.)

The codepoints themselves have disappointing names like EGYPTIAN HIEROGLYPHIC SIGN A049, which I think is the designation for the man-with-beard component. But the original proposal hints at something greater. It suggests, and immediately rejects, a descriptive nomenclature including such names as BABY CHICK, OWL, HARE,

    STANDING MONKEY HOLDING SEVERED HEAD

and

    RECUMBENT CROCODILE WITH COBRA HEADDRESS AND FLAGELLUM

But still, what could have been? TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR is only 51 characters long, but RECUMBENT CROCODILE WITH COBRA HEADDRESS AND FLAGELLUM is 54.

If you want to look it up, it is known as EGYPTIAN HIEROGLYPHIC SIGN I098, found on pages 36–37 of the formal proposal. The suggested glyph looks like this:

Hieroglyphic symbol
of a recumbent crocodile with cobra headdress and flagellum.


Addendum: Rik informs me that he brought the Telugu fraction to my attention not, as I remembered, because it was longest but because it was curious. At the time the longest designation was

    ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM

which has since been supplanted by the twins

    BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT AND MIDDLE RIGHT TO LOWER CENTRE
    BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT AND MIDDLE LEFT TO LOWER CENTRE

I still regret that STANDING MONKEY HOLDING SEVERED HEAD is not the name of a Unicode codepoint.

Hieroglyphic symbol of
a standing monkey holding a severed head.

[ Addendum 20230526: More hieroglyphic monkeys holding stuff. ]


[Other articles in category /tech] permanent link