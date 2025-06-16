« previous post |

Submitted by Charles Belov:

I've been browsing through the proposed Unicode 17 changes, currently undergoing a comment period, with interest. While I don't have the knowledge to intelligently comment on the proposals, it's good to see that they are actively improving language access.



I'm puzzled that some new characters have been added to the existing Unicode CJK Unified Ideographs Extension C (6 characters) and Unicode CJK Unified Ideographs Extension E (12 characters) rather than added to a new extension. But the most interesting is the apparently brand-new Unicode CJK Unified Ideographs Extension J, with over 4,000 added characters.

I found the following characters of special interest:



– 323B0 looks like the character 五 with the bottom stroke missing.

– 323B3 looks like an arrangement of three 三s – does it possibly mean the same as 九?

– 32501, while not up to the character for biang for complexity, is nevertheless quite a stroke pile: the 厂 radical enclosing a 3 by 3 array of the character 有

– 3261E is the character 乙 in a circle, which doesn't look quite right to me as a legit Chinese character

– 326FB seems sexist to me: three 男 over one 女

– 33143, similarly to 32501, has ⻌ enclosing a 3 by 3 array of the character 日



Alas, macOS does not yet support the biang character, so I can't include it in this email. Hopefully someday.

Character additions

VHM:

Note that, as it has been since the beginning of Unicode, CJK gobbles up the vast majority of all code points (see Mair and Liu 1991).

What is this fact telling us about the Chinese writing system, particularly in comparison with other writing systems? How does one account for this disparity? What is the meaning of this gross disparity?

The average number of strokes in a Chinese character is roughly 12.

The average number of strokes in a letter of the English alphabet is 1.9.

The average number of syllables in an English word is 1.66 (and 5 letters).

The average number of syllables in a Chinese word is roughly 2 (and 24 strokes).

The average number of words in an English sentence is 15-20.

The average number of words in a Chinese sentence is 25 (ballpark figure; see here)

Chinese has more than 100,000 characters.

English has 26 letters.

Total number of English words; over 600,000 (Oxford English Dictionary)

Total number of Chinese words: a little over 370,000 (Hànyǔ dà cídiǎn 漢語大詞典 [Unabridged dictionary of Sinitic])

und so weiter

Selected readings

Permalink