The economics of Chinese character usage
« previous post | next post »
Under the above rubric, my friend Apollo Wu sent around a note (copied below) about the economic impact of the use of Chinese characters in the operation of his business. Since Apollo was for many years (from 1973 to 1998) a top translator in the Chinese Translation Service at United Nations headquarters in New York, he knows whereof he speaks. Among other interesting tidbits that I heard from Apollo over the decades was that, of the official languages of the United Nations (Arabic, Mandarin Chinese, English, French, Russian, and Castilian Spanish) Chinese was by far the least efficient and most expensive to process.
Apollo's note:
Our company tried to pay two PRC women workers, namely Kang Xi 康肸 and Li Di 李頔 for the services provided by them. However, we failed because our Bank in Shenzhen (Nanyang Commercial Bank) did not accept their names [VHM: here Apollo provided an attached screen shot of an electronic communication from the bank informing his company that the names were cuòwù 错误 ("mistaken") and consequently could not be entered in the bank's computer system] .
Prior to making payments, we did have a hard time trying to find out how to input their names with our Pinyin input system. We failed to input the Xi 肸 of the first person as it is not included in the Sogou Pinyin Input (I don't know how my colleague ultimately managed to find the character). The whole affair was very time consuming. We were at the end of our rope when the Bank refused to accept these Chinese names as valid. It may very well be that the ID issuing Authority in China uses a larger character set in creating ID cards, while different banks use different character sets. It really cost a lot of trouble for everybody.
I would like to write something to highlight this incidence in the context of language and economy. In contrast to the practice in the PRC, banks in Hong Kong only accept alphabetized names without any problem. The Pinyin or English names coupled with their respective account numbers provide sufficient accuracy, while avoiding a lot of possible trouble in the use of Chinese text.
By the way, a previous rejection of payment was due to a homophonous character used by us, as Guo Yicheng was mistakenly written as 郭意诚 instead of 郭意成. Subsequently we had to call long distance to find out what character was used by the recipient.
This sort of costly, frustrating delay due to problems with characters is especially galling to Apollo, now a software developer in Hong Kong who has business contacts all over the world.
What is particularly ironic in this case is that xī 肸, while certainly not an everyday word, is contained in the phenomenally popular Xīnhuá zìdiǎn 新华字典 (Xinhua [New China] Character Dictionary), which has sold hundreds of millions of copies in numerous editions and is considered to be the standard dictionary of single characters for common use. The Xīnhuá zìdiǎn contains approximately 10,000 characters, counting variant forms. Considering that there are upwards of 80,000 characters in the total set that font managers must contend with, any character that is in Xīnhuá zìdiǎn cannot be considered to be truly obscure. In fact, the definition for xī 肸 in Xīnhuá zìdiǎn says merely this: "commonly used in a person's name"! So Apollo is not the only businessman or other type of bank client (not to mention workers in post offices, hospitals, colleges and universities, etc) who is going to be having trouble with people who unfortunately have the character xī 肸 in their name.
(For purposes of comparison with the 10,000 characters in the Xīnhuá zìdiǎn and the 80,000+ in mega data bases, "full literacy" in modern Chinese requires a knowledge of between 3,000 and 4,000 characters.)
There are still some more bizarre aspects to the question of xī 肸. Apparently, people who have this character in their names are either oblivious of its meaning or just don't care. Why they would choose it is beyond me. Because it sounds good? But there are more than 200 other characters pronounced xī (forget about the hundreds more pronounced xi in one of the other three tones), some with verifiable, felicitous meanings. Because it looks good? Well, I wouldn't vouch for that; to me it actually looks a little ungainly and ill-proportioned (though I'm prepared for disagreement on that point).
Once we start digging more deeply into xī 肸, the mysteries proliferate, such as that it may also be written with 兮 as the phonophore on the right hand side rather than the awkward (to me) group of four strokes that is there now. And, if we try really hard by going to big dictionaries with classical citations, we discover — lo and behold — that this same character xī 肸 is also pronounced bì (no phonological resemblance to xī!), in which case it signifies the name of an ancient place in what is now Shandong Province. Burrowing still more deeply, we find that xī 肸 actually does have a meaning (or rather did have a meaning two millennia or so ago.), namely, to indicate a spreading or crescendoing sound. Much later, but only when reduplicated, mind you, it came to be used as onomatopoeia for the sound of laughter. In combination with other characters, xī 肸 also entered into disyllabic expressions meaning such things as "adjust / shake / vibrate ornaments / decorations"; "spreading, dispersing" (as of vapors); "thorough interpenetration of numinosity / spirituaility"; "dimly discernable, diaphonous". Once again, I must stress that xī 肸 only acquires these evocative meanings when used in disyllabic expressions with other characters.
Anyway, why anyone (or, more likely, anyone's parents) would want to choose xī 肸 for their personal name — other than that lots of other people have done so — is beyond me, especially in light of the fact that we now know it is causing problems in information processing systems. My advice to everyone: don't inflict xī 肸 on your child, and stay away from it yourself. It will only give you trouble in this modern world of electronic data processing.
I needn't spend so much time and energy on dí 頔. I will say only the following:
It's not in Xīnhuá zìdiǎn (10,000 character range).
It is in Hànyǔ dà cídiǎn 汉语大词典 (Unabridged Dictionary of Sinitic) (23,000 character range).
It is fairly often used in personal names.
It is said (in a character dictionary dating back about a thousand years) to mean "good(ness)".
In a word, characters cost (a lot), both for learners and for users, especially in the Electronic Age.
Matt said,
September 2, 2011 @ 12:09 pm
I had the idea that many Chinese don't choose their children's names themselves…instead they visit a "fortune teller" who looks at the stars or the moon or something to decide a name that will give the child good luck (or wealth, or whatever) throughout their life. Maybe these fortunetellers feel that to make the parent feel they've gotten good value for their money, they need to choose somewhat obscure characters that the parent wouldn't have thought of themselves…
But maybe this is trying to lead the Language Log into Freakonomics territory.
Bruce Rusk said,
September 2, 2011 @ 12:38 pm
To Matt's observation above, one could add that many families have naming systems such that, for example, all the children of a certain generation must have names containing a certain graphic element (usually the semantic on the left). In large families one might have to go to fairly obscure characters to find suitable names.
Adara said,
September 2, 2011 @ 2:28 pm
Yup, lots of fortune-tellers are coming up with Chinese/Taiwanese people's names, often with obscure characters for whatever reason. The fortune-tellers use the child's 8-character birthdate reading and come up with a few options, and the parents pick the one they like.
A common "rare" character used in Taiwanese girls' names is 妏 wen4, but usually pronounced wen2 because it "sounds nicer" that way.
Ian F. said,
September 2, 2011 @ 6:26 pm
Interestingly, of the two mentioned characters only xi 肸 is in the Shuowen Jiezi, and its entry is brief to the point of incomprehensibility. Suffice it to say it says the character comes from a meaning-meaning combination shi 十 ("ten" or "many") and yi 䏌 ("vibrate", "ancient dance").
Andy Averill said,
September 2, 2011 @ 8:09 pm
I wonder if there's a similar problem in Japan. Do they use kanji for personal names?
Philip Spaelti said,
September 2, 2011 @ 10:01 pm
@Andy Averill:I wonder if there's a similar problem in Japan. Do they use kanji for personal names?
Yes, they use Kanji for personal names in Japan. In official contexts the Kanji is in fact the only thing that is registered, and one generally needs to "get it right," i.e., it matters which exact Kanji variant one uses. My wife has an extremely common name with simple characters, and she only ever writes it that way, but officially her name is registered with archaic variants that even just a few years back were not present in some electronic character sets.
An additional problem in Japanese are readings. Generally parents choose names by sound as much as they choose the Kanji. Most Kanji have mutliple readings and in name contexts this is mulitplied, that is many Kanji have a large extra number of "name readings." Sometimes parents even inadvertenly choose "mistaken" readings, that is they choose a Kanji in the mistaken belief that it can be read a certain way. Such children then have an "illegal" reading.
Both of these problems — the archaic variants and "mistake" readings — mean that you can be told by an official person that you have gotten your own name wrong!
Bob Violence said,
September 3, 2011 @ 6:24 am
In official contexts the Kanji is in fact the only thing that is registered, and one generally needs to "get it right," i.e., it matters which exact Kanji variant one uses. My wife has an extremely common name with simple characters, and she only ever writes it that way, but officially her name is registered with archaic variants that even just a few years back were not present in some electronic character sets.
I'm a little surprised by this, since I had always understood the situation to be the opposite — that a person can write their name with whatever characters they want, but that officially registered names can only use kanji from the jōyō ("common use") and jinmeiyō ("personal name") character sets (2,997 characters altogether, some of which are just traditional and simplified variants of the same character), and that characters outside those sets have to be registered using substitute characters or kana. I do know for a fact that it is possible for Japanese personal names to be registered using only kana and with no kanji at all.
Bob Violence said,
September 3, 2011 @ 6:37 am
…and here's a older post from Prof. Mair on the naming situation in (mainland) China. AFAIK the situation is unchanged — whether a rare/unsupported character is allowed depends on how much the officials you deal with are willing to indulge you, and there's no centralized list of allowed characters.
Nathan Hopson said,
September 3, 2011 @ 7:04 am
Bob and Philip are forgetting the interesting point that personal names can also be bestowed in hiragana and katakana. In practice, this only applies to women, but it is an interesting exception to the "orthography > pronunciation"-ism of the family registration bureaucracy, which can lead to what Bob calls "illegal readings" (assume you mean ateji?) in some cases.
The very simple reason for this is legibility, in the sense used by James Scott in "Seeing like a State," for example. The name which can be made legible by the state is the only legal designator that counts — how that is read is unimportant to a system whose function is to subsume individuals and families into the state for taxation, conscription, etc. Pronunciation doesn't matter, because documents don't talk.
Bob Violence said,
September 3, 2011 @ 7:12 am
Bob and Philip are forgetting the interesting point that personal names can also be bestowed in hiragana and katakana.
I mentioned that, and I've known a couple such people myself. Naturalized foreigners are also allowed to take all-kana names if they so choose.
Terry Collmann said,
September 3, 2011 @ 8:37 am
Hah! On British birth certificates, a newborn child's name is rendered with the forename(s) in upper-and-lowercase and the surname(s) in all-caps. When my wife and I registered our daughter's birth, we thought we had registered her name as Firstforename Secondforename Wife'ssurname MYSURNAME, ie three forenames, one surname. When we tried to get our daughter a passport, the Passport Office refused to accept that version of her name, because on the birth certificate it was written as Firstforename Secondforename WIFE'SSURNAME MYSURNAME, ie two forenames, two surnames. So even in the West, an official can tell you that you've got your own child's name wrong …
Andy Averill said,
September 3, 2011 @ 12:02 pm
@Terry Collmann, yes but in the UK it's entirely possible to have 2 (or even 3) surnames, eg Helena Bonham Carter etc etc. Maybe they just assumed that that's what you meant?
Hung Lee said,
September 3, 2011 @ 5:44 pm
What are the odds of having two employees with such funny — I mean unusual — names as Kang Xi 康肸 and Li Di 李頔 in one company? Their parents must be oddballs to have given their children such names, with characters that essentially no one recognizes, much less to pronounce.
B.Ma said,
September 3, 2011 @ 8:39 pm
@Hung Lee, if one in 10000 people are "oddballs", then there are 130000 oddballs in the PRC. So I don't think it's unusual to have 2 odd names cropping up in one person's career.. he hasn't told us about all the boring names he has encountered.
This is definitely one of the disadvantages of Pinyin. How many different names can be spelled Li Di? With Guoyu Romazi this name can be spelled up to 16 different ways (don't know how to use it myself obviously). Now if the PRC used Cantonese or some other language that kept finals and more vowels, then there would be even less of a problem. The other disadvantage is that someone decided to use C, Q, R, X, and Z for sounds that don't correspond to their usual English sounds, so all my friends from the PRC have to pronounce their names incorrectly.
I don't know if this has anything to do with PRC policy, but there is nothing mandating you give the government your "real" name. Personally, I have 3 names, a bit like the Chinese emperors of old. A "government" name which is on all my official stuff, boring but not so common as to cause problems in everyday life; a "real" name which is only used within close family, which contains a quite rare character, and a "public" name which I use to introduce myself in general. In addition I also have a Western name, of which the first is English and the middle name is Spanish, so when using those languages I use the relevant name.
Wonder whether anyone else has such a complicated name. One contender is a friend from Mauritius, where one of his ancestor's full Chinese name became his surname. In addition he has a personal Chinese name, a first name and a middle name, so it is 2 English names followed by 5 Chinese characters, the true surname being the 3rd character (though he uses the 5th, which was actually his ancestor's "nickname", as in Ah-____)
David said,
September 4, 2011 @ 12:58 am
Created in 1981, GB2312 was the standard computer character encoding in PRC, and it included only 6,763 Chinese characters. (Unicode started to supplant it in 1993.) So characters in the Xīnhuá zìdiǎn could have been truly obscure in a computer sense.
Victor Mair said,
September 4, 2011 @ 6:55 am
From Cecilia Segawa Seigle:
Philip Spaelti said,
@Andy Averill:I wonder if there's a similar problem in Japan. Do they use kanji for personal names?
Yes, they use Kanji for personal names in Japan. In official contexts the Kanji is in fact the only thing that is registered, and one generally needs to "get it right," i.e., it matters which exact Kanji variant one uses.
====
The statement that "In official contexts the Kanji is in fact the only thing that is registered" is not right. Because many names in katakana or hiragana can be registered as official names. My mother's legal name was はま子。My younger cousin's name is すみれ. But personally, if they wished to sign their names using kanji, they were free to do so.
When new parents face the project of naming a new baby, most parents sincerely wish the child to have a good, healthy, fortunate, bright and happy life, so they name the child with the best kanji they can think of, or with a name after some person they love, or on occasion, with a lucky character that they have been advised to use by a fortune teller or diviner or some such person if they are particularly old-fashioned.
But on occasion, the parents (mainly father) is so perverse, that they would use a fairly ordinary kanji, but choose a reading that is absolutely impossible for anyone to figure out. My understanding is that they are within their legal rights to choose a special reading for any kanji when naming a child. The child will have to tackle throughout his/her life with defending the reading of his/her name. My name is written 淑子and is read よしこ.This is very normal. But there are many other ways of reading it. And I think — I am not absolutely sure, but I have heard, that if my parents had chosen to read it マリア [VHM: Maria], they would have been within their rights, although they might have been advised at the city hall not to be so eccentric.
Then another kind of perverse parents would choose a character that is not in the usual 漢和辞典 or even in 大漢和辞典, and they will run into the kind of problem Professor Mair is discussing in his essay. My feeling is that both types of parents are extremely unkind to the child, making the child's life difficult and miserable. In the first case, it won't be too bad because the kanji is included in the electronic conversion – you just use the ordinary reading to obtain the right character. But the second case is very very difficult. Professor Mair is right to advise people "don't inflict xī 肸 on your child, and stay away from it yourself. It will only give you trouble in this modern world of electronic data processing."
Now about the difficulty I run into very often in my research and writing. I have encountered many names of Edo period essayists that I can't figure out the reading of, or I cannot find the character in my computer's Kanji conversion capacity. My solution for figuring out the reading is usually going to Morohashi. And going to Google, I search by various readings suggested by Morohashi. Very often, I can find a listing, if the person is a writer or a historical figure, no matter how obscure the person is. Of course there are many cases I cannot find. But I have been lucky most of the time. When I find that person, I just copy the kanji from the Google listing and reproduce it into my writing or bibliography, or whatever I need it for.
JREL said,
September 7, 2011 @ 1:35 am
@B.Ma:
I don't mean to say that this would solve all problems, but if tone marks were used more consistently in Pinyin, it would reduce at least some confusion. Other writing systems are quite happy with diacritics, why not Pinyin?
This is a complaint I often hear from English speakers: Pinyin letters sound different from English letters. Well, tough. After all, if ‹ch› can be [ʧ] in English, [ʃ] in French, and [x] in German, why can't it be [tʂʰ] in Pinyin? Remember: letters ≠ sounds. Pinyin is a Latin-based writing system that is as different from English as any other, and should be treated as such.
~flow said,
September 7, 2011 @ 12:40 pm
@"‹ch› can […] [x] in German"—worse! ‹ch› can be either [x] or [ç], depending on position. It's [x] in ‹Kuchen› ('cake') but [ç] in ‹Kuhchen› (rare; a diminutive of ‹Kuh›: 'small cow').
just as an aside, when you hear either ['ku:çən] or ['ku:xən], you can in principle not know whether to write it ‹Kuchen›, ‹Kuhchen›, ‹Quchen›, or ‹Kuuchen›, although likelihood is very strongly in favor of the first two. similarly, upon reading any of these spellings, you cannot derive the pronunciation from the spelling alone with certainty (although the correct solutions are the ones that are probably the statistically most favored ones)—you must know these words before to be sure.
i seem to remember that some decades ago, some germans fell prey to offical arbitrariness when they were told their surnames had been re-spelled from using ‹hs› to using ‹ß›, the reasoning being that ‹hs› 'does not conform to german spelling rules'—which is a surprising in a country that has no officially proscribed spelling rules, no mention of the german language in the constitution, and is home to about 16 million people (20% of the entire population) with 'migrationshintergrund' ('migrational background').
Wentao said,
September 10, 2011 @ 9:31 am
A person named 佛肸 (bi4 xi1) appeared in the Analects 《论语·阳货》, arguably the most influential work ever written in Chinese. Although it's certainly not a commonly used character in modern days, I think all scholars and readers of classics, or anyone who attended a traditional school, should know this character, since the Analects is a book as fundamental as the Pentateuch.
Actually the story about 佛肸 is quite famous; it's one of the three occasions when Confucius himself was tempted to assume office under someone unworthy, against his moral principles. Ms Kang's parents' choice might be eccentric, but no more than Western parents' in choosing a biblical baby name, e.g. Gideon and Nehemiah.
By the way, the name sounds the same as the era name of the great Manchu emperor, 康熙.
GummySnake said,
October 15, 2011 @ 1:23 am
Not to mention the character "肸" is not even included in 《现代汉语通用字表》or 《通用规范汉字表》
Lareina said,
October 27, 2011 @ 12:39 pm
that is why a more efficient system, like a modified pinyin system should be invented and used more often in daily life.