Homographobia

« previous post | next post »

From the pages of Xin Tang, Mark Swofford has resurrected a classic piece by John DeFrancis entitled "Homographobia."  Here's Mark's post.  The entire essay may be found here.  A pdf of the whole issue of Xin Tang 6, in which John's essay appears, is available here.

This is the opening paragraph of John's essay:

Homographobia is a disorder characterized by an irrational fear of ambiguity when individual lexical items which are now distinguished graphically lose their distinctive features and become identical if written phonemically. The seriousness of the disorder appears to be in direct proportion to the increase in number of items with identical spelling that phonemic rendering might bring about. The aberration may not exist at all among people favored by writing systems that are already closely phonemic, such as Spanish and German. It exists to a mild degree among readers of a poorly phonemic (actually morphophonemic) writing system such as English, some of whom suffer anxiety reactions at the thought of the confusion that might arise if, for example, rain, rein, and reign were all written as rane. It exists in its most virulent form among those exposed to Chinese characters, which, among all the writing systems ever created, are unique in their ability to convey meaning under extreme conditions of isolation.


Since the so-called homograph problem is invariably brought up by opponents of Chinese romanization, John's demolition of this bugaboo in XT 6 is crucial for allaying the irrational fear of romanization that plagues many otherwise reasonable individuals.  In his inimitable style, John uses a combination of powerfully marshaled evidence and clever wit to allay fears of homography in romanized Chinese.  His citation of homographs and near-homographs in Hawaiian and in Vietnamese is particularly entertaining.

My rule of thumb is always this:  if homography were a problem in (more or less) phonetic scripts based on real, spoken languages, then homophony would be a problem in the speech of such languages.



64 Comments

  1. Avery said,

    September 27, 2010 @ 11:43 pm

    …except that written and spoken languages can be very different beasts, and the written form tends to use the more complex and, for Japanese and Chinese, homophonically ambiguous forms of words.

    [(myl) It's certainly true that spoken and written forms are generally different, and that this difference is larger in some languages (and genres of speech or writing) than others.

    But if your argument were valid, then it should be difficult or impossible to follow someone reading aloud in Chinese or Japanese. Is this true? The existence of fairly large numbers of Chinese and Japanese audiobooks suggests that it's false. At librivox.org, Chinese, with 506 works, is the second-largest language after English. On your argument, all of these works should be essentially incomprehensible, right? And this 1988 NYT article asserts that more than 20 years ago, listening to books on tape was increasingly popular among Japanese commuters. How could this possibly be true, if "the written form tends to use the more complex and […] homophonically ambiguous forms of words", to the point that a phonologically-based writing system would be difficult or impossible to comprehend?

    This discussion is very stereotyped and predictable:

    A: If homography were a problem in phonologically-based writing for a given language, then homophony would be a problem in speech.

    B: But speech and writing are different, and Language X in particular has a written form where homophony is especially pernicious.

    A: So reading out loud is impossible to understand in Language X?

    For some reason, the people who insist on position B have never (as far as I know) done the simple experiments that would prove them right.

    Now, it's clear that reading is a highly overlearned activity, and switching systems is a pain at best. English speakers who are unsympathetic to this problem should try reading a (piece of a) novel written in IPA some time. But that's not a good reason to make empty (I think) arguments about what's psychologically (as opposed to culturally) possible and impossible.]

  2. Disfraz said,

    September 28, 2010 @ 12:04 am

    The problem isn't so much that the phonetic form doesn't make sense so much as that it involves teaching yourself how to read again. It's pretty well-documented that fluent readers recognise word forms, not individual letters, which is why wirtnig lkie tihs siltl mkaes smoe aomunt of snese. So, for a fluent reader of Japanese, changing 我輩は猫である to わがはいわねこである (or worse, wagahaiwanekodearu) is going to be, while not indecipherable, still a complete nuisance, because the word forms are all different. Maybe it would be worth it in the long run, but in the short term it would be terrible.

  3. Disfraz said,

    September 28, 2010 @ 12:07 am

    Pardon me, that should be 吾輩. SCIM betrays me again.

  4. Linus said,

    September 28, 2010 @ 12:07 am

    I don't know, the syllabic diversity is so comparatively low in Chinese (and especially Japanese) that I feel like if they were Romanized the incidence of homographs would be high enough to make things really ambiguous. Korean used to be written with Chinese too, but though Hangul's often cited as a "perfect" phonetic writing system, even they need to occasionally use Hanja ruby characters to set their homophones apart. So I hear.

    I mean, I read his essay, but I'm still not totally convinced – maybe I'm just one of the phobics though. There's a lot of culture in Chinese characters, I'm not sure anyone would want to lose them anyway.

  5. phspaelti said,

    September 28, 2010 @ 1:17 am

    @Disfraz: I've seen that bogus argument paraded in Japanese elementary school textbooks too. Of course you should write: wagahaiwa neko dearu, or something similar.
    Ifyouremovewordspacesinenglishpeoplecantreaditeither.

    All hiragana text (when used) is usually written with word breaks.

  6. Sinnombre said,

    September 28, 2010 @ 1:32 am

    It's true that in spoken modern standard Mandarin, people don't have trouble resolving homophones, and that written modern standard Mandarin is similar enough to the spoken version that writing it in a phonetic system should not create a problem with homographs either.

    However, the current writing system has the advantage of making older varieties more accessible. Historical linguists seem to agree that Chinese used to have a richer phonetic system, and many words that used to sound distinct are now homophones. I doubt that an audiobook version of anything written during the Warring States, or even during the Tang dynasty, would be easily comprehensible to anyone familiar with Ancient Chinese but not already familiar with the text. Also, monolingual Cantonese speakers would not have an easier time, and may have an even harder time, reading phonetically-rendered Mandarin than they do now with Mandarin written in characters.

    Some may doubt the importance of making ancient Chinese more accessible, or making non-Mandarin speaking Chinese read and write in Mandarin. The character system does, however, break down temporal and geographical barriers.

  7. fs said,

    September 28, 2010 @ 1:59 am

    Part of the beauty of, for example, the Japanese writing system is the distance it maintains from spoken Japanese. It is entirely commonplace to have two differently written words that are pronounced identically, as well as two differently pronounced words that are written identically. Consequently, a fairly complex disambiguation process occurs both when committing words to print and when reading printed words aloud. In combination with 振り仮名 (ふりがな, furigana, i.e. guides to phonetic decryption printed alongside lines in a text), this gives rise to, among other things, a rich tradition of punnery and double-meanings in Japanese printed matter, which is a delight to immerse oneself in. Indeed total conversion of Japanese to a romanized writing system would probably hinder the mundane conveyance of meaning very little, but what's the fun in that? :)

  8. Outis said,

    September 28, 2010 @ 2:17 am

    Exactly as fs suggests, in both Chinese and Japanese literature and pop culture, homophones and puns provide a rich source of word play, without which the language-culture would be very, very different. You need look no further than any corporate names or product brands in Chinese — almost every brand/corporate name conveys some kind of meaning that cannot be easily transmitted through a purely phonetic system. Of course, you can still tell people that "kekoukele" means "delicious bliss", but the impact would be so much diminished compared to the simple and immediate 可口可乐.

    Previously, VM had written about how Chinese readers use a different part of the brain from phonetic system readers. In my personal experience, when listening to Chinese/Japanese, written characters also appear in my head. It may be an interesting research to see if Chinese listeners — people who are accustomed to associate sound with characters — also have different brain activity from phonetic script listeners.

  9. fs said,

    September 28, 2010 @ 2:19 am

    Just to clarify a bit – I maintain that it's not just "a sense of cultural tradition" which causes people to cling to Chinese characters. There is a very linguistically involved layer to the writing system which would be lost if romanization were adopted wholesale (in Japanese, at least). For a simple example, the word [mirM] is written in the two phonemic scripts as みる and ミル (rare) respectively, or romanized as "miru" in most romanizations. The word itself means "to view" or "to look at". But using kanji, you can disambiguate it as 見る (to simply look at something) or 観る (to view something). This is not necessarily something that can or should be done simply by context – rather the use of the "wrong one" in writing would itself insinuate some sort of meaning, and this practice is often used to great effect.

    A more extreme example would be the word とる/トル/toru, meaning approximately "to take", which can be written with kanji in a plethora of ways, such as 取る (generic, common), 採る (to pick [fruit], to take [an attitude], to adopt [a proposal], to draw [blood (samples)]), 撮る (to take [a photograph]), 捕る (to take [into custody]), 執る (to take [command of], to take [the trouble], to attend to [business]), 獲る (to take [a prize], to catch [fish]), 摂る (to ingest, to take [in]), 盗る (to take [an object belonging to someone else]), and 録る (to take [notes], to take [a recording]). In fact, I'm sure someone out there is audacious enough to coin more ways to write "toru", simply by taking some kanji X associated with some nuance of meaning they want to convey and sticking the annotation "とる" in a furigana line next to the text. As you can imagine, this creates quite the opportunity for subtlety in written material which is impossible to achieve in the spoken language.

    I suspect that my own personal variety of "homographobia" is activated not by the threat of the orthographic merging of totally unrelated lexemes, but that of closely related lexemes, which in an analysis of the spoken language might even be considered identical lexemes. From what I understand of the article, this point seems not to be addressed.

    [(myl) There were many benefits to making Latin and Greek the foundations of the European educational system. Could we realistically argue that restoring this practice would be a wise investment of thousands of hours of class and study time for each of hundreds of millions of pupils? I don't think so, as much as I value what I myself got out of being put through a partial version of this process.

    No doubt that the current Japanese writing system, which similarly requires thousands of hours of classroom and study time to master, creates all sorts of opportunities for subtle word-play and the like. Is it worth the social cost? It's not up to me to answer this question, but merely observing that the current system has some benefits is not an answer.

    Reference to various (no doubt real) cultural benefits is the usual fall-back position, after the argument about unmanageable homophony is abandoned. This argument would be more persuasive if it was coupled with a realistic assessment of the costs.]

  10. fs said,

    September 28, 2010 @ 2:23 am

    Er, excuse me – I both conducted the phonetic transcription of みる incorrectly and apparently forgot to convert X-SAMPA to IPA in my previous comment. [mirM] should be [mi4M], which in IPA is [miɾɯ].

  11. Janne said,

    September 28, 2010 @ 2:44 am

    Japanese already has alternative writing systems from kanji – three of them, as a fact; people know all of them well, and two of them are already very well adapted to the Japanese language. There is also no real linguistic barrier to using either of them in place of a kanji, and people do in fact use kana in place of – or in addition to – kanji in various situations.

    There can be some social stigma in not using kanji in certain contexts, but that does not include most informal social settings such as messages between friends and the like, which also happen to be the kind of setting that's most conducive to language innovation.

    In short, if dropping kanji in favour of kana would represent a significant net gain for the users of the language, all the elements needed for such a shift are already present and have been for a long time.

    But people aren't dropping kanji. If anything, the amount and variation of kanji in regular use is increasing, not decreasing; no doubt in part due to the new-found ease of use with electronic media. The latest standard list of fundamental kanji has added a couple of hundred characters while dropping only half a dozen, and that list is largely altered in response to popular use rather than trying to drive it.

    People tend to optimize their language fairly well over time, and here kanji use is not dropping despite every opportunity to do so. It seems, in short, that the benefits you suggest for dropping kanji are not in fact there – or not strong enough to compensate for the drawbacks – for the actual users of the language.

    [(myl) Those who are already literate in the current system have an enormous individual and collective investment of cultural capital in the system that they've learned. It's definitely not in their interest to change it. (The same thing could be said about the current English writing system, which is also arguably a disaster from the point of view of the difficulties that it poses for learners, but which suits those who have mastered it very well indeed.)

    The argument is not whether current skilled users of the system would be individually better off if they (for example) increased their use of kana and decreased their use of kanji. The question is whether the enormous investment now required to become a skilled user is Worth It from the point of view of the society as a whole.

    This is a rather abstract discussion, because the real-world possibilities for reform are now essentially zero, just as in the case of the English writing system. But admitting this unfortunate fact doesn't license the conclusion that the current situation is in fact the Best of All Possible Worlds.]

  12. Disfraz said,

    September 28, 2010 @ 2:47 am

    @phspaelti:

    My argument wasn't that it's harder without spaces, though of course it is. わがはいは ねこである is still not that much easier to read, because of the word form issue. It takes slightly longer to parse わがはいは or wagahai-wa than 吾輩は, and the accumulation of those delays makes reading a whole page of kana very difficult. As VM said, it's like trying to read English written entirely in IPA: even if you're familiar with IPA, finding the meaning in the transcription (the sound is easy, obviously) is much harder than it would be with standard English.

  13. fs said,

    September 28, 2010 @ 2:52 am

    @myl: Right you are. I am paying no attention to the social cost, but am only considering this from the point of view of how "interesting" it makes the language, subjectively of course, to me, and perhaps to others who share my view. As a non-native speaker (or reader/writer, I suppose I should say) living outside the Japanese language's sphere of common usage, I have been learning Japanese purely for interest's sake and not out of any necessity to use it in daily life, which probably goes a long way towards explaining why I take this stance :)

    As Janne has pointed out above, though, I don't really have much to fear in the way of kanji disappearing from Japanese. He is absolutely right in saying that computers have revolutionized the way people think about kanji – I, for example, have never even tried to study how to write kanji (or even kana for that matter, though that's not hard), which makes it require a lot less time and effort. The ratio of the number of kanji I can read to the number I can write must be in the dozens by now, and it doesn't hinder me one bit in communication, thanks to 漢字変換-style input methods, which are ubiquitous today.

  14. Pomplemoose said,

    September 28, 2010 @ 3:34 am

    I don't really understand this argument that we should reductively examine the costs and benefits of something so inherently wrapped up in culture and history. What is our desired end state exactly? Maximized efficiency? But we could just as easily begin maximizing efficiency by eliminating Christmas, football, and art museums, surely? The beauty of being human is that each one of us has different abilities and passions, and these differences are reflected not only on an individual level but on the level of cultures, ethnicities, nation states, etc. When our traditions are directly harmful to individual liberty or intolerant of outsiders, dissenters and minorities we may wish to exert effort to change them, but with something that's obviously "expensive" but doesn't seem to be negatively affecting human welfare in any way, such as kanji or oil painting, why should we desire to alter the status quo?

    [(myl) There are many who would take issue with your "doesn't seem to be negatively affecting human welfare in any way" as it applies to the Japanese writing system. Let's ignore the fact that it forces schoolchildren to devote many years of painful memorization to an attempt to acquire a skill that (say) Finnish children routinely acquire in the first couple of months of first grade, and instead focus on one aspect of the outcome, namely the fact that adult literacy — at the level needed to read the newspaper, for example — is apparently quite low. (It's hard to know what the figures really are — see here for some discussion.) Or we could consider the effect of this writing system on educational outcomes for the children of immigrants, who are likely to become more numerous as Japan's demographic changes create pressures to bring in workers.

    It's strange to consider a national writing system as if it were a complex artistic skill like oil painting. Learning to read and write is not an optional skill that people with interest and time can choose to pursue, as part of the fulfillment of their "individual abilities and passions". It's an absolute requirement for progressing in the educational system and for getting a decent job.]

  15. Iulus said,

    September 28, 2010 @ 4:36 am

    I get where you're coming from about homography and I certainly understand your concern that "thousands of hours of classroom time" might be wasted. But, just as you're criticizing defenders of Hanzi/Kanji, I don't think you have adequate research to back up your claims. Han characters allow students to learn the morphology of a language in a way that highly phonetic scripts don't. Foreigners learning Japanese, and I would presume Chinese as well, learn to form and distinguish words with characters in a way that simply isn't possible for Ancient Greek or Latin for example. English offers a similar benefits, but to a lesser extent. Yes, these systems are more daunting, but in a fully phoneticized writing system for English or Japanese, there would be no reasonable way to distinguish all the possible meaings of "こう” (92 possible Kanji/morphemes come up with Microsoft's IME!). I must say that personally, I can barely read Japanese even when it is all hiragana, because I haven't studied the Kanji enough. And I have high conversational proficiency, enough that I would feel confident in my ability to live and work in Japan.

    But of course, as you say, actual research is needed before anyone starts sanely advocating the destruction or preservation of the current Japanese or Chinese writing systems, regardless of the goodness of their intentions.

    [(myl) As I wrote in response to an earlier comment, I think that this is an "academic" (in the derogatory sense) discussion. The chances of reform (whether in Japan or in the U.S.) seem to be essentially nil, and so questions about whether the ultimate outcome would be better or worse (and for whom, and in what way) are empty. Given that, I'm also not sure that it's worth doing a lot of research on the subject. But some experiments (like the question of whether Japanese native speakers can understand formal written texts read out loud) are easy, and the predictable outcome would at least flush the silly arguments about excessive homophony.]

  16. phspaelti said,

    September 28, 2010 @ 4:39 am

    @Disfraz: "It takes slightly longer to parse わがはいは or wagahai-wa than 吾輩は, and the accumulation of those delays makes reading a whole page of kana very difficult"

    This is just nonsense. If readers use romaji every day that will of course be faster. Same for kana.

  17. Jongseong Park said,

    September 28, 2010 @ 7:24 am

    There are cases where it makes sense to distinguish homophones in spelling, namely where the homophones are the result of the surface neutralization of different underlying phonemes.

    In French, cent and sans are homophones, but in certain contexts the normally silent phonemes /t/ and /z/ surface.

    In Korean, word-final d, t, s, ss, j, and ch are all neutralized as /t/, but the distinctions reappear in certain contexts, such as before a particle starting with a vowel sound. Korean orthography respects these underlying phonemes and distinguishes these homophones in spelling.

    Linus: Korean used to be written with Chinese too, but though Hangul's often cited as a "perfect" phonetic writing system, even they need to occasionally use Hanja ruby characters to set their homophones apart. So I hear.

    Korean wasn't written with Chinese; Koreans used Classical Chinese as their literary language. It is true that there are many homophones among Sino-Korean words, but this poses no problem as one can virtually always tell from context which words are meant. In any case, not all Sino-Korean words are used equally often, so it is relatively rare to have commonly used words that are homophones.

    These days, when hanja (Chinese characters) is shown, it is probably because it is a relatively uncommon Sino-Korean term, or because the author wants to emphasize the etymology. Distinguishing homophones is not a primary reason at all. We certainly don't need hanja to distinguish homophones. By the way, most often the hanja is shown in parentheses next to the hangul spelling, so the term 'ruby' is misleading in this case.

    I don't think homophones are especially prevalent in Sino-Korean words as opposed to native ones. A native word like jida has multiple meanings and really has to be counted as multiple words that happen to be homophonous.

  18. David J. Littleboy said,

    September 28, 2010 @ 7:37 am

    The idea that there are significant numbers of Japanese who can't _read_ the newspapers strikes me as nuts. In everything I've done here over 30 years now, I've never run into anything but people doing just fine, thank you. And not just engineers and academics; that includes both music and bowling. The referenced article evaluates literacy as being able to write all 2000 or so kanji. But that's asking the question in a way that's designed to give the desired answer. No one has any trouble understanding or communicating because they miss a dot or stroke here or there; especially now that everyone has kana kanji conversion on their cell phone.

    Meanwhile, if you listen to the news, watch law/medical dramas, and the like on the tube, you'll note that all the technical terms are all kanji. Having the kanji underneath functions as a sort of mnemonic system for keeping things straight: in English, technical terms tend to be latin/greek rooted and distinct from more informal terms; Japanese does that by technical mouthfulls being strings of kanji. When I was a kid, my father had a cough and came back from the doctor with a diagnosis that sounded very knowledgeable. Until we looked it up and found it was Latin for "bad cough". When the doctor here tells me I have 老眼 I know I'm getting old, or that I have 口内炎, I know he's winging it, without running to a latin dictionary.

    Whatever, I'm with fs on this one. Kanji are fun and the Japanese enjoy their language. Telling them that it's a bad idea reeks of cultural imperialism. (Sorry to be in your faces here, but it really does.)

  19. Darryl Shpak said,

    September 28, 2010 @ 8:07 am

    @phspaelti: "This is just nonsense. If readers use romaji every day that will of course be faster. Same for kana."

    You can see the same effect in English. Look at an older manuscript written with letterforms that most people are no longer familiar with. It takes far longer to read: not because that style of writing is worse, just because it's unfamiliar.

    An extreme example as applied to Japanese would be someone like me that happens to know a small handful of Japanese words. Show me hai, neko, or sakura written in romaji and I'll understand it instantly, but the katakana, hiragana, or kanji forms are completely indecipherable to me.

  20. Rick said,

    September 28, 2010 @ 8:21 am

    I think it is curious no one mentions Turkish in these discussions. Turkish changed from a modified arabic script to a roman-based script in the 1920's, with the result (so I have been told) of cutting years off the time it takes to learn to read. But today few turks are able to read pre-1920 Turkish.

  21. George said,

    September 28, 2010 @ 9:15 am

    @Rick: "Turkish changed from a modified Arabic script to a roman-based script in the 1920's, with the result (so I have been told) of cutting years off the time it takes to learn to read."

    I would be interested to know if this is correct. If so, it might have something to do with Turkish-language adaptability to an abjad script. I find English transliterations to be more difficult to decipher than native Arabic words. But, the Arabic script, I think, works quite well for Arabic speakers writing Arabic.

    It may also have had something to do with social identity and a desire to be 'Western.'

  22. Jongseong Park said,

    September 28, 2010 @ 9:16 am

    This may be an unpopular opinion here, but linguists, with their focus on spoken language, may not always best equipped to tackle what goes on at the level of the written language. I get the feeling that linguists often regard written language as mere straightforward visual representation of speech, with the logical conclusion that phonetic orthography is best.

    Most studies about reading that I've come across are by cognitive psychologists. We would do well to take what they have to say on this subject into account in any discussion of orthographic reform. Take the word superiority effect: we are better at recognizing whole words rather than letters. Shouldn't we then focus on making the spellings of words recognizable rather than striving for strict phonetic regularity?

    What I am worried about rather than excessive homophony is excessive heterography that obscures shared underlying forms. The single Korean morpheme dad (root of the verb 'to close') appears in forms written dad-da, dad-eul, dad-i, dad-hi, etc. If we follow pronunciation, they would be written datta, dadeul, daji, and dachi. We would then have more trouble recognizing this morpheme and it would slow down reading (not to mention the difficulties of looking it up in a dictionary). This would be quite unnecessary since the pronunciations given these underlying forms are entirely predictable. I'm not making up straw men here; various people have argued for an entirely phonetic orthography of Korean since its creation, including Syngman Rhee, the first South Korean president.

    At least we should recognize that reading is not the same process as listening. We don't sound out what we read in our mind in order to understand the meaning of the passage. There are so many factors that come into reading that have nothing to do with spoken language—the finding that lowercase text is read faster than uppercase text, for example.

    Note that this type of discussion isn't entirely academic. There are many languages that have no standardized orthography.

    [(myl) The points that you raise about morphological and lexical transparency are important ones, and need to be considered in the design of an orthography. But your characterization of the attitude of linguists toward these questions is inaccurate. No one (as far as I know) is suggesting that a phonologically-based orthography for Japanese ought to represent vowel devoicing, or that pinyin ought to be reformed to represent the lenition or deletion of word-medial consonants that regularly occurs in Mandarin Chinese.

    In the case of languages for which standard or widely-taught orthographies don't exist, this question often takes the form of worrying about whether to mark tone in cases where it changes with phrasal (or other) context. Steven Bird has argued that the right answer is "no", e.g. here and here.

    But the one idea that NEVER comes up in such discussions (I hope) is the idea that the right solution would be to invent a set of a few thousand complex logograms — or to borrow some from Chinese for the purpose.]

  23. J.W. Brewer said,

    September 28, 2010 @ 9:45 am

    With all due respect, the late Prof. DeFrancis does seem to have been rather a monomaniac on this subject. The tribute to him written on LL by Prof. Mair (http://languagelog.ldc.upenn.edu/nll/?p=1077) features the striking phrase: "Annoyed by what he considered to be the backtracking of Mao Zedong and the waffling of Zhou Enlai with regard to their earlier commitment to romanization . . ." I remember thinking at the time I first read that, that if you can look past the tens of millions of corpses and hundreds of millions of ruined lives chargeable to the most tyrannical regime in human history (ok, perhaps in absolute rather than relative terms) to be upset about their failure to adopt your own pet reform proposal, you must be, at best, the very model of the absent-minded professor.

    And of course despite its perceived orthographic handicaps, Japan has for the last century and a half pretty consistently been the most economically and technologically successful society using a non-IE language out there, despite having been bombed to rubble in the middle of that period. Not a very good marketing slogan to tell them that if only they'd gone all the way to romaji earlier and thus freed up all that extra time in elementary school for other subjects, they might by now hope to be as prosperous as the Phillipines or Indonesia. (I'm leaving out Vietnam b/c of the confounding communist-brutality variable.)

  24. language hat said,

    September 28, 2010 @ 9:58 am

    It continues to amaze me that people are so invested in the current, astoundingly inefficient and socially harmful, writing systems of China and Japan that even in a context devoid of practical significance, where the person pointing out said inefficiency explicitly says that "the real-world possibilities for reform are now essentially zero," the defenders of said systems default to the standard sky-is-falling absurdities that come up every time the subject is raised. Why not just say "Yes, it's a terrible system and if it didn't already exist it would be the last one you'd want, but since there are several thousand years invested in it, we're stuck with admiring its esthetic pleasures"?

    I remember thinking at the time I first read that, that if you can look past the tens of millions of corpses and hundreds of millions of ruined lives chargeable to the most tyrannical regime in human history (ok, perhaps in absolute rather than relative terms) to be upset about their failure to adopt your own pet reform proposal, you must be, at best, the very model of the absent-minded professor.

    With all due respect, that's the very worst sort of ad hominem. Are you seriously saying that no one is allowed to say anything about Mao and Zhou other than that they were mass murderers? ("You know, Mao said something in 1948 that is relevant to…" "COMMIE!") Please.

  25. atp said,

    September 28, 2010 @ 10:19 am

    How will Japan ever become a prosperous, modernized nation if they are held back by such a primitive writing system? Indeed, China could never hope to become the world's second largest economy when they are hobbled by those oh-so-difficult to learn characters!

    It's a good thing that there are foreigners smart enough to point out the errors of such cumbersome writing systems. No doubt when China and Japan adopt a phonetic writing system they will become as prosperous as Vietnam.

  26. John Cowan said,

    September 28, 2010 @ 10:24 am

    It's not just that Classical Chinese is difficult to read out loud, it's impossible to do so reliably. Yuen-Ren Chao's famous Lion-Eating Poet In the Stone Den demonstrates this with a text of 92 characters, each of which is pronounced shi in one of the four tones. It's perfectly readable, but spoken out loud it's gibberish. (The syllables are a little more diverse in the other Sinitic languages, but still probably gibberish.)

    If Chinese were written in a purely vernacular style, romanization would be straightforward. Unfortunately, the more elevated the style, the more excursions into Classical Chinese there are, and the less possible it is to romanize it. Chao devised a romanization called General Chinese that attempts to partially solve this problem by maintaining every phonemic distinction that any of Mandarin, Cantonese, Wu, Sino-Japanese (both go'on and kan'on readings), Sino-Korean, and Sino-Vietnamese maintains. The result looks very strange, of course: his name is "Dhyao qiuan-remm", but if you can remember the rules, it works.

    The Turkish writing reform was only a small part of the overall Turkish language reform. Ottoman Turkish was so saturated with Persian and Perso-Arabic loan words and loan syntax that ordinary Turks could not understand it even read out loud, never mind read it for themselves. Indeed, when some poets began to write using actual Turkish words and morphemes in the early part of the last century, it was taken as a portent of revolution, which indeed it was.

  27. J.W. Brewer said,

    September 28, 2010 @ 10:41 am

    Well, Prof. Mair was obviously focusing in his tribute on Prof. DeFrancis' linguistic interests and thus may have understandably omitted broader context concerning the decedent's attitudes toward the mainland Chinese government of the 1949-76 era. But it was, shall we say, not unknown for 20th century western intellectuals and academics to become enamored of brutal third-world tyrannies precisely because they might be willing to carry into effect the intellectuals' pet reform projects that would have no chance of being adopted in a democratic and/or traditional society. Maybe the rationalization is that if the eggs are being broken anyway, why not chime in with your own suggestions for the omelette?

    There's actually a good empirical question here: how many changes of writing system have, historically, been voluntarily adopted by reasonably democratically-governed language communities? Ataturk is certainly not my idea of a liberal democrat, romanization in Indonesia/Malaysia/Vietnam/the Phillipines was the product of colonialism. Maybe some post-Communist polities have de-Cyrillicized (not sure if I'm coining that word . . .) voluntarily and with broad popular support, although it's not like those Central Asian -stans are very democratic)? I've heard anecdotally that the more pro-independence factions of the Montenegrin population are less likely to use Cyrillic, with those grumpy about no longer being in the same nation-state as Serbia tending in the opposite direction, although there you already have two scripts existing side by side so it's easier for further shifts to occur in a grass-roots-driven kind of way. And maybe some spontaneous-evolution things are happening on the ground now in Singapore and Hong Kong where a very high %age of Sinophones are also Anglophones and thus fully comfortable typing/texting etc. in romanized form. But if as a matter of history and politics we know that kanji+kana is highly unlikely to be displaced in Japan except by brutal and illiberal means in pursuit of a supposedly enlightened end (and MacArthur apparently had other fish to fry), isn't spending scholarly time and energy on complaining about the status quo sort of like tweaking the details of Esperanto so it will be all ready to implement Come the Revolution?

    Overall, I continue to be struck by the interesting fact that the standard cultural taboos of modern academic linguistics (no language is inherently better than any other, there's no such thing as a primitive language, comprehensive reform proposals aimed at making a language's grammar or lexicon more "rational" are the sign of prescriptivist crackpots or authoritarians) apparently do not apply to writing systems.

  28. Jongseong Park said,

    September 28, 2010 @ 11:06 am

    J. W. Brewer: Overall, I continue to be struck by the interesting fact that the standard cultural taboos of modern academic linguistics (no language is inherently better than any other, there's no such thing as a primitive language, comprehensive reform proposals aimed at making a language's grammar or lexicon more "rational" are the sign of prescriptivist crackpots or authoritarians) apparently do not apply to writing systems.

    There is no double standard going on here.

    I have seen no objective evidence that one natural language is better than any other, etc., and I don't think academic linguists have either. It is not as if the reason for these 'taboos' is political correctness.

    On the other hand, I find it to be blindingly obvious that writing systems and orthographies are not created equally. Some are clearly more rational, efficient, and more suited to the languages they represent. I would like to hear someone argue that the complicated Chinese writing system is the best one can come up with for the language, or that current English orthography is the best one imaginable.

  29. John Roth said,

    September 28, 2010 @ 11:27 am

    @phspaelti:

    Ifyouremovewordspacesinenglishpeoplecantreaditeither.

    You obviously have not listened in on computer programmers arguing whether this form, or

    IfYouRemoveWordBreaksEnglishPeopleCantReadItEither

    or

    ifYouRemoveWordBreaksEnglishPeopleCantReadItEither

    or

    If_you_remove_word_spaces_in_english_people_cant_read_it_either

    is more readable. The debate can get vicious, especially when someone drags in 30 year old experimental evidence where the experimenters didn't allow for training effects.

    As far as I can tell, those neural networks can adapt to anything, and will stay adapted as long as the person continues using them regularly. The length of time it takes to train is probably more important.

  30. wally said,

    September 28, 2010 @ 11:58 am

    @ George and Rick

    My feeble understanding of it is that the arabic script with its deemphasis on vowels was particularly ill suited for Turkish with its vowel harmony. And I have seen it expressed not so much as in reducing the time it takes to learn, but that literacy fairly quickly went from a tiny percentage to about normal for comparable countries. And yes, now only a few scholars can read the old inscriptions.

  31. J.W. Brewer said,

    September 28, 2010 @ 12:09 pm

    But English itself certainly isn't the best language imaginable, insofar as I can be irked by, oh just for example its loss (in standard form and my own dialect) of singular/plural distinctions in second person pronouns and its failure to offer contrasting sets of first person plural pronouns for inclusive and exclusive use. The background assumption is perhaps that all languages have strengths and weaknesses and somehow (through some undescribed Panglossian process) net out equally, with any improvement in one dimension of a given language somehow guaranteed to be accompanied by an offsetting loss elsewhere? And of course precision is a good thing but complexity is perhaps a bad thing yet the same feature may equally deserve both labels.

    To say something positive about DeFrancis, I do think that it is perfectly legitimate and useful to inquire into and criticize the self-serving and complacent mythology than many language communities may have about the unique wonderfulness of the way that they happen, for contingent historical reasons, to do things. Nor do I have any PC hangups about Western academics potentially contributing to the demythologization of non-Western cultures. On the other hand, DeFrancis' tack of characterizing his intellectual adversaries as suffering from a form of mental illness is perhaps a rhetorical strategy that should be employed sparingly, even if one assumes a jocose intent will be obvious to the reader.

  32. fs said,

    September 28, 2010 @ 1:29 pm

    J.W. Brewer: I don't think it's a matter of languages' "strengths" and "weaknesses" balancing out – it's just that what we perceive as a strength or a weakness in another language is wholly based upon our experience with our own native language or the languages in which we are fluent.

  33. Alex said,

    September 28, 2010 @ 2:08 pm

    When I clicked on the post, I thought "homographobia" referred to the practice of trying to "expose" gay people by analysing their handwriting. That is, homophobia+graphology. It seems to have been a surprisingly popular use of the pseudoscience of graphology. A quick googling yields "Sexual Deviations as Seen in Handwriting" by Marie Bernard 1990(!) and "The ABCs of Handwriting Analysis" by Claude Santoy.

    But I see my etymylogical analysis was mistaken.

  34. George said,

    September 28, 2010 @ 2:17 pm

    Wally: "My feeble understanding of it is that the arabic script with its deemphasis on vowels was particularly ill suited for Turkish with its vowel harmony. "

    This would make sense if vowel harmony is phonemic. Actually, I was wondering if Turkish morphology might cause problems with the Arabic script. It is always given as the example of an agglutinating language. (Unfortunately, I know very little about Turkish).

  35. SeanH said,

    September 28, 2010 @ 2:25 pm

    in a fully phoneticized writing system for English or Japanese, there would be no reasonable way to distinguish all the possible meaings of "こう” (92 possible Kanji/morphemes come up with Microsoft's IME!)

    With the awareness that I'm playing a scripted part in this debate, English has no reasonable way to distinguish all the 130-ish possible meanings of "run", but we do quite well nonetheless.

  36. Jongseong Park said,

    September 28, 2010 @ 2:47 pm

    SeanH: English has no reasonable way to distinguish all the 130-ish possible meanings of "run", but we do quite well nonetheless.

    Exactly. And this is also the point I wanted to make with the example of jida in Korean. A quick look in the Pyojun Gugeo Daesajeon dictionary reveals 11 separate entries for this word, with 28 meanings in total. Non of them are Sino-Korean.

    Sure, there are a lot of homophones among the Sino-Korean words, but there are also a lot of homophones in native words. We may be cheating a bit since we're relying an examples of semantic expansion for 'run' and 'jida' to inflate our number of meanings, but it wouldn't surprise me if there were several cases of single Chinese morphemes acquiring several different meanings and coming to be represented by different characters in spite of the shared etymologies (the creation of the character for the feminine third person pronoun 她 comes to mind).

  37. zoetrope said,

    September 28, 2010 @ 3:36 pm

    @George: Having studied Turkish for a bit, I can't say I can see how its morphology would cause problems if it were written in Arabic script. In fact, I don't even think vowel harmony would be a big problem; it might even help readers by diminishing the amount of guesswork for words with unwritten vowels.

  38. fs said,

    September 28, 2010 @ 3:39 pm

    Jongseong Park: Or the adoption of the Latin digraph "TA" for a gender-neutral animate third-person pronoun! (renren.com, a major Chinese social networking site, uses it, I believe.)

  39. VMartin said,

    September 28, 2010 @ 3:42 pm

    Let's ignore the fact that it forces schoolchildren to devote many years of painful memorization to an attempt to acquire a skill that (say) Finnish children routinely acquire in the first couple of months of first grade, and instead focus on one aspect of the outcome….

    Once I read that there were two kind of schools at the end of the 19th century in Central Europe. – Classical, where Greek and Latin was mandatory and "Realschulen" focused more "real", more technically. Yet the great physiscists who founded modern physics were educated all in the classical schools (Einstein included).

    Another question is for what purpose schools really are. Aren't they also places where to put children when adults are working? In that case it is better when children learn Kanji if they can use it later. What would they learn instead? What is the the GDP per capita in USA? Or how long it took Columbus to cross Atlantic? Or how to make Sum of values in Excell?

  40. Faith said,

    September 28, 2010 @ 4:23 pm

    @ JW Brewer — the adoption of the Roman alphabet for Ladino took place very easily, without a central government at all, far less an autocratic one.

  41. Xmun said,

    September 28, 2010 @ 6:40 pm

    The word ought obviously to be "homographophobia", but it's been reduced by haplology, compression, contraction, syncope, syncopation, elision, or something (take your pick: and how many more words are there for this?).

  42. David J. Littleboy said,

    September 28, 2010 @ 10:32 pm

    "English has no reasonable way to distinguish all the 130-ish possible meanings of "run", but we do quite well nonetheless."

    But those 130 or so possible meanings were created with the awareness that you couldn't distinguish them in writing, so there aren't any problems distinguishing them. So that argument doesn't apply to the situation where the possible meanings were created under the assumption that they'd be distinguishable in writing.

  43. Anthony said,

    September 29, 2010 @ 1:38 am

    Part of the beauty of, for example, the Japanese writing system is the distance it maintains from spoken Japanese. It is entirely commonplace to have two differently written words that are pronounced identically, as well as two differently pronounced words that are written identically.

    I wonder how well this argument would be received, either by a native speaker of English, or someone trying to learn it:

    Part of the beauty of, for example, the English writing system is the distance it maintains from spoken English. It is entirely commonplace to have two differently written words that are pronounced identically, as well as two differently pronounced words that are written identically.

  44. xah lee said,

    September 29, 2010 @ 2:03 am

    DeFrancis appears to me like a academecic idiot.

    he rants on in monotone, but his writing is unclear, complex, hard to read, yet sans cogency nor humour.

    in a piece on Chinese, why didn't he throw in some real chinese characters to illustrate the point? I have a hard time reading his pinyin adorned with english explanations. Is the target of this piece to be non-chinese speaking plebians??

    in what seems to be deep research, why didn't he actually illustrate examples or cite statistics regarding chinese homographs and homophones? instead, he borrows hawaiian and vietnamese to prove a point, by analogy??

    quote:
    «The relative simplicity of Chinese will become even greater if, as many advocate, tone indication is used only when necessary to avoid ambiguity. According to Yin Binyong (personal communication 2/7/85), tests made on written materials indicate that Chinese needs to add one of its four tone marks only on one word (cir) in twenty. According to my own count, French …»

    why didn't he actually give hard points instead waving hands with his “personal communications” friends?

    quote:
    «A rational approach along the lines indicated above will doubtless confirm the conclusion reached by Chao (1959: 10) that Chinese as a whole is “neither much more nor much less ambiguous than most other languages.” It would logically seem to follow from this that a phonemic writing system for Chinese on the whole would also be neither much more nor much less ambiguous than other phonemic systems of writing such as English, Spanish, German, and Russian. In other words, it seems to be an elementary truism that a Pinyin orthography that is truly based on speech (of course at various levels), and that is provided with a minimum number of judiciously determined special spellings to avoid attested occurrences of unacceptable ambiguity in realistic contexts, can function as a simple and practical orthography for Chinese. The implementation of such an orthography appears to offer the best possibility for curing all but the completely hopeless cases of homographobia.»

    The above conclusion, is so ridicilous that it just won't fly for any native chinese in living in a taiwan or china who were not born imbicils.

    when is this piece written? published in where? what's the audience, context?

  45. ahkow said,

    September 29, 2010 @ 2:54 am

    So what does the name of the journal Xin Tang mean? New Soup? Letter Sugar? New Sugar? New Sweet? New Tang? Heart Sweet? Heart Hot? Heart Lie [Down]? Letter Hot? New Pond? New Hall? Heart Hall?

    Without the English name I wouldn't be able to figure it out precisely (New China = Xin Tang (as in the dynasty)). Some of the options are wacky, of course, but others are plausible (eg. Heart Pond, Heart Hall, for literary magazines).

  46. Guyllaume said,

    September 29, 2010 @ 3:18 am

    Romanizing the Chinese language would be an extremely foolish idea. It leads to far too much ambiguity in the language. Yes I agree that for most uses, romanizing everyday Mandarin sentences would not be too onerous, but I can think of numerous situations which would not be the case:

    1. What happens to personal and place names? Its impossible to tell from romanization the meaning behind someone's name.

    2. The less vernacular the style of writing, the harder romanized Chinese is to understand. If you switch to romanization, classical Chinese (or even vernacular Chinese with large elements of classical writing) becomes very difficult to read.

    3. What about Chinese idioms (e.g. the 4 character sayings)? Chinese idioms are hard enough to remember as it…the characters often provide the reader with a clue to the meaning.

    4. What about Chinese puns ("shuang guan yu")? Chinese puns often only work because Chinese is such a visual language i.e. you use 2 characters with different meanings but the same sound. Chinese visual puns are a huge component of the written language and give the written language much of its humour. If you romanize, you take that away.

    5. One of the problems with characters is that people often forget how to write them. But no one forget hows to read them! I am ok with using pinyin for computer input (makes life easier). But it would be incredibly inefficient to print entire books in pinyin – the problem isn't reading characters, its writing them!

  47. xah lee said,

    September 30, 2010 @ 2:48 am

    more comment… wrote a whole blog of it here
    • 〈John DeFrancis Idiot on Chinese Language〉
    http://xahlee.org/Periodic_dosage_dir/bangu/John_DeFrancis_idiot_on_Chinese.html

    here's extra info not from my previous comment.

    About Pinyin

    Note: the journal Xin Tang, where DeFrancis's article is published, seems to be a pro-pinyin journal. All articles in it are in pinyin. And the name Xin Tang (新唐) means New China. The journal is published in USA. Possibly it's politically backed up publication.

    The idea and desire for Chinese writing system to change to Latin alphabet system was common at the time during early 1990s among scholars and educations. Remember, China was devastated by World War 2 with Japan and civil war between its communist party and nationalist party Kuomintang (國民黨). The thought of modernization is strong. (See: 花样的年华 (Age of Blossom))

    People's Republic of China officially introduced Pinyin in 1958, and Simplified Chinese characters also happened around the same time.

    Pinyin was primarily a system for annotating pronunciation, but with the thought of using it to completely replace the character system. But it never happened. Instead, simplified characters remain the writing system. Pinyin is just used as pronunciation symbols in mostly education, and it is also widely used to input Chinese on computer today.

    Here's some quote from Wikipedia:…

    Can Pinyin Practically Replace Chinese Characters

    It is a interesting question: to what degree the ambiguity of increased homograph in pinyin as a writing system, affect pinyin as a the sole writing system for Chinese. Another way of asking the same question is, if all Chinese characters of the same sound are replaced by identical characters, how would it affect the effectiveness of written communication?

    Note that puns based on homophone happens daily in Chinese newspaper articles, and puns in shop names are frequent too. And person names, street names, much relies on different characters to differentiate. If we imagine that Chinese chars magically disappears in China and Taiwan and are replaced by pinyin, it will have a major impact on Chinese culture.

    If you simply show a page of pinyin to a Chinese, what can be read in 20 seconds might now take 2 minutes. This is mostly due to unfamiliarity. But suppose if Chinese grew up with pinyin as the writing system, what would be affected or changed due to the increased homographs? (without spending time on this, i'd guess the homograph would increase perhaps at least 100 folds.)

    DeFrancis's article doesn't provide any technical info, but only tries to mock the questioner.

    Today, the China modernization crisis is gone, and the worry about the difficulty of Chinese characters for computer processing is also a thing of the past. There is no desire in China or Taiwan to replace Chinese writing system by alphabets. However, the pinyin as writing system is still a interesting technical question of Chinese language.

    Japan also has the same problem. Japanese writing system is based on phonetic alphabets and Chinese characters (called kanji). Japan also had the desire to eliminate the cumbersome Chinese characters, facing pretty much the same problem as alphabetizing Chinese. However, Japan clearly have not adopted the elimination of Chinese characters.

    Similar situations happens in Korean and Vietnamese. One would be interested to know how North Korea and Vietnam solved the problem, even though they are completely different languages. For example, to what degree would be the increase of homographs in these languages. DeFrancis provides no info on this whatsoever.

  48. Jongseong Park said,

    September 30, 2010 @ 8:28 am

    @xah lee: You may want to look at my previous comments about Korean. You can certainly find many cases where Chinese characters would disambiguate homophones in Korean. But I don't for a moment buy that this means that we should revert to using Chinese. Literary Korean is reasonably close to the spoken language, and homophony is not a problem in spoken Korean.

    Remember, Koreans wrote in Classical Chinese in the pre-modern period. The notion that Chinese characters were used to write Korean is false; this is no more possible than writing English with Chinese characters. Korean could only be adequately recorded after the invention of the Korean alphabet. Nevertheless, words of Literary Chinese origin (Sino-Korean words) were often written with Chinese characters in many texts otherwise written with the Korean alphabet. There were varying degrees of this; in extreme styles, all Sino-Korean words, even Sino-Korean elements in compound words with native elements, were written in Chinese characters.

    Such texts would literally be unreadable for most younger Koreans today, as no one except specialized scholars reads or writes in Literary Chinese any more (much as a general English-speaking audience can't be expected to read the Greek alphabet any more). North Korea banned Chinese characters from the beginning; in South Korea, the 'mixed style' persisted for a few decades, but Chinese characters became used less and less. Nowadays, most texts produced in Korean don't use Chinese characters at all; you would probably be more likely to find the Latin alphabet mixed into Korean texts. At this point, the continued use of Chinese characters in writing Korean is mainly for prestige reasons.

    The main arguments for reintroducing Chinese characters centre around maintaining the connection with tradition (it is a shame after all that younger Koreans can't read books written mere decades ago because of the Chinese characters) and with neighbouring cultures (although China now uses simplified characters, complicating this argument somewhat). The disambiguation of some homophones is a bonus, but is not essential in any way; tens of millions of Koreans who now write without using Chinese characters can tell only too well from experience that homographs or not, they are getting along perfectly fine using only the Korean alphabet.

    You were curious how Korean solved the problem of increased homographs. The answer, as I suggested in previous comments, is that this was not a problem to begin with, any more than using the same graphemes 'homo' for the elements in homograph (from Greek, 'same') and in Homo sapiens (from Latin, 'man') is a problem of English orthography (which, let's face it, has far more important things to worry about).

  49. Greg Morrow said,

    September 30, 2010 @ 3:46 pm

    Guyllaume, I might suggest reading the rest of the thread. As Professor Lieberman points out more than once, if there is an ambiguity problem in a quasi-phonetic writing system, there is an ambiguity problem in the spoken language. By inspection, a living spoken language cannot have an ambiguity problem — language features and circumlocutions will rapidly evolve to get around the problem, because a spoken language has the single-minded goal of effective communication.

    Accordingly, if a writing system introduces ambiguity, it is because the written language it is attempting to represent has in fact diverged substantially from the spoken language. This happens all the time — written Latin versus spoken Italian in the era before Dante; Bokmal versus Nynorsk in Norwegian (IIUC), etc. But you have to understand that that's what's happening before you declare the problem intractable.

    So it is entirely likely that a quasi-phonetic writing system suitable for how those crazy kids in Beijing talk today will be incompatible with the current standard written language because the current standard written language was designed at a much earlier stage of evolution of the spoken language.

    Writing systems can have different goals with different tradeoffs — if the goal is "represent how folks talk today", that's one thing; if it's "maximize access to classical writing", that's another.

  50. Jongseong Park said,

    September 30, 2010 @ 4:47 pm

    Case in point: The poem "Shī Shì shí shī shǐ" 《施氏食獅史》 in Classical Chinese. It consists entirely of syllables that are pronounced as 'shi' in different tones in the Standard Mandarin readings of the Chinese characters. This kind of ambiguity arises precisely because Classical Chinese has not been a living spoken language for over a thousand years.

    Bokmål vs Nynorsk is maybe not the best example; both are artificial written standards and neither is based on a single spoken form. It's just that Bokmål sticks close to literary Danish while Nynorsk is a distillation of elements from different dialects intentionally chosen to steer clear from literary Danish. I don't really know if you can say that Nynorsk is significantly closer to the way people speak than Bokmål.

  51. Guyllaume said,

    October 1, 2010 @ 4:38 am

    Hmm. Well yes there are ambiguities which are due to the use of classical elements in Chinese writing which do not reflect vernacular speech. But there are also plenty of elements of ambiguity in Chinese which only occur because of homophony in the spoken language, not because of any particular divergence between spoken and written Chinese.

    I do agree though that there are language features used conventionally in spoken Chinese which are used to clarify the ambiguity. The most common are (a) describing the shape of a character and (b) using the character in a well-known phrase or multi-character word. These conventions are never used in Chinese writing, as of course the ambiguity cannot occur when reading Chinese characters.

    Are there any other languages out there which have the excessive degree of homophony as Mandarin?

  52. Jongseong Park said,

    October 1, 2010 @ 9:12 am

    Spoken language adapts to homophony. If homophonous words consistently create ambiguities in communication, then they are avoided using numerous strategies. An amusing example concerns those speakers of English in the US for whom 'pen' and 'pin' have become homophones through sound change—they use the term 'ink-pen' for what most speakers of English would simply refer to as a pen.

    If spoken Chinese has to clarify ambiguities using conventions never used in written Chinese, then this is by definition a divergence between spoken and written Chinese. The homophony in the spoken language in this case arises from importing the register of written language into spoken language, i.e. speaking as one would write, not as one would speak naturally.

  53. Greg Morrow said,

    October 1, 2010 @ 10:55 am

    The development of "snow-ski" to clarify a distinction with water-skiing is another example of the spoken language adapting to work around ambiguity. (You could imagine the word-peevers decrying the "redundancy", but they would be wrong to do so.)

  54. xah lee said,

    October 1, 2010 @ 7:21 pm

    @Jongseong Park. Thanks for the valuable info on Korean.

    i'd like to add a few points on this discussion, parts follows Guyllaume's sentement/comment right above this post.

    • Korean, Japanese, Chinese, Vietnamese are 4 entirely independent langs sharing no roots. So, to the various degrees all these langs have chinese chars as (parts) of their writing system, and how they may got rid of it, has very little to do with whether chinese can adopt pinyin as its sole writing system.

    • in daily talking of chinese, it is quite common, few times a day or at least once every few days, that we ask the other what is meant due to homophone. (this especially happens with person names. When introducing oneself, we often verbally indicate the chars used in our names)

    • if chars of the same sound gets replaced by same chars, as in pinyin, what's a good estimate for increase of homograph? my guess from experience of using dict is that it's at least 100 fold, and probably few hundred. (one can just look up a dict. In a index, char are ordered by sound. For each sound, there are anywhere from 5 to 20 or more chars. Same can be done by trying to use a basic pinyin input system.) This figure shouldn't be hard to come by… has anyone seen it? One can easily get a corpus from online sites and rather trivial scripting can obtain the answer.

    • written lang doesn't have body language to help context. Context here include who is talking, how many in conversion (1 one 1, friend group, lecture), relationship of talkers, situation and subject of the talkers, time (pauses, speed variation in a utterance), intonation etc. So, let's say the problem of homophone in chinese speech is rated at 1. I estimate that these contexts and real-time interaction gets rid problems of homophone by say 99%. So, writing with the same number of now homograph would increase 99 fold.

    If daily speech already use extra talk to get around problems with homophone, written one can't do that. I'm not sure it makes sense to claim that this is caused by that writing and speech diversified. Because otherwise there's no lang that haven't diversified their speed and written text.

    • side question: is there study somewhere on which lang has what percentage of homophone by statistical analysis of transcribed recorded speech? (am mostly interested of this for japanese, korean, chinese, on this, due to a shared characteristics of their one-sound-per-char pronunciation as compared to say french, spanish, english.)

    PS am chinese, grew up in taiwan till 14. Lived mostly in north america since.

  55. Jongseong Park said,

    October 2, 2010 @ 11:14 am

    • side question: is there study somewhere on which lang has what percentage of homophone by statistical analysis of transcribed recorded speech? (am mostly interested of this for japanese, korean, chinese, on this, due to a shared characteristics of their one-sound-per-char pronunciation as compared to say french, spanish, english.)

    You might be interested in "A cross-linguistic quantitative study of homophony" by Jinyun Ke, though it mostly concerns various Chinese dialects.

    I'm not sure where you are getting "one-sound-per-char" for Japanese and Korean, or what you mean exactly. A single Chinese character has multiple possible readings in Japanese, so there is a bit of an opposite problem of homography of different sounds. Korean would only use Chinese characters for Sino-Korean vocabulary, a limited and morphologically constrained part of Korean vocabulary. Many native Japanese and Korean morphemes are polysyllabic and show complex morphological alternation (conjugation, ablaut, etc.). This contrasts with Chinese where at the basic level most morphemes are monosyllabic, even though most words are disyllabic today. Each Chinese character represents a morpheme 90% of the time, but only a small portion of single syllables used in Japanese or Korean would be morphemes by themselves.

    I just don't see what characteristics of Japanese and Korean that would be relevant to homophony are shared with Chinese but not with French, Spanish, and English.

  56. Martin Ellison said,

    October 3, 2010 @ 3:43 am

    re John Cowan: is there any information on Chao's General Chinese romanisation beyond the brief description in Wkipedia? Preferably in English.

    One advantage of Chinese characters is that the student does not need to learn how to spell. Think of all the time that could be saved if students did not need to learn the correct alphabetic encoding of each word.

  57. Ellen K. said,

    October 3, 2010 @ 5:27 pm

    And how is having to learn how to write characters is an advantage over having to learn how to spell?

    Even in English, I can't see that Chinese has the advantage. And many languages are much easier to learn to spell than English.

    Plus, in learning to spell, we don't have to learn the correct alphabetic encoding of each word. We can guess for words we don't know. How likely one is to get the correct spelling depends on the language, and one's spoken dialect and accent, still, one will usually get close enough to be understood. Is there any chance of guessing the right character for a Chinese word, at least closely enough to be understood?

  58. Ellen K. said,

    October 3, 2010 @ 5:28 pm

    Or was that supposed to be an ironic comment?

  59. David J. Littleboy said,

    October 4, 2010 @ 12:21 am

    Probably ironic. But I remember reading it in print somewhere that the number of dizzy spellings in English is roughly the same order of magnitude as the number of characters you have to learn for reasonable literacy in Japanese or Chinese (2,000 (J) to 5,000 (C, but I suspect this is an overestimate).

    Whatever, just as I couldn't survive without spell checking, most Japanese couldn't survive without kana-kanji conversion. I remember reading an essay by one of my favorite novelists in which she said "Writing a letter without checking a dictionary to get the kanji correct is as rude to your readers as it is for a woman to go out in public without makeup." Now if this were our CEO (who dabs on a bit of lipstick hurriedly for only the most formal of occassions) it would have been sarcasm…

  60. C said,

    October 5, 2010 @ 4:48 pm

    Every time this issue is raised here, pro-romanization people go on and on about how inefficient it is for schoolchildren (and foreign language learners) to have to learn all these thousands of characters. The implication is that if only they didn't have to bother with those, they would be able to use their time much more efficiently and learn something more useful.

    I cannot think of any area of knowledge that Japanese and Chinese children are lacking in because of the extra time required to learn to read and write their languages. If the students of these countries were falling behind the rest of the world and failing to meet certain academic standards, then the inefficiency argument would hold more weight. As it is, the most they miss out on is some extra free time that would otherwise be spent watching television or playing video games.

    Seriously, it's not like they're cutting math or science classes in order to squeeze in character learning. The only reason it takes "years and years" to teach children the characters is because they are simultaneously learning the vocabulary words (and accompanying complex concepts) that are built with these symbols. An adult can certainly learn all the daily use characters in Japanese in less than a year.

    Also, do any of you really think that no one in Japan (or China) can read anything before having finished officially learning the characters? In Japanese at least, it is quite easy for children to learn to read new characters just from repeatedly seeing them accompanied by ruby characters, in manga for instance. Just because school instruction is slow and spread out over many years does not mean that all children spend that much time learning to read and write. It's the same as in the West: children who are avid readers quickly learn more vocabulary (and characters) than their peers.

    At any rate, it takes a long time to learn to read and write properly, regardless of the language being taught. Someone said that children in Finland only need a few months of instruction before they are able to read and write, but obviously those children are not reading or writing at the same level that a high school student or college graduate could. Sure, Chinese children can't read everything they encounter after only a few months of first grade, but neither can the children of other countries.

    Whenever this topic comes up, it is claimed that illiteracy is high in China and Japan, and there are plenty of anecdotal examples of smart people forgetting how to write a character. No one forgets how to read them, though. (Except in South Korea, where they don't have to use them much after school.) If someone is born into the right economic and social position, the "opacity and difficulty" of the Chinese script is simply not a barrier to literacy. If someone is born into the wrong economic and social class, literacy is difficult to achieve no matter how easy the writing system is, as evidenced by the many illiterate people in Spanish-speaking countries who are not helped by the advantage of having a very simple phonetic writing system.

    As long as China and Japan continue to be strong economically and technologically, and continue to educate their students to meet international standards, the arguments about opacity, inefficiency, and barriers to literacy hold no weight. If having a cumbersome writing system was honestly such a big burden, there is no way that a country using it could be successful in modern times.

    I don't like being an apologist for these writing systems, since I myself do see problems with them (at least with Chinese; I don't see any problems with Japanese, since the number of characters required for literacy is simply not that many). However, you guys should really come up with some better arguments than "it's hard" or hypothetical workarounds involving "starting from scratch" that don't deal with the messy cultural implications. Assuming that eradicating some sort of fear of homonyms would make it simpler to scrap an entire system is hilarious. Most of the arguments I hear sound like language students whining about how much effort they have to put in when learning a language that uses Chinese characters, and bitching about how everything would be so much simpler and rosier if only those backward Chinese could see the brilliant clarity and logic of a romanized system.

    It's weird how linguists can be so enthusiastic about saving small, rare languages, yet at the same time be so eager to drive an old, widely used writing system extinct. Reading and writing are more complex and involved than simply transferring sounds back and forth from a piece of paper, and even though there are simpler ways of writing Chinese without characters, that doesn't automatically make it a good idea to switch to one of those. I don't buy the idea that advocates of romanization are simply concerned about literacy and efficiency. The way they paint everyone who disagrees with them as irrational or romanticizing the East is suspect. It just makes them sound bitter about the fact that their clearly superior, simple, logical, and efficient system is never going to catch on.

    Come on guys, you love diversity in languages and grammatical systems, why are you advocating for uniformity in writing systems? If the problems with Chinese characters were really that serious and actually had an impact on the ability of a country to compete at the worldwide level, or the ability of citizens to express themselves and produce creative works, then your system would already be in use. Technological advancements have decimated most of the previous arguments for romanization, and the attempts to make good new arguments are falling short.

    Sorry to go off on a tangent, but I lose a bit of respect for Victor Mair every time he speaks about romanization, especially in posts like "English and Science in China and Japan." The fact that scientists communicate in English, regardless of what writing system their native languages use, should not have been "mind-boggling" to him. That he thinks that this would not be happening if Asian languages had switched to romanized systems is laughable.

    "Thirty years ago, I predicted that all of this (the rapid shift to English) would happen IF East Asian countries did not aggressively expand the applications of Romanization for their own languages. To my mind at the time, this was simply a foregone conclusion due to the archaic nature of sinographic writing and the relatively inflexible phonetic representational ability of syllabic writing in comparison with alphabetic scripts."

    Seriously? Does he really think that the use of English in science is in any way unique to East Asia, or actually has anything to do with the writing system? Surely a person as smart as him should be aware of the fact that science is an intensely collaborative field, and publishing things in English is entirely about ensuring that the material is understood by as many people as possible, not a sign of the innate inferiority of the scientists' native language and writing system. Posts like this show an intensely filtered view of Chinese characters; presumably if he had been shown an Italian scientific journal published in English he would not have come to the conclusion that it was the fault of an archaic writing system or language. Linking every example of English use in East Asia to a failure to romanize is strange and short-sighted.

  61. john riemann soong said,

    October 5, 2010 @ 9:00 pm

    the sheer amount of skepticism here that Chinese could /function/ with a romanised writing system is appalling.

    no one is suggesting that the Chinese writing system should be replaced. but note that katakana essentially uses sort of the same thing, and it's getting very popular.

    I as a Han Chinese thoroughly support DeFrancis in his essay. It's a disorder– people don't simply THINK about it!

    The Chinese Department in my school insists on a character-heavy treatment to first-year Chinese students, when what they need to work on is their SPOKEN foundation. Speech is the foundation of language — not writing. Writing systems die out without a spoken counterpart; on the other hand, spoken counterparts do . Writing systems are babies, nursing on the mothers that are the spoken languages.

    body language? body language as I observe with Chinese, isn't a big factor in resolving homophony. piecing out the puzzle of homophony is an automatic task by our brains based on syntactic and grammatical cues. In fact, that's pretty much what GRAMMAR is for — providing redundancy to resolve ambiguity.

    Pullum (and other writers here) have written on the irrational fear of ambiguity time and time again guys. If homophony were such a problem without body language, this would in fact be an extremely unacceptable situation to most children — who would promptly change it. I think the 99% resolution rate cited for body language is extremely generous. Most of the time the ambiguity is resolved through grammar.

  62. C said,

    October 8, 2010 @ 8:47 pm

    @john riemann soong

    "the sheer amount of skepticism here that Chinese could /function/ with a romanised writing system is appalling."

    I don't doubt it could function that way, and neither do many other people who don't like the idea of romanizing everything. It would be functional and people could understand each other, but some subtleties of language, especially written language, would certainly be lost. We don't think that's worth the supposed benefits of no longer using characters.

    "no one is suggesting that the Chinese writing system should be replaced"

    Except for all the pro-romanization people promoting exactly that. That's the main thing many people get upset at, which makes them resort to silly arguments about homophones. Seriously, I have no idea what drives so many people to enthusiastically support extinguishing an entire unique writing system, and think they are being rational. At any rate, the homophony argument isn't much sillier than the pro-romanization argument that schoolchildren are wasting time on characters that could be better spent doing… well, something.

    "Speech is the foundation of language — not writing. Writing systems die out without a spoken counterpart; on the other hand, spoken counterparts do . Writing systems are babies, nursing on the mothers that are the spoken languages."

    Writing influences spoken language greatly, especially when new words are created by combining characters, particularly in Japanese. Also, Chinese characters are no longer tied to one specific spoken language. This is not a writing system that is in any danger of dying out, so it's strange to have a desire to actively abolish it. You may not be one of the people who wants to do that, but many pro-romanization people would be delighted if they could make that happen.

  63. Michael LC said,

    November 22, 2010 @ 5:10 pm

    Suppose in English two, too, and to were merged into TO. Would an article titled "To Hammer Candles" still make sense? Does it mean to pound candles with a hammer or two hammers have formed in a stock chart?

    [(myl) What would it mean if you spoke it?]

  64. Michael LC said,

    November 22, 2010 @ 6:04 pm

    "I don't doubt it could function that way, and neither do many other people who don't like the idea of romanizing everything. It would be functional and people could understand each other, but some subtleties of language, especially written language, would certainly be lost. We don't think that's worth the supposed benefits of no longer using characters."

    I don't doubt it either. However it would lead to changes in product labels. For example, a product my parents bought would in pinyin be "xiāng liánróng." Is that fragrant lotus paste or Hunan lotus paste or some other kind of lotus paste? Turns out it is Hunan lotus paste. Obviously the product label would have to be changed if it had to be labeled in pinyin. As it is, it's too ambiguous if written in pinyin.

RSS feed for comments on this post