The impact of phonetic inputting on Chinese languages

« previous post | next post »

The vast majority of people, both inside and outside of China, input characters on cell phones, computers, and other electronic devices via Hanyu Pinyin or other phonetic script.  Naturally, this has had a huge impact on the relationship between users of the Chinese script and their command of the characters, since they are no longer directly writing the characters through neuro-muscular coordination and effort.  Instead, their electronic devices do the writing of the characters for them by converting the Pinyin or other phonetic inputting to the desired characters, resulting in the widely lamented phenomenon of "character amnesia", which we have touched upon in dozens of LL posts.

There has in recent years been a lot of stuff and nonsense bandied about concerning how Chinese character inputting led to the development of predictive typing, whereas the actuality is that the extreme cumbersomeness of the Chinese writing system necessitated the development of one kind of predictive typing (other predictive algorithms were already in use long before) to rescue the characters from hasty extinction.

Fatuous, sensationalistic claims for the superiority of Sinographs over the alphabet are refuted by more sober analyses, such as that presented in the following article:

"How the QWERTY Keyboard Is Changing the Chinese Language", by Chip Chick (3/13/12)

Even though it was written seven years ago, the points this article makes are still relevant and valid today, especially because they were obscured by the thick smokescreen of misinformation about how technologically advanced Chinese characters allegedly are that was being purveyed during the intervening years.  (In the not too distant future, David Moser will write a guest post exposing the fallacy of the assertion that Chinese characters were the source of predictive typing.)

Starting with the first two paragraphs, here are some key portions of the well-informed, balanced Chip Chick article:

Think about how easy and natural it is to type an English word. Writing a word down on paper and typing a word on a computer are done using the same process – letter by letter. Made a mistake? Pop back and fix the spelling error. It’s easy to overlook how simple the whole process is, because it is done so naturally. But, users of the Roman alphabet have it easy in the world of computers. Things don’t go as smoothly in other languages around the world, which have had to adapt to the now-ubiquitous QWERTY keyboard.

…Estimates say a Chinese speaker needs to know 2,000 different characters to achieve basic literacy, with well-educated speakers knowing upwards of 5,000. 5,000 characters aren’t going to fit onto a keyboard – at least, not one anyone will want to use.


Yang Yaju, a Chinese language teacher at National Sun Yat-Sen University, remarked, “For me, typing English is much faster than Chinese…Besides, computer software can check and correct English easily, but it doesn’t work as well in Chinese.” Chinese computing is in danger of failing to keep up with the needs of its users.

So, what does the future hold? Predictive algorithms are improving, cutting down on the need for selection boxes in phonetic input methods. They aren’t perfect, but they are faster. The attentive typist can spot and fix typos fairly quickly. Still, that leaves the most common methods of Chinese computing completely lacking in the richness and complexity of the natural written language. The practicality issue is being addressed, but the cultural problem of character amnesia persists.

The article pays due regard to shape-based entry methods such as Cangjie and assesses touchscreens and trackpads that enable the user to write characters with one's fingertip or a stylus, but in the end it squarely confronts the inescapable drawbacks of such complicated, user-unfriendly systems in comparison with the ease and efficiency of phonetic inputting (the article naturally focuses on the 26 letters of the alphabet, but also gives an adequate accounting of bopomofo (Mandarin phonetic symbols).  As the author puts it daintily, shape-based, component composition inputting requires "a steep learning curve".  Beyond that, they are slower, more frustrating, and result in a disappointingly large number of mistakes.  These are things I have witnessed repeatedly and have often written about on Language Log.

Some people just dispense with the conversion from Pinyin to characters and communicate directly with Pinyin.  You don't even need tones if you divide up the syllable strings into words (as we've learned in recent posts) according to grammatically rational orthographical rules.  Here's an example:

Anthony Tao
Zhongguo xuezhe David Moser cengjing shuo guo, women wanquan keyi zhi yong pinyin lai goutong (zui qima jiandan de zaixian jiaoliu). Huan yi ju hua shuo, shei ye dou bu xuyao Zhongguo wenzi!
Dangran, wenzi geng haokan, geng you wenhua.

I read that as easily and flawlessly as I read English.


Selected readings

etc. (this is just a tiny sampling of the scores of LL posts relevant to Chinese character inputting)

[Thanks to Shaun Lim]


  1. Tom davidson said,

    December 9, 2019 @ 10:14 am

    Would like to see comments on the impact that character amnesia will have on brush pen calligraphers and creators of Chinese paintings who write in 楷,隷 , 行. or 草. Anyone?

  2. John Rohsenow said,

    December 9, 2019 @ 2:14 pm

    About 20-25 yrs ago when we still had large clunky terminals on our desks and no easy to use Chinese character input systems, I discovered
    my TA from the PRC in the office typing away. Turns out she was writing to her husband back in China using only HYPY (w/o tones). Seemed to work just fine for them. :-)

  3. Luke said,

    December 9, 2019 @ 2:34 pm

    Although writing/typing out "Dangran, wenzi geng haokan, geng you wenhua" is much more practical than writing "當然,文字更好看,更有文化". I fail to see the abolition of Chinese characters happening anytime soon due to the cultural implications of replacing a native writing system that has gone through continuous evolution for thousands of years with a phonetic system that may be seen as appeasing to the west (especially as Pinyin, based on the Latin alphabet, supplanted the use of Zhuyin, a native phonetic system, in mainland China).

    Yes, obviously Simplified characters are in many instances different from Traditional characters, but both of them carry much more cultural heritage than Pinyin.

  4. Not a naive speaker said,

    December 9, 2019 @ 4:35 pm

    I just tried to use "Google Translate" to translate the above example in pinyin. The input was (as expected) detected as Chinese. The English translation was the same text as the pinyin text. After clicking the arrows two times came some partially English text:

    Chinese scholar James Moser once said that women's Wanquan could go back and forth with pinyin (zui qima jiandan de zaixian jialiu). Huanju gathered flowers and said, Xie Duya is not Chinese language!

    Dang Ran, Wenzi Geng Haokan, Geng Youwenhua.

    Google has to do their homeworks.

  5. sicherhalten said,

    December 9, 2019 @ 4:44 pm

    "I read that as easily and flawlessly as I read English."

    I guess this sentence is just odd. Why wouldn't you? Isn't that what you are used to? I'm not Chinese, but I would have thought that someone from China would be used to reading block by block.

  6. Victor Mair said,

    December 9, 2019 @ 5:14 pm


    Although many people (mostly Chinese), some of them highly respected and influential, did call for the abolition of characters during the last century, I have not been hearing many such calls during recent decades, and I don't think anyone is or was trying to "appease" the West by doing so (they were more interested in pragmatic results). Instead, things are just taking their natural course ("tīngqízìrán 聽其自然"), emerging digraphia, etc., all of which have been securely documented on Language Log.

    @Not a native speaker:

    I don't know why you would expect Google Translate to recognize that passage as "Chinese". "Chinese" is written with Sinographs. I find it quite amazing that Google Translate recognized it as "Chinese". Google Translate is incredibly good for being able to recognize simplified and traditional characters and for being able to shift back and forth between them, also for producing high quality translations of Mandarin written in Sinographs.

    If there ever comes a time when there is a substantial number of people reading and writing Pinyin Mandarin, then Google Translate will in all likelihood produce a translation tool for Pinyin Mandarin. That won't be for a while yet, because only a few of us (including Chinese colleagues and friends) exchange messages in Pinyin, and even fewer write literature, news, documents, etc. in Pinyin Mandarin.

    You have to do your homeworks.


    It's not merely a matter of parsing words or not. The main question is that this little text is written in Pinyin, whereas the vast majority of all Chinese reading and writing is done with characters. Most Chinese people simply are not used to reading and writing texts written in Pinyin, though my wife and I have carried out experiments which demonstrate that many Chinese are capable of quickly becoming fluent readers and writers of Pinyin Mandarin. After all, they already know the underlying language.

  7. Ellen K. said,

    December 9, 2019 @ 5:16 pm

    @Not a naive speaker

    The part in parenthesis, when put in by itself, gets translated to "At least simple online communication" if taken from the original, or "At least simple online streaming" after losing the O in the last word. I've no clue on the accuracy of the translation, but it's English, and even grammatical.

  8. Chester Draws said,

    December 9, 2019 @ 6:10 pm

    I fail to see the abolition of Chinese characters Latin happening anytime soon due to the cultural implications of replacing a native writing system International language that has gone through continuous evolution for thousands of years with a phonetic system vulgar language that may be seen as appeasing to the west the illiterate.

    So the Chinese are going to cripple themselves for centuries using a written language that is not fit for purpose any more? All because they don't want to use a much better system because it happens to be associated with the West (not invented, mind, because it was invented east of Europe)?

    There were people who opposed "Arabic" numerals in the West, preferring the completely hopeless Roman numerals for cultural reasons. There is a word for people like that — idiots.

  9. Jeffrey said,

    December 9, 2019 @ 8:06 pm

    As a longtime lurker of Language Log, one who has read all of Victor Mair's entries on Hanyu Pinyin, I decided when I came here to China to ONLY learn Chinese in Pinyin. When I arrived at this university, I told all of my teachers that I would not write a single hanzi. They were a bit perplexed, but they allowed it.

    When we did our tingxie, I would write an entire sentence in Hanyu Pinyin while all of the other students wrote a single character. When the teachers came to my desk, they would check the tone marks and grammar of my sentence.

    In my classes here, the teachers never really taught Pinyin, not in the full way that even Chinese primary school children get. I taught myself all of the fine points and really worked on my tones. Because the teachers started teaching hanzi immediately, there was no time to work on tones in class, so now in my third semester a lot of my classmates are still guessing about tones. When we do choral responses of hanzi text, it's torture to listen to the tones getting murdered all around me.

    One more point, As Victor has mentioned in a previous post, once you reach a certain level of Chinese, it's easier to write and read WITHOUT the tone marks. After about ten months of classes, I have now internalized all of the tone marks for the Pinyin words and it's easier for me to read Hanyu Pinyin text without the tone marks.

    I'm very direct with my teachers. I tell them that I have no idea why the Chinese are using a type of old writing system that people in the West stopped using three thousand years ago.

    The other day, we were practicing the budan…erqie correlative structure, so, smiling, I said: Hanzi budan gulao erqie meiyong.

    Everyone in class laughed. They know me. My gracious teacher smiled in return and said: Hanzi budan haokan erqie youyong.

  10. Luke said,

    December 9, 2019 @ 8:10 pm

    @Chester Draws
    Please explain why you think Chinese characters are no longer fit for purpose. As far as I can tell, the advent of IMEs eliminates the cumbersome task of writing/remembering to write Chinese characters, which also takes care of the issue of character amnesia.

    Now you may be asking, why don't we just cut out the middle man and type Pinyin directly then? I'm not against the use of phonetic systems to write out Mandarin. The simple fact of the matter is that Zhuyin is far superior in representing Mandarin phonology than Pinyin, and more efficient too. Finals that are represented with up to three letters in Pinyin can easily be represented with one character in Zhuyin, likewise initials such as /ʂ/ are represented by Zhuyin with one character rather than two letters (sh) in Pinyin.

    Furthermore, Pinyin contracts a lot of its sounds, /weɪ̯/ becomes ui, which may not be too much of a hindrance for native speakers, but it does mean that it is nowhere near as effective at representing Mandarin phonology as Zhuyin. The use of the same letters in Pinyin to represent entirely different sounds is also an issue in Pinyin, the pairs /i/ & /ʐ̩/ and /u/ & /y/ come to mind, the latter of which causes an inconsistency with the Pinyin rendition of 有 (you) which was not contracted to yu as that would have conflicted with 雨 (yu).

    Make no mistake, I am more against the use of Pinyin than I am against the replacement of the native script with a phonetic system. However I am still arguing that while it may be more practical to use a phonetic system to write Mandarin, pragmatically it is difficult as there is little evidence of widespread backing, as Victor has stated, there has not been many calls to abolish Chinese characters recently.

  11. Victor Mair said,

    December 9, 2019 @ 9:25 pm


    I salute you, I applaud you. You instinctively know how to go about learning Chinese languages. My wife, who was one of the best Chinese language teachers ever to walk the face of the earth, always said that her smartest students (including those at Harvard, Swarthmore, Bryn Mawr, Haverford, Oberlin, Penn, and the University of Washington) rebelled against learning characters, and she did her level best to accommodate them.

  12. Jeffrey said,

    December 9, 2019 @ 10:45 pm

    @Prof. Mair,

    Thank you. I really doubt that I belong to that group of smart students your wife taught, but I am a practical student. I knew what I wanted to learn when I stepped off the plane at Pudong airport. I wanted to learn the Chinese language system in the fastest and easiest way, and I knew that would be by only using Hanyu Pinyin.

    To me, the Chinese language system, a term I'm using to avoid referring to the multiple varieties of Chinese spoken here, is absolutely fascinating. I'm a beginning student,, so each day I learn something really interesting about this system. For example, today I figured out that the particle -zhe can be added to state words to indicate ongoing state. You can say, "Men kaizhe ne." You can also say, "Ni mangzhe ne?" We first learned to use -zhe for ongoing action, but here it can also indicate ongoing state. That's such a simple and clever way of handling this issue.

    I'm finding that the language system is filled with lots of these simple and clever solutions to creating grammatical and semantic meaning.


    I'm sure that for many of you the hanzi are also equally fascinating, but that's not my interest in the language. I'm interested in the phonology, morphology, and grammar of the language, not the hanzi writing system.

  13. Jonathan Smith said,

    December 9, 2019 @ 11:09 pm

    By coincidence I was just looking at Surendran & Levow, The Functional Load of Tone in Mandarin is as High as that of Vowels,, wherein (unsurprisingly given their title) the authors remark that "some language reformers have suggested that tones do not need to be represented in a revised alphabet. Our result suggests that such an alphabet would be as hard to use as an alphabet that represented tones but not vowels." food for thought…

  14. Jonathan Smith said,

    December 9, 2019 @ 11:13 pm

    Although note their results suggest only that two such hypothetical alphabets would be about equally information-lossy, not that they would be objectively "hard"

  15. Martin said,

    December 10, 2019 @ 2:37 am

    When I type using jyutping, I just hapzardly pause to make a selection when it "feels right". How do/would the orthographical rules work when it comes to the particle-salad that is Cantonese?

  16. Chris Partridge said,

    December 10, 2019 @ 4:01 am

    What will the impact of voice recognition systems be? I don’t speak Chinese but I understand that voice-to-text systems are now quite reliable thanks to AI. Can forgetful Chinese testers simply dictate what they want to say and let the smartphone remember the right characters?

  17. Chris Partridge said,

    December 10, 2019 @ 4:03 am

    Testers = Texters. Damn you, autocorrect (should have used Google Dictate.)

  18. Jon said,

    December 10, 2019 @ 4:27 am

    It strikes me that there are many parallels between the Chinese resistance to changing to alphabetic script and American resistence to metrication. To the rest of the world, there is no contest in either case: the newer system is far easier to learn and use, the old system is cumbersome, requires more memorisation and leads to more mistakes. It is inconceivable that any country would change from the new system to the old. Yet there are many who claim that the old system is in fact far superior.
    In both cases there are possible reasons for not changing: the cost of changing, the effort, the disruption, the fact that much older written material will become redundant, the difficulties for older people, etc.
    But usually the objections raised against metrication and alphabetisation are not these reasonable ones, but spurious ones. Those opposing change often claim that the old system is inherently more suitable and natural. I have even read Americans claiming that Farenheit is easier and more natural than Celsius. For those using the simpler systems, these claims are ludicrous.

  19. cliff arroyo said,

    December 10, 2019 @ 4:31 am

    My own ideas about tone in pinyin (former learner of Vietnamese some basic knowledge about Mandarin) would be that both the rigorous representation of surface tones and no tones would both be problematic.
    Intuitively I think that maybe always writing the same morpheme with the same tone (and dropping the first tone mark) would be a good compromise. The functional load of distinguishing first and neutral tones seems to be extremely small.

  20. Antonio L. Banderas said,

    December 10, 2019 @ 5:22 am

    an inconsistency with the Pinyin rendition of 有 (you) which was not contracted to yu as that
    would have conflicted with 雨 (yu).

    Can somebody please elaborate on this? What other examples are there?

  21. Philip Taylor said,

    December 10, 2019 @ 5:23 am

    Cliff — I think that one needs to differentiate between writing as a normal part of communication and writing for pedagogic/didactic reasons. For the former, I would argue that conventional tone-marked pinyin is less ambiguous, and whilst arguably unnecessary for native speakers is a marked boon for others. For pedagogic purposes, however, I believe that didactic material (at least for beginners) should make the effects of tone sandhi explicit.

  22. cliff arroyo said,

    December 10, 2019 @ 5:41 am

    "arguably unnecessary for native speakers"

    arguably is the important word, that's what research is for (though no one seems to want to do research on tone input and reading)

    I think a major problem is that no one's come up with an easy and convenient tone input system similar to telex for vietnamese (which allows users to enter ascii input and see the right tones in real time). Or have they?

  23. JK said,

    December 10, 2019 @ 6:06 am

    I did find one somewhat obscure example the other day of two medical words with the same pronunciation — 心率 heart rate, and 心律 heart rhythm. I wonder how audiences distinguish them in medical lectures, perhaps there is enough context for them to figure it out.

  24. Luke said,

    December 10, 2019 @ 7:48 am

    @Antonio L. Banderas
    /joʊ̯/ is contracted to 'iu' under Pinyin, /i/ should also be a 'Y' when it's used as an initial, meaning that /joʊ̯/ should be written as 'Yu' in Pinyin normally, however whilst /y/ may be written as 'Yü' officially, more often than not, Pinyin drops the diacritics meaning that /joʊ̯/ must be written as 'You' to avoid conflicting with /y/ 'Yu'.

  25. David Marjanović said,

    December 10, 2019 @ 8:50 am

    My own ideas about tone in pinyin (former learner of Vietnamese some basic knowledge about Mandarin) would be that both the rigorous representation of surface tones and no tones would both be problematic.
    Intuitively I think that maybe always writing the same morpheme with the same tone (and dropping the first tone mark) would be a good compromise. The functional load of distinguishing first and neutral tones seems to be extremely small.

    Pinyin orthography does specify that the same morpheme should always be written with the same tone.

  26. Victor Mair said,

    December 10, 2019 @ 8:53 am

    As someone who taught himself Russian well enough to travel around in Leningrad / St. Petersburg and Moscow and to read texts in my fields (Sinology and Central Asian archeology), I would compare marking and not marking tones in Sinitic languages to marking and not marking stress in Russian. During the early stages of my learning, I appreciated having the stress marks, especially since stress in Russian changes the quality of the vowels. Later on, however, when I became more proficient in Russian, the stresses and vowel changes came naturally, and I actually preferred not to have the stresses marked, as I found them duōyú 多餘 ("superfluous; unnecessary") — thinking in Mandarin, as I often do.

  27. Victor Mair said,

    December 10, 2019 @ 8:57 am

    From an anonymous colleague:

    Same old stuff rehashed over and over. Nothing here that Mair, DeFrancis, and Hannas didn't say 3 decades ago. Lost in all of this is the critical argument made even longer ago, by Japanese scholars in the 1930's, namely, that character recognition is directly tied to the motor skills needed for writing them. Stop writing the characters (because phonetic input is available) and sooner or later you'll lose the ability to recognize them.

  28. Karl said,

    December 10, 2019 @ 9:25 am

    When I studied Chinese, pinyin was introduced later. In retrospect, there was little downside. Zhuyin was used first. The rationale was that Chinese sounds differed so much from those of English that pinyin could be misleading. And Zhuyin was, like kana, relatively easy to learn.

    My study of Chinese in the United States started in January, 1986 and continued there until I arrived at Peking University as a language student in September 1987 (a total of two academic years of language instruction). Instructors were Arthur Chen and Clara Sun at Univ. of Wisconsin–Madison. The course of study included an intensive second-year course over the summer of 1987. We started in first year with complex characters and Zhuyin, and moved to simplified and pinyin later, in the second year. The whole experience was not for the faint of heart, but outstanding, something I could only appreciate in retrospect.

    The approach at Madison was holistic in the sense that we had to "do it all," with reading, writing, speaking, and listening roughly equally weighted. There were many contact hours. I think in the advanced first-year track, which I joined in spring, 1987, three lectures and four or five TA session a week. Not surprisingly, talent and work determined performance. One classmate astounded me with his recall of characters, whereas I was better at speaking. Many fell by the wayside. I was young and full of energy, but decided I wouldn't be able to really nail the language without living in a Chinese-speaking environment. That goal served as motivation.

    When I arrived at Beida, I found myself much more advanced than classmates who had done a similar level of coursework (or even quite a bit more). Many American students had okay speaking skills, but really couldn't read or write. They had, however, been steeped in pinyin. German students had the opposite imbalance.

    Anyway, the solid base provided by Prof. Chen and Sun Laoshi helped me later, when it came to a legal career in Hong Kong (among other things) when it was necessary to have serious Chinese reading comprehension skills. And I've always been grateful that they approached language instruction in a way that provided a solid basis for later work in the language. It was sad to see bright and capable students disadvantaged by poor instruction at the start of their foreign-language learning experience.

    Of course pinyin is now indispensable to me as an input method, but I've never considered it more than an organizational tool and don't think that I was somehow disadvantaged by how it was not treated as a core element for learning the language.

  29. cliff arroyo said,

    December 10, 2019 @ 11:34 am

    "Pinyin orthography does specify that the same morpheme should always be written with the same tone"

    That's not the impression I get from spellings I've seen…

    I imagine the main benefit of tone marking is less tones per se and more to create a larger pool of visually distinct syllables. So some degree of tone marking might be less important for the extremely literate but helpful for many others (including Chinese people who learn Mandarin as a second rather than first language).

  30. Philip Taylor said,

    December 10, 2019 @ 11:50 am

    Cliff — "I imagine the main benefit of tone marking is less tones per se and more to create a larger pool of visually distinct syllables". I think that different groups of people potentially derive different benefits. For me, a relative newcomer to Putonghua/HYPY in relation to many contributors to these threads, I find the fact that the tone markers automatically create a pitch-contour in my mind an enormous benefit, whereas Cantonese Jyut6ping3 completely fails to achieve the same.

    Normally I am not a visual person, far preferring text to icons, but the HYPY tone markers are so well chosen that I cannot imagine a better system. This is not to suggest that others do not derive more benefit from "a larger pool of visually distinct syllables", only to report that I am not one of that group. In fact, until the pitch contours have established themselves in my mind, I genuinely do not know what HYPY means.

  31. Antonio L. Banderas said,

    December 10, 2019 @ 1:56 pm

    @cliff arroyo "That's not the impression I get from spellings I've seen…"

    Please add some examples; are you referring to "Linguistic Pinyin" as opposed to "Character Pinyin"?

  32. cliff arroyo said,

    December 10, 2019 @ 4:11 pm

    "Linguistic Pinyin" as opposed to "Character Pinyin"?

    There's a difference? Most of the pinyin I've seen is in language courses and some syllables definitely have different tones written such as


  33. Julian said,

    December 10, 2019 @ 6:30 pm

    What Victor said about Russian.
    Remember that a phonetic writing system can have two purposes:
    1. to represent the language completely and accurately – eg International Phonetic Alphabet. Useful for learners who are using written learning materials.
    2. to contain information that is enough but, for the sake of brevity, not more than necessary to remind readers of the intended word, assuming they are already competent in the language. (Linear B syllabary for Mycenean Greek; alphabets without vowel signs; Pitmans shorthand; common English abbreviations like Mr, Mrs, Dr, Wm [=William], Bros [=Brothers]…

    And there can be various blends, such as English spelling, which contains more than enough information for 2 (mostly), but not enough to be accurate in the sense of 1 (often).

    Russian with accents/pinyin with tone marks is type1; without, type 2, in that it relies on the reader to supply missing information to complete the spoken word.

    There's no law that says that a 'good' writing system should be of type 1. A type 2 system can be fine and fit for purpose providing it has enough information and is pitched to its users' level of competence. With supplements for learners if desired, such as marking accents/tones in learning materials.

  34. Victor Mair said,

    December 10, 2019 @ 7:25 pm


    Your remarks are much appreciated.

    Most Semitic scripts don't even indicate vowels, yet somehow their users manage quite well without them.

  35. Jonathan Smith said,

    December 11, 2019 @ 12:21 am


    As fluent readers of English, all commenters will understand that a phonetic writing system need not be of your "type 1", and as far as I can tell, we also all agree that toneless pinyin is (or would be) perfectly functional as a writing system for fluent-speaker users. Nonetheless, for my part, I find toneless pinyin less than satisfying largely because non-representation of tone — a simple function of the lack of a ready means for such in our Roman alphabet — is in effect a crude cultural imposition. Explicit indication of lexical tone in Romanized Chinese, to which we might profitably compare explicit indication of phonemically long vowels in Romanized Japanese, is in my view a welcome signal of respect for and understanding of the special and distinctive, not to say defining, features of the language under representation.

    And tone in a language like Chinese really shouldn't be compared to stress in Russian or English: as is intuitively obvious and as is demonstrated statistically in the study I linked, lexical tone participates in meaningful contrasts at a level which massively surpasses our word stress. I'm not sure why these features should come up so often as approximate analogues of tone… maybe because both are squiggly marks over the actual sounds that are easier to just leave out?

  36. cliff arroyo said,

    December 11, 2019 @ 1:20 am

    "And tone in a language like Chinese really shouldn't be compared to stress in Russian or English"

    If only there were a tonal language once written with Chinese characters and now written in Latin script…. there might be something to learn there….

  37. Andreas Johansson said,

    December 11, 2019 @ 6:30 am

    Regarding the functional load of tones and vowels, as Prof. Mair points out, there are plenty of languages which make do without vowel representation (well, most of them indicate at least some vowels at least some of the time, but they don't indicate vowels with near consistency as Latin-script langauges do). Is there any research on the impact of this, if any, on native speaking literacy etc?

  38. Victor Mair said,

    December 11, 2019 @ 7:59 am

    @cliff arroyo:

    "If only there were a tonal language once written with Chinese characters and now written in Latin script…. there might be something to learn there…."

    We have one like that, Dungan, only it was / is written in Cyrillic, another phonetic script beside Latin, but it also has been written in Xiao'erjing (historical [Perso-Arabic]), Latin (historical), and Pinyin — not in Sinographs since late imperial times.

    "Dungan-English dictionary"

    "Dungan: a Sinitic language written with the Cyrillic alphabet" (4/20/13)

    "'Jesus' in Dungan" (7/16/14)

    "Writing Sinitic languages with phonetic scripts" (5/20/16)

    Implications of the Soviet Dungan Script for Chinese Language Reform


    "Sinitic languages without the Sinographic script" (3/5/19)

    By the way, Chinese characters don't tell us the tones of the morphosyllables they represent. The don't tell us the vowels or consonants either.

  39. Philip Taylor said,

    December 11, 2019 @ 8:31 am

    I think that Cliff was referring to Vietnamese, Victor …

  40. Victor Mair said,

    December 11, 2019 @ 8:35 am

    If native speaker users of a phonetic script for the representation of their tonal language habitually omit explicit tonal indications — which actually happens to be the case much of the time — nobody is imposing that practice on them. For whatever reasons, and it is easy to think of many, that is their own choice.

  41. Victor Mair said,

    December 11, 2019 @ 8:40 am

    Perhaps some Language Log readers were formerly unaware of Dungan. It certainly is something good to know about.

  42. cliff arroyo said,

    December 11, 2019 @ 8:47 am

    "Is there any research on the impact of this, if any, on native speaking literacy etc?"

    Supposedly, Turkish literacy drastically increased with the introduction of the current Latin-based alphabet though I'm not sure about any official figures.

    IIRC literacy figures are not that inspiring in most Arabic speaking countries but how much of that is alphabet vs diglossia vs dysfunctional political systems is hard to suss out.

    The main point is that what works find for highly educated super-literates doesn't necessarily work for the majority of less-educated everyday users who'll be the ones depending on it.

  43. Andreas Johansson said,

    December 11, 2019 @ 10:16 am

    @cliff arroyo:

    Thanks. In the Turkish case, I suspect it may be similarly hard to suss out what was due to the switch of writing system and what to other factors. If I understand, the written standard language was at the same time moved much closer to the spoken vernacular, and of course there was wide-ranging changes in the wider society.

  44. The Other Mark P said,

    December 11, 2019 @ 2:52 pm

    Please explain why you think Chinese characters are no longer fit for purpose. As far as I can tell, the advent of IMEs eliminates the cumbersome task of writing/remembering to write Chinese characters, which also takes care of the issue of character amnesia.

    It doesn't "take care of character amnesia". It merely reduces the burden. What causes the burden in the first place is that an incredible amount of memory is required to learn even a moderate number of words. Which means Chinese children are using their brains to rote learn words when they could be using them to remember concepts. It's complete waste of everyone's time, for literally no benefit.

    Chinese characters can't be alphabetised. If I see an obscure word, I open my dictionary and can search a word. I see an obscure Chinese word, no such luck.

    Which must cause computer programmers no end of trouble. How do you store and list the names of people, goods, etc? That is the modern world, and using a non-alphabetic system is putting an enormous cost on Chinese computing as a result. The solution is, of course, to translate to an alphabetic system, do the computing and translate back at the end. We already do that with numbers into binary, of course, but that at least is a very simple one to one match. With Chinese there's always going to be slips and mistakes.

    Finally, Chinese will never become the world's number one language while it is character based. The number of people who are prepared to put the massive effort in just to be able to read it will always be tiny. If learning spoken Mandarin is easy, which I believe it is, it is a massive fail to then make it unnecessarily difficult to learn to read.

  45. Terpomo said,

    December 11, 2019 @ 3:43 pm

    F tns crry smlr mnts f fnctnl ld t vwls, Nglsh wrttn wtht vwls sms t m lk a prfctly gd cmprsn t tnl lnggs wrttn wthtt tns, nd t's nrmlly vry ntllgbl t ntv spkrs.

  46. IA said,

    December 11, 2019 @ 8:24 pm

    For those interested —

    Olli Salmi's Dungan-English dictionary is online at:

    As for the Perso-Arabic script (variously known as 小經、消經、狹經、小兒經、小兒錦、小錦……etc 文字 — and concerning which names it is my understanding that despite online folk etymology ["Children's Writing" etc] to the contrary, the 兒 is not semantic but only indicates rhotacisation of the preceding syllable, and which script has been at one time or another used by Hui on both sides of the border (thus including the so-called 'Dungans' on the eastern side), a couple years back was published the following three-volume work:

  47. Hello - Chinese said,

    December 12, 2019 @ 12:14 am

    "character amnesia", that's exactly what I got, after so many years of not needing to hand writen the Chinese characters. We just don't have the chance to practice this basic skills, except when you need to sign your names on some paper documents.

  48. Philip Taylor said,

    December 12, 2019 @ 5:49 am

    TOMP — "Chinese children are using their brains to rote learn words when they could be using them to remember concepts". I doubt very much whether the two activities are mutually exclusive, and I suspect that rote learning (as advocated and taught by Nevile Gwynne, for example) actually "strengthens" the brain and thereby facilitates its use for purposes other than rote learning (e.g., gaining a better understanding of concepts).

  49. Ellen K. said,

    December 12, 2019 @ 10:03 am

    Writing English leaving out the vowels but otherwise writing it with standard spelling is really not the same thing as scripts that leave out vowels, it seems to me.

  50. Victor Mair said,

    December 12, 2019 @ 10:44 am

    @Ellen K.

    How so?

    The real question is whether leaving out the tones of Mandarin written with a phonetic script is comparable to leaving out the vowels of English, as another commenter alleged.

    Wouldn't leaving out the vowels of Mandarin written in a phonetic script be a more apt comparison?

  51. Jonathan Smith said,

    December 12, 2019 @ 1:58 pm

    In pursuing such a writing-systems thought experiment, we ought to differentiate between, among others, what may be subjectively desirable or ideal to any one observer, what is most practical given obtaining circumstances, what is measurably most learnable or readable, and what is possible given native-speaker users, who will clearly tolerate a considerable degree of deficiency with respect to phonetics, phonemes, or other aspect of linguistic structure.

    And this is all largely academic to this point… the facts on the ground to the great majority of Chinese observers are revealed in the OP, where A. Tao first remarks that pinyin is useable for "zui qima jiandan de zaixian jiaoliu" ("at a minimum, for simple online exchanges"), which sets the bar for the capabilities of PY humorously low, and later that Chinese characters, in contradistinction to pinyin, are "wenzi" ("writing" [not "mosquitoes" :P]).

  52. Ellen K. said,

    December 12, 2019 @ 3:19 pm

    @Victor Mair

    Because writing systems without vowels are designed without vowels. English we actually lose information about the consonants when we remove the vowels. Is rr one r or two with a vowel in between? Is sh a single sound, or a s then an h?

    And, possibly, because of differences in the languages themselves.

    I would not assume that writing systems designed without vowels are as hard to read as English without vowels, which, if Terpomo's comment above is any indication, is not at all easy to read.

  53. Frans said,

    December 14, 2019 @ 4:33 am

    @Victor Mair

    Stop writing the characters (because phonetic input is available) and sooner or later you'll lose the ability to recognize them.

    That doesn't a priori sound very plausible, since surely it's much easier to recognize than to reproduce. I realize (to some degree) that character recognition involves a complex order of brush strokes, but is that integral to a general understanding or more akin to an orthographic edge case like flower and flour?

  54. Philip Taylor said,

    December 14, 2019 @ 5:17 am

    In the verbal (as opposed to the written) world, I would respectfully disagree with Fran. I personally find it much easier to (attempt to) (re)produce an utterance in a foreign language that I speak poorly than to recognize the same utterance when spoken by a native speaker.

    For hanzi/kanji, I think that it is probably no more difficult to learn to write the simple ones badly than it is to learn to read them, but to learn to write them with calligraphic perfection will require a lifetime of practice and then some …

  55. liuyao said,

    December 15, 2019 @ 9:32 am

    In regard to the OP, I'm more curious about a related question: How do people who speak a different topolect than standard Mandarin cope with character inputting? Do they have to mentally "sound out" the word in Mandarin or they somehow rote memorized the pinyin spelling but associated it with their own pronunciation? (That's bound to happen even within standard Mandarin, in what linguists would call sound change. One day some pinyin "letter" will get to represent a different sound. It sort of happend with Ho 何 -> He.)

    I know the argument has been raised (and countered) in the many previous posts that characters are more inclusive, and form the common writing system for all Chinese languages. It doesn't let you write everything you want to express, even in Pekingnese. But I can read Hu Shih, Lu Xun, Liang Qichao, not to mention Literary Chinese which requires more training. Had they been writing in a phonetic system, they'd need to be translated into Mandarin. Today, most writers can speak good Mandarin, except those from Hong Kong. I would still like to read news from Hong Kong by their people, even if it's loaded with Cantonese words.

    I'm reminded of this yesterday as I was listening to Zigeunerweisen, a violin piece. Because I don't speak German, I don't know what this word means, and I could never spell it without the aid of Google. Even if I hear how it's pronounced, and learn the individual morphemes, I might need to learn enough "surrounding" words, if not the whole language, to remember this word. (By the way, in Chinese it is translated, as 吉普賽之歌, which adds to my confusion because it says "Gypsie's song".) My theory is that the German name was stuck in English because a hundred years ago a person who listened to or played/sang classical music was expected to known enough German, not to mention all the Italian.

    I have many examples of this sort. (It must be my character-based upbringing that makes me always ask the meaning of a word when I see it, just like how VHM would always ask the sound of a character.) I've mentioned dinosaur names before; now kids had better learn some basic Mandarin to pronounce e.g. Huayangosaurus and Yi—no kidding, that's the shortest genus name. Would you say that Latin in science is equally antiquarian and burdens the students—and patients, for example—unnecessarily? If we don't expect Latin to go away in medicine (due to inertia, or elitism), I don't see how characters could fade out given that the Chinese are constantly immersed in characters.

  56. Victor Mair said,

    December 15, 2019 @ 10:02 am

    The characters will never "fade out". They are part — a very important part — of our human heritage. However, as I have said before in various ways and will say again in a forthcoming post within the next few days, the proportion and nature of their usage in our time and in coming generations will continue to evolve as part of what I have often referred to as "emerging digraphia".

RSS feed for comments on this post