The Opacity and Difficulty of the Chinese Script

My class on the Chinese script has around 36 students in it. About half of them are native speakers from Taiwan, the Mainland, Singapore, and Hong Kong (most of these are graduate students who already have M.A.'s from overseas universities or are finishing up their Ph.D.'s). About one quarter of the other students are native speakers of Japanese, Korean, and Vietnamese. About a quarter are Americans who have studied Mandarin anywhere from two to twelve years.

Today, I made the students close their computers, electronic dictionaries, and all their books and papers, then asked them to write down on a piece of paper the simplified and traditional characters for Taiwan and beneath that what the meaning or origin of the name is. In the top right corner they indicated whether they were native speakers or how many years they had studied Chinese (I also should have asked them to indicate where they were from, but neglected to do so). The results:

  • only 2 students could write both forms correctly
  • only 4 students could write both forms partially correctly
  • only 10 students could write one form correctly
  • about 10 students could write one form partially correctly
  • the remainder of the students could not write either form correctly, including a couple of the native speakers
  • most students who had taken up to 6 years of Chinese couldn't write either form correctly

[If you want to give yourself the same quiz, before reading further, the answer is here.]

Only 2 students said that the name means "Terrace Bay" (which the characters seem to indicate), and only 1 student knew correctly that "Taiwan" ultimately derives from the early Dutch transcription of an Austronesian tribal name (and I think that I told him that last year in another class). About half of the remaining students hazarded all kinds of strange guesses about the meaning of the name, with the other half having no clue whatsoever.

The students felt embarrassed by the results of the experiment, but I told them not to feel bad because Chinese is a "damn hard" script.

Then one of the brightest students in the class (with an M.A. in history and languages from Peking University) said that it doesn't really matter that they couldn't write the characters for Taiwan — despite the fact that the name is constantly in the news — because they do virtually all of their writing in Chinese using computers or handheld devices that enable them to access characters through pinyin.

[Mark Swofford wrote:

And both of those are within the 600 most common Hanzi — the ones supposedly "everybody" knows if they're literate in Mandarin. If a class of mainly grad students and native speakers at an Ivy League U.S. university couldn't manage those, what do you suppose that says about China's claim of greater than 90 percent literacy in reading and writing Chinese characters?

To meet even the PRC's entirely inadequate standard for literacy, even a peasant in the countryside supposedly knows a minimum of *1,500* Chinese characters. Hah!



  1. Chris said,

    September 18, 2008 @ 12:44 pm

    I hope the Language Log will discuss John McCain's recent interview with Spanish-language radio. He seems to misunderstand a question he's asked, and when realizes his confusion, he plows ahead without correcting. His answer leaves the listener with the impression that he doesn't know who the Prime Minister of Spain is, or where Spain is, or something. My guess is the error is due to a speech/listening glitch, but maybe not. In any case, I'm interested in your take on it. Apologies for posting my request here in an unrelated thread. Didn't know how best to ask a general question of the log. thanks

  2. Rob P. said,

    September 18, 2008 @ 12:46 pm

    Could those same students have *read* the characters? I would suspect (because it was true of me when I briefly studied Chinese) that the students can recognize far more characters than they can write. Not to defend the Chinese governments clearly amplified literacy statistics, but if the measurement is reading v. writing, very different results might come up.

  3. John Cowan said,

    September 18, 2008 @ 12:48 pm

    Hanzi literacy for the masses: every man his own royal amanuensis. Hah! is correct.

  4. Trevor Stone said,

    September 18, 2008 @ 1:34 pm

    Have you tried a similar experiment with students and native speakers of English? I certainly don't know the derivation of many famous place names. And I wouldn't be surprised if a lot of folks were to misspell names like Massachusetts and Connecticut. But, like Chinese writers and pinyin software, we've got spell checkers all over the place. And even folks who can't spell those places can read them correctly. I agree with Rob… Chinese is easier to read and write… and so is English (though the gap is less).

  5. john riemann soong said,

    September 18, 2008 @ 1:48 pm

    Is dyslexia especially more common for Chinese?

  6. Ben Heilers said,

    September 18, 2008 @ 1:57 pm

    My own (limited) personal experience agrees with Rob and Trevor. My fiancee's parents never attended college, but both read the entire Sing Tao newspaper each day. I am not in a position to critique their written skill though, but they are certainly literate.

    Also, I've only taken a year of *conversational* Cantonese, meaning I have no interest in learning to read or write until I can hold a few conversatios. But even I know that Toi4 Waan1 (rendered in Cantonese Yale romanization) literally means "platform bay", and can read and sloppily write the character for platform. With that in mind, I wonder about your students' skill level. Perhaps they were not interested in learning to read and write Chinese and instead focused on learning English. I heard rumors of parents in Hong Kong persuading their children to learn English before learning Chinese.

  7. Joe said,

    September 18, 2008 @ 2:25 pm

    How many strokes do those characters have!? I took a quick peek at them, but I study Japanese, not Chinese. All I could tell from them is that there are many good reasons for the simplified script …

    I see that they're just made out of common pieces that I recognize even if I don't know whether they're true radicals or not, but two tiny copies of 'thread' should add more than enough strokes to any character. It has to be close to 30 strokes!

    And they're so badly squished together. I had to blow things up to 72 point font to get a good look at them. Otherwise they look like tiny ink blot tests.

  8. Bryn LaFollette said,

    September 18, 2008 @ 3:44 pm

    Along the lines Trevor suggests, I wonder how many of your students would have been able to explain the derivation of the name Liverpool (or even Pennsylvania, for that matter). The presence of the words "liver" and "pool" are certainly no less there literally than 臺 (tai2) or 灣 (wan1) are in Taiwan. I guess I just don't see how it says anything about the Chinese script to demonstrate an ignorance of the derivation of a place name.

    Also, as with Ben and Rob and Trevor, in my experience it's a lot easier to read Chinese characters than it is to write them. Also, in my subjective experience, I find I have an easier time reading and understanding Chinese when written in Chinese script than I do when reading material in Pinyin, at least in a compositional setting, but I'll admit my control of Mandarin is far from perfect, and I came to it with many years of Japanese already.

  9. Stuart said,

    September 18, 2008 @ 4:27 pm

    "that it doesn't really matter that they couldn't write the characters for Taiwan — despite the fact that the name is constantly in the news — because they do virtually all of their writing in Chinese using computers or handheld devices that enable them to access characters through pinyin."

    This is a sentiment I come across among speakers of Urdu, Hindi and Punjabi too. I empathise with your students, because I never write devanagari by hand, only by computer using an IME. Many of my native-speaking Indic friends do likewise. Texting seems to be acclerating the use of Roman characters to write Indic languages, I wonder if a similar phenomenon is noticeable in the Chinese languages?

    Thank you for the information on the Austronesian origin of "Taiwan", too. The links between the aboriginal Taiwanese and NZ Maaori have been brought to public attention more frequently of late, and it's fascinating to learn that the ancestors of NZ's first inhabitants have still managed to leave such a prominent mark on their own homeland as well.

  10. mark said,

    September 18, 2008 @ 5:55 pm

    Imagine if some freak event (EM field bomb etc) wiped out all of our technology- no-one would be able to write Chinese anymore! We would have to cut hanzi out of the People's Daily and glue the on to paper..

  11. dr pepper said,

    September 18, 2008 @ 6:31 pm

    We would learn to write our names and a few labelling words but all else would be in the hands of a specially educated scribal class. Which of course is how we got the chinese character set to begin with.

  12. TB said,

    September 18, 2008 @ 6:47 pm

    I find the idea that people are "illiterate" or that Chinese characters are "in decline" because people can't write them by hand completely idiotic. If you can't chisel Roman characters into a marble block, does that mean you are necessarily illiterate in Latin? Why should literacy be tied to archaic technology? The pen is a fine tool, but it is no longer the best tool for writing Chinese characters. Why use it as a measure of literacy?

    Unless you have to write by hand often, it's completely understandable that you might forget how to write any character. When I was in middle school, there was a powerful peer pressure to stop writing in cursive, and use print. I started writing again in college, and found I couldn't remember how to write the capital G. I guess I had become illiterate.

  13. Matt said,

    September 18, 2008 @ 8:04 pm

    Yeah, I don't disagree with your central (and habitual) claim that the Chinese script is opaque and difficult, but what this experiment reinforces for me is the need to separate "inability to write" and "inability to read" (and maybe even separate "inability to write" further into "inability to write by hand" and "inability to produce via software input system") when having a serious conversation about these matters.

    And the etymology thing seems like a red herring to me, too, since (as Bryn demonstrated) the problem (?) exists in other languages as well. There may be a quantitative difference, but I don't see a qualitative one.

  14. Stuart said,

    September 18, 2008 @ 8:07 pm

    Unless you have to write by hand often, it's completely understandable that you might forget how to write any character.

    Thanks for expressing this so clearly, TB. I started learning Hindi some time after cerebral palsy really started making a difference to my motor skills, and so I was never able to get comfortable drawing the devanagari script by hand. I still love the script and do not like reading transliterated Hindi, even if I only ever type it myself. Of course, devanagari is a doddle compared to Chinese characters, so your comments make a greal of sense. Thanks again.

  15. Louis said,

    September 19, 2008 @ 3:12 am

    The simplified script is the same as the Japanese name, so maybe it would have been interesting to ask where they're from. I couldn't say what it means though- I would've guessed 'big bay'

    The same has been said for Japanese and English- the use of word processors is making the ability to write words correctly a lot less essential. I have heard of university students using Word's autocorrect feature to turn txt spk into proper English. With Chinese characters as long as you recognise the correct one when you type it you're fine.

  16. Kellen said,

    September 19, 2008 @ 6:59 am

    "With Chinese characters as long as you recognise the correct one when you type it you're fine."

    agreed. i can't write a single character well (that is, prettily) except ones as simple as 回 or 丹, but can text and type a huge number of characters. i've never taken classes and mostly use chinese to send text messages, so i've never needed to hand write any of them. my phone doesn't have predictive texting so sometimes i type the wrong character, but it also means it's forced me to learn more of them more accurately than if i were just using my computer (which does predict). most people, however, tend to not care how exact the characters are in text messages as long as the sound is right. so even if i do mess up, no one tells me.

  17. James Wimberley said,

    September 19, 2008 @ 7:31 am

    Chris: I think the answer is to email one or more of the bloggers with a request for a post and thread. (Matt Yglesias has special request threads from time to time, a good idea.)
    But since I"m here, may I second you? I wonder if McCain was (in part) thrown by the interviewer's use of president as a title for Zapatero, correct in Spanish usage but unusual in English since his office is functionally more like a prime minister.

  18. GAC said,

    September 19, 2008 @ 9:47 am

    I generally agree here that your experiment is really not an accurate measure of literacy. I have only studied Chinese for a year, but I know that the characters that I can write correctly by hand I can only do so through a good bit of practice, and I suspect that when I'm no longer in classes that require hand-written workbooks, that skill will fade. And of course, like any English speaker I often have trouble spelling long or highly irregular words — though it's only rarely that I find a word that stumps me.

    Another thing I can say is that the etymology test is most definitely extraneous. Etymologies of toponyms are trivia — there's nothing in etymologies that is essential for understanding the word itself. If you were to give a quiz to Americans about various English toponyms, they might get some right, but many will be opaque. I can easily see the etymology of my home state's name — West Virginia — and trace it back through the name of the old colony of Virginia back to a couple fuzzy stories about its origin. But all I know about, say, Hawai'i is that it's probably derived from Hawai'ian in some way — and I have no clue on Alaska — or even a few of the states in the contiguous forty-eight (Oregon baffles me). Think, how many people would really associate "Nevada" with snow (from Sierra Nevada?). And how many people know that California was named for a mythical paradise in an old Spanish adventure novel?

  19. Constance said,

    September 19, 2008 @ 5:22 pm

    It's absolutely absurd that he would say that. Chinese children aren't taught the same way as American children are. They drill characters endlessly. They have to repeat a single character hundreds of times, starting from kindergarten. And so by a certain age, everyone will know the characters simply because they were engraved into their minds.

  20. Therese said,

    September 19, 2008 @ 10:05 pm

    Constance — tell that to my coworker here in Hong Kong who couldn't write the word for "pineapple" (鳳梨) and didn't know how to even describe it until I had written it out by hand. And this is quite a common word! Drilling doesn't help if you don't use it, and IME has made people lazy. Thankfully there're writing pads for those of us who are insane and like to handwrite everything.

  21. Adrienne said,

    September 20, 2008 @ 10:06 am

    Therese: Pineapple is referred to as 菠羅 about 90% of the time in Hong Kong. 鳳梨 is a Taiwanese (or Chinese?) name for it. It's not surprising that your coworker was unfamiliar with the term.

    And about the test, I agree with the various commenters who noted that this is not as bad as it sounds. Most people, not just the Chinese, do not know the origins of common place names. And as a native speaker from HK, I can say with great certainty that a majority of my classmates at high school would've been able to write both the simplified and traditional forms for 臺灣, since I was able to, and my Chinese level was well nigh at the bottom of the class.

    One last point: even in traditional-script dominated media I rarely see the traditional form of 台 anymore, so it's not surprising that expatriate students in an American university would not know how to write 臺.

  22. David Marjanović said,

    September 20, 2008 @ 12:53 pm

    Another thing I can say is that the etymology test is most definitely extraneous. Etymologies of toponyms are trivia — there's nothing in etymologies that is essential for understanding the word itself.

    Clearly, the intent was to figure out if people know what the characters mean by themselves/in any other context ("terrace" and "bay").

  23. blahedo said,

    September 22, 2008 @ 1:39 am

    I'm a little taken aback by all the posters that claim we are in a post-handwriting world (especially the one that made an analogy to chiseling in rock)—do these people never take notes on paper? How do you grade papers or write comments on a draft you've been asked to edit? Make to-do lists? Label diagrams on a whiteboard? Write out a "meeting moved to room Q-231" or similar sign?

    I've been using computers for nearly my entire life and am perfectly capable of using a computer to write anytime that's suitable. But there are far, far, far too many contexts where I need to write something on an ad-hoc basis on a non-electronic surface, where grabbing a computer and finding a printer would put a significant crimp on my effectiveness. I definitely do not buy this claim that it "doesn't matter" if people can only write with the aid of a computer—this is, and will remain for a long time yet if not forever, a serious handicap to full literacy.

  24. GAC said,

    September 22, 2008 @ 7:09 pm

    No, hand-writing is definitely still useful, of course. And, of course, there are still plenty of people who don't have access to modern electronics at all. But computers are becoming more and more ubiquitous and in many places electronics are used more and more.

    And of course, in a society where technology is ubiquitous, even when people must hand-write quick notes or ad-hoc signs and stuff, some may have a device available that they can use up to look up the spelling of an unfamiliar word or jog their memory for a half-forgotten character.

  25. Mark Liberman said,

    September 22, 2008 @ 7:46 pm

    Many of the commenters seem to be missing the point of the post — or at least an assumption strongly underlying the post — which is that use of an alphabetic writing system such as pinyin or bopomofo would remove the need for a dozen or so years of Hanzi training, and also make it possible for people to write much more easily even after that training. As Victor's experiment shows, many highly-educated people (including highly-educated native speakers) can't write common Chinese words without the help of a computer. This is nothing like whether or not you can chisel letters in stone — the question is whether or not you're able to retain active knowledge of the characters themselves, however executed. The task is a terribly difficult one; the claims of high levels of literacy (in pre-computer days as well as now) have apparently been exaggerated if not completely faked; and the existence of digital aids is apparently creating a situation in which even highly-educated people don't retain (and perhaps don't ever really acquire) active control of the writing system.

    It seems unlikely that orthographic reform is culturally or politically feasible, at least in the immediate future, but that doesn't stop people from pointing out the problems.

  26. Tiff said,

    September 22, 2008 @ 11:19 pm

    While the Chinese writing system obviously has shortcomings, the flip side is true, too… It is rather similar to having united all of Europe under one writing system, regardless of spoken language. It obviously has shortcomings, but the writing system is really more efficient for the language… i.e. if you write 'bing' which bing do u mean? Even adding a tone marker isn't specific enough. There are too many bing1's. Simplified characters was a good step, and really the only feasible one.

    In terms of Chinese dyslexics, I am one… it's a different part of your brain than in English. So, if you are dyslexic in English, you won't be in Chinese… and vice versa. They have no real statistics yet on the frequency.

  27. GAC said,

    September 23, 2008 @ 8:53 am

    The idea that the Chinese writing system "unites" many languages has been fairly well debunked. Of course, you can get *some* information from a Japanese text if you know only Mandarin, and probably even more from a colloquial Cantonese text — but vocabulary and grammatical differences will probably keep you from getting the whole message.

    I'm also very skeptical of how important the homophone differentiation is. Mandarin already has a way to deal with having so many homophones — highly productive compounding. Plus, while writing bing1 by itself, if I put it in a sentence: ni3 xi3 bu4 xi3huan1 bing1qi2lin2 (highlight for translation:"do you like ice cream?"), then it's not so hard to identify the bing1.

  28. Ash said,

    October 9, 2008 @ 5:21 am

    I live in Taiwan and have for the last 3 years. I attend graduate school here. The 臺 form is almost never used even in formal contexts. The fact that people can't write it has to do with the fact that they don't need to. Even in report writing for graduate level courses ( 2/3 of my classmates are native speakers), the 台 form is used. So when people say that 臺 is in the most 600 commonly used characters, they are wrong. 台 is, but 臺 isn't, even in Taiwan.
    The thing is, the name tai2 wan1 is a sound loan. Sound loans don't require one to read the "meaning" of the character, only the sound. So,
    Chinese people don't think of Malaysia (馬來西亞) as "horse-come-west-asia" but as "ma3-lai2-xi1-ya4".
    The argument being presented seems to be: The chinese characters 台灣 don't give the same meaning as the original aboriginal name of the island by way of Dutch pronunciation, ergo the the Chinese script sucks.
    If that isn't contorted "logic" then I don't know what is. I doubt that the writer of that article would be able to tell us the meaning of the word "america" off the top of his head. I don't know the author and don't want to comment on his possible motives, but the test given was certainly no test of the adequacy of the Chinese script.

  29. Ash said,

    October 9, 2008 @ 5:30 am

    One more thing (comment on M. Lieberman): As Victor's experiment shows, many highly-educated people (including highly-educated native speakers) can't write common Chinese words without the help of a computer.
    That is simply not true. I challenge you to find a Taiwanese person who doesn't know how to write 台灣 in the way which it is written 98% of the time. You'd get the impression from such a statement that people here (or in Hong Kong, Singapore, etc.) can't read and write. They most certainly can read and write the things they need to read and write. The character 臺 simply doesn't enter into the equation. Sure you can trip people up on some characters which they don't use in daily life, but that says nothing about literacy rates or the script itself. The "experiment" is ill conceived. If you want to prove people can't read and write, then choose commonly used characters. It's like T. Stone said above, you can't use words like Massachusetts and Connecticut to test English literacy.

  30. Bob said,

    January 31, 2009 @ 12:27 am

    I teach English to college-age students in Beijing,
    and I am learning Chinese. I often ask them what might
    seem like trivial questions about how to write
    certain characters, or words, and what they mean.
    An example is 孑孓, the answer of which only about 5% of
    students know. I use it in part of my lesson on
    how to learn English. Quite often, the highly
    literate and educated students have to check
    their electric dictionaries or phones to verify
    one of my simple "quizzes" – like what are the
    three (or more) homophones for "rong2 hua4",
    both how to write them and what they mean.
    (I use that to illustrate the differences between
    'melt', 'thaw' and 'dissolve' in English; but I also
    use things like that to show that they are
    still not done learning Chinese even though
    it is their native language, they are highly
    literate, and went through the rigorous
    language training as children in their recent
    past.) I also make it a point to learn new
    things about English to set a good example.

  32. oohkuchi said,

    May 17, 2009 @ 7:32 am

    I don't know how many times I have read about the difficulty of Chinese characters. But it is NOT the main problem. You master the basic few thousand for everyday reading purposes in a few years (reading, that is, not writing). Then you reach the real obstacles, the two things that stop nearly all foreigners from reading Chinese quickly and smoothly for at least ten years. These are the massive new vocabulary burden, and the vagueness of the grammar. The much vaunted lack of 'endings' at this stage in your studies is not a blessing, it is a bloody nuisance. Singular/plural, object/subject, definite/indefinite, past/present, all has to worked out from context. In hard text, you cannot even easily distinguish noun from verb sometimes. This is the real reason why so few foreigners can read Chinese well. (It is also why switching to pinyin would only make things even harder as the differences between the thousands of homophones would become invisible).

  33. Chris said,

    June 7, 2010 @ 7:50 am

    "(It is also why switching to pinyin would only make things even harder as the differences between the thousands of homophones would become invisible)."

    Does this mean such texts could never be read aloud and comprehended?

  34. Wentao said,

    April 30, 2011 @ 9:07 am

    On the contrary, I think most of my family and friends can at least get the simplified form of Taiwan 台湾 correct, and all of them can read the traditional form.

