Language is not script and script is not language

« previous post | next post »

Trying to clear up the confusion between the two is a battle we have been waging for decades, and nowhere is the problem more severe than in the study of Sinitic languages and the Sinographic script.  The crisis (not a "danger + opportunity"!) has come to the surface again this month with the appearance of a new book by Jing Tsu titled Kingdom of Characters: The Language Revolution That Made China Modern (Riverhead Books, 2022).

The publication of Tsu's book has generated a lot of excitement, publicity, and reviews.  Here I would like to call attention to the brief remarks of an anonymous correspondent (a famous, reclusive linguist) that are right on target:

Reimagining "antiquated" Chinese

Reproduced below is the text of a book review in Science that you may not have seen. It is classified as "Linguistics", though the reviewer is a historian at Cal State Poly, Pomona. Notice that Chinese is assumed to be "antiquated" and in need of being "reimagined"!  There is simply no sign of Science understanding the difference between a human language and a writing system. This is consistent with the way they have always treated linguistics; they have no idea what the subject really is.

Here are the beginning and ending of the review to which our linguist colleague is referring:

Science 14 JANUARY 2022 • VOL 375 ISSUE 6577 page 151

An antiquated language, reimagined
A new tome traces efforts to unify, reform, and modernize the Chinese language

By Zuoyue Wang

As China’s scientific, technological, and economic developments continue to propel its rapid rise and geopolitical tensions, there is a hunger for information on the country’s past and present. Among the new crop of books being published to meet this demand, Kingdom of Characters by Yale professor of East Asian languages and literatures Jing Tsu stands out as a lively and insightful history of the intersection of China’s information technology systems and its language revolution. The book is a richly documented, riveting, and scholarly rigorous transnational account of how Chinese evolved from a hard-to-learn script entrenched in the beleaguered Middle Kingdom in the 19th century to a global language in the 21st century.

Tsu devotes the last chapter of Kingdom of Characters to the globalization of the Chinese language. In the book’s conclusion, she recounts a contentious 2018 conference in Hanoi, Vietnam, where a Unicode group adjudicated new Chinese characters proposed for international acceptance. This episode illustrates a larger point: that even as the Chinese language goes global, it still stirs cultural and geopolitical tensions, especially in an era when China seeks to expand its influence in the world.

Jing Tsu was featured in this Language Log post:  "How many more Chinese characters are needed?" (10/25/16).  Two years later, she was still in quest of ways to inject more Sinographs into Unicode, which led her to participate in that "contentious 2018 conference in Hanoi, Vietnam" (10/22-26/2018).  Professor Tsu was also very much in evidence at a still later conference, as recounted by a linguist-computer scientist who represented one of the big players there (part of a much longer series of communications about the conference from him to me):

I encountered Jing Tsu once at a meeting when it took place in Shenzhen (10/21-25/2019). She seemed to understand virtually nothing about Unicode and what it does and is for. (Unicode is in essence a giant registry for letters/characters of all writing systems to be used on IT systems, together with databases about properties of letters/characters (e.g.: uppercase vs lowercase, whether something is a number, etc.). Jing Tsu had somehow been allowed to be an "observer" for our meetings and attended them without a clue of what was going on. (She also promised the chair to hold one of our meetings at Yale — normally they are always either somewhere in Silicon Valley or in East Asia. Due to Covid, the Yale location seems to have been abandoned for the time being.) We have certain procedures to follow when we decide whether a new Sinograph gets "encoded" (added to the Unicode character database) or whether it gets rejected (because it is merely a graphical variant of an existing character). There are many borderline cases.

Anyway, she kept asking me and other people very very strange questions during the whole week. For example: "Oh, you went to the Unicode conference! Was it like this meeting here?" [No! It was quite different; that one was about the giant Unicode standard in general, with many introductory lectures, not a working meeting for CJKV experts.] "What are emoji?" Here is an answer of mine, close to what I told her: Emoji originated as vendor-specific small images expressing conversational moods, and at some point Japanese telecommunication carriers agreed on a standardized set of emojis, presumably to facilitate their use in textual exchanges such as via SMS. And that standardized Japanese set was part of what convinced Unicode to add emoji to its character set, after long debates. And now anyone can propose to have an emoji added to Unicode. The notion itself is a bit variable, just like the notion of "letter" or "computer" or "Chinese" or just about any human concept is fuzzy around the edges and might change or shift over time. That's it.

But she kept asking me the same … question ("But what are emoji really?") another 2-3 times without showing any sort of indication that she had actually understood any of what I said. And then she asked other questions about our work repeatedly, as it relates to the different standardization documents and organizations we are working with. Her questions were repetitive and confusing, and at no time I felt that she understood any of it. And, mind you, the topics and answers we discussed weren't necessarily technical.

It is clear that Professor Tsu was obsessed by the relationship between emoji and Chinese characters.  Another topic that has preoccupied her is the development of the Chinese typewriter and what Zuoyue Wang calls the "Chinese script’s incompatibility with the Western alphabet".

A key to the "modernization" of the Chinese language & script nexus is what I have called the phoneticization of the latter (see the list of "Selected readings" for some references).  Thus, as William C. Hannas has opined, the alphabet has both threatened and rescued the Chinese writing system.


Selected readings


[h.t. Don Keyser]


  1. Jim Breen said,

    January 23, 2022 @ 9:01 pm

    Of course Chinese characters are a language. Dan Brown said so in one of his novels, so it must be true.

  2. AntC said,

    January 24, 2022 @ 12:54 am

    Quaint that the link to Science's review gives camera-copy (despite the url showing it's a 'epdf' — and I couldn't find a more presentable format on its website). Its Kerning and dithering makes it quite unpleasant to read — on my screen, at least — and I guess far worse on many a tablet.

    If only there were some technology to encode text content and layout so it could be fluidly adapted to a variety of formats and display devices. Wait … So apart from this 'Science' publication not understanding Language vs Script, they seem not to know much about Script full stop; nor about publishing.

    This poor rendering is presumably deliberate, to encourage me to "Get full access to this article".

  3. ~flow said,

    January 24, 2022 @ 2:14 am

    It's said to see when people with little understanding of the subject matter get such a big publicity which just helps to disseminate fake knowledge. How people can earnestly write and believe in drivel like "Chinese evolved from a hard-to-learn script entrenched in the beleaguered Middle Kingdom in the 19th century to a global language in the 21st century" is beyond me.

    As for the analysis of 危機 as 危險機會 (and the critique of it) I am a bit on the fence though. When we learn languages including our own and encounter factually or seemingly, overtly or opaquely compound words we certainly do have a drive to analyze them into their parts: so a blackbird is a black bird; a typewriter is something (or something?) that writes type (or types writing?); a breakwater is something that breaks water(waves); a strawberry is a berry that grows (or is preferably grown on?) straw and so on. It can be misleading, you have to guess the missing parts, the 'solution' is often hardly unique (there are many kinds of black birds that are not called blackbirds, only one kind gets the label), sometimes it makes to sense to the modern user (as in cranberry), and one has to account for and give prevalence to the conventional meaning over the 'etymologically correct one', all of that. But then I go to and find this explanation: [危机]是有危险又有机会的时刻,是给测试决策和问题解决能力的一刻,是人生、团体、社会发展的转折点,生死攸关、利益转移,有如分叉路。 so it does look like at least this single native source has bought into the very American way of thinking of 危機 'crisis' as 危險機會 'dangerous opportunity'. One must add that a plethora of Chinese words are formed and can be understood this way — just like their English counterparts by filling in (presumably) missing parts, e.g. 簡明 is 簡單明瞭 &cpp. Of course, that it works sometimes doesn't mean it's correct for 危機 at all (the word could be unanalyzable like 蝴蝶 after all) or that the correct expansion it is indeed 危險機會; however, one can likewise not exclude the possibility that a sizeable number of native speakers would agree with the explanation offered by and criticized above. So, like I said, I'm sitting on the fence here, between a rock and a hard place, and the jury is still out.

  4. alex said,

    January 24, 2022 @ 4:14 am

    Its funny I was going to use the same quote that flow did

    "and scholarly rigorous transnational account of how Chinese evolved from a hard-to-learn script entrenched in the beleaguered Middle Kingdom in the 19th century to a global language in the 21st century"

    the amount of evolution is hand washing clothes to using a machine.

    Yes alphabet saved Chinese script and also technology. Perhaps its a curse as if not for technology maybe they would have finally transitioned into pinyin. The kids will suffer for another decade if not more.

  5. DJL said,

    January 24, 2022 @ 9:09 am

    a famous, reclusive linguist?

  6. David Marjanović said,

    January 24, 2022 @ 10:49 am

    This poor rendering is presumably deliberate, to encourage me to "Get full access to this article".

    丁丁! We have a winner.

    a famous, reclusive linguist?

    Scientists get famous by publishing, not by meeting people.

  7. J said,

    January 24, 2022 @ 10:52 am

    To add to the 危机 discussion, Chinese media and leaders at all levels have used phrases like 化危为机、危中寻机 a lot in recent years as well (sometimes with quotes around "危" and "机")

  8. Victor Mair said,

    January 24, 2022 @ 11:16 am

    This was not meant to be a post about the fallacy of dangerous opportunities, a topic that we've addressed dozens of times, going back almost to the beginning of this blog, although it turns out that the problems of morphological analysis raised by the allure of the (mostly) morphosyllabic script do relate to the matter of the relationship between language vs. script that this post is about.

    Native speakers are as prone to misinterpreting the origins of words as second language learners are (perhaps more so) — folk etymologies, false etymologies, etc.

    "How much wood would a woodchuck chuck if a woodchuck could chuck wood?"

    Such pitfalls about in the history of Sinitic languages, e.g., the words for "phoenix", "butterfly", "balloon lute", even words as common as "country", "university", "grammar", "literature", and so forth and so on — we've covered hundreds of them on Language Log.

    I think it's about time for us to do a general debunking of overly enthusiastic etymologizing. The last general one I can remember was in 2005, "Etymology as argument". Give me a week or so to find the time.

  9. Aelfric said,

    January 24, 2022 @ 12:19 pm

    Feel the need to add that I was watching some B-movie this weekend, and there was this bit of dialog: "What language is that?" "Cuneiform!" I winced.

  10. Stephen Hart said,

    January 24, 2022 @ 1:48 pm

    "If only there were some technology to encode text content and layout so it could be fluidly adapted to a variety of formats and display devices."

    Slightly off topic, but the latest macOS, Monterey, includes built in very good OCR. I just selected and copied direct from the image of the article and pasted this (though I did have to fix the drop cap A):

    As China's scientific, technological, and
    economic developments continue to
    propel its rapid rise and geopolitical
    tensions, there is a hunger for infor-
    mation on the country's past and pres-
    ent. Among the new crop of books
    being published to meet this demand,
    Kingdom of Characters by Yale professor
    of East Asian languages and literatures
    Jing Tsu…

  11. KWillets said,

    January 24, 2022 @ 3:51 pm

    Wired has an excerpt here:

    This section seems less controversial; it has little linguistic discussion but rather a history of one encoding and input method.

  12. Bob Ladd said,

    January 24, 2022 @ 5:38 pm

    There's also a longish review of Jing Tsu's book in a recent New Yorker. The review skirts around the edges of the language/script confusion, and mostly avoids falling in. But it's pretty clear that the notion of a clear distinction between language and writing is subtler than most linguists realize (or remember), and that it's a difficult idea for literate people everywhere.

  13. Beirne said,

    January 24, 2022 @ 8:20 pm

    I listened to her interview at the New York Times Book Review podcast ( and at about 15:50 in when explaining how the Chinese figure out a way to make a typewriter type a character in four keystrokes or less, she compares it to English words, which she says average a bit over five keystrokes. I assume the numbers are correct, but she is comparing Chinese characters to English words, which suggests a serious lack of understanding of characters. An awful lot of English words require multiple Chinese characters to get the equivalent meaning. There is no single character for "bicycle", for example. So either she doesn't understand how Chinese characters works or she distorts facts to make a point.

  14. Terpomo said,

    January 25, 2022 @ 12:05 am

    It's not only Chinese that suffers that. It seems like Korean is commonly conflated with Hangul- I hear that many Koreans take Hangul Day as an opportunity to opine about the Korean language rather than the script itself. Korean translations are also often labeled 한글판 "Hangul version"; I know it would make me a pedantic jerk but I almost want to publish a "한글판" of something that's literally just the original text transliterated in Hangul to make a point.

  15. John Swindle said,

    January 25, 2022 @ 8:25 am

    One or more publicly funded health plans in Hawaii offer a phone number to call for help in 繁体中文 (I.e., Traditional Chinese). I wonder whether it sounds different from Simplified Chinese.

    Once we’ve distinguished script from language, though, might not Professor Tsu still have something interesting to say about the former?

  16. AntC said,

    January 25, 2022 @ 3:25 pm

    By coincidence, a language is not a script has been the main stumbling-block in tracing the origins of Arabic — according to this 2019 lecture.

    Dr Al-Jallad traces as far back as the first half of the first Century BCE inscriptions in a tongue that is a precursor to Arabic, but in a variety of scripts (including in Greek — which very helpfully captures the vowels). It wasn't until nearly the time of the Prophet that identifiably 'Arabic script' appears.

  17. David C. said,

    January 25, 2022 @ 6:38 pm

    It can go the other way too. For some time (and perhaps it is still the case), British Airways had the option to select "官话" as one of the interface languages within the in-flight entertainment system. Probably someone just googled "Mandarin" and took what they saw from the Wikipedia entry.

    The confounding of script and language has also been compounded by the fact that 中文 (zhongwen, Chinese script) became the politically neutral term to denote Standard Mandarin Chinese, over 國語 Guoyu or 普通話 Putonghua.

  18. Andrew Usher said,

    January 25, 2022 @ 11:42 pm

    Perhaps some of the difficulty lies in the distinction between a specific script and writing in general. The statement "script is not language" may well be taken (I did at first) as a declaration that written language is not language, which almost everyone would disagree.

    There are a number of languages that have changed script (usually to Latin) and have not thereby broken continuity as a language, and that's what I might use to illustrate the point.

    k_over_hbarc at

  19. Bathrobe said,

    January 25, 2022 @ 11:56 pm

    Buruma's review at the New Yorker (mentioned by Bob Ladd) is well worth reading.

    Buruma does not skirt around the written/spoken distinction; he sticks to talking about the written language, which also encompassed the abandonment of Classical Chinese and the adoption of baihua. What is particularly interesting about the Buruma article is his invocation of the political aspect of language modernisation (with a swipe or two at Tsu).

RSS feed for comments on this post