Sinitic is a group of languages, not a single language

« previous post | next post »

Pro-Cantonese sign in Hong Kong:

A man holds a sign professing his love for Cantonese as he attends a Hong Kong rally in 2010 against mainland China’s bid to champion Mandarin over Cantonese. Picture: AFP

The sign says (in Cantonese):

ngo5 oi3 gwong2dung1waa2 ("I love Cantonese")

m4 sik1 bou1dung1gwaa1 ("I don't know Putonghua [Modern Standard Mandarin / MSM]").

Note that Pǔtōnghuà / Pou2tung1waa6*2 普通話 ("MSM") is here written punningly as bou1dung1gwaa1 煲冬瓜 ("stewed winter melon").

It could also be written with another pun:  paau4*2dung1gwaa刨冬瓜 ("shaved winter melon")

The above photograph and caption are from this sensible article by Lisa Lim in the South China Morning Post, "Language Matters" (9/29/17):

Why it’s hard to argue there is one Chinese language

To a linguist ‘the Chinese language’ is a family of languages – not dialects – that for the most part are mutually unintelligible and written different ways; an appreciation of this variety would help discussions about language policy.

Biographical note in the SCMP:

Lisa Lim has worked in Singapore, the UK, Amsterdam, and Sri Lanka, and is now Associate Professor and Head of the School of English at the University of Hong Kong. She is co-editor of the journal Language Ecology, founder of the website and co-author of Languages in Contact (Cambridge University Press, 2016).

Although some things the author says may be open to discussion (e.g., "Chinese" is comparable to the Romance or Germanic "families", is a branch of the Sino-Tibetan family, etc.), much of what she says is spot on (e.g., most of the "Chinese" language groups are mutually unintelligible, her calling into question referring to these groups as "dialects", and so forth).

Modern written Chinese is technically not bound to any specific variety, though it mostly represents the grammar and vocabulary of Mandarin. But Cantonese has its own written forms, for both formal (“High”) and colloquial (“Low”) vari­eties. The latter flourishes in Hong Kong, where, for instance, one finds  (fan) for “sleep” in addition to the more formal  (sèoih).

[VHM:  Nobody would understand you if you used the term fan3 瞓 in Mandarin, even if you pronounced it fèn à la mandarin.]

In classrooms, Chinese texts are often taught using H Cantonese, with Putonghua pronunciation having little currency – for example, the word for “no, not”, realised as  (m̀h) in colloquial Cantonese, is written as  in Standard Chinese, pronounced  in Putonghua, but the formal H Cantonese pronunciation b¯ a t is likely to be used. There is even Hong Kong Written Chinese, influenced by Cantonese and English.

Official references to these various systems are often blurred and confused under the label “Chinese language”. Parents’ and policymakers’ worries about students’ “Chinese language” proficiency, as well as the medium-of-instruction debate, will continue, with issues of mother-tongue-based education and national-vs-local identity at their core. A more nuanced appreciation of all that “Chinese language” encom­passes will go a long way towards more fruitful discussions.

[VHM:  These are the last three paragraphs of the article.]

What a breath of fresh air Lisa Lim's article is!

[Thanks to Bob Bauer and Abraham Chan]


  1. Jenny Chu said,

    October 12, 2017 @ 11:50 pm

    Also interesting:

    1. The sign says 广东话 and not 廣東話 … :) well, I guess it has a certain target audience in mind. But the logo at the bottom (the name of the organization?) uses 廣人 and not 广人.

    2. "Official references to these various systems are often blurred and confused under the label “Chinese language”. –> this is true, but what's left out is that they are often deliberately blurred and confused by people who want to be able to interpret the references as they like.

  2. Mark Meckes said,

    October 13, 2017 @ 5:29 am

    As someone quite ignorant about Sinitic, I'd be curious to hear some of the discussion there is to be had about whether Sinitic is comparable to Romance or Germanic. More or less diverse? More or less mutually intelligible? Longer or shorter since it presumably used to be mutually intelligible? etc.

  3. richardelguru said,

    October 13, 2017 @ 5:52 am

    So this is the inverse of the old quip about a language being a dialect with an army?

  4. Victor Mair said,

    October 13, 2017 @ 7:39 am

    No, it's not a quip.

    See, among many others, the following posts:

    "Uyghur as a "dialect" — NOT" (10/1/13) — search for "army" in the comments at several places

    "Intelligibility and the language / dialect problem" (10/11/14)

    "Spoken Hong Kong Cantonese and written Cantonese" (8/29/13)

    "Devil-language" (5/25/14)

    "English is a Dialect of Germanic; or, The Traitors to Our Common Heritage" (9/4/13)

    For more on "dialect" and "topolect", do the following Google searches:

    victor mair language log dialect

    victor mair language log topolect

  5. Coby Lubliner said,

    October 13, 2017 @ 7:40 am

    @richardelguru: That quip was ironic, intended by its author, Max Weinreich, to represent the prevailing attitude of the authorities of his day, not his own. His specialty, after all, was Yiddish.

  6. nick m said,

    October 13, 2017 @ 10:30 am


    So the quip could be uttered (perhaps substituting "topolect" for "dialect") in just the same spirit as Max Weinreich's, by speakers of Cantonese or Shanghainese etc., about the Mandarin-speaking authorities in Beijing?

  7. Bart said,

    October 14, 2017 @ 4:43 am

    I’d like to repeat the question of Mark Meckes. Here’s how I’d put it (like Mark, avoiding tedious definitions of ‘family’, ‘group’ etc).

    Are the languages in the set that is called Sinitic comparable in degree of similarity to those in the whole set of languages called Germanic?
    OR Are the languages in the set that is called Sinitic comparable in degree of similarity to those in the whole set of languages called Indo-European?

    It seems a rather basic question but I've rarely seen it discussed.

  8. Alex said,

    October 14, 2017 @ 8:46 pm

    On similarities between Chinese topolects and Romance/Germanic languages:

    I make this comparison all the time when talking to interested friends who have some knowledge of European languages. Every time, though, I stress that it's just an
    approximation and that the situation with Chinese can't really be shoved into a different framework like that.

    I am also not a real Sinologist, but I do have a great interest in this topic. I've read countless articles and discussions, and I've traveled around China and spoken to a wide range of people about it.

    If we suppose that Standard Mandarin Is equivalent to a broad Latin American standard of Spanish:
    I would map Sichuan dialect to a small regional language of Spain such as Asturian. (Very unfamiliar at first but with enough listening practice comprehension improves without additional study)
    Cantonese would be French. (Cannot learn just by listening, study of vocabulary is necessary).
    Taiwanese Hokkien would be Romanian. (Noticeably different grammar as well as vocabulary)
    Shanghainese would be… Sicilian? Hakka is Romansh? The metaphor starts to break up once you try to add more than three or four varieties.

    One thing that this metaphor does preserve is how isolated words can often be understood. You can't really have a discussion using two different varieties, but you can absolutely say "oh, you say parler, we say parlare!" Also, a speaker of several Chinese topolects absolutely has an advantage learning more – just like Romanian is made easier if you know French, Italian, and Spanish. I had a professor who natively spoke Mandarin, Taiwanese Hokkien, and Hakka, and she *was* able to learn to understand Cantonese just by listening.

  9. Yilin said,

    October 16, 2017 @ 10:59 am

    It reminds me of a book I've read before talking about how to record dialect vocabulary. It offers methods below: use the known character, check out the misuse and find the actual character, etc. And one of it is to say that it would be the last choice to create a new character.

    I have seen a lot of native speaker of Cantonese, who who would use 'pinyin' input method to get Chinese character that sounded like Cantonese when they type on the computer. e.g 我听日翻屋企 sounded like '我明天回家' ( I will go home tomorrow) in Cantonese, but it means nothing semantically neither in Mandarin or Cantonese. Not all the 'different written form' lead to different word.

    So 'different written form' just means it's a kind of way to record dialect vocabulary, I guess.

  10. Eidolon said,

    October 19, 2017 @ 6:39 pm

    "Are the languages in the set that is called Sinitic comparable in degree of similarity to those in the whole set of languages called Germanic? OR Are the languages in the set that is called Sinitic comparable in degree of similarity to those in the whole set of languages called Indo-European? It seems a rather basic question but I've rarely seen it discussed."

    It might be rarely discussed because studies often vary wildly on the standard of measurement for language similarity, with different researchers applying different methodologies; lexical and phonological interference from the common language also has a huge influence on end result, and the fact that the vast majority of Chinese today know some degree of Standard Mandarin, with younger generations being increasingly fluent in it, make it difficult to assess factors such as mutual intelligibility between Sinitic languages. It also varies from person to person. Two Spanish speakers may have different degrees of mutual intelligibility with German, even when neither have ever learned it.

    That said, I'll cite a few numbers. Keep in mind that these numbers vary massively between different studies, at times by as much as 40%, so don't be surprised by studies that don't agree…

    Lexical similarity: Tang et. al (2008) estimated Beijing-Cantonese at 24%. They also measured Wu-Mandarin at ~30%, Min-Mandarin at ~20%, Sichuanese-Mandarin at ~45%, and northern varieties of Mandarin – ie between Beijing, Shandong, Shaanxi, etc. – generally at ~60-70%. Northeast Mandarin should be even closer though it wasn't evaluated, since it is practically mutually intelligible with Standard Mandarin.

    The lexical similarity between English and French is, by a popular measure, 27%. English-Russian, 24%. English-German, 56% to 60%. Most Romance languages are a lot closer – ie Spanish-French 75%, Romanian-French 75%, etc.

    Based on this comparison, the difference between Beijing and Cantonese is like the difference between English and Russian, or English and French – ie, across Indo-European language families within Europe. The difference between Beijing and Min might be more like the difference between Russian and German, which are less like each other than Russian and English.

    On the other hand, northern varieties of Mandarin can be said to form a family similar to Germanic or Romance. The difference between northern Mandarin and southwestern Mandarin – ie Sichuanese – can be said to be either the difference between distant languages of the same family, or very similar languages from different families. Again, this is with respect to Europe.

    So all in all, I'd say the situation in China compares well with the situation in Europe, with respect to language diversity. The Chinese Sinitic languages are like the European Indo-European languages. The one difference, however, is that educated Chinese today can almost universally *write* Standard Mandarin, while the same is not the case in Europe due to spoken and written English – the European lingua franca – being closely associated, and Russians/Ukrainians being generally poor at both. Thus, educated Chinese can generally communicate with each other through writing, even though their Standard Mandarin proficiency might be just as poor or more poor than European proficiency in English.

  11. Kanreian said,

    October 22, 2017 @ 5:54 am

    @Professor Victor Mair:

    I noticed that your personal website still says "Professor, Chinese Language and Literature". Maybe "Language" should be changed to "Languages"?

RSS feed for comments on this post