Colloquial Cantonese and Taiwanese as mélange languages

« previous post | next post »

Charles Belov writes:

My understanding was that Hong Kong newspapers, newscasts, and popular Cantonese songs use literary Chinese exclusively while Hong Kong star magazines and Cantonese hip-hop (e.g., LMF, Softhard) use colloquial Cantonese exclusively. But today as I was walking along, an old Beyond song, 俾面派对, was earworming me and it suddenly hit me that, unlike most Cantonese songs, and like Cantonese hip-hop, which it isn't, it includes colloquial Cantonese, specifically 唔 and 佢 (and, as it turns out, "D").

Now that I am home and can listen to the song, it also contains literary 不 and 的, both of which have colloquial Cantonese equivalents. (There may be other examples of both literary and colloquial in this song but this is the extent of my Cantonese.) I'm puzzled, especially about 不 (literary "not") and 唔 (colloquial "not") appearing in the same song. My expectation would have been that any particular Cantonese work, unless it's including direct quotes, would either just use literary Chinese or just use colloquial Cantonese, but not mix the two. Could you please shed some light as to what is going on here? Is this common and just something I haven't noticed before because my Chinese is so minimal, or is this a unicorn?

Music video with lyrics here.

My impression of colloquial Cantonese is quite different from that of Charles, i.e., that there is no hard and fast line between pure topolectal Cantonese (jyut6 jyu5 粵語) and standard Chinese (zung1 man4*2/4 中文), but that they exist on a continuum of more or less of one and the other (see the book by Donald Snow listed in the "Selected readings" below), and that in practice there is no such thing as "pure" written Cantonese.

To confirm my impression, I asked Bob Bauer, author of ABC Cantonese-English Comprehensive Dictionary, what he thought of the language in the video.  His reply:

Your LangLog reader seems to assume there should be consistency in writing styles of Hong Kong Chinese – on the one hand, the text is written completely in standard Chinese, or on the other, everything is written in colloquial Cantonese – that is, without mixing together different linguistic styles and varieties in one text.

But this is not how writers write in Hong Kong. What I observe every day is “messy mishmash”, the lack of consistency.

Yesterday one of the headlines on the front page of Apple Daily was: 疫捲五GYM房 jik6 gyun2 ng5 zim1 fong4/2 ‘the (corona)virus spreads to five gyms’. This word GYM房 zim1 fong4/2 ‘a place where a person uses specially-designed equipment to achieve and maintain physical fitness’ has become a typical Hong Kong Cantonese loanword from English that can be mixed with standard Chinese on the front page of Hong Kong’s most popular newspaper.

At any rate, for the moment I don’t have an elegant, or even adequate theory to explain this mixing phenomenon in Hong Kong written Chinese – other than to simply characterize the Hong Kong Chinese writer as having only one brain in which is stored “the Chinese language” as one broad, amorphous category, and not as individually-defined, discretely-segregated categories of standard Chinese, colloquial Cantonese, Hong Kong Chinese, Hong Kong English loanwords, mainland Chinese, etc.

Don Snow, an authority on written Cantonese, explained the situation as follows:

While it might seem reasonable to assume that texts written in Cantonese would faithfully and consistently follow the norms of the spoken language, it is actually very common for texts to be written in a mix of Cantonese and other varieties of Chinese. Sometimes it is easy to describe what the "game rules" for such mixed-code texts are; for example, in some texts the narration is all in Mandarin/Standard Chinese and only direct quotes are written in Cantonese. However, it is also not at all unusual for texts to mix the two varieties in ways that make it hard to explain why a Cantonese or Mandarin word is used where it is. So, with regard to the Beyond song, I think the best way to explain what is going on is to say that the lyrics are mainly in Mandarin (which is normal for Cantonese pop songs) but that it is also peppered with Cantonese words to give it a clear Cantonese accent.

Of course, Cantonese speakers would probably not perceive this as being a mix of Mandarin and Cantonese, but rather a style in which the base language is a somewhat literary form of written Chinese – of the type taught in Hong Kong schools – and the spicing comes from colloquial words, the kind a Hong Kong school teacher would tell students not to use when writing.

Another colleague who has been long resident in Hong Kong replied thus:

Yes, your reader seems confused about the nature of modern Cantonese. "Normal" Canto just mixes everything. It's true that newspaper articles and student essays will tend to use the "standard" writing system and phrasing shared by the various Chinese languages, so usually they're easily intelligible to someone from Shanghai or Taipei. But there's often a bit of Canto flavour mixed in — phrasing that's a bit different. In everyday speech, though, anything goes. I can't imagine any popular music or speech not using 唔 and 佢. But I also wouldn't think of 不 as "literary," per se. It would be interesting for someone knowledgeable (not me!) to look into when exactly 不 is used and when 唔. For example, 不如 is used all the time colloquially for suggestions; I don't think anyone would say 唔如.

Yes, apparently "俾面派对" really is a "giving-face party." Wong Ka-kui 黃家駒 didn't like being forced to attend all these industry social events to get ahead in the music business. He just wanted to play rock and roll!

Charles also sent along a music video from Taiwan that mixes things up quite a bit. 

I ran across a Mandarin music video from Taiwan that uses some apparently Taiwanese cultural terms written using Latin characters (not to mention that the group's name is Fun4) as well as some conventions which seem strange to me.

0:30 High (???)

1:39 kuso and Orz (I was able to find these from a search; kuso (said as a word in the song) originally stood for unintentionally funny bad video games and has expanded to mean anything funny, while orz (spelled out in the song) represents a kneeling person emoticon.

1:46 written but not sung tiny 我的 in the middle of a line

2:21-3:01 they are inserting small circles and, in a couple cases, a long spacing line; I'm guessing this might be a stylistic choice but my nearly non-existent Chinese is unable to make a guess as to their purpose.

Although the group is clearly singing in decent Mandarin, they feel free to play around with all sorts of other linguistic spices. I will not go into the Taiwan song in the depth and detail we expended on the Hong Kong song, but will just mention two terms that stand out:

Kuso is a term used in East Asia for the internet culture that generally includes all types of camp and parody. In Japanese, kuso (糞,くそ) means fuck, shit, damn, and bullshit, and is often said as an interjection. It is also used to describe outrageous matters and objects of poor quality. This definition of kuso was brought into Taiwan around 2000 by young people who frequently visited Japanese websites and quickly became an internet phenomenon, spreading to Taiwan and Hong Kong and subsequently to the rest of China.


Orz is a posture emoticon representing a kneeling, bowing, or comically fallen over person.  The O represents the head, the r represents the torso and arm, and the z represents legs.

(source, source)

It would be a good exercise for Language Log readers to ponder why this amorphousness exists for Cantonese, Taiwanese, and other Sinitic topolects and what its implications are for their future.


Selected readings


[Thanks to Pui Ling Tang]



  1. Zheng-sheng Zhang said,

    March 15, 2021 @ 5:07 pm

    My two cents:

    I think written Cantonese is neither completely homogeneous nor a random mishmash. Two considerations may be relevant:

    1. The proportion of Cantonese vs. non-Cantonese elements may depend on the audience and subject matter. Apply Daily may be different from other papers in having more of a local flavor. The exact proportion can be investigated quantitatively.

    2. Whether a said element is grammatical or lexical. I have noticed that 不 in lexical elements (as in 不如) cannot be replaced with 唔.

  2. David C. said,

    March 15, 2021 @ 7:17 pm

    Cantopop lyrics usually seek to maintain rhyming, even when it's at the expense of meaning, so the mixing of written and colloquial language occurs quite a bit to make a rhyme work. Cantopop in the 1960s up to the early 1990s and beyond even included literary Chinese in the mix. The Wikipedia article (in Chinese) for one of Hong Kong's best known lyricists James Wong Jim describes it as a transition from a literary/vulgar dichotomy in the 1960s to one that was a fluid cross between wenyan (classical Chinese), baihua (modern standard Mandarin), and colloquial Cantonese that Wong created. As the article mentions, his Cantonese adaptation of "It's a Small World", 世界真細小, is a wonderful example. In just the first three lines, you'll see all three at play:


    A more recent example of mixing written and colloquial language in Cantonese is Internet slang 是咁的 (literally "that's how it is", meaning "let me explain") – a mix of the colloquial 係咁嘅 and the written 是這樣的.

    The continuum present in diglossic regions is well described in previous posts on the Language Log. Two that come readily to mind are Québec French and Swiss German, which also mix "high" and "low" dialect/language varieties in popular culture.

  3. Krogerfoot said,

    March 18, 2021 @ 11:30 pm

    The Wikipedia source on Japanese kuso gives the impression that the word "means" fuck, but only in the sense that it's used as an interjection. It more literally means shit/crap and is appended to nose and eyes to cover the substances that we excrete there, as well as the metaphorical meaning of low quality: 下手くそ hetakuso "[be] crap [at something], i.e. no good, clumsy, bad at"; foolishness:「しかしー」「しかしもくそもあるもんか」"But—" "Don't give me that 'but' shit!"

    "Fuck" really overstates how profane kuso is in Japanese. Japanese children use it as an interjection all the time with nary a batted eye.

  4. Chas Belov said,

    March 19, 2021 @ 2:50 pm

    Thank you for this thorough discussion. I will certainly follow up on the references.

  5. KIRINPUTRA said,

    March 20, 2021 @ 9:56 am

    Some mixed-up thoughts (mine) on this theme…

    In “native” terms, the “mishmashery” is just between “Chinese script” & “English script”.

    In pre-modern times, erudite people knew what “pure” “refined” (high-register) writing was. But there was no concept of pure “vulgar” writing.

    In modern times, “multiple high registers” has happened: Mandarin, Japanese, & English on top of the old “Hanmun” (漢文).

    Madame Iûⁿ Chúi Sim’s diary for 1928: many entries written in romanized Hokkien (speaking in pre-1947, “pre-Taioanese” terms); other entries in what she would’ve thought of as Hànbûn (漢文) cut with “Japanese” (katakana). There’s mixing in almost every direction, script-wise & language-wise. Even the romanized parts are partly (romanized) Hànbûn.

    So Middle China’s vernacular writing revolution streamlined things for Middle China. For the Deep South & Formosa, it just officially imposed another high register. The tropical literati accepted almost reflexively.

    The end game of today’s (reversible) trends & policy is in sight: death of Taioanese & later Cantonese; regional “vulgarisms” & a higher percentage of “English” script use as lingering markers of identity.

    Core issue seems to be that Cantophone & Taioanese society have outsourced their war functions. Use of “Chinese”, “English”, and “Japanese” scripts is like “investing” in armies & navies. The blend (“portfolio”) of scripts a 人 or group uses is a reflection of identity. Rediscovered war functions — nothing hardcore, just what Vietnam & S. Korea have — could “make Cantonese & Taioanese great again”.

RSS feed for comments on this post