Cantonese "here"

The first comment to my post on "Multilingual voting signs" (11/9/12) was by Alinear, who stated that cǐ chù 此處 ("this place") sounds like Cantonese to him.  As a matter of fact, as reader ahkow pointed out in the second comment, cǐ chù 此處 ("this place") is simply the literary / classical Chinese way of writing "here".  Both cǐ 此 ("this") and chù 處 ("place") occur on the oracle bones, so this means that they have been a part of Sinitic vocabulary since around 1200 BC.  Where they might have come from before that time remains to be determined.

A search on CantoDict yields 1 single character entry and 29 polysyllabic entries meaning "here" or that are related to that English word one way or another.  Having studied a bit of Cantonese, and having consulted with Cantonese friends to verify what I remember from my Cantonese classes, I spotted ni1dou6 呢度 as the most authentic and common Cantonese word for "here".

Now, ni1dou6 呢度 ("here") is a very high frequency term that is a basic part of the vocabulary, yet it is noteworthy that neither of the morphemes of which it is composed is of obvious Sinitic derivation.  In Mandarin, 呢 has three pronunciations:

ní — as part of the word nízi 呢子, signifying a type of heavy woolen cloth

ne — used at the end of a sentence to indicate a rhetorical, special, or alternative interrogative

nī — a technical term in esoteric Buddhism

In the Cantonese word for "here", ni1dou6 呢度, the ni1 is being used to transcribe the sound of a morpheme for which no Chinese character exists.  This phenomenon of morphemes that do not match any known Chinese character is extremely common in Cantonese, Taiwanese, Shanghainese, and other non-Mandarin topolects.  The same holds true for colloquial forms of Mandarin, such as Sichuanese and Pekingese.  What seems ironic is that it is often the highest frequency morphemes in the spoken languages that lack a good Sinitic pedigree.

The other morpheme in the Cantonese word for "here", namely dou6 , is also being used to transcribe the sound of a Cantonese morpheme for which no Chinese character exists.  Because there are no proper characters for writing Cantonese ni1dou6 ("here"), I have long suspected that it is not Sinitic at all, but that it derives from some substrate language that was present in southeast China long before the so-called Han people started moving in from the north.  I believe that much of the elaborate particle system and many other aspects, elements, and features of Cantonese also derive from substrate, non-Sinitic languages, but in this post I shall concentrate only on ni1dou6 ("here") to keep it within manageable limits and to serve as a powerful example of characteristic of Cantonese that is generally ignored.

Bob Bauer, the doyen of Cantonese linguistics, has confirmed my suspicion about the non-Sinitic origin of ni1dou6 ("here"):

There is a good case to be made that Cantonese 呢度 is not Chinese but has originated from some non-Chinese languages of the area.  Last night I just happened to be looking at page 46 of Zhan Bohui's 詹伯慧 masterful work, Guǎngdōng yuè fāngyán gàiyào 广东粤方言概要 = An Outline of Yue Dialects in Guangdong (2002, Jinan Univ. Press), which has a sentence at the top saying that Cantonese has some words that are shǎoshù mínzú de cí 少數民族的詞 (words that are from ethnic minority languages); below that are three sets of paired lexical items with words from Zhuang 莊, Li (Hlai) 黎, and Yao Mian 瑤勉 on the left and the comparable Cantonese words on the right. The last item on the page is Yao Mian "[dǎu33 {33 is a superscript}] 指示地點的語素,相當於這裏、那裏的裏 ('morpheme indicating location; equivalent to the 裏 of 這裏 [here] and 那裏 [there]')". On the right are the comparable Cantonese lexical items: [tou22 {22 is a superscript}] 呢度、嗰度 comparable to Mandarin 這裏 ("here") and Mandarin 那裏 ("there").

According to page 231 of Yáozú yǔyán jiǎn zhì《瑤族語言簡志》 (A Brief Description of Yao Language) (1982) by 毛宗武 et al, Yao Mian 瑤勉 [na:i1 dau1 {1 in both cases is a superscript}] is equivalent to Mandarin 這裏.

On pages 111 and 112 of A Handbook of Comparative Tai (1977), Li Fang Kuei reconstructed Proto-Tai *nii [= *ni:] and *hnii [*ni: {small circle under n}] with tone C2 for 'this, here' which was based on a number of forms he listed on pages 112 and 113 from Tai languages in which the vowel is [i], [i:], [ai], or [nei] and the tone either C1 or C2.

Page 802 of《莊語方言研究》by Zhang Junru et al 張均如等 lists the forms for "這" and "這裏" in 36 Zhuang dialects; 33 dialects all have similar forms of [ni], [nei], [nai], etc.

Comparable forms for 'this' and 'here' are listed on page 870 of Wang Jun's 王均的《莊侗語族語言簡誌》for 11 Tai languages.

For those who are interested, the enormous difference between Cantonese and Mandarin may be seen from this Wikipedia article that gives Cantonese words in the first column, their Mandarin equivalents in the second column, and annotations in the third column.  Similar analyses could be carried out for many other basic words in Cantonese, and indeed in Taiwanese, Shanghainese, and other Sinitic topolects.

Just looking at the words for "here" in the twenty different topolects across the length and breadth of China that are presented in Hànyǔ fāngyán cíhuì 汉语方言词汇 (A Lexicon of Sinitic Topolects), 2nd. ed., p. 559a, only six of them — all so-called Mandarin topolects — resemble Modern Standard Mandarin (MSM) zhèlǐ 这里 ("here").  All the rest vary wildly in their phonology and morphology, and ten of the thirty-one forms cited cannot even be written with characters, while others require special characters or transcriptional use of characters.  Thus well over a third of all the words for "here" in Sinitic languages may be suspected of being derived from non-Sinitic substrate languages.

What is true of the words for "here" in Sinitic languages is true of much of the basic vocabulary of the Mandarin topolects.  It is mostly only higher level and literary vocabulary that is similar across the full range of topolects and can be written fully with the standard set of characters.

Chinese historical linguistics is in its infancy.  There is still so much work that needs to be done with regard to phonology, morphology, and etymology.  Similarly, for a language family that has over a billion speakers, the classification of its constituent branches, languages, and dialects has barely begun, with most scholars still maintaining that there is only a single Sinitic language throughout all time and space.

As I have before, I again make an appeal to view Sinitic as a language family or large group of languages, not a single language with countless mutually unintelligible varieties.  See this chapter on the classification of Sinitic in the Festschrift for Alain Peyraube:  "The Classification of Sinitic Languages: What is 'Chinese'?" by Victor H. Mair (梅 維恒).

[1st draft November, 2012]

[Thanks to Don Snow]


  1. Eidolon said,

    October 15, 2015 @ 1:38 pm

    There is a fundamental difference between viewing Sinitic as a language family and viewing it as a large group of languages [that aren't even related]. I thought the former was already proven? Is there any reason to believe that Sinitic isn't a language family the way, say, the Romance languages are?

  2. Rubrick said,

    October 15, 2015 @ 4:32 pm

    with most scholars still maintaining that there is only a single Sinitic language throughout all time and space.

    Wow, is this really true? Most people, no doubt, but I'd gotten the impression that serious linguistic scholars knew better. If not, I'd wonder what the threshhold for "scolarship" is considered to be.

  3. David Moser said,

    October 16, 2015 @ 1:01 am

    I just installed a new Windows 10 operating system on my work computer, and noticed that the name of the icon that used to be "My Computer" 我的电脑, or just 计算机 "computer", has now been changed to ci3 dian4nao3 此电脑 "This Computer." Evidently the use of ci3 is increasing.

  4. ErikF said,

    October 16, 2015 @ 6:51 am

    @David Moser: Possibly that is true, but in this case Windows 10 changed the wording in English as well so the change is likely a branding issue; all of my computers show "This PC" now instead of "My Computer".

  5. JQ said,

    October 17, 2015 @ 1:44 pm


    From Vista onwards, the English version of "My Computer" became just "Computer".

    I would have translated "This PC" as 本电脑 rather than 此 though.

  6. Alain lucas said,

    October 18, 2015 @ 9:27 am

    About "most scholars still maintaining that there is only a single Sinitic language throughout all time and space"

  7. Victor Mair said,

    October 18, 2015 @ 2:08 pm

    From Bob Bauer:

    In your LL post on Cantonese ni1 'here' you wrote:

    ni1 呢 is being used to transcribe the sound of a morpheme for which no Chinese character exists.

    While this is certainly true, one could say more: I would revise it as follows:

    The Chinese character 呢 is being used to transcribe the sound of the Cantonese morpheme which is the semantic equivalent of 這 zhe4 in standard Chinese but is etymologically unrelated to it or any other standard Chinese character. In order to transcribe the Cantonese morpheme, writers have had to borrow a character with a pronunciation that matches or is similar enough to the Cantonese morpheme, and so 呢 was chosen for this purpose, but its original meaning is ignored/suppressed when it is used in this way.

  8. Chas Belov said,

    October 18, 2015 @ 5:35 pm

    As for the linked Wikipedia article, I find it interesting that the presumably MSM column is labled "京", which I am taking to literally mean "Beijing speech." Yet, my understanding from reading Language Log is that there are notable differences between actual colloquial Beijing speech and MSM, so the label would appear to be not quite precise. Am I interpreting this issue correctly?

  9. Eidolon said,

    October 19, 2015 @ 10:56 am

    @Chas Belov there are certainly differences between actual colloquial Beijing speech and MSM. The article looks to be referring to MSM, which it calls 京話 ie "capital speech."

