Ask Language Log: Unnecessary disyllabism?

« previous post | next post »

From Thorin Engeseth:

I was doing some reading this morning on the magpie, and the Wikipedia page states:

Similarly, in China, magpies are seen as an omen of good fortune. This is even reflected in the Chinese word for magpie, simplified Chinese: 喜鹊; traditional Chinese: 喜鵲; pinyin: xǐquè, in which the first character means "happiness".

I'm almost entirely illiterate when it comes to the languages of China, so I took to Google Translate just to see how it would translate the two characters from both simplified and traditional script. In both, the first is translated as "like; to be happy", while the second is "magpie". My question is: if the second character itself can be translated as "magpie", if Google Translate is correct here, then is the first character still necessary?

This is a very common phenomenon in vernacular / colloquial Sinitic languages, esp. Mandarin.  Most words in Mandarin are disyllabic, so a word will often be composed of two synonyms or near-synonyms or a core morpheme preceded by a modifier, etc., sometimes even antonyms.  This predilection for disyllabism is so strong that it governs rhetoric, rhythm, syntax, word formation, and other fundamental aspects of usage in Sinitic languages.

See, for example:

Perry Link, An Anatomy of Chinese:  Rhythm, Metaphor, Politics (Cambridge:  Harvard University Press, 2013).

Jerome L. Packard, The Morphology of Chinese: A Linguistic and Cognitive Approach (Cambridge:  Cambridge University Press, 2000).

As for how, why, and when this marked tendency toward disyllabism in Sinitic languages began, it is an extremely complicated problem, one which is far beyond the scope of this humble post.  In any attempt to arrive at a satisfactory answer, however, here are some factors that would need to be taken into account:

1. the diminution of the phonetic inventory

2. the morphosyllabic nature of the script

3. the impingement of polysyllabic Buddhist and other foreign languages

4. literary values, styles, and genres

Undoubtedly there were other operative factors, but these are fundamental.

So powerful and profound is the disyllabism of Sinitic languages that it even heavily affects the Sinoxenic vocabularies of neighboring states.  For example, in "The role of Vietnamese dissyllabism in exploring Vietnamese words of Chinese origin" (November 15, 2002; March 16, 2003), dchph recognizes that, though disyllabism is a relatively later development in both Sinitic and Vietnamese, once set in motion it was intimately intertwined in the development of the lexicons of both languages.

Approvingly, the author quotes Chou Fa-Kao, “Monosyllabism of Chinese Reconsidered”, Tsing-hua Journal of Chinese Studies, 14.1-2 (1982), 106 of 105-109:

Following Kennedy and de Francis [sic], Eugene Chin said: ”If we admit that words, not morphemes, are the construction material of Chinese, we cannot but admit that Chinese is polysyllabic. If we may use the majority rule here, we will have no trouble establishing the fact that Chinese is dissyllabic [sic].”

[N.B.:  Althought the html version of dchph's article cited above is readable, it lacks a full list of abbreviations and glossary, as well as bibliography and appendices.  All of these may be found in this pdf with the title "Introduction to Sinitic-Vietnamese Studies".  Unfortunately, typographically the latter version of the paper is a mess because of incomplete, unsuccessful font conversion.]

To return to Thorin's question, if we just said "que", it would not be clear which of more than 60 morphosyllables pronounced with that tone we were referring to.  Even if we specified that it was fourth tone, "què", that still wouldn't help, because the overwhelming majority of those 60+ "que" are actually "què".  On the other hand, if we say "xǐquè", people will immediately understand that we mean "magpie", because there is only one other commonly used word with the two syllables "xi + que", and that is xīquē 稀缺 ("rare; scarce").  Even if somebody botched the tones badly, context would disambiguate that the speaker was not talking about a magpie.

[Thanks to Adam Albright]


  1. Bathrobe said,

    November 27, 2017 @ 3:38 pm

    Over a thousand years ago the Japanese government decided on a policy of writing Japanese place names in Chinese characters. The policy was 二字良字 (two characters, good characters). This policy at times ran roughshod over the actual native naming. For instance, Awa in Shikoku was written 阿波. Awaji island, on the way to Awa, was written 淡路 ('Awa road').

  2. Bruce Rusk said,

    November 27, 2017 @ 3:50 pm

    We might note the same ambiguity in English: one of the meanings of "pie" (now rare of course) is the bird, but we generally use the full form "magpie."

  3. WSM said,

    November 27, 2017 @ 4:12 pm

    I've noticed many dialects such as Min and Cantonese tend to use fewer compounds for written representations: these representations thus resemble more closely the classical language, which is generally (though not exclusively) oriented towards monograms. Could the existence of more tones in many of these dialects account for the apparently reduced need for disambiguation using compounds?

  4. A1987dM said,

    November 27, 2017 @ 4:38 pm

    Same reason why English speaker say "seagull", "windowsill" etc.?

  5. Ian said,

    November 27, 2017 @ 5:20 pm

    Are most Sinitic words also monomorphemic? I.e., are all these disyllabic words are composed of one morpheme, and as such most morphemes are two syllables?

    Regarding the development of disyllabism, some people propose that homophone avoidance was one reason for it. I still read, however, that Chinese languages have a lot of homophones, but most of this appears to be non-scholarly, and then goes on to compare words with the same segmental structure but with different tones. Do Chinese languages actually have a lot of exactly homophonous, disyllabic words (words composed of two syllables with exactly the same segmental structure and tones) or is it a case of most people conflating Chinese *characters* (half a word, in most cases) with Chinese words?

  6. Joyce Melton said,

    November 28, 2017 @ 12:36 am

    Somehow this reminds me of a joke we used to tell after Vietnamese class. "Are you sleeping or just stupid?" we would ask. The answer was always, "I'm resting but not thinking." This is funny in Vietnamese to someone learning the language. The important words in each sentence differ only by a tone mark but both tones are pronounced the same in the southern dialect. No wonder they use a lot of bisyllabic construction, like attaching a classifier to most nouns.

  7. Jon said,

    November 28, 2017 @ 1:32 am

    Bruce Rusk:
    The mag in magpie is short for Margaret. There was a habit of giving human names for birds, such as Tom tit, Jenny wren, Jack daw, Robin redbreast.

  8. Chris Button said,

    November 28, 2017 @ 2:04 pm

    This reminds me of that famous Y.R. Chao quote regarding Classical (i.e. not Modern) Chinese is concerned: "the monosyllabic myth is one of the truest myths in Chinese mythology" .

    The notion of where one syllable ends and another begins is a thorny problem that does not seem to afflict analyses of Chinese as much as English. For example, some people (quite correctly in my opinion) break a word like "paper" as while others would prefer pa.per. While the writing system certainly makes Chinese syllables explicit, I wonder if any argument could be made in terms of phonotactics? In Proto-Indo-European, the "vowels" /e/ and /o/ (or rather /ə/ and /a/) pattern as the sonorants /j/, /w/, /r/, /l/, /n/, /m/. In Old Chinese, the "vowels" /ə/ and /a/ pattern with /j/ and /w/ (and to a certain extent possibly /r/) due to their greater sonority than the other sonorants /l/, /n/, /m/, /ŋ/. This allows for "consonant clusters" in Proto-Indo-European that do not occur in Old Chinese. Perhaps this could add confusion in terms of syllabification? Along with misled (in my opinion) notions of unconditioned vowel-breaking, another area of contention in Old Chinese is to what extent affixes/pre-syllables etc can be reconstructed without them simply being applied as wild-cards to account for various hard to explain consonantal phenomena (I am certainly guilty of over-use here).

  9. Keith said,

    November 28, 2017 @ 2:38 pm

    This reminds me of something told to me by a then colleague, in around 1998. He was married to a Chinese woman, and was earning Mandarin, and talked about the word for "book", for example, as being "that instance of a book-like object which is a book"…

    As regards "magpie", French also has a word "pie" for that bird, but it is often named "pie voleuse" (i.e. "thieving magpie"), although there is not any other bird named "pie" that is considered more honest. I wonder if this is also a mechanism for disambiguation between homophones, though there are fewer of them. The only ones that come to mind are "pis", udder of a cow, ewe, etc. and "Pie", the pope's name Pius.

  10. WSM said,

    November 28, 2017 @ 4:55 pm

    @Chris Button the whole analogy with syllables in western languages such as English (for sake of argument) continues to be deeply problematic, though: a better analogy of Chinese representations (trying to avoid the word "morpheme") such as xi3que4 (喜鹊) would be to English compounds such as "papercut", which are quite obviously not atomic, in contrast to truly atomic (and meaningless) units of sound such as "pa" and "per". That something different, and perhaps more complicated, is going on in Chinese is suggested by the fact that "bisyllabic" "words" (each of these characterizations can be challenged!) such as tiao4wu3 (to dance/跳舞) can be broken up *in speech* into smaller units in phrases such as tiao4 qi3 wu3 lai2 le (跳起舞来了), suggesting that the so-called "syllables" tiao4 and wu3 carry semantic meaning, unlike their putative Western analogues.

  11. Michael Watts said,

    November 28, 2017 @ 5:36 pm

    Discussion of English analogues seems to have missed the American dialectal disambiguation of "ink-pins" (or "pens") from "stick-pins" (or "pins").

  12. Ellen K. said,

    November 28, 2017 @ 6:03 pm

    @Michael Watts

    Also PIN numbers. I did experience where someone said, I thought "I need my pen", and I was really confused till I realized what she needed was her PIN.

  13. Chas Belov said,

    November 29, 2017 @ 2:42 am

    Or trisyllabic, as in Cantonese hambahlaang, altogether.

  14. Adam F said,

    November 29, 2017 @ 2:46 am

    As others have noted, there are similar things in English ("seagull", "magpie"; ISTR the "cran" in "cranberry" means "cranberry") but they are not usually transparent to native speakers, except for the sort of people who read Language Log. ;-)

  15. Silas S. Brown said,

    November 29, 2017 @ 7:47 am

    @Chris Button: Yes, English syllabification can be done in more than one way, although there are established conventions in print, which comes up both in general typography (LaTeX etc) and in the typesetting of music (printing syllables to be sung with their corresponding musical notes). Things get more interesting when one or more of those syllables is an affix: a typical example in computational linguistics is “unionized”, which in politics is “union + ized” but in chemistry “un + (ion + ized)” (not to be confused with deionized). I suppose the Chinese translation of the Maya "Long Count" calendar (the one that rolled over in 2012), 长历法, is more likely to be thought of as 长 + 历法 (long + calendar system) than 长历 + 法 (long calendar + method), and I suppose the 3-character word for "epidermis", 表皮层, is more likely to be thought of as 表皮 + 层 than 表 + 皮层 (the latter wouldn't make sense if 皮层 refers to a cortex in the brain), but it's entirely possible people don't mentally break these up at all. (I never realised "jackdaw" contained the name Jack.) Then there are 3-character versions of things like “surround” (包围 + 围住 = 包围住) and “recognise / make out” (辨认 + 认出 = 辨认出) where it doesn't really make sense to break the 3 into 2 + 1 or 1 + 2 because the 1 at the end wouldn't easily have the relevant meaning. Incidentally I'm told London Heathrow Airport used to have a sign saying “Airport long stay car park courtesy vehicle pickup point” with no indication of how those words are grouped (at least they didn't write “air port”, as they presumably would have done when flying was a new thing), and it was reportedly common to see foreigners standing in front of this sign trying to work it out.

  16. Chris Button said,

    November 29, 2017 @ 12:15 pm

    @ WSM

    I think your point is valid in so far as the languages evolving differently in terms of very different stress-patterning and phonotactic constraints. However, if we stick with the English syllable "er" mentioned earlier, suppose we assign it a character like 者 with a semantic overlap with Chinese as an agent doing something – e.g. "dance者" (dancer / 舞者) . In that sense, surely "者" is no more valid as an independent entity in Chinese than "er" is in English?

    From a purely synchronic perspective (i.e. regardless of the actual evolution of the words in question), we could also then use the same character for comparative forms like "tall 者" (taller), although alternatively a different homophonous character could be used for this particular function. We could then further extend it to words in which "er" is not even a suffix like "pap者" (paper) and compare this to examples in Chinese of characters not being used for any semantic function in binomial compounds (蝴蝶 being the famous example). Having said all that, I do realize it is starting to sound a lot like the evolution of hiragana out of kanji originally used as furigana in Japanese!

    @ Silas S. Brown

    I prefer the approach that associates as much as possible to the stressed syllable – hence /peɪp.r̩/ not /peɪ.pr̩/ for "paper". The latter incorrectly implies that the second /p/ would be aspirated as an onset and that the first syllable would not have a slightly shorter nucleus as a result of the voiceless obstruent coda. Such an approach is technically subject to phonotactic constraints, but even then I'm still drawn to aligning with the stressed syllable regardless.

  17. John Swindle said,

    November 29, 2017 @ 2:24 pm

    English-language radio broadcasts from Radio Pyongyang, as it was known then, used to refer regularly to "the Democratic Pea Pulls Republic of Korea." It has long seemed to me that this factoid would someday be relevant here. Perhaps not quite yet.

  18. Chris Button said,

    November 29, 2017 @ 3:36 pm

    @ John Swindle

    That's a great example. They seem to be using the "p" coda of the first syllable (which should therefore be unaspirated and clipping the "eo" nucleus slightly) as the onset of the second syllable (making it aspirated and not causing clipping in the first syllable) presumably because they are making no stress distinction by treating ˈpeop.le as ˈpeoˈple (I'm assuming the /ʊ/ vowel implied by "pull" as opposed to the /ʌ/, or more accurately /ɐ/, that in American English tends to conflate with unstressed schwa is more about the word it ends up sounding like in English than any phonetic rounding of the vowel)

  19. John Swindle said,

    November 29, 2017 @ 4:52 pm

    @ Chris Button

    Yes, unrounded, I think, thanks.

  20. WSM said,

    November 29, 2017 @ 5:28 pm

    @ Chris Button – yeah, wrt 蝴蝶, I'm not trying to argue that *all* such constructions are necessarily divisible compounds, but I do think the "syllables, not characters" mantra ignores a large number of constructions which are quite clearly reducible to character-level units of meaning.

    As for 者, I'd argue that it does carry meaning by itself (unlike Classical grammatical particles such as 也 or 哉), i.e. as a shorthand of "的人/东西", and therefore can't really be said to have no standalone meaning, even if it can't be *used* by itself (whether the character can be used by itself, doesn't seem to have anything to do with whether it has any meaning by itself). A vague analogy might be to "seemann" and "landsmann" etc in German (though admittedly "mann" can of course be used by itself! This still doesn't seem to be a controlling difference to me).

  21. Chris Button said,

    November 30, 2017 @ 12:48 pm

    @ WSM

    yeah, wrt 蝴蝶, I'm not trying to argue that *all* such constructions are necessarily divisible compounds, but I do think the "syllables, not characters" mantra ignores a large number of constructions which are quite clearly reducible to character-level units of meaning.

    I think that is possibly conditioned by the following:
    – Different levels of opacity conditioned by different sound changes. These changes themselves are often heavily conditioned by different levels of loanword influence since much of modern English does not come directly from Old English hence the original syllables behind the compounded syllables are no longer clear
    – Different levels of application of inflectional/derivational paradigms. This is connected to my point above regarding a growing use of these in Old Chinese reconstructions which is not always entirely warranted in my opinion.
    – Different levels of influence of orthography in terms of syllable perception.

    In short, a disyllabic world like "husband" does not seem particularly divisible unless one is told that it effectively represents the modern syllables "house" and "bond" in terms of its Old English form. On the other hand, "paper" is a loanword and then the question becomes one of how distant it was originally and what changes it went through in various languages before it reached English via French.

    As for 者, I'd argue that it does carry meaning by itself (unlike Classical grammatical particles such as 也 or 哉), i.e. as a shorthand of "的人/东西", and therefore can't really be said to have no standalone meaning, even if it can't be *used* by itself (whether the character can be used by itself, doesn't seem to have anything to do with whether it has any meaning by itself).

    While I personally think "er" could be argued in the relevant contexts to be as applicable as as 者 in terms of having an inherent meaning in and of itself (especially if one starts looking at where this "er" originally came from in terms of its broader Indo-European origins), there are certain issues with syllabification in English that do still cause trouble. My approach to syllabification generally follow that of Wells' "Longman Pronunciation Dictionary". The only area I struggle with is when phonotactic constraints come into play. For example "action" is analysed as /æk.ʃn̩/ since /kʃ/ is not viewed as a viable coda combination, yet the last two syllables of "education" are analysed as /keɪʃ.n̩/ since /ʃ/ is a viable coda even if it does not ever actually occur after the nucleus /eɪ/ as an independent morpheme (although maybe a dialectal pronunciation of "cash" could suffice here). From a morphological perspective, it would be much better to have /ækʃ.n̩/ and /keɪʃ.n̩/ since that then allows "ion" (reduced to /n̩/ or /ən/) to be the "syllable" added to "act" and "educate". As I write this, I'm now wondering if these troubles largely stem from the external origins of such words and how much that can be applied to other such problematic cases? The irony of course is that if "action" had just kept the Middle English /s/ (since the /t/ had already palatalised in French before the borrowing) rather than palatalising separately in English to /ʃ/ (as was discussed on an earlier LL topic) then there would not be a phonotactic violation in modern English since /æks.n̩/ is perfectly viable with "axe" as the first syllable.

  22. WSM said,

    November 30, 2017 @ 5:05 pm

    @Chris Button all very interesting. One additional point perhaps worth considering is that Modern Standard Mandarin is still a very young language (just over 100 years old!) , and as such remains influenced by the monogram-focused classical language to an extent that may lessen as it takes on more of a life of its own (as it has often been observed doing, in this blog). Some of the relatively straightforward divisibility of Mandarin compounds reminds me of how many English neologisms coined by writers like Donne in the 16th and 17th were obvious combinations of roots borrowed from Latin and Greek, a trend English eventually grew out of. Whether MSM does the same is contingent on many factors, most certainly including matters of orthography which as you note most certainly do affect syllable perception.

  23. ajay said,

    December 1, 2017 @ 12:44 pm

    Same reason why English speaker say "seagull", "windowsill" etc.?

    Well, there are other kinds of sills: doorsills and mudsills, for example. And "gull" originally meant "shore bird", so it might have been useful to make it clear that this was the sort of gull that you saw well out to sea, as opposed to a similar-looking wader like an oystercatcher, which you wouldn't.

  24. Chris Button said,

    December 1, 2017 @ 9:06 pm


    I'm not so sure about that since Classical Chinese was not really a vernacular form of the language even at its outset. I suppose that makes its origins a little different from Classical Latin, although ultimately the continued use of Classical Latin as a written form regardless of all the spoken Romance languages that had spawned from it would probably be a good comparison to how Classical Chinese was used as a written form regardless of all the vernacular varieties of Chinese being spoken contemporaneously.

  25. WSM said,

    December 3, 2017 @ 10:52 am

    do we know that "Classical Chinese" (say, the form recorded around Warring States period) was never a vernacular? Kind of wonder what the point of text like "Records of Warring States", a manual for speeches, would be in that case.

  26. Luke B said,

    December 4, 2017 @ 3:29 pm

    I came across a paper by Geoffrey Sampson on a closely related topic just last week:

    "Why does the history of Chinese phonological merger / phonemic loss contradict the principle of (excessive) homophony avoidance in sound change?", Sampson asks. The obvious answer is the move to disyllabicity, but S rejects this explanation. One argument involves the prevalence of synonym compounding:

    "One leading disyllable-creation process was a type of compounding which conjoins synonyms or near-synonyms. Li and Thompson (1987: 819) gave examples such as 疲乏 pífá “tired-tired = tired”, 防守 fángshŏu “defend-defend = defend”, 放棄 fàngqì “loosen-abandon = to give up”. From what they wrote one might suppose that this process occurred mainly with verbal meanings, but there are also many examples with other grammatical functions, e.g. 朋友 “friend”, 民族 “a race”, 墳墓 “a grave”, etc. etc. If synonym compounds of this type arose earlier than the phoneme mergers, that would imply that Chinese adopted a habit of saying the same thing twice even though saying it once would have been unambiguous. Is it realistic that any speech community would adopt such a pointlessly redundant habit of speech? I am not aware of any empirical evidence that the Chinese did so."

    The other half of this supposed paradox is that had widespread phonemic merger predated mass compounding, there would have been an implausible period during which people could not understand each other due to excessive homonymy in the lexicon.

    However, the phonemic functional load problem ranges over an entire lexicon, and there is no consideration by Sampson of incremental maintenance of 'homeostasis': one phoneme pair merges, one set of words moves towards disyllabicity, another phoneme pair merges, another set of words moves towards disllabicity, etc. There would only be a paradox if the entire lexicon had shifted in one fell swoop from well-delineated to massive rampant homophony, which is implausible.

    Also, for someone so well-versed in an East Asian language, I am surprised by the following foonote elaborating on the synonym compound 'problem'.

    "Synonym compounds are of course only one type of Chinese compound, and it may well be that they seem disproportionately salient to Western linguists because European languages contain little or nothing that is analogous. But that very fact strengthens my point. I know of no language other than Chinese which uses compounding of synonyms as a word-formation technique, so there must presumably be some special reason why Chinese uses it. I cannot think of any alternative to the pressure of homophony as an explanation."

    You don't have to go beyond Vietnamese, which has a much larger syllabic palette than Mandarin, to find another language rich in synonym compounds—one key reason there, of course, being mass borrowing from Chinese.

  27. Chris Button said,

    December 7, 2017 @ 11:27 am

    My favorite one in French is aujourd'hui "today" which literally means "on the day of on this day" (jour means "day" andhui has the same origin as Spanish hoy "today")

RSS feed for comments on this post