There is a widespread misconception that Chinese languages are monosyllabic. That is purely an artifact of the writing system, since most Chinese words average out at about two syllables in length. Typical examples: zhuōzi 桌子 ("table"), fēijī 飛機 ("airplane"), péngyǒu 朋友 ("friend"), qìchē ("car"), huǒchē 火車 ("train"), fángzi 房子 ("house"), and so on. Even in Classical Chinese (or Literary Sinitic), there were many words that were greater than one syllable in length, e.g., húdié 蝴蝶 ("butterfly"), fènghuáng 鳳凰 ("phoenix"), shānhú 珊瑚 ("coral"), wēiyí 委蛇 / 逶迤 ("sinuous; winding; meandering"), jūnzǐ 君子 ("gentleman; superior man; person of noble character; sovereign; ruler; lord; m'lord"), and so on.
It will probably come as a shock to most readers of Language Log that not even all Chinese characters are monosyllabic.
When I first went to Beijing in 1981 to read Dunhuang manuscripts in the National Library, I saw the following words engraved on a horizontal wooden plaque hanging over the entrance to the rare book room where I sat every day to read scrolls and booklets:
Běijīng túshūguǎn shànběn shūshì
"Rare Book Room of the Beijing [National] Library".
I'm not absolutely certain of the last two characters, and the first two characters may have been preceded by something like guólì 國立 or guójiā 國家 (both would mean "national" in this context). But I remember very clearly these characters: Běijīng ?? = 北京圕.
Even though I had never seen the third character, 圕, I knew from the context that it must be equal to túshūguǎn 圖書館 ("library"). When I asked the librarians on duty how to pronounce the mystery character, they said matter-of-factly, "túshūguǎn".
Thus was the myth of innate monosyllabism of Chinese language, and even of Chinese writing, a myth with which students are indoctrinated worldwide, forever happily shattered for me.
圕 comes up in pinyin input if you type "tuan", at least on my computer. That reading seems to be some sort of abbreviation (beginning and ending) of "tushuguan".
圕 was a real character widely used among the Communists at Yan'an during the 30s and 40s, and also after the founding of the People's Republic of China in the 50s. It obviously continued into use even up to the 80s when I saw it in Beijing. This character is said to have been invented by a library sciences expert named Du Dingyou (杜定友) in 1914.
Once exposed to túshūguǎn 圕 ("library"), I kept my eyes open for other polysyllabic characters. They weren't hard to find. In fact, there were hundreds of such polysyllabic characters, and they still pop up from time to time simply because they are easier and faster to write than the groups of characters that they are intended to supplant. The authorities, however, in their ongoing quest to "standardize" Chinese language and writing, have attempted to outlaw such polysyllabic characters (with a few exceptions, one of which I shall mention below).
Whereas 圕 is trisyllabic, one character that was very popular among Communist writers is quadrisyllabic, namely the graph that stands for
社會主義 (simplified form: 社会主义)
As I noted above, although the government tries to stamp out such handy polysyllabic characters in the name of standardization, several of them continue in use, even by state corporations, and may be found in official dictionaries. One example of such a character that is still in wide circulation is the bisyllabic graph 瓩 (U+74e9) . It is pronounced qiānwǎ and is equivalent to 千瓦 ("kilowatt")
Polysyllabic characters are by no means a phenomenon of the 20th century. Indeed, I found a number of them in Dunhuang manuscripts dating back well over a thousand years, including this one:
In Modern Standard Mandarin, this graph is pronounced púsà and is the equivalent of the two characters 菩薩 (simplified form: 菩萨) (an abbreviated transcription of the Sanskrit word "bodhisattva" or Pali "bodhisatta" — "enlightened being").
In fact, we can trace the existence of polysyllabic graphs back to the earliest stage of the Chinese script, namely, that of Shang oracle bone inscriptions (OBI) about 1,200 BC.
Polysyllabic characters are very common in OBI, but they mostly occur in certain limited situations. Two-syllable names of ancestors are perhaps most often written with single (or combined) graphs, though they are also commonly written with 2 separate graphs. These include names like Shàng Jiǎ 上甲, Shì Guǐ 示癸, Mǔ Yǐ 母乙, etc. Trisyllabic names like Kāng Zǔ Dīng 康且(祖)丁 can also be written in the space of a single graph. Similarly, certain sacrificial terms, like xiǎo láo 小牢 ("lesser lao sacrifice [consisting of an ovicaprid and a pig]"), can be written with combined characters. Set phrases are also not infrequently written with polysyllabic characters, like shàngxià 上下 ("above and below"), xiàshàng 下上 ("below and above"), dàjí 大吉 ("greatly auspicious"), shòu yòu 受又(祐) ("receive blessings"), etc. The last of these (at least) pretty clearly isn't a single word, but it can still be written with a combined graph.
Numbers are also very commonly written in combined forms, whether just multisyllabic numerals written together, like liù bǎi 六百 ("six hundred") or shísān 十三 ("thirteen"), or numbers written together with the objects they quantify, like qīshí rén 七十人 ("seventy people").
In modern varieties of Chinese, there is considerable phonological as well as distributional and semantic evidence for polysyllabic words and fixed phrases. Judging from the above evidence, it would seem that the concept of polysyllabic words and phrases is also firmly embedded in the history of Sinitic writing systems, almost as deeply as the partly-contrary notion that morpheme = syllable = character.
[Thanks are due to Maddie Wilcox for asking me about the phenomenon of polysyllabic graphs and to Matt Anderson for the Shang data; Tom Bishop created the special characters with the Wenlin CDL system and Richard Cook checked Unicode numbers for obscure characters.]