The look, feel, and sound of Dungan language

« previous post | next post »

Here are a couple of YouTube videos by way of example:

Title page item in Modern Standard Mandarin (MSM):

Zěnme láile, jiù nàme zǒu diào ne.


"However it came, that's the way it goes."


"However you got it, that's the way you lose it."

Title page item in Modern Standard Mandarin (MSM):

Hàipà láng le, jiù bù yǎng yáng le.


"If you are afraid of wolves, then don't raise sheep."

The two parts are called respectively:

Лохуэйхуэйди кугәр 东干(回族 Dungan Proverbs Дунганские Пословицы 3


Лохуэйхуэйди кугәр 东干(回族 Dungan Proverbs Дунганские Пословицы 1 Часть

Neil Kubler comments:

It's interesting to note that, as in other types of Central Plains Mandarin, Beijing [zh- ch- sh-] before the vowels [-u] or [-o] become [pf- pfʰ- f-], so that, for example, 水 shuǐ “water” is pronounced [fei].

Dungan is considered to be a Mandarin topolect, but after having been separated for a century and a half from its original home in northwest China (Shaanxi, Ningxia, Gansu, Qinghai) by an epic, long distance migration to the Central Asian areas of what are now Kazakhstan and Kyrgyzstan, their language has diverged considerably from that of their congeners who stayed behind.  The distinction between Dungan and the language of its origin also further increased by absorbing vocabulary from Arabic, Persian, Turkic, and Russian languages and after having their language written successively in the Arabic, Latin, and Cyrillic alphabets for lengthy periods of time up to the present (they were illiterate in Chinese characters).

Dungan is unique in that it is one of the few varieties of Chinese that is not normally written using Chinese characters. Originally, the Dungan, who were Muslim descendants of the Hui, wrote their language in an Arabic-based alphabet known as Xiao'erjing. The Soviet Union banned all Arabic scripts in the late 1920s, which led to a Latin orthography based on Janalif. The Latin orthography lasted until 1940, when the Soviet government promulgated the current Cyrillic-based system. Xiao'erjing is now virtually extinct in Dungan society, but it remains in limited use by some Hui communities in China.

The writing system is based on the standard 3-tone dialect. Tone marks or numbering do not appear in general-purpose writing, but are specified in dictionaries, even for loanwords. The tones are specified using the soft sign, hard sign, or nothing.


Wikipedia articles:

Dungan people

Dungan language

The Berkeley linguist, William S-Y. Wang, after reading my article on "Implications of the Soviet Dungan Script for Chinese Language Reform" (1990; see "Selected readings" below), became very interested in studying the Dungan language to see how writing a Sinitic language with an alphabet might affect its development.  He even went to their communities in Central Asia with one of his graduate students whom he hoped would write a dissertation on the subject.  Much to his surprise, he was disappointed that he and his student could understand very little of the Dungans' speech and writing, so — due to limited intelligibility — he abandoned the project.  I myself remain deeply interested in Dungan as living proof that a Sinitic language can be written with an alphabetic script.


Selected readings



  1. Philip Taylor said,

    October 15, 2020 @ 11:07 am

    A query about the phraseology of the title, if I may ? If this were about (for example) Tibetan rather than Dungan, I would expect the title to read "The look, feel, and sound of the Tibetan language", and the same for any other language of which I can think (although some might require a plural). Is there something about Dungan that requires the definite article to be omitted, or is there some special significance that we (your readers) are intended to infer from the lack of an article ?

  2. Victor Mair said,

    October 15, 2020 @ 11:25 am

    It has several dialects.

    I wanted to refer both to the written form and the spoken form.

  3. Ben said,

    October 15, 2020 @ 11:28 am

    Cyrillic actually seems like a fantastic alphabet to choose for Mandarin. It already encodes for a wider variety of coronal affricates – no need to use digraphs or arbitrary associations like Pinyin (z, c, zh, ch). And it uses a larger vowel inventory (again avoiding pinyins arbitrary vowel combos), including inherent [j] glides that you can use for the palatal sounds (Pinyin xi, ji, qi). Finally, using the hard and soft signs for tones is brilliant.

  4. gds555 said,

    October 15, 2020 @ 5:31 pm

    Given that Dungan speakers’ decision to form a linguistic exclave eventually had the salutary effect of their language being written alphabetically, speakers of the major Sinitic languages would be well justified in saying “You’ve a better script than we have, Dunganned in”.

  5. Michael Watts said,

    October 15, 2020 @ 11:05 pm

    And it uses a larger vowel inventory (again avoiding pinyins arbitrary vowel combos)

    What arbitrary vowel combos? The combinations aren't arbitrary at all, though in a couple of cases a vowel is omitted for no good reason.

    The weird vowels are all indicated by a single letter.

  6. Philip Taylor said,

    October 16, 2020 @ 1:45 am

    I think it is possible, Michael, that by "arbitrary vowel comb[inations]", Ben may have been referring to the overloading of vowel symbols such as Pinyin "i".

    But I would respectfully disagree with Ben regarding the use of the Cyrillic hard and soft signs for tones — IMVHO, the Pinyin system conveys the required tone absolutely brilliantly, at least to those who minds form an immediate mapping from diacritic shape to tone contour.

  7. Bob Ladd said,

    October 16, 2020 @ 2:58 am

    @Philip Taylor:
    But surely pinyin i is not overloaded if you consider a phonemic analysis of the Chinese vowel system (admittedly not straightforward). Arguably the sounds in xi, shi, and si are all just allophones of the same phoneme. From the point of view of a speaker of, say, Hindi or Thai, you could equally well say that p and t are "overloaded" in English because they spell both aspirated and unaspirated stops.

    As for using letters (in this case the hard sign and soft sign) for tones, I'm with Ben. For one thing, writers are far less likely to omit them, which pinyin users – annoyingly – do with the tone diacritics all the time. There's also limited evidence (from the days of Y R Chao's national romanisation) that spelling tones into the letter string rather than writing them with diacritics helps second language learners with non-tonal native languages acquire new lexical items complete with the tone more accurately.

  8. John Swindle said,

    October 16, 2020 @ 3:14 am

    Although, as we see in the examples, it seems that Dungan mostly omits the tone marks. But the system shares with Gwoyeu Romatzuh the advantage that tone marking is easy to type.

  9. Philip Taylor said,

    October 16, 2020 @ 4:33 am

    Bob, I do not dispute that the "i" sounds in xi, shi, and si are allophones (i.e., are in complementary distribution), as are (for example) the "u" sounds of bu and qu, but from the perspective of a learner they are a considerable source of confusion. "One symbol, one sound" would really make life simpler for learners, IMHO.

  10. Philip Taylor said,

    October 16, 2020 @ 5:06 am

    John, the French have no trouble typing diacritics, the Czech, Germans, etc., likewise, because they have keyboards designed for their particular languages. Whether there is a hardware Hanyu Pinyin keyboard I do not know, but I can certainly re-configure my standard 101-key IBM keyboard to allow easy entry of Hanyu Pinyin with tone marks using (for example) the PinyinTones system — Ni3 ka4n ka4n, wo3 ke3yi3 za4i zhe4ngmi2ng ! -> Nǐ kàn kàn, wǒ kěyǐ zài zhèlǐ zhèngmíng !.

  11. Michael Watts said,

    October 16, 2020 @ 8:56 am

    I do not dispute that the "i" sounds in xi, shi, and si are allophones (i.e., are in complementary distribution), as are (for example) the "u" sounds of bu and qu

    I don't think the definition provided here works. It shows just as clearly that the vowel /y/ in qu is an allophone of the vowel in shi.

    It definitely doesn't work for bu and qu, since there are many minimal pairs contrasting those vowels, such as 路 / 律.

    I believe that the traditional Chinese analysis says that the syllables qi and chi both use a zero vowel. This runs into problems when ni and li rhyme with qi, but are not considered to use a zero vowel.

  12. Bob Ladd said,

    October 16, 2020 @ 10:38 am

    MIchael Watts is right that a phoneme definition based on complementary distribution doesn't work very well for the case at issue here, but it's because there is really multiple complementary distribution. Pinyin sh, ch, zh can never be followed by IPA [i], and pinyin x, q, j can never be followed by IPA [ɤ] (or [ɵ] or however you want to transcribe it). So we could do what Philip Taylor suggests and use different symbols for the vowel, but as soon as we do that, then the pairs of consonants can be considered to be in complementary distribution, meaning that x and sh, q and ch, and j and zh are just pairs of allophones of the same phoneme. This is the analysis on which Wade-Giles is partly based: chi vs chih and ch'i vs ch'ih. (But W-G is not consistent: for pinyin xi and shi they use different transcriptions for both the consonant and the vowel, namely hsi and shih.)

    The problem is that, as in many languages in that part of the world, some of the phonetic features seem to be features of SYLLABLES (what the FIrthians called "prosodies") rather than segments. This actually applies to Michael Watts's example of 路 / 律 as well: with the vowel /u/ we have a notably dark (velarised) initial /l/ whereas with /ü/ we have a very clear (palatalised) /l/. Which is conditioning which? (Here pinyin is not consistent: with xu and shu the vowels are treated as "allophones" and the consonant is the conditioning factor, whereas with lu and lü the dark /l/ and clear /l/ are treated as allophones and the vowel is the conditioning factor).

    It's probably no accident that it was Y R Chao, a polyglot polymath Chinese/American linguist of the first half of the 20th century, who wrote a classic early paper on phoneme theory entitled "The non-uniqueness of phonemic solutions of phonetic systems".

  13. Michael Watts said,

    October 16, 2020 @ 11:37 am

    (Here pinyin is not consistent: with xu and shu the vowels are treated as "allophones" and the consonant is the conditioning factor, whereas with lu and lü the dark /l/ and clear /l/ are treated as allophones and the vowel is the conditioning factor)

    I don't agree. Pinyin considers xu and shu to use fully distinct vowels. You can see this by looking at any syllable chart; xu/qu/ju will all appear in the same row/column with yu/nü/lü, categorized by their vowel, while shu/chu/zhu will all be grouped with wu/bu/lu/etc.

    The different vowels in xu and shu are represented the same way, but so are the different vowels of ni and shi, which are certainly not considered allophones.

    (The different vowels of yan and an are also represented the same way, but in that case I suspect it might be because they're considered to be allophones?)

  14. Michael Watts said,

    October 16, 2020 @ 11:41 am

    (I do agree with the larger point that the phonetic features seem to be features of syllables and that there isn't a unique solution to describing them in terms of phonemes. But I think it's clear that the perspective taken by pinyin is that xu and shu differ in both their initial consonant and their vowel.)

  15. Andreas said,

    October 16, 2020 @ 12:07 pm

    From the perspective of getting roughly correct pronunciations from outsiders, it wouldn't have been a bad idea to merge ch, zh, sh and q, j, x as ch, j, sh, and represent the vowel of chi, zhi, shi with something else. You'd also have to distinguish more consistently between u and ü.

    (This is not to say, of course, that that ought to have been a high priority when designing pinyin. There are obviously other desiderata that may reasonably be judged more important.)

  16. Luke said,

    October 16, 2020 @ 3:01 pm

    In the Yale Romanization of Mandarin, ch, zh, sh, and q, j, x are represented by ch, j, and sh/sy (take a guess at which one is /ʂ/ and /ɕ/, with /ɨ/ represented by a -r.

    I'd say Yale is more practical for native English speakers but isn't nearly as efficient for romanizing Mandarin as Pinyin (in terms of average amount of letters per syllable).

  17. David C. said,

    October 16, 2020 @ 9:07 pm

    Perhaps to point this out more explicitly, it was not an oversight of the Hanyu Pinyin system to write ju, qu, xu as they are, even they do not share the same vowel as zhu, chu, shu etc.

    From the note 4 of the section on vowels in Hanyu Pinyin (汉语拼音方案):

    ü行的韵母跟声母 j,q,x 拼的时候,写成 ju(居),qu(区),xu(虚),ü上两点也省略;但是跟声母 n,l 拼的时候,仍然写成 nü(女),lü(吕)。

    The diaeresis is omitted for simplicity given that the "u" (乌) vowel never follows j, q, x, as everyone has already mentioned, so there is no inconsistency here.

  18. Michael Watts said,

    October 16, 2020 @ 11:39 pm

    I think it bears some mention that modern Chinese are likely to spell the syllable "lv" instead of "lü", presumably because that is how you type those characters in a pinyin input system.

  19. Philip Taylor said,

    October 17, 2020 @ 1:42 am

    David, I fully accept that the overloading of the symbol "u" in Hanyu Pinyin is intentional, but even a full knowledge of that fact does not automatically help the beginner when he or she is seeking to pronounce a word with which he is she was previously unfamiliar. As one's experience grows (and I speak as someone who started studying Mandarin Chinese quite late in life), application of the rule does start to become second nature, but initially it is a source of confusion, just as is the overloading of the symbols "a", "i", etc.

  20. maidhc said,

    October 17, 2020 @ 3:19 am

    gds555: Thank you!

  21. Terpomo said,

    October 17, 2020 @ 2:11 pm

    The Palladius Cyrillization also exists, for standard Mandarin; however, it has no provision for tones (other than just superscript numbers, though using pinyin tone diacritics wouldn't be too much of a stretch) and uses -нь for final -n whereas just -н is -ng!
    I don't know if I want to applaud or smack you upside the head, or both.
    @Michael Watts, Philip Taylor
    What about the -ian final actually being pronounced -ien? That's a good example of arbitariness? For that matter, the falling element in the "ao" diphthong is /ʊ/ so there's no real good reason to spell it as rather than .
    @Bob Ladd, Philip Taylor
    Wiktionary's IPA and the Zhuyin spelling both agree that si/zi/ci/shi/zhi/chi/ri are in fact pronounced with syllabic consonants. And [ɕ] as realization of /s/ before /i/ is an areal feature, it's also present in Japanese, Korean, and I believe Mongolian. Between that, and the fact that historical szc (Middle Chinese 精清從心邪 initials) palatalize to xjq before /i/ I'm more inclined to analyze xjq as allophones of szc rather than sh/zh/ch.

  22. Terpomo said,

    October 17, 2020 @ 3:29 pm

    Oh, and another point: If the alveolo-palatals were allophones of the retroflexes, you'd expect one corresponding to R.

  23. Philip Taylor said,

    October 18, 2020 @ 4:10 am

    Terpomo — "What about the -ian final actually being pronounced -ien ?" — yes, that was exactly the example I had in mind when I wrote "the overloading of the symbols 'a', …".

  24. Bob Ladd said,

    October 18, 2020 @ 5:59 am

    Terpomo – Good point about the retroflexes, but it's also just another example of how there is a LOT of multiple complementary distribution in the system, and it shows that deciding what is an allophone of what is not straightforward at all. Even pinyin and zhuyin disagree in places about what conditions what (e.g. the final written -ong in pinyin (as in 中) is identified with the sound in yong (e.g. 用) in pinyin but with the sound in weng (e.g. 翁) in zhuyin.
    On the other hand, in many cases pinyin and zhuyin agree on treating some phonetic differences as allophonic, for example the -ian vs -an differences that Philip Taylor treats as "overloading" the symbol /a/. In a system designed by native speakers for native speakers (unlike, say, the Yale romanisation, which as Luke says is good for English speakers), it's natural to expect allophonic differences to be ignored, and this seems to be what's going on with -ian and -an. Here the "overloading" is purely in the eyes of the non-native speaker, as in my original example of a Hindi or Thai speaker suggesting that English spelling is confusing because it uses the letter P for "two different sounds" in pill and spill.

  25. ~flow said,

    October 18, 2020 @ 8:08 am

    Mandarin lacks almost any kind of morphophonemic processes that languages like English and German are so rich in. It is because of these processes that we can conclude from paradigms like Fach—Fächer [fax]—[fεçɐ] in conjunction with the observed complementary distribution of [x] and [ç] (which co-occur only with 'dark' and 'light' vowels, respectively) that we may conclude that [x] and [ç] are allophones of a single phoneme, write it /x/. Moreover, as has been pointed out above, we can be certain that it is the change in the vowel that causes the change in the consonant and not vice versa. All these details go into the classical idea of the phoneme.

    In Mandarin there are no words that have a regular change from, say, [i] (as in 西) in one and [ɿ] (as in 四) or [ʅ] (as in 是) in of their other form of their morphophonetic paradigms because there are no morphonetic paradigms to speak of (except for Erhua 兒化 as in eg 畫畫兒 'paint a painting', see Therefore, subsuming [i, ɿ, ʅ] into a single phoneme /i/ is only supported by complementary distribution. And although Pinyin chooses to identify the unrounded high front vowel [i] with the two apical vowels, there are other solutions possible; for instance, one can argue that there is a single phoneme underlying the initial of re 熱, the coda of er 兒, and the vowels of si 四 and shi 是. This could lead to an orthography that writes 西四是熱兒 as xi, sr, shr, re, er—certainly not less clear or economical than what Pinyin does (and bearing some similarity to the Yale transcription).

    Likewise, although Pinyin x, q, j is in complementary distribution to both s, c, z and sh, ch, zh, analyzing these six sounds into anything less than six phonemes is tchnically possible but let's keep in mind that x, q, j is *also* in complementary distribution to h, k, g. All we can say is that in front of high front vowels [i, y] the consonants written as s—sh—h, c—ch—k, z—zh—g are neutralized to x—q—j, so that it is undecidable whether we should write 西其及 as ?(si, ci, zi), as ?(shi, chi, zhi), or as ?(hi, ki, gi) to get an economical orthography. Phonologically all these solutions are on a more or less equal footing, although relationships to earlier forms of the language and/or to existing close variants of Mandarin may make some solutions appear as more desirable than others. Note that elements of each solution have been historically used by different Mandarin transcriptions, eg. Wade-Giles writes 西 as hsi, and the postal transcription has Nanking, Peking (albeit for other than phonological reasons).

  26. Victor Mair said,

    October 18, 2020 @ 1:23 pm

    Under the title "The First Linguistic Dictionary of Dungan", International Journal of Eurasian Linguistics 1 (2019), 353-356,Juha Janhunen has a review of Olli Salmi, Dungan-English Dictionary. Хуэйзў-Англия хуадян. Manchester: Eastbridge Books, 2018. xii, 406 pp. ISBN 978-1-78869-154-3. I have a pdf of the review.

    At the end of the review, Janhunen lists the following three references:

    Cunvazo, Jusup [Юсуп Цунвазы]. 1984. Хуэйзў йүянди лэйүан хуадян [An etymological dictionary of the Dungan language]. Frunze: “Ilim” chŭbanshè.

    Imazov, M. X. [M. X. Имазов] 1977. Орфография дунганского языка [The orthography of the Dungan language]. Frunze: Izdatel’stvo “Ilim”.

    Salmi, Olli. 1979. An outline of the historical phonology of the dialect of Chengguan, in Linxian, China. Studia Orientalia 51/6: 1–18.

  27. David Marjanović said,

    October 18, 2020 @ 1:40 pm

    What about the -ian final actually being pronounced -ien? That's a good example of arbitariness?

    It's not "pronounced -ien" in the sense of rhyming with -en. The a in yan and -ian is pronounced [æ].

    the falling element in the "ao" diphthong is /ʊ/

    (I assume you mean the actual sound [ʊ] and aren't postulating a phoneme for it.)

    I'm not sure I've actually ever heard it that way. Often it's [ɒ]. Some (tens or hundreds of millions of) people even smooth the whole diphthong to [ɑ], giving them a phonemic contrast between a central (that's a) and a back unrounded open vowel. The same people turn ai into [æ].

    And [ɕ] as realization of /s/ before /i/ is an areal feature, it's also present in Japanese, Korean, and I believe Mongolian.

    Yes, and all of Tungusic. It's a fairly straightforward consequence of the lack of a /s/-/ʃ/ contrast (which some of the languages in question now have, but in restricted ways).

    But the Standard Mandarin x isn't [ɕ]. It's the dorso-palatal sibilant, something the IPA doesn't have a symbol for because there don't seem to be any languages where this rare sound contrasts with [ɕ] or [ç] or [sʲ]. I think the use of [ɕ] is restricted to southern accents that don't retroflex and thus have fewer sibilants to distinguish.

  28. Victor Mair said,

    October 21, 2020 @ 4:11 pm

    For an excellent blog post on the Dungans, including interesting information about Svetlana Rimsky-Korsakoff Dyer, great-granddaughter of the composer and one of the leading scholars on the Dungans, see "Chinese-Russian Muslims: the Dungan people", posted on 13/04/2020 by

RSS feed for comments on this post