Xiongnu (Hunnic) Shanyu

One of the most hotly debated questions in early Chinese studies is the origin and pronunciation of the title of the ruler of the Xiongnu (Huns), which is written with these two Sinographs, 單于.  The current scholarly consensus is that the Modern Standard Mandarin (MSM) pronunciation should be chányú.  Although it is much contested, the current scholarly consensus for the pronunciation of the name of the son of the first Xiongnu ruler, Tóumàn, is Mòdú (r. 209-174 BC): 

Modun, Maodun, Modu (simplified Chinese: 冒顿单于; traditional Chinese: 冒頓單于; pinyin: Mòdùn Chányú ~ Màodùn Chányú, c. 234 – c. 174 BCE), also known as Mete khan across a number of Turkic languages, was the son of Touman and the founder of the empire of the Xiongnu. He came to power by ordering his men to kill his father in 209 BCE.


The following is a guest post by Penglin Wang, which takes a different approach, and for the first time offers a novel source for the Hunnic title.  The state he refers to is Shanshan, better known as Loulan, which would make its language Indo-European (Tocharian or Gandhari Prakrit), for which see here.

For caṃkura as a Gandhari Prakrit title, see A Dictionary of Gāndhārī here.


Xiongnu Shanyu (單于) and Shanshan Caṃkura (An Official Title)

Historians have pointed to Xiongnu as a regional power interacting with and exploiting statelets in the Western Regions, especially in the Tarim Basin, which featured urban centers known for their cultural brilliance. The inception of the title shanyu dates from the late third century BCE and was used for the Xiongnu leader Touman and his son Maodun, who committed patricide to usurp the shanyu throne, and further adopted by subsequent successors. Since both shanyu and caṃkura (čaṃkura) functioned as official titles respectively by the Xiongnu and the Shanshan rulers, to connect them together etymologically depends on their phonetic comparability. Hanshu (94.3828) decisively resolved the pronunciation of the Xiongnu title equivalent to shanyu (善于 ["good at"]). Lead phonologists have converged in reconstructing *‘źi̯än for shan and *gi̯u for yu (Karlgren 1923:251 #854, 371 #1317) and the like for each. More significant in the latter phonetic context is the use of character 于 in Yutian (于窴) in Shiji (123.3169) and Yutian (于闐) in Hanshu (96.3881) as Chinese transcriptions of the name Khotan, with both forms containing the same yu used in shanyu. In its early spread through the Tarim Basin into Inner Asia and beyond, Khotan became known to people with different linguistic backgrounds. In profiling Khotan, Datang Xiyuji completed in 646 offered several of its alternative names in different languages: Qusadanna (瞿薩旦那) meaning ‘the milk of earth’ as translated into Chinese, Huanna (涣那) in local language, Yudun (于遁) in Xiongnu, and Qudan (屈丹) in Indic. For these names for Khotan the actual dates of initial use are probably considerably earlier. Concerning the attestation of the Xiongnu name Yudun for Khotan, it recurs in the early eleventh century when Xin Tangshu (221.6235) was written, its survival in the Chinese sources should be appreciated in the study of the title shanyu because of its sharing of the same character 于 in Yudun and can thus serve as strong evidence in favor of the phonetic value of yu both in shanyu and in yudun recurrently corresponding to the ku of Shanshan (鄯善 or Kroraina) caṃkura and kho of Shanshan Khotaṃna in Kharoṣṭhī script in the third century. The third 于 is found in the Xiongnu title huyu (護于 Hanshu 94.3827) in turn connected with Shanshan official title ogu. Based on these correspondences I propose that the pronunciation of shanyu be adjusted to ǰanğu, with ra of caṃkura unrepresented, and of huyu to ɣoğu.


References and Sources

Burrow, T. 1937. The Language of the Kharoṣṭhī Documents from Chinese Turkestan. Cambridge: Cambridge University Press.

Datang xiyuji (The great Tang records on the Western Regions). https://zh.wikisource. org/wiki/大唐西域記/.

Hanshu (漢書 Book of Han) compiled by Ban Gu in 82. http://hanchi.ihp.sinica.edu.tw/ihp/hanji.htm.

Karlgren, B. 1923. Analytic dictionary of Chinese and Sino-Japanese. Paris: Librairie Orientaliste Paul Geuthner.

Shiji (史記 Records of the Grand Historian) written by Sima Qian in 91 BC. http://hanchi. ihp.sinica.edu.tw/ihp/hanji.htm.

Xin Tangshu (新唐書 New Book of Tang) compiled under Ouyang Xiu in 1060. http://hanchi.ihp.sinica.edu.tw/ihp/hanji.htm.


Selected readings


  1. Peter B. Golden said,

    July 16, 2021 @ 12:59 pm

    This is a minefield, but an interesting one. Some random notes: More recent reconstructions of 單于 should have been noted, cf. Baxter and Sagart, Old Chinese. A New Reconstruction (Oxford: OUP, 2014): 260,331: MC dzyen.hju < W. Han dar-ɦwa (< OC [d]ar + ɢw(r)a and 單于 tan hju, also Schuessler, Minimal Old Chinese and Later Han Chinese (Honolulu: University of Hawai’I Press, 2009): 255 [24-21az, a], 50 [1-23a]: Old Chin. *dan wa, Late Han dźan wɑ, Middle Chin. źjän ju or OC *tân wa, LHan tɑn wɑ MC tân ju. Beckwith, Empires of the Silk Road (Princeton: PUP, 2009):387, n.7 suggests that the OC reconstruction rendered “something like” *darġa and then *danġa which he compares with the later Činggisid-era Mongol title daruġači in Yuan China (daruġa elsewhere). Beckwith suggests that the title may not be original with the Xiongnu. Mong. daruġa, it should be noted, is from daru- “to press, press down” and is rendered in Turkic as basqaq (title well known in the Ulus of Jochi) *bağatur) of the Chinese accounts, Christopher I. Beckwith and Gisaburo N. Kiyose† “Apocope of Late Old Chinese Short ă: Early Central Asian Loanword and Old Japanese Evidence for Old Chinese Disyllabic Morphemes.” Acta Orientalia Academiae Scientiarum Hungaricae 71.2 (2018): 154: Mòdùn 冒顿 MC. *mǝk twǝn3 Hvatäna, Hvataṃ Hvaṃna, Hvana, Hvani and Hvaṃ-kṣīra, Soġd. xwδnyk (H. Bailey, The Culture of the Sakas in Ancient Iranian Khotan (Delmar, NY: Caravan Books,1982: 2-3). In Turkic Udun (Kāšġarī, Dīwān Luġāt al-Turk, ed. trans. R. Dankoff, 1982-1985, I: 50 and the 11th century Uyğur translation by Šiŋqo Šeli Tutuŋ of the biography of Xuanzang, V-6321 et passim; Tuguševa, L. Ju. (ed, trans.), 1991. Ujgurskaja versija biografii Sjuan’-czana (fragmenty iz leningradskogo rukopisnogo sobranija Instituta vostokovedenija AN SSSR (Moskva: Nauka): 77, et passim, 293-294.

  2. Chris Button said,

    July 16, 2021 @ 4:26 pm

    cf. Baxter and Sagart, Old Chinese. A New Reconstruction (Oxford: OUP, 2014): 260,331: MC dzyen.hju < W. Han dar-ɦwa (< OC [d]ar + ɢw(r)a and 單于 tan hju

    Beckwith, Empires of the Silk Road (Princeton: PUP, 2009):387, n.7 suggests that the OC reconstruction rendered “something like” *darġa and then *danġa which he compares with the later Činggisid-era Mongol title daruġači in Yuan China (daruġa elsewhere).

    I'd agree with Beckwith that there was no original labial segment in 于. Where Baxter & Sagart have /ɢw/, I would simply go with /ʁ/ and treat the rounding as a later development associated with the uvular articulation. I don't see how Beckwith's "ġ" /ɣ/ can account for the later rounding, but at least his /ɣ/ and my /ʁ/ are close.

    Separately a straight shift of coda /r/ to /n/ is unconvincing for me (those alternations in OC phonetic series come from /r/ merging with /l/ and there being an occasional /l/ ~ /n/ confusion in OC). In this case, we might simply be dealing with the uvular articulation of the onset in /ʁ/ 于 coloring the /n/ coda of 單 to make its use for /r/ in the source phonetically appropriate.

  3. R. Fenwick said,

    July 17, 2021 @ 5:31 am

    Regarding the suggestion that transcriptions of the name of Khotan with the character 于 offer support for a reconstruction *gi̯u: the claim is fine in itself, but it’s served equally well or better with one or more of the alternative reconstructions of Baxter and Sagart (*ɢʷ(r)a) as well as Zhengzhang’s Old Chinese (*ɢʷa) and Coblin’s Eastern Han (*gjwah). I agree with Peter Golden that it seems peculiar not to cite more recent reconstructions. While Karlgren’s work was important, it’s also very close to a century old and a great deal of fruitful work has been done in the field since then.

    Personally, on this topic I find Sasha Vovin’s hypothesis—building upon the earlier suggestion of Edwin Pulleyblank to etymologise the Xiongnu titles 單于 and 護于 as Yeniseian (leaving the hypothesis of daruγa as an internal Mongolic derivative as a mere folk-etymology)—more compelling. It’s been discussed here in comments before, and I’ll happily content myself with just linking to the post here. What doesn’t appear to have been mentioned before is that a Yeniseian origin would also match well with the interconnection suggested by Stefan Georg between Xiongnu 撐犁 ‘sky, heaven’, Common Turkic *teŋrī, Middle Mongol tngri ‘id.’, and Proto-Yeniseian *tɨŋgᴠr ‘high’ (cp. Ket tɨ́ŋkɨl, Yugh tɨŋgɨl, Pumpokol tokardu ‘id.’). Not to mention that this very same Xiongnu lexeme appears also in the chanyu’s full titulary as attested in the Hanshu, viz. 撐犁孤塗單于, giving some indirect support to the idea that 單于 might also be of Yeniseian origin.

  4. Chris Button said,

    July 17, 2021 @ 6:49 am

    (*ɢʷ(r)a) as well as Zhengzhang’s Old Chinese (*ɢʷa)

    Let's at least get those pesky raised ʷ out of the Old Chinese forms and assume the later rounding in the syllable was a secondary development. Also *ɢ- is hardly likely from an articulatory perspective, so let's stick with *ʁ- please.

  5. Chris Button said,

    July 17, 2021 @ 6:57 am

    Common Turkic *teŋrī, Middle Mongol tngri ‘id.’, and Proto-Yeniseian *tɨŋgᴠr ‘high’ (cp. Ket tɨ́ŋkɨl, Yugh tɨŋgɨl, Pumpokol tokardu ‘id.’).

    The recent mistaken reconstructions of 天 with an original lateral have pulled 天 away from these forms, but the correspondence is tempting. At least Schuessler (2007:495, 2009:319) still keeps *tʰ for 天 (the original phonetic component is 丁 after all).

  6. R. Fenwick said,

    July 18, 2021 @ 4:42 am

    @Chris Button: your comment reads as a direct rebuke to me, but I can only cite the forms as reconstructed by the researchers I named; offering critique of the the internal Sinitic reconstructions is well beyond my expertise.

    (Also in my defence, as a Caucasologist I'm also just very used to phonologies others would dismiss as typologically extreme: especially in the context of Old Chinese likely having a vertical vowel system, another typological rarity that you've explained is nonetheless what's supported by the data, I suppose I'm primed to view even phonemic labialisation as entirely unsurprising—it's a perfectly normal consequence of the formation of a vertical vowel system. And my Caucasian background means I don't really flinch at *ɢ any more either; at least *ɢ and *ɢʷ must be reconstructed for Proto-North-West Caucasian, and perhaps also *ɢʲ and *ɢᶣ too.)

  7. Chris Button said,

    July 18, 2021 @ 5:28 am

    @ R. Fenwick

    No not a rebuke; more a plea to the field. So nothing personal intended. Although, as I think you’re aware, personally I always love your NWC contributions to Language Log.

    The comparison between the viability vertical vowel systems and voiced uvular plosives does obscure one key point though. The vertical vowel system is a question of underlying phonology (as we all know, all those NWC languages are full of vowels on the surface when people actually speak—as would have been Old Chinese); a voiced uvular plosive is however an articulatory challenge on the surface. That’s not to say voiced uvular plosives don’t exist (we have an IP symbol after all) but they are incredibly rare. There’s also a good reason why we don’t have a uvular nasal in Old Chinese too.

    A broader connected question is what is the aim of historical reconstruction. If we’re just going for abstract symbols or notations of differences without any regard for how things may surface, then it just becomes an exercise in algebraic symbols. That then gives license to keep piling on other variables to rigidly account for unexpected or less common divergences, rather than just look at it from the perspective of surface variation between speakers in a living language at any point in time.

  8. Chris Button said,

    July 18, 2021 @ 5:53 am

    Oh and as for ʷ, the issue is that I don’t see a need for it in this particular word in Old Chinese since we can account for the rounding via the uvular. Incidentally, the same thing happens in coda position. The Pulleyblank formulation that I use is his 1977-8 proposal for -ʁ and -q where others may have something like -w and -kʷ or -kw. The distributional lack of an otherwise likely nasal counterpart is readily accounted for if a uvular reconstruction is adopted. Furthermore the inconsistency of rounding in Middle Chinese reflexes is better accounted for by the uvular itself not being rounded with the secondary rounding then occurring in some instances. Why Pulleyblank abandoned that proposal is another matter (personally I think it’s brilliant).

  9. David Marjanović said,

    July 18, 2021 @ 6:44 am

    Separately a straight shift of coda /r/ to /n/ is unconvincing for me

    Do you mean in this particular case, or as a matter of typological probability? Because r/n shifts in both directions are reconstructed elsewhere.

    (I also prefer Baxter & Sagart's explanation through different dialects, for which they present some independent evidence, over "occasional confusion" as a matter of scientific principle: it's a lot more easily falsifiable.)

    There’s also a good reason why we don’t have a uvular nasal in Old Chinese too.

    Uvular nasals are about as common as [q] phonetically. A phonemic distinction between velar and uvular nasals is what's extremely rare (quite a bit more so than [ɢ] and /ɢ/).

  10. Chris Button said,

    July 18, 2021 @ 7:28 am

    @ David Marjanović

    I don’t quite follow your comment about uvular nasals. And, to be clear, I’m not talking about allophones of nasals before uvular consonants, but standalone uvular nasals.

    Unfortunately I’m not sure I’m completely following your comment on dialects either. Languages are living things, and whether you call it “dialect” or “allofamic variation” (or whatever) doesn’t really matter when reconstructing it. What matters is the evidence, how much you have of it and whether it’s linguistically plausible on internal and typological grounds.

