Tuoba and Xianbei: Turkic and Mongolic elements of the medieval and contemporary Sinitic states

James Millward sent in a very interesting and important communication (copied in full below) touching upon the ethnic composition of what has now become the Peoples Republic of China (PRC) a thousand and more years ago, especially its Turkic and Proto-Turkic components, together with its proto-Mongolic and para-Mongolic congeners.

Since it is of crucial significance for the early middle, middle, and modern history of the East Asian Heartland (EAH) and Extended East Asian Heartland (EEAH) (see the second item by Victor H. Mair in the "Selected readings"), this is a topic that I have long wanted to address in extenso on Language Log, so I welcome Professor Millward's timely submission on the origins and identification of "Tuoba".

Inasmuch as this lengthy post is chiefly about a group called Tuoba (in Modern Standard Mandarin [MSM] pronunciation of the Sinitic / Sinographic transcription of their ethnonym), supposedly a clan of a people called Xianbei (MSM pronunciation of the Sinitic / Sinographic transcription of their ethnonym), and because it is a very thorny and complicated issue having contemporary political implications, we had better gain a modicum of familiarity with who the Tuoba and Xianbei were, as well as where and when they lived.


The Tuoba, (Middle Chinese: *tʰak-bɛt) also known as the Taugast or Tabgach (Old Turkic: Tabγač), was a Xianbei clan in Imperial China.

During the Sixteen Kingdoms period in northern China, the Tuoba clan established and ruled the dynastic state of Dai from 310 to 376. In 386, the Tuoba clan restored Dai, only to rename the dynasty "Wei" (known retroactively in Chinese historiography as the "Northern Wei") in the same year. The Northern Wei was a powerful dynasty that unified northern China after the Sixteen Kingdoms period and became increasingly sinicized. As a result, from 496, the name "Tuoba" disappeared by an edict of the Emperor Xiaowen of Northern Wei, who adopted the Han surname of Yuan (). After the Northern Wei split into the Eastern Wei and Western Wei in 535, the Western Wei briefly restored the Tuoba name in 554.

A branch of the Tanguts originally bore the surname Tuoba, but their chieftains were subsequently bestowed the Chinese surnames Li () and Zhao () by the Tang dynasty and the Song dynasty respectively. The Tangut Tuoba clan later adopted the surname Weiming (嵬名) and eventually established the Western Xia dynasty in northwestern China.

The Tuoba and their Rouran enemies descended from common ancestors. The Weishu stated that the Rourans were of Donghu origins and Tuoba originated from Xianbei, who were also Donghu's descendants. The Donghu ancestors of Tuoba and Rouran were most likely proto-Mongols. Nomadic confederations of Inner Asia were often linguistically diverse, and Tuoba Wei comprised the para-Mongolic Tuoba as well as assimilated Turkic peoples such as Hegu (紇骨) and Yizhan (乙旃); consequently, about one quarter of the Tuoba tribal confederation was composed of Dingling elements as Tuoba migrated from northeastern Mongolia to northern China.

Alexander Vovin (2007) identifies the Tuoba language as a Mongolic language. On the other hand, Juha Janhunen proposed that the Tuoba might have spoken an Oghur Turkic language René Grousset, writing in the early 20th century, identifies the Tuoba as a Turkic tribe. According to Peter Boodberg, a 20th-century scholar, the Tuoba language was essentially Turkic with Mongolic admixture. Chen Sanping observed that the Tuoba language contains both elements. Liu Xueyao stated that the Tuoba may have had their own language which should not be assumed to be identical with any other known languages.



The Xianbei (/ʃjɛnˈb/; Chinese: 鮮卑; pinyin: Xiānbēi) were a Proto-Mongolic ancient nomadic people that once resided in the eastern Eurasian steppes in what is today Mongolia, Inner Mongolia, and Northeastern China. They originated from the Donghu people who splintered into the Wuhuan and Xianbei when they were defeated by the Xiongnu at the end of the 3rd century BC. The Xianbei were largely subordinate to larger nomadic powers and the Han dynasty until they gained prominence in 87 AD by killing the Xiongnu chanyu Youliu. However unlike the Xiongnu, the Xianbei political structure lacked the organization to pose a concerted challenge to the Chinese for most of their time as a nomadic people.

After suffering several defeats by the end of the Three Kingdoms period, the Xianbei migrated south and settled in close proximity to Han society and submitted as vassals, being granted the titles of dukes. As the Xianbei Murong, Tuoba, and Duan tribes were one of the Five Barbarians who were vassals of the Western Jin and Eastern Jin dynasties, they took part in the Uprising of the Five Barbarians as allies of the Eastern Jin against the other four barbarians, the Xiongnu, Jie, Di and Qiang.

The Xianbei were at one point all defeated and conquered by the Di-led Former Qin dynasty before it fell apart not long after its defeat in the Battle of Fei River by the Eastern Jin. The Xianbei later founded their own dynasties and reunited northern China under the Northern Wei dynasty. These states opposed and promoted sinicization at one point or another but trended towards the latter and had merged with the general Chinese population by the Tang dynasty. The Northern Wei also arranged for ethnic Han elites to marry daughters of the Tuoba imperial clan in the 480s. More than fifty percent of Tuoba Xianbei princesses of the Northern Wei were married to southern Han men from the imperial families and aristocrats from southern China of the Southern dynasties who defected and moved north to join the Northern Wei.

Paul Pelliot tentatively reconstructs the Later Han Chinese pronunciation of 鮮卑 as */serbi/, from *Särpi, after noting that Chinese scribes used 鮮 to transcribe Middle Persian sēr (lion) and 卑 to transcribe foreign syllable /pi/; for instance, Sanskrit गोपी gopī "milkmaid, cowherdess" became Middle Chinese 瞿卑 (ɡɨo-piᴇ) (> Mand. qúbēi).

On the one hand, *Särpi may be linked to Mongolic root *ser ~*sir which means "crest, bristle, sticking out, projecting, etc." (cf. Khalkha сэрвэн serven), possibly referring to the Xianbei's horses (semantically analogous with the Turkic ethnonym Yabaqu < Yapağu 'matted hair or wool', later 'a matted-haired animal, i.e. a colt')  On the other hand, Book of Later Han and Book of Wei stated that: before becoming an ethnonym, Xianbei had been a toponym, referring to the Great Xianbei mountains (大鮮卑山), which is now identified as the Greater Khingan range (simplified Chinese: 大兴安岭; traditional Chinese: 大興安嶺; pinyin: Dà Xīng'ān Lǐng).

Shimunek (2018) reconstructs *serbi for Xiānbēi and *širwi for 室韋 Shìwéi < MC *ɕiɪt̚-ɦʉi. This same root might be the origin of ethnonym Sibe.


Now we come to the significant new materials submitted by James Millward:

In a 2019 speech regarding ethnic affairs, PRC President Xi Jinping included the following historical summary:
[VHM:  All MSM transcriptions in this section have been added by me.]

Wǒmen yōujiǔ de lìshǐ shì gè mínzú gòngtóng shūxiě de. Zǎo zài xiānqín shíqí, wǒguó jiù zhújiàn xíngchéngle yǐ yánhuáng huáxià wèi níngjù héxīn,“wǔ fāngzhīmín” gòng tiānxià de jiāoróng géjú. Qín guó “shū tóngwén, chē tóng guǐ, liàng tóng héng, xíng tóng lún”, kāiqǐle Zhōngguó tǒngyī de duō mínzú guójiā fāzhǎn de lìchéng. Cǐhòu, wúlùn nǎge mínzú rù zhǔ zhōngyuán, dōu yǐ tǒngyī tiānxià wéi jǐrèn, dōu yǐ Zhōnghuá wénhuà de zhèngtǒng zìjū. Fēnlì rú Nánběicháo, dōu zì xǔ Zhōnghuá zhèngtǒng; duìzhì rú Sòng Liáo Xià Jīn, dōu bèi chēng wéi “táohuā shí”; tǒngyī rú Qín Hàn, Suí Táng, Yuán Míng Qīng, gèng shì “liù hé tóng fēng, jiǔ zhōu gòng guàn”. Qín Hàn xióngfēng, dà táng qìxiàng, kāng qián shèngshì, dōu shì gè mínzú gòngtóng zhùjiù de lìshǐ. Jīntiān, wǒmen shíxiàn Zhōngguó mèng, jiù yào jǐnjǐn yīkào gè zú rénmín de lìliàng.

(Xi Jinping, “Speech at the National Commendation Meeting for Minzu Unity and Progress” (Zai quanguo minzu tuanjie jinbu biaozhang dahui shang de jianghua) 27 September 2019, posted on the website of the National Ethnic Affairs Commission of the PRC
My question is about táohuā shí 桃花石 "peach blossom stone," but first, here's my translation of Xi's passage: 
Our long history is written by all ethnic groups.  As early as the Qin era, our country gradually formed into an amalgamated configuration from a nucleus coalesced from the Hua-Xia people under Emperor Yan and the Yellow Emperor, and the “peoples of five directions” sharing Tianxia [VHM:  "All-Under-Heaven"].  Qin’s standardization of written script, carriage axle-lengths, weights and measures, and customs and values, launched the process of development of China’s unified multi-national state. After that, no matter what minzu came into the Central Plain to live, all saw unifying Tianxia as their duty, and all considered themselves orthodox in Chinese cultural terms.  If divided like the Southern and Northern Dynasties, they all bragged that they were Chinese orthodox (Zhonghua zhengtong).  If in mutual confrontation, like Song, Liao, Xixia (Tangut) and Jin, all were called Tabghach [i.e., all were called “China” by outsiders].  When unified like Qin and Han, Sui and Tang, Yuan, Ming and Qing, it was even more a case of “customs and civilization in the Six Directions are all as one, laws and decrees apply uniformly across the Nine Divisions.”  The heroic style of Qin and Han, the vital spirit of Great Tang, the prosperous age of Kangxi and Qianlong emperors [of the Qing] were all history collectively forged by every minzu.  Today, in realizing the China Dream, we must closely rely on the strength of the people of every minzu.
[VHM:  Note that Millward leaves mínzú 民族 untranslated.  It can mean lots of things:  nation, nationality, people, ethnic group, race, volk.  Since these are all touchy terms, it is understandable why the translator would shy away from them.  See the "Selected readings" for in-depth discussions of mínzú 民族.]

There were a number of on-line articles explaining the quotes and allusions in this speech of Xi's, and one of the things they explained was "taohuashi".  This Chinese wiki about it states in its first line that the name "taohuashi" is Old Turkic and mentions three theories about the origins of the word Tabghach, which as you know came to be used in a generic sense for "China" in such places as the Orkhon [Old Turkic] inscriptions and even a Byzantine text.  I still haven't figured out how to use the medieval and ancient Chinese pronunciation dictionaries — are there online versions now?  I really should, but in the meanwhile, what do you think of the etymologies linking Tabghach to "Táng jiā 唐家" or “Dà Hàn 大汉”?  I am suspicious of both, since they seem so sino-centric and apparently phonetically distant from Tabghach.  I don't know the Tang or pre-Tang pronunciation of tuòbá 拓跋; but seemingly you'd need a metathesis to get the labial b before the alveolar (or further back) g / k/ kh.  I don't know how probable that is.  

Notes by VHM:

Trying to link "Táng jiā 唐家" or “Dà Hàn 大汉” to Tabgach makes no sense.  Phonologically they are remote from Tabgach, even in Middle Sinitic reconstruction.  Semantically they both strain credulity, since the former means "Tang family" and the latter means "Great Han", whereas "Tabgach" is surely the transcription of a non-Sinitic ethnonym.

tuòbá 拓跋

Middle Sinitic: /tʰɑk̚  buɑt̚/



táohuāshí 桃花石

Middle Sinitic: /dɑu  hˠua  d͡ʑiᴇk̚/

Etymology 1



      1. peach blossom stone (a pink-coloured mineral produced in Hunan)

Etymology 2

Phono-semantic matching of Turkic; compare Old Turkic (tabγač, Tuoba, a Xianbei clan; the Chinese). Compare 拓跋 (MC tʰɑk̚ buɑt̚), 拓拔氏 (MC tʰɑk̚ bʉɐt̚ d͡ʑiᴇX).

Proper noun


      1. (obsolete) Name of China used by people in Central Asia in the 13th century.


Anyway, do you have any opinion on these three theories about "Tabghach"?  My surmise is that when bringing this up, Xi's speech-writer thought he was saying "Táng jiā 唐家" in an erudite way, and not pointing out that "China" has been called by Altaic names for centuries.   Of course, it's perfectly consistent with other examples (Khitai, and Qin = Sin = China itself) that the indigenous name of the Northern Wei (Tuoba / Tabghach) would be used across Eurasia to refer to states in north China even centuries after the Wei state was gone.

Would Tabghach perhaps reflect an Altaic pronunciation from which Tuoba was a transcription into Chinese? or might Tabghach be an Altaicization of Tuoba (as one of these theories suggests)–but then where did Tuoba come from in the first place?

The issues that Jim Millward has raised are of utmost significance for coming to terms with Xi Jinping's pseudo-ethno-etymological maunderings.  They may seem plausible for Han / PRC patriots and nationalists, but their ramifications for genuine historical and linguistic research are worrisomely disturbing.


  1. martin schwartz said,

    May 16, 2022 @ 5:09 pm

    All I'm competent to say is that as to the paragraph "Paul Pelliot…",
    the (late) Middle Persian word for 'lion' is not sēr but šēr;
    the earlier Middle Persian form is šaGr (my G = gamma, voiced velar fricative). The other Middle Iranian languages have š- in cognate forms except for Khwarezmian, which has sVrG. What consequence
    the correct š- has for Pelliot's suggestion I cannot say, but it seems relevant.
    Martin Schwartz

  2. Mehmet Oguz Derin said,

    May 16, 2022 @ 5:36 pm

    I am not done with the complete legwork yet, but there is one (hypothetical) element (at least) in Turkic languages that might help with the linking here. It might be the case that the original pluralizer of Turkic was a rather a simpler consonant (making the common +lAr a compound than an atom). Though, by the time of writing (Old Turkic script), this one source consonant's realization might have had entropy towards stabilizing to (listing attested ones) -t, -d, -s, -r, or the most common single consonantal representation -z. It might have been recorded or stabilized as -c (or -ch, depending on how you spell this consonant) situationally. Hence, the "Tabga" or the "Tʰak-bɛ" form might contain more important hints than the "Tabgac" or the "Tʰak-bɛt" form.

  3. Victor Mair said,

    May 17, 2022 @ 11:59 am

    From Peter Golden:

    Many thanks for this excellent post. I have a long-standing interest in the Xianbei question. Xi Jinping’s comments are a wonderful exercise in sophistry (with a hard edge).

    I have a new article ("The Kaepiči [Каепичи]") that appeared in the Festschrift marking András Róna-Tas’s 90th birthday. I cannot yet post it publicly (in Academia.edu, Research Gate, etc.) because it is so recent. Once you get past the initial pages that focus on the Kaepiči in Rus’, the article touches on their origins in the Serbi-Mongolic world.

    VHM: I will make a separate post announcing the publication of the stupendous volume in honor of András Róna-Tas, which has something for practically all readers of Language Log.

  4. Kingfisher said,

    May 17, 2022 @ 5:51 pm

    I had thought that Tabghach being the initial name of this clan of the Xianbei, of which Tuoba was a Sinicized form, was the standard theory behind the name.

    What exactly is the nefarious implication? That Tabghach / 桃花石 is being coopted into being an ethnically Han term or repurposed to mean a Han state? I cannot see that here, nor do I agree with Millward's initial impression in assuming that the speech's use of 桃花石 is meant to be parsed as 唐家. It seems to me that 桃花石 was chosen for the speech precisely because it is a generic term that both roughly means "China" while not referring to a specific state, embracing the full set of states then existing in the Chinese milieu regardless of their ethnic differences. In the relevant sentence:


    "Even when [control of the Chinese realm] was split between [the varying ethnic states of] Song [Han], Liao [Khitan], Xia [Tangut], and Jin [Jurchen], all of them were still referred to as 桃花石 [by foreigners, who did not regard any of them as not being Chinese]."

    In that regard, one could just as well substitute 桃花石 for "Cathay" without changing the meaning. This passage of the speech seems to be about affirming each people's claim to represent and lead China, not Han nationalism. To the extent there is cooption, it would be in claiming that all of these states did indeed perceive themselves (and were perceived by others) as being Chinese, and at least for the Tuoba this certainly became the case; the dynastic histories are not lacking in examples (and certainly they may be filled with lies, but that's not something one can pin on the PRC). That appears to be the sentiment behind "each ethnic group having written their part in our long history".

    (If anything, the problem with the passage seems to be its insincerity, considering the harsh assimilationist policies which the PRC has been carrying out in Xinjiang, Tibet, Mongolia, and elsewhere. One could easily imagine reworking the examples in the passage to make the same message of "E Pluribus Unum" apply to the West and its many peoples now coming together under the EU; it is not a sentiment easily associated with the actual actions of the PRC.)

  5. Hiroshi Kumamoto said,

    May 17, 2022 @ 9:12 pm

    The passage by Victor referred to by Professor Schwartz seems to be based on the remarks by Hoong Teik Toh, The -yu Ending in Xiongnu, Xianbei, and Gaoju Onomastica (Sino-Platonic Papers 146, 2005), p. 10:

    "Paul Pelliot proposed *Särbi / *Serbi (1921: 331, 1928-29: 142) based on the observation that the Chinese used 師 to transcribe Persian šēr "lion" (cf. Laufer: 4) and on the assumption that the tribal name Shiwei 室韋 *sek ui ~ 失韋 *sit ui might have been a later form of Xianbei."

    Going back to Pelliot' s articles mentioned there (both available in PDF from https://altaica.ru/personalia/e_pelliot.php ):

    "Note sur les Tou-yu-houen et les Sou-p'i", TP 1921, 323-331
    "L'édition collective des oeuvres de Wang Kouo-wei" TP 1928-29, 113-182

    we notice (see esp. p. 142 fn. 6 of the latter) that Pelliot doesn't say what Victor writes there (no Middle Persian sēr). The second half of Victor's remark on the -pi part is not found in Pelliot, but comes from Toh's paper (p. 12). Incidentally the initials of 師 and 室 are similar but not identical; so Pelliot's suggestion is not quite happy unless some dialectal / later forms are to be assumed. I don't see any justification for Toh's "室 *sek", which is to be reconstructed with a dental final -t just like 失.

  6. Victor Mair said,

    May 18, 2022 @ 6:56 am

    @Hiroshi Kumamoto

    Thanks for clarifying the quotations ascribed to Paul Pelliot. You attribute them to "Victor", but they are from this source (as marked at the end of that indented section), not from me.

  7. James Millward said,

    May 18, 2022 @ 9:41 am

    Thanks to everyone for entertaining this question, and for the comments, and especially Victor for writing this out so clearly.
    I'm embarrassed he conveyed my admission about not knowing how to look up middle sinitic pronunciations, etc.

    In response to Kingfisher: The suggested etymologies to 唐家 and 大漢 were from this: https://zh.m.wikipedia.org/zh-hans/%E6%A1%83%E8%8A%B1%E7%9F%B3

    and Victor's comments confirm my suspicions. Yes, I am speculating about Xi or Xi's speechwriter having "Tang jia" in the back of their mind when writing 桃花石 . But they could have used the simple transcription, the much better-known term " Tuoba." "Peach blossom stone" by contrast has meaning in Chinese (it is not just characters used for their phonetic value) and sounds very romantic in MSM; it is of course also richly sinitic, both in reference to "stone" and to "taohua", with echoes of Peach Blossom Spring, a legendary utopian hidden kingdom. It is only a guess, but my guess based on what I know about ethnic policy in the PRC these days, that it was those Chinese symbols the speech is going for.

    Kingfisher also suggests correctly that the overall point of the line is abundantly inclusive: all of these states, including the Song but also non-Han Jin, Liao and even Xixia, were equally known as "China" (that is, as Tabghach, which meant "China"). That is in keeping with his overall argument: all minzu that play a role in Chinese history (he doesn't say 56 minzu) are part of Chinese history, and always have been. This full-blown inclusionism is subtly but powerfully different from previous PRC approaches, such as Fei Xiaotong's 多元一體格局 that the speech references with certain keywords at the start. In a nutshell, the difference is this: Fei (and past PRC) had the various minzu coming together like a snowball over time around a Han core, and emerging in modern times as "Zhonghua," becoming self-conscious as Zhonghua in the late-19th and early 20th c. Xi and current theory keeps some of that terminology, but he keeps pushing the moment of Zhonghua amalgamation back further in time, while keeping it all vague. But here, it's clear, everything was Zhonghua with the Qin unification or even before. And all peoples after that were Zhonghua all along. This is in keeping with arguments that the Uyghurs aren't Turkic, because they are Zhonghua–a point made in an editorial by the mayor of Urumchi lately. Some careerist linguists have been trying to argue that Uyghur is a Sinic, not a Turkic language. In short, current doctrine increasingly says that China was China avant la lettre–well before, and expects that point to be echoed in Chinese academic work. And this is having real consequences on history, archaeology and, I wouldn't be surprised, linguistics. It is doing ethnographically was has been done territorially: taking the current borders of PRC and saying anything that ever happened in that territory is Chinese history. Another implication is the erasure of conflict: Han were Zhonghua, but so were Xiongnu (and Rong, Di, Hu, etc.). Qing conquest of Xinjiang was reunification. "China" has always been peaceful, has never invaded anyone, because everyone and everywhere, including territorial acquisitions by the PRC, are retroactively by definition Zhonghua peoples and places. It's all unifying Tianxia, the goal of all good Zhonghua rulers.

    Sorry for this long off-topic comment–but that's how I'm reading the speech, and why Xi's peculiar reference to the Tabghach, by the term Taohuashi, interests me.

