Tocharian C: its discovery and implications

[This is a guest post by Douglas Q. Adams]

For over a hundred years now linguists have known of a small Indo-European family comprised of two closely related languages, Tocharian A and Tocharian B, in the Tarim Basin of eastern Central Asia (Chinese Xinjiang). Tocharian B speakers occupied the northern edge of the Tarim Basin, north of the Tarim River, from its origin at the confluence of the Kashgar and Yarkand rivers eastward to about the halfway point to the Tarim’s disappearance into Lop Nor. Politically Tocharian B speakers were certainly the major constituent of the population of the kingdom of Kucha and natively they called the language (in its English form) Kuchean. To the east-north-east, in the Karashahr Basin, were speakers of Tocharian A, centered around Yanqi (Uighur Karashahr, Sanskrit Agni). On the basis of the Sanskrit name this language is sometimes referred to as Agnean, though we do not have any direct or conclusive evidence as to what the speakers themselves called it. To the east-south-east of Kuqa, along the lower Tarim was the historic kingdom of Kroraina (Chinese Loulan < Han Chinese *glu-glân). The administrative language of Loulan was Gandhari Prakrit, obviously imported into the Tarim Basin along with Buddhism from northwestern India. In documents of the Loulan variety of Gandhari Prakrit are non-Gandhari words that have been attributed to the native language of the area. Some of those non-Gandhari words look like Tocharian (e.g., kilme ‘region’ beside TchB kälymiye ‘direction’) and it has seemed a reasonable hypothesis that the native language of Kroraina/Loulan was another Tocharian language, “Tocharian C.” (That the native language of Loulan was Tocharian was first suggested by Thomas Burrow in his The Language of the Kharoṣṭhī Documents from Chinese Turkestan, 1937.) This is a reasonable hypothesis, for which the evidence is admittedly meager, and many have been (reasonably) dubious or unconvinced.

However, in December 2018 Hempen Verlag of Bremen published Klaus T. Schmidt, Nachgelassene Schriften, edited by Stefan Zimmer. One of the two Nachlass documents was an examination of some ten heretofore ignored texts written in the Kharoṣṭhī alphabet, clearly associated with Loulan, in an obviously Tocharian language that is neither Tocharian A nor Tocharian B. Relying on a single, clearly inadequate (in my opinion) phonological feature, Schmidt associates this new language, “Lolanisch” in his terminology, more closely with Tocharian B than with Tocharian A. He groups the medial cluster for the Tocharian C word ‘ox’ (okwson-) with that of Tocharian B okso against Tocharian A ops-. In reality the -kws- cluster of Tocharian C is of Indo-European date and both Tocharian B and A show independent developments thereof. While his argument will not bear the weight he puts on it, his overall conclusion is just as clearly correct. The “secondary” nominal cases in Tocharian C have shapes that are transparently related to those of Tocharian B rather than A when those two languages differ (e.g., the Tocharian C ablative in –ma, the exact match for Tocharian B’s –me, and not Tocharian A’s –V). Likewise, the third person singular of the present tense is marked by –, just as in Tocharian B, and not as in Tocharian A with its –. One can at least imagine the possibility of a continuum of Tocharian dialects along the north side of the Tarim River which developed into two standard, written languages, one around Kucha, the other around Loulan/Kroraina. Tocharian A would have been closely related, but outside that continuum.

This new data firmly establishes the existence of a Tocharian language in the Lop Nor Basin. A rather similar hypothesis, that there was a Tocharian-speaking population in the Gansu Corridor, known to the Chinese as the Yuezhi, is hardly proved by this new data, but it is rendered a bit more plausible in that now we can imagine an unbroken chain of Tocharian languages from the upper Tarim into the Gansu Corridor. The Yuezhi of course, driven from their home by the Xiongnu in the second century BC, migrated to western Central Asia where, ultimately, they were known to the classical world as the Tókharoi. The latter’s name was extended by early investigators (particularly Friedrich W. K .Müller in 1907) to the newly discovered languages of the Tarim Basin (A and B) under the mistaken idea that these peoples represented an eastward reflux of the Tókharoi. This reasoning was clearly wrong, but, if the Yuezhi should happen to have spoken a variety of Tocharian, the name may actually have some historical justification. The classical Tókharoi are now known to have spoken an Iranian language, but it’s quite possible that the incoming Yuezhi (whatever their original language) came to speak the language of the earlier inhabitants of their new home. (Compare the French who today speak a Romance language but whose [partial] ancestors, the Franks, were speakers of Germanic, or the Bulgarians who speak a Slavic language but whose [partial] ancestors, the Bulgars, spoke a variety of Turkic.) Further information and discussion, focusing on the linguistic data and issues, will appear in my review of the book to be published in the Journal of Indo-European Studies.



  1. Victor Mair said,

    April 3, 2019 @ 5:51 pm

    From Douglas Q. Adams:

    After I sent my little summary piece off to you, it occurred to me that there's another semi-implication to the establishment of Tocharian C in the greater Lop Nor region. It would strengthen the case (certainly not prove it, of course) for thinking the mummies may have been pre-Proto-, Proto- or early post-Proto-Tocharian speakers. The mummies would have been there at the right time, and now the right place, to have been attested Tocharian ancestors.

  2. Victor Mair said,

    April 3, 2019 @ 5:52 pm

    I have always felt that the earliest wave of Tarim mummy people were pre-Proto-, Proto-, or early post-Proto-Tocharian speakers. I spelt those ideas out various places, most notably in the "Introduction" ("Priorities") and "Conclusion" (“Die Sprachamöbe: An archeolinguistic parable") in The Bronze Age and Early Iron Age Peoples of Eastern Central Asia, 2 vols. (Washington, D.C.: The Institute for the Study of Man; Philadelphia: The University of Pennsylvania Museum, 1998). Not only that, the latest archeological findings show that the earliest movement of Bronze Age peoples into the Tarim Basin came from the northeast, around the eastern edges of the Tängri Tagh / Tianshan / Heavenly Mountains, and surveys done within the last ten years are so show that there are hundreds of Bronze Age sites precisely in the area around Lop Nor.

  3. Chris Button said,

    April 3, 2019 @ 9:42 pm

    One can at least imagine the possibility of a continuum of Tocharian dialects along the north side of the Tarim River which developed into two standard, written languages, one around Kucha, the other around Loulan/Kroraina. Tocharian A would have been closely related, but outside that continuum.

    Does that also support the notion that Tocharian A (or "East Tocharian") might have been a liturgical language – i.e. separate from Tocharian B and Tocharian C as variant forms of a common vernacular ?

