The sound and sense of Tocharian

« previous post | next post »

Readers of Language Log will certainly be aware of Tocharian, but when I began my international research project on the Tarim Basin mummies in 1991, very few people — only a tiny handful of esoteric researchers — had ever heard of the Tocharians and their language since they went extinct more than a millennium ago, until fragmentary manuscripts were discovered in the early part of the 20th century and were deciphered by Sieg und Siegling (I always love the sound of their surnames linked together by "und"), two German Indologists / philologists — Emil Sieg (1866-1951) and Wilhelm Siegling (1880-1946), in the first decade of the last century.

It wasn't long after the decipherment of Tocharian by Sieg und Siegling that historical linguists began to realize the monumental importance of this hitherto completely unknown language.  First of all, it is the second oldest — after Hittite — Indo-European language to branch off from PIE.  Second, even though its historical seat was on the back doorstep of Sinitic and it loaned many significant words (e.g., "honey", "lion") to the latter, it is a centum (Hellenic, Celtic, Italic and Germanic) language lying to the east of the satem (Indo-Iranian and Balto-Slavic) IE languages.  (PROVISO:  some sophists will undoubtedly argue that the centum-satem split in Indo-European is meaningless; it has happened before on Language Log and elsewhere, but I think it does matter for the history of IE languages and the people who spoke them.)  Third, Tocharian has grammatical features that resemble Italic, Celtic, and Germanic (i.e., northwest European languages) more than they do the other branches of IE.  (STIPULATION:  certain casuists will surely argue that such differences are meaningless, but I believe they are crucial for comprehending the nature of the spread of IE in time and space.)  Etc.

Because their physical, textual, and cultural remains were indisputably found in the Tarim Basin, the Tocharians naturally became a primary focus of my investigations in Eastern Central Asia during the more than two decades from the nineties through 2012.

As I mentioned in another recent post:

…it was my great, good fortune to work together for years with J.P. Mallory, who is both a dirt archeologist and a solid historical linguist, on the mummies of the Tarim Basin.  In his lifelong dedication to solving the "Tocharian Problem", Mallory has always focused his attention both on the archeological and the linguistic aspects of the conundrum.

As Jim said in an e-mail that he wrote to me on Friday, "I'm an archaeologist and would prefer to die with a trowel in my hand."  Since he is also a great historical linguist, I take this ringing declaration as confirmation of my cherished belief that archeologists and linguists need to work closely together in order to make sense of the data concerning the origins and evolution of human cultures.

Another great, good fortune of mine is being at the same university as Donald Ringe.  I took his Tocharian class about twenty years ago.  There were about twenty participants in that class.  The course was a real mind-bender, but it produced several students who went on to become outstanding Tocharianists in their own right, one such being Ron Kim.  Since I had forgotten about three-quarters of what I had learned the first time, I took it again this semester as a refresher.  This time around there are only about seven or eight students in the class, but I predict that several of them will go on to become distinguished practitioners of the noble art of philology, one such being Diana Shuheng Zhang, who has superlative Chinese skills, e.g., "Ancient Chinese mottos" (4/5/20), "Bear talk" (11/15/19), etc.  And here's a little sample of what Diana can already do with Tocharian:  "Tocharian love poem" (4/1/20).  Incidentally, Diana is also advanced in Sanskrit and knows Prakrit, Gandhari, and so on (no point in listing all of her other languages here).

To tell the truth, though, what really prompted me to write this post is that I happened upon these recordings of Tocharian on a site that also offers recordings of scores of other languages, some of them quite ancient and obscure (e.g., Hittite, Old Hittite, Old Egyptian, Old Japanese, Avestan, Khotanese, Proto-Indo-European, Gothic, Karachay-Balkar, Emilian, Polabian, Aromanian, Melpa, and so forth).

"The Sound of the Tocharian A Language (Excerpt from the Maitreyasamiti-nāṭaka)"

"The Sound of the Tokharian A Language (Punyavanta Jataka) — Agnean"

"The Sound of the Tocharian B Language (Excerpt from the 'Last Supper of the Buddha)"

"The Sound of the Common Tokharian Language (A Short Story) — Proto-Tocharian, 1000 BC"

Some friends who have heard these recordings got the impression that the persons who are making them might be native speakers of Russian or some other Slavic language.  What do you think?

Selected readings

1. the title

2. the language, the people, and their history

3. archeology and language

The origin of the Tocharians and their relationship to the Yuezhi (月氏) have been debated for more than a century, since the discovery of the Tocharian language. This debate has led to progress on both the scope and depth of our knowledge about the origin of the Indo-European language family and of the Indo-Europeans. Archaeological evidence supporting these theories, however, has until now sadly been lackin



10 Comments

  1. David Marjanović said,

    May 4, 2020 @ 8:06 am

    "The Sound of the Tocharian A Language (Excerpt from the Maitreyasamiti-nāṭaka)"

    The accent of the introduction is not Russian. It could be Beng[ɒː]li.

    The accent used to read the text could also be Bengali (with additions necessary for Tocharian, like ly – a sound Russian happens to have – and ä), but its intonation is so consistent that I think we're listening to a computer.

    [Tocharian] loaned many significant words (e.g., "honey", "lion") to [Sinitic]

    I'm curious why you think "many". I mean, I appreciate how difficult the problem is (involving the reconstruction of various Pre-Tocharian stages and the pronunciation of various northwestern Sinitic varieties at various times…), so it is definitely likely that some have not yet been identified. But the lists of Tocharian loanwords in Sinitic that have been proposed in the last 25 years are rather short; "honey" is the only one everybody agrees on.

    centum (Hellenic, Celtic, Italic and Germanic)

    And Hittite + Palaic within Anatolian.

    PROVISO: some sophists will undoubtedly argue that the centum-satem split in Indo-European is meaningless; it has happened before on Language Log and elsewhere, but I think it does matter for the history of IE languages and the people who spoke them.

    Do you think Tocharian had early intense contact with another centum branch?

  2. Victor Mair said,

    May 4, 2020 @ 8:37 am

    How many things are there in the world that "everybody agrees on"? That the earth is round? The pronunciation of "spaghetti" by young children?

  3. Kenny Easwaran said,

    May 4, 2020 @ 10:25 am

    Is there a good place we can read about the relationships between Tocharian and the western Indo-European languages, to understand more which of these relationships are shared innovations that indicate sustained contact, rather than preservations of the earlier state that don't indicate any particular affinity?

  4. Andreas Johansson said,

    May 4, 2020 @ 1:29 pm

    If I understand correctly, the centumness of Hittite and Palaic must be convergent with that of the western branches, because Luwian and therefore proto-Anatolian retain a three-way distinction, but that of Tocharian is at least potentially a shared innovation with Germanic etc.

  5. M. Paul Shore said,

    May 4, 2020 @ 7:20 pm

    Glad to know that J. P. Mallory–whose Encyclopedia of Indo-European Culture, which he co-edited with D. Q. Adams, is a work I've perused with great fascination–has no intentions of throwing in the trowel. I suppose if he does die with a trowel in his hand, "Nobody Knows the Trowel I've Seen" would be an appropriate selection for the obsequies.

    But seriously . . .

    I thought it might be worth mentioning that I first heard of the Tocharians and their languages decades ago from Calvert Watkins' articles and associated materials on Indo-European in the unabridged American Heritage Dictionary (which debuted in 1969). I've heard that many current Indo-Europeanists' interest in their field originated with those same articles and materials; and undoubtedly there are thousands more individuals, maybe tens or hundreds of thousands, who've been made at least minimally aware of the Tocharians through the AHD even though they have no professional involvement in linguistics or archeology. So that'd be an example of knowledge of the Tocharians in recent decades that, thankfully, extends beyond just a handful of esoteric researchers.

  6. Victor Mair said,

    May 4, 2020 @ 8:49 pm

    But seriously . . .

    During the 90s, I gave scores of lectures on the Tarim mummies all over North America, Europe, and beyond. When I asked the audiences, ranging from around fifty to two or three hundred, whether anyone knew who the Tocharians were, it would be extremely rare if anyone had heard of them. By the end of that decade, a few people came armed with questions to me about the Tocharians!

  7. Chris Button said,

    May 4, 2020 @ 9:28 pm

    @ David Marjanović

    "honey" is the only one everybody agrees on.

    I disagree. Only recently here on LLog we discussed Guillaume Jacques' attempt to discredit the Tocharian origin there too, although he did later recant that.

    You mention the difficulties in reconstructing Old Chinese, but it's more fundamental than that.

    Well-established words of foreign origin (e.g., Tocharian "lion", "rug", "chariot", etc.) should not be automatically dismissed as soon as they fall foul of some new approach to Old Chinese reconstruction. That pertains to the numerous loanwords from Mon-Khmer, Semitic and other Indo-European languages too.

    The cases of "horse" and "mage", discussed only a few posts back ( https://languagelog.ldc.upenn.edu/nll/?p=46914 ), are classic examples. If a reconstruction of Old Chinese suddenly makes such archaeologically supported comparisons look less compelling (i.e., it's unlikely just coincidence that these two words look the same), perhaps the problem is not with the comparison but rather with the newfangled approach to Old Chinese reconstruction.

    There's usually good internal evidence that a word is a loan in Old Chinese. Something funny is usually going on–that may be in its phonology, orthographic variability/confusion, or simply its existence as an etymological isolate. It's heartening to reconstruct a form like "cassia" and then to have a convincing loan naturally surface without too much special pleading.

    @ Victor Mair

    Would now be a good time to request a post on on 母猴?

  8. Victor Mair said,

    May 5, 2020 @ 7:00 am

    Bless you, Chris Button, for your voice of sanity and reason.

    Yes, I would certainly love to do a post on 母猴 (沐猴﹑馬猴﹑獼猴, etc. [?]). Lord knows it's many years overdue, but I've got to finish grading and take care of other urgent matters before embarking on some big summer projects. Maybe in the brief space between the end of this semester and the start of the summer would be a good time to dig up my old, scattered notes on this subject. In any event, thanks for the periodic reminder. It is indeed a very interesting question, one that fully deserves a Language Log post.

  9. David Marjanović said,

    May 6, 2020 @ 5:00 am

    If I understand correctly, the centumness of Hittite and Palaic must be convergent with that of the western branches, because Luwian and therefore proto-Anatolian retain a three-way distinction, but that of Tocharian is at least potentially a shared innovation with Germanic etc.

    Yes, except Germanic etc. likewise share innovations with Indo-Iranian and Balto-Slavic that Tocharian lacks. One example off the top of my head is high productivity of thematic verbs (they're marginal in Tocharian and absent in Anatolian).

    There's also internal evidence that "West IE" became kentum separately from Hellenic. In the former, the outcome of the consonant cluster *kʲw has completely merged with the outcome of *kʷ (i.e. still [kʷ] in historical times). In the latter, at least between vowels, the consonant cluster kept its total length and first became /kʷː/, as seen in ppos. That looks like Hellenic already had phonemic consonant length when it underwent the kentum merger, but "West IE" did not.

    Well-established words of foreign origin (e.g., Tocharian "lion", "rug", "chariot", etc.) should not be automatically dismissed as soon as they fall foul of some new approach to Old Chinese reconstruction. […]

    The cases of "horse" and "mage", discussed only a few posts back ( https://languagelog.ldc.upenn.edu/nll/?p=46914 ), are classic examples. If a reconstruction of Old Chinese suddenly makes such archaeologically supported comparisons look less compelling (i.e., it's unlikely just coincidence that these two words look the same), perhaps the problem is not with the comparison but rather with the newfangled approach to Old Chinese reconstruction.

    Thank you! That's the explanation I was waiting for in the thread you link to.

  10. David Marjanović said,

    May 6, 2020 @ 5:02 am

    Oops, there should be an asterisk on the Proto-Hellenic /kʷː/.

RSS feed for comments on this post