Language Log

An early fourth century AD historical puzzle involving a Caucasian people in North China

January 25, 2019 @ 6:52 pm · Filed by Victor Mair under Historical linguistics, Language and history, Transcription, Translation

« previous post | next post »

[This is a guest post by Chau Wu]

There is a long-standing puzzle that has attracted historical linguists’ interest. This is a single sentence of 10 characters in two clauses: “秀支替戾岡, 僕谷劬禿當” (xiù zhī tì lì gāng, pú gŭ qú tū dāng). The sentence does not make sense in any of the Sinitic topolects. Obviously, this appears to be from a foreign language using Sinographs as phonetic transcriptions. Indeed, the source document which gives this mysterious sentence clearly indicates this is in Jié 羯, a non-Sinitic language that showed up in China during the chaotic period known as the Sixteen Kingdoms (304-439 CE) marked by uprisings of 五胡 wŭhú ‘Five Barbarians’ (Xiōngnú 匈奴, Jié 羯, Xiānbēi 鮮卑, Dī 氐, and Qiāng 羌) against the Jìn 晉 dynasty.

Early in the Sixteen Kingdoms there came into prominence a kingdom called Hòu Zhào 後趙 (Later Zhao [319-351]) founded by a king named Shí Lè 石勒 (274-333). In 328 he found out that an army of another kingdom, Qián Zhào 前趙 (304-329), led by king Liú Yào 劉曜 (r. 318-329, d. 329) was about to invade Luòyáng 洛陽, the old capital of China. He went to consult a Kuchean* Buddhist monk, Fótúdèng 佛圖澄 (traditional reading Fótúchéng; ca. 232-348; Skt. Buddhacinga [?]), who could speak the Jié language. As recorded in the Book of Jìn (Jìnshū 晉書) in the section describing his biography, the monk made the 10-character prognostication given above.

[*VHM: Kucha was the homeland of the speakers of Tocharian B or Kuśiññe, a name that corresponds to the Kushan of Indic scripts of late antiquity. See also Tocharian people, Tocharian languages, and Kushan.]

The text in the Book of Jìn also gives a translation and summary of the sentence. With the translation known, we can now parse the sentence appropriately. The following shows the parsed reading with explanation as given by the Book of Jìn. I provide a rough English translation of the explanation in parentheses. The Sinographs are read in MSM (in Pinyin).

“xiùzhī, jūn yĕ 秀支, 軍也” (xiùzhī means ‘army/troops’);

“tìlìgāng, chū yĕ 替戾岡, 出也” (tìlìgāng means ‘to go/go out/advance’);

“púgŭ, Liú Yào Hú wèi yĕ 僕谷, 劉曜胡位也” (púgŭ refers to Liú Yào’s foreign official position, that is to say, pŭgŭ stands for Liú Yào);

“qútūdāng, zhuō yĕ 劬禿當, 捉也” (qútūdāng means ‘to seize/capture’);

“Cĭ yán jūn chū zhuō dé Yào yĕ 此言軍出捉得曜也” (this prognostication says that, when the troops are sent out, they can capture [Liú] Yào).

Since the Book of Jìn has already provided the answer key to the puzzle, one wonders, what puzzlement is there? Because the whole sentence appeared to be in a foreign language, the question centers on what kind of language it was in. Jìnshū says it is in Jié language. So, the crux of the question comes down to what is the known language family that Jié belongs to? If we can figure out the language family, then perhaps we may get some inkling about who the Jié were and where they may have come from.

Wikipedia (Jie people) lists five eminent scholars in Central Asian studies, Kurakichi Shiratori 白鳥庫吉, Gustaf J. Ramstedt, Louis Bazin, Annemarie von Gabain, and I. N. Shervashidze, who interpreted the sentence as Turkic. Louis Bazin’s interpretation is quoted by Nicholas Ostler in his Empires of the Word: A Language History of the World (p. 139). However, Edwin Pulleyblank remarked that the Turkic interpretations could not be considered very successful because they were at variance with the phonetic values of the Chinese text and with the Chinese translation. Instead, he suggested that the sentence might be related to Yeniseian languages. It prompted Alexander Vovin to propose an interpretation based on Southern Yeniseian.

The Book of Jìn describes the Jié with such physical characteristics that today we would attribute to Caucasians: deep eye sockets (深目 shēn mù), high noses (高鼻 kāo bí), and bushy beards (多鬚 duō xū), suggesting that this ethnic group might have come from Central Asia or even Europe. Since I have been studying sound correspondences between ancient European and modern Taiwanese lexicons (Sino-Platonic Papers, No. 262, 2016), this “riddle, wrapped in a mystery, inside an enigma” has caught my attention. I thought that I should try my hand too, but I took approaches different from the previous researchers in two respects. Firstly, I used Modern Taiwanese to read the sinographs, instead of reconstructed Old Sinitic. Secondly, I tried to match them to ancient European languages instead of Central Asian.

In trying to solve the puzzle, it goes without saying that, if one word is found to match one language, all the remaining words must also match the same language, with the possible exception of 僕谷 púgŭ, which is said to be a Hu official title (胡位也).

Abbreviations: OE, Old English; E, English; ON, Old Norse; Gmc, Germanic; L, Latin; Gk, Classical Greek; Tw, Taiwanese (in Pē-ōe-jī); MSM, Modern Standard Mandarin (in Pinyin); PSC, pattern of sound correspondence; PT, phonetic transcription.

(1) 秀支 (Tw siù-ki), PT for ‘army’

The corresponding word is ON fylki ‘district, county, shire; battalion, host (in battle)’, or OE fylc(e) ‘a company, troop, tribe, country, province’. ON fylki has a related verb ON fylkja ‘to draw up in battle formation’ with its reflex in Swedish fylka (same meaning). De Vries reconstructs the Gmc form as *gafulkja whereas Orel has *fulkjan.

The sound correspondence between ON fylki / OE fylc(e) and Tw siù-ki 秀支 is analyzed as follows.

(1) The second segment –ki, being the same in both languages, needs no explanation.

(2) The correspondence of the first segment between fyl– and siù is due to two processes. The first one, that of the medial -l changing to -u, is fairly common across languages, as we see Latin/Greek -l changing into French -u. For examples, L palma > F paume ‘palm’; L salsa > F sauce (> E sauce); L falsus > F faux ‘false’; Gk psalmós > F psaume ‘psalm’. For ON -l changing to Tw -u, see examples in Wu (p. 88).

The second process is less obvious. This was due to a lack of the /f/ phoneme in Old Sinitic (and Taiwanese), known in Chinese historical phonology as 古無輕唇音, thereby foreign /f/ was frequently turned into /h/. This /h/ was not so stable, and often got assibilated to become /s/. Thus, the mechanism of the process can be viewed as a chain of reactions: f- > *h- > *hs- > s-. Because this is a rather unusual sound change, a few examples showing the change from European f- to Tw s- are given below.

ON fyrstr ‘first’ > Tw siú 首 as in siú–sian 首先 ‘first, foremost’

OE fann ‘fan’ > Tw siàn 扇 ‘fan’

L fanaticus ‘frenzied’ > *fan- > Tw siàn 煽 (verb) ‘to instigate someone(s) into frenzies’

OE fiþere ‘wing’ > *fith– > Tw si̍t 翼 ‘wing’

Proto-Gmc *fidwōr ‘four’ > *fid– > Tw sì 四 ‘four’ (the third tone reflects loss of -d)

OE fǣhþ(u), fœþh (> E feud) > Tw siû 仇 ‘feud, enmity’

ON fyrnd ‘decay, dilapidation’ > Tw sún 損 as in sún-hāi 損壞 ‘damage, decay’

ON fá ‘to grasp with the hands, get hold of’ > Tw sa (same meaning; no sinograph)

(2) 替戾岡 (Tw thè-lē-kong), PT for ‘to go out’

The corresponding word is OE dragan (intransitive verb) ‘to draw oneself, go’ or ON draga (intransitive verb) ‘to move, draw’. The first two segments 替戾thè–lē– came about because OE dragan / ON draga has an initial cluster of dr-. In the Old Chinese period, initial clusters existed, but by the time of the Sixteen Kingdoms, clusters may have disappeared. Therefore, the Germanic dra– underwent epenthesis to become *dara-, eventually becoming Tw thè–lē– for 替戾. The same process is also seen in the loan of E ‘glass’ to Japanese as garasu (ガラス) and in the Mycenaean Greek ‘tripod’ (Classical Gk τρίπους) written in the syllabic Linear B as tiripo.

As regards the third segment 岡, it is pronounced in Tw as kong, which reflects an original *kang (same as MSM gāng). The etymological origin for OE dragan and ON draga has been reconstructed as Gmc *dragana. It is a strong verb of Class 6. From the reconstructed inflection pattern, the most likely form to account for替戾岡 is the indicative 3^rd person plural form, *dragandi. Thus, Gmc *dragandi > *dragand– > *daragang > 替戾岡 (Tw thè–lē-kong).

(3) 僕谷 (Tw po̍k-kok), PT for the title of an official rank in the Hu government

Since it is clearly stated in the Book of Jìn that the monk Fótúdèng used the title to refer to Liú Yào 劉曜, who was a Xiongnu, not a Jié, obviously this title need not be sought in Germanic. However, I surmise that it may be related to L Prōcūrātor ‘one who is in charge, an administrator’, appropriate for a king. The derivation involves gemination of the medial c, so that Prōcūrātor > Prōcūr– > *Prōc–cūr– > po̍k–kok 僕谷 with a sound change of -ur > -ok.

(4) 劬 (Tw kû, khu), PT for ‘prison’

The corresponding word is OE clūs ‘an enclosure, prison’. Following the loss of the -s coda according to PSC-8 (Wu, pp. 93-100), we obtain *clū-, which upon dimidiation of the cluster, gives rise to 劬 kû / khu. Interestingly, through alternative splicing of the cluster, OE clūs gives rise to two doublets Tw ku 拘 and liû 留, the latter then recombine to form a compound Tw ku-liû 拘留 ‘detention’. Sino-Japanese 拘留所 kōryūjo means ‘detention room’.

(5) 禿 (Tw thut), PT for ‘tower’, here for ‘prison tower’

OE has three words for ‘tower’: (1) the native word is stípel; (2) torr is a loan from L turris (cf. Sp, It torre ‘tower’); and (3) tūr is a loan from either Norman French or Latin directly. The most likely candidate for 禿 is OE tūr. Normally, -ur is converted to -ok as has been shown above for –cur > -kok 谷. Indeed, the Sino-Japanese reading of 禿 is toku (tūr > *tok > toku). However, Tw has a second way of conversion for -ur to become -ut so that OE tūr > Tw thut 禿.

(6) 當 (Tw tong), PT for ‘take, seize’

The corresponding words are OE tacan and ON taka. ON taka means ‘to take, catch, seize; to touch’. OE tacan, with the earliest meaning of ‘to touch’ but later ‘to take, catch, seize’, is said to be a late borrowing (c. 1100) from ON taka. However, the existence of OE oftacan ‘overtake’ may point to the native currency of tacan. The cognate in Gothic is tekan ‘to touch’. Their etymological origin is Proto-Gmc *takan. When this word came to Asia, the -k of the stem *tak– was changed to -n which eventually became -ng. Thus, *tak– > *tan– > *tang. In Taiwanese *tang is transformed into tong for 當 (cf. MSM dāng).

Summary. We can now read the puzzle in OE/ON as follows, suggesting that the Jié spoke a language of the Northwest Germanic group.

“秀支替戾岡, 僕谷劬禿當” is read “Fylki dragan, Prōcūrātor clūs tūr takan.”

Epilogue. While working on the puzzle, I came to a bottleneck at 禿 which I suspected might be related to “tower”. My intuition was that a tower might serve as a prison. Then I thought of the Tower of London which was indeed a prison. Although not 100% sure, I wrote an email to Professor Mair, declaring that I had solved the puzzle. The next day, while checking up on something else in Carl Buck’s A Dictionary of Selected Synonyms in the Principal Indo-European Languages, I came upon a statement (p. 1452b), “The association between ‘tower’ and ‘prison’ was once widespread (cf. esp. the Tower of London).” Seeing this, I rested my case.

==========

Update (2/11/19):

Someone noticed a post earlier today on China's Weibo 微博 (microblog) by a personality named Dong Lishi 动历史 who translated and introduced the article posted on Language Log that contemplated the Jie 羯 question. The post is quite popular on Weibo 微博 and has gained 100 plus retweets. I thought you might be interested.

From Sanping Chen (2/11/19):

This has always been an intriguing puzzle, with many "solutions" that can be neither proven nor disproved.

But I take issue with the prevailing interpretation of 胡位 as "foreign rank/position" — it sounds like a neomodern understanding. After all, at the time Liu Yao had been enthroned as emperor/khan/kaiser for ten years. To me, the character 位, if not mis-written, can only represent some sort of zodiac sign here. It is also worth noting that grammatically the second half of the prophesy has an object+verb structure, which allows both Iranian and Altaic solutions.

Update (2/18/19):

Here's an interesting research project in Leipzig on the Buddhist wall paintings from Kucha.

Note the clear recognition that Kucha was a Tocharian kingdom.

January 25, 2019 @ 6:52 pm · Filed by Victor Mair under Historical linguistics, Language and history, Transcription, Translation

Permalink

141 Comments

Dick Margulis said,

January 25, 2019 @ 7:12 pm

Several places in the post, something is missing in the text, presumably glyphs that were copied and pasted but did not survive the conversion to html (perhaps they're in a script that isn't supported by Unicode). Or maybe this is a local phenomenon in my browser, in which case, never mind. Just noting it in case it's fixable in the post. This is all Greek to me. Well, not Greek Greek, but you know what I mean.
Victor Mair said,

January 25, 2019 @ 7:55 pm

@Dick Margulis:

I'm sorry that it's not showing well for you, but everything is visible in my browser (Firefox).
Philip Taylor said,

January 25, 2019 @ 8:08 pm

In Seamonkey 2.49.4 (same stable as Firefox), Para.~2 is defective : "Early in the Sixteen Kingdoms there came into prominence a kingdom called Hòu Zhào 後趙 (Later Zhao [319-351]) founded by a king named (274-333)."

"A king named" followed by no name.
Paul F said,

January 25, 2019 @ 8:10 pm

There does appear to be missing text, both Chrome and Firefox, e.g. 'Early in the Sixteen Kingdoms there came into prominence a kingdom called Hòu Zhào 後趙 (Later Zhao [319-351]) founded by a king named (274-333)'

What was the king's name?
Narmitaj said,

January 25, 2019 @ 8:12 pm

I get the same as Dick Margulis… also Firefox, though on a Mac if that makes a difference.

For instance, "Indeed, the source document which gives this mysterious sentence clearly indicates this is in , a non-Sinitic language that showed up in China", where presumably " , " is not actually the name of the non-Sinitic language.
Victor Mair said,

January 25, 2019 @ 9:01 pm

Thanks, everyone, for reading so carefully. For some reason, the items that dropped out were all names to which I had added hyperlinks. I hope that everything is working properly now.
Chris Button said,

January 25, 2019 @ 11:14 pm

Vovin's suggestion that Jié 羯 (Pulleyblank's EMC kɨat) ultimately means "man, human being" and is associated with the Yeniseian "Ket" people is compelling and supportive of Pulleyblank's original hunch for a Yeniseian association. In his analysis of the text, Vovin seems to struggle with 秀支 for "army", although I wonder if there is perhaps more to his very speculative association of the second syllable 支 with the Yeniseian "Arin" word kel "army" than he is prepared to admit (rather than his suggestion that it is probably a loanword or internal innovation)? The Old Chinese *-l coda was after all long gone by that stage.
AntC said,

January 26, 2019 @ 12:11 am

Thank you Chau Wu for an intriguing piece of sleuthing.

I used Modern Taiwanese to read the sinographs, instead of reconstructed Old Sinitic.

If not using Old Sinitic, is there any strong reason to use a Modern Taiwanese reading (presumably you mean Southern Min/Hokkien) as opposed to (say) Modern Cantonese or Hakka or any other Han topolect? Indeed would correlating sound changes between different topolects give a stronger 'fix'? Which modern topolect would be closest to Shí Lè's? Or at least to that of his scribes?

With only ten characters to go at (and one of them probably an official title), there's a significant risk that vaguely-related sounds/vaguely-related meanings could just be random chance. (There's a statistical paper by Prof Liberman showing that under suitably vague criteria of 'similarity', almost any few vocab items in some language could be found similar to some other language. And I'm aware of the cautionary example of poor Edo Nyland(sp?).)
Victor Mair said,

January 26, 2019 @ 1:03 am

Ket people– history, language, culture (with lots of photos [click to enlarge]):

https://en.wikipedia.org/wiki/Ket_people

Yeniseian languages — including Ket:

https://en.wikipedia.org/wiki/Yeniseian_languages

The Kets live along the Yenisei River, which flows into the Arctic Ocean from Central Siberia.

The Jie people lived in North China.

https://en.wikipedia.org/wiki/Jie_people

Note that early Chinese sources state that the Jie originated among the Yuezhi:

https://en.wikipedia.org/wiki/Yuezhi

Chinese sources describe the Yuezhi as having deepset eyes and "profuse beards and whiskers" — like the Jie. Furthermore, the Yuezhi are often identified with Tókharioi (Greek Τοχάριοι; Sanskrit Tukhāra) and Kushanas (see the discussion of Kucha and the Kuchean language [Tocharian B or Kuśiññe] in the o.p.). Note that Fótúdèng 佛圖澄 (traditional reading Fótúchéng), the Buddhist monk who spoke the Jie phrases being discussed here, was from Kucha.
cameron said,

January 26, 2019 @ 2:23 am

This methodology of reading the ancient characters as if they were modern Taiwanese is veering perilously close to what Leibniz called "goropism".

Wasn't there another guy who claimed that all the Etruscan inscriptions could be easily read if you assumed they represented modern Ukrainian?
B.Ma said,

January 26, 2019 @ 2:45 am

This may be a silly question but I am confused by the "examples showing the change from European f- to Tw s-" – is this saying that those Taiwanese words were derived from Old Norse?
NV said,

January 26, 2019 @ 5:39 am

“Fylki dragan, Prōcūrātor clūs tūr takan.” seems ungrammatical – did OE or ON allow combining three nouns without putting any of them into genitive?
BobW said,

January 26, 2019 @ 7:22 am

My first thought on reading the headline was the captured legion of Crassus, but that was far too early.
AG said,

January 26, 2019 @ 7:42 am

Same question as B.Ma – I assume I'm missing something but if not, seems a bit bold to me. There's no theorizing about … like, Gothic, or Tocharian, or any other intermediary steps in there, just … Iceland and Taiwan, close linguistic siblings?
Victor Mair said,

January 26, 2019 @ 9:14 am

The Min group of Sinitic languages, to which Taiwanese belongs, is generally held to be the most conservative, if not archaic, of all the Sinitic topolects.

Old Sinitic reconstructions are so vexed, contested, and unreliable that some of the most respected historical linguists, such as Jerry Norman, W. South Coblin, and David Prager Branner, no longer subscribe to any of them nor engage in them themselves.

Overall, Tocharian (after Hittite the second oldest Indo-European language), although historically it is the easternmost, typologically more nearly resembles northwest European languages than it does Balto-Slavic or Indo-Iranian. For how this seemingly improbable situation could have come about, see the following works that take into consideration linguistics, archeology, genetics, and other relevant disciplines:

Victor H. Mair, ed., The Bronze Age and Early Iron Age Peoples of Eastern Central Asia (Washington, D.C.: Institute for the Study of Man Inc. in collaboration with the University of Pennsylvania Museum Publications, 1998). 2 vols.

J. P. Mallory and Victor H.Mair,The Tarim Mummies: Ancient China and the Mystery of the Earliest Peoples from the West (London: Thames & Hudson, 2000).

"Early Indo-Europeans in Xinjiang" (11/19/08). http://languagelog.ldc.upenn.edu/nll/?p=843

Victor H. Mair, ed., Contact and Exchange in the Ancient World (Honolulu: University of Hawai'i Press, 2006).

Victor H. Mair, "Language and Script: Biology, Archaeology, and (Pre)History," International Review of Chinese Linguistics, 1.1 (1996), 31a-41b.

J. P. Mallory, "The Problem of Tocharian Origins: An Archaeological Perspective", Sino-Platonic Papers, 259 (Nov. 2015), 63 pages (free pdf) http://sino-platonic.org/complete/spp259_tocharian_origins.pdf

David Anthony, The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (Princeton: Princeton University Press, 2007).

Pita Kelekna, The Horse in Human History (Cambridge: Cambridge University Press, 2009).

This long, learned, rigorous comment by Axel Schuessler on the Tocharian word for "honey" that was borrowed into Old Sintic: http://languagelog.ldc.upenn.edu/nll/?p=40516#comment-1556739.

etc.

Plus the long and continuing series of essays about Old Sinitic reconstructions to which this post belongs. See "Readings" at the conclusion of this post: "Galactic glimmers: of milk and Old Sinitic reconstructions" (1/8/19) http://languagelog.ldc.upenn.edu/nll/?p=41346
David Marjanović said,

January 26, 2019 @ 3:56 pm

Phew! Where to start.

Perhaps not even with the OP, but with Prof. Mair's latest comment, because I can do that faster.

The Min group of Sinitic languages, to which Taiwanese belongs, is generally held to be the most conservative, if not archaic, of all the Sinitic topolects.

The Min group indeed lacks innovations that all the rest of the modern Sinitic languages, and Middle Sinitic, have. Therefore, there is a consensus that Min is not descended from Middle Sinitic, unlike all or almost all other Sinitic languages; instead, Middle Sinitic and Proto-Min had a common ancestor, which appears to be Late Han Sinitic. (What people generally mean by "Old Sinitic"/"Old Chinese" when they don't explain further is an even older stage, or several, corresponding to Qin, Zhou and even Shang times.)

This does not, of course, mean that the Min group doesn't have substantial innovations of its own! It does.

Just off the top of my head, all Min languages not only have tones, but some of them have the most complex tone systems of any Sinitic languages – but Old and, on the phonemic level, even Early Middle Sinitic lacked tones altogether. In the process of developing tones (like many other languages around them, Sinitic and otherwise, roughly around the same time), they lost a bunch of consonant distinctions, some of which are preserved in some other Sinitic languages today.

Perhaps most dramatically, Min has changed all initial nasal consonants into voiced plosives! These are not homologous with the voiced plosives preserved in Wu and Old Xiang from Middle Sinitic.

Old Sinitic reconstructions are so vexed, contested, and unreliable that some of the most respected historical linguists, such as Jerry Norman, W. South Coblin, and David Prager Branner, no longer subscribe to any of them nor engage in them themselves.

To the best of my knowledge, this is a gross exaggeration. The reconstructions in use today (basically those made in the 21st century) differ mostly in their means of notation; there are genuine disagreements concerning a number of individual words, but for the most part the latest reconstructions agree on which distinctions existed, which distinctions did not exist, and which words belonged to which side of each distinction. Importantly, the latest reconstructions differ from the older ones (of the 1950s through 70s) in mostly the same ways, and usually agree with each other against the older ones. Unsurprisingly, the remaining genuine disagreements between the reconstructions in current use concern hard problems, like poorly attested words, uncertainties whether words from extant languages (Sinitic or not) are descended from a particular Old Sinitic word, and questions of the exact phonetic nature of certain distinctions (e.g. whether Type A syllables had distinctively long vowels, pharyngealized initial consonants, or distinctively short vowels). It is really no surprise that the people you cite don't consider it worth their time to figure these details out.

But Old Sinitic is hardly relevant to a 4th-century text anyway! The 4th century is much closer to the age of Early Middle Sinitic, whose sound system – thought to represent the literary varieties of the preceding Northern and Southern Dynasties period (5th & 6th c.) – is documented in the 切韻 Qièyùn of 601 and the later rhyme tables. While these sources cannot be "just read", they are clear enough that the disagreements between the current reconstructions are really minimal. I dare say the reconstructions compared in the "Middle Chinese finals" article of Wikipedia are mutually intelligible with next to no further effort.

Overall, Tocharian (after Hittite the second oldest Indo-European language), although historically it is the easternmost, typologically more nearly resembles northwest European languages than it does Balto-Slavic or Indo-Iranian.

In what ways? The only one I can come up with is the kentum merger. The western branches share several innovations with Balto-Slavic and Indo-Iranian that are absent in Tocharian, like the great expansion of the class of thematic verbs or the meaning shift of */jebʰ/- from "enter" (which is its meaning in Tocharian) to a euphemism and then a swearword (which has a sacred use in one Sanskrit ritual). That is what is meant by "second oldest".

For how this seemingly improbable situation could have come about, see the following works that take into consideration linguistics, archeology, genetics, and other relevant disciplines:

Of these, only Mallory's paper is recent enough to cite any of the large-scale genetic studies that have revolutionized our understanding of the early speakers of Indo-European. It cites only one (Allentoft et al. 2015).

(It is a very valuable paper for other reasons!)

Next part: Yeniseian.
David Marjanović said,

January 26, 2019 @ 4:30 pm

From the OP:

Wikipedia (Jie people) lists five eminent scholars in Central Asian studies, […], who interpreted the sentence as Turkic. […] However, Edwin Pulleyblank remarked that the Turkic interpretations could not be considered very successful because they were at variance with the phonetic values of the Chinese text and with the Chinese translation. Instead, he suggested that the sentence might be related to Yeniseian languages. It prompted Alexander Vovin to propose an interpretation based on Southern Yeniseian.

And then suddenly the topic changes. It shouldn't.

Here is the latest paper on this Jié (Middle Sinitic: */kjet/) sentence. It dates from 2016 and is by Vovin, Edward Vajda (a specialist on Yeniseian) and Étienne de la Vaissière. On its 20 pages it replies to an attempt from 2015 to read the sentence as Turkic. In great detail it shows why Turkic cannot be bent so as to fit the sentence, which, incidentally, is quoted with all its context in the original and in two translations. Then it shows, likewise in great detail, why Southeastern Yeniseian (documented as a few 18th-century wordlists from the extinct Pumpokol language) is a really good fit for the pronunciation (in Early Middle Chinese), the gloss and the translation given in the original text.

At the end, it mentions two interesting things.

First, "person" is /keʔt/ in Ket, and recorded as kit in Pumpokol; there's no reason to think the /ʔ/ was absent in Pumpokol, which was only documented by explorers and fur traders from Russia and western Europe. How would [kiʔt], or [kiʔə̆t], sound to an Early Middle Sinitic ear? Probably like [kiet] or something; if so, the closest approximation in EMS itself would be /kjet/, writable as 羯.

Second, Yeniseian river names occur over a very, very large area in Siberia. Those which show a sound change otherwise documented only in Pumpokol occur in northern Mongolia. In other words, the geographic obstacle has gone out the window.

The geographic obstacle to placing an ungrammatical – barely comprehensible – sequence of Old English verbs in the infinitive and nouns in the nominative in reach of China, however, remains.

I'll comment on the sound changes assumed in the OP tomorrow.
Chris Button said,

January 26, 2019 @ 4:39 pm

Regarding 秀支 "army", I wonder if Vovin and others might have more success looking for Yeniseian words meaning "camp, tent, hut" etc.? Such a semantic shift to "army" is common cross linguistically: compare English host (from whence "hostel" etc.) in its archaic sense of "army", or Chinese 呂 originally depicting a series of tents in an encampment (no not a "spinal column"!) and clearly related to the words represented by characters like 旅 and 廬 etc.
Chris Button said,

January 26, 2019 @ 5:19 pm

@ David Marjanović

Prof. Mair is correct in his comments about Old Chinese. Furthermore, a cursory glance at the the critical reviews of Baxter & Sagart's recent book show that the field is actually becoming even more divided. If you are just interested in "distinctions" for which any algebraic signs will suffice, you may as well just cite Karlgren. Speaking of which, his "Compendium of Phonetics in Ancient and Archaic Chinese" still remains the best basic introduction to the field in English in my opinion and I would very much recommend it.

As for Jié 羯 / "Ket" and the sense of "person", I mentioned that in my earlier post. Personally I think Vovin, following Pulleyblank's hunch, is thinking along the right lines, but his argument is hardly definitive.
Victor Mair said,

January 26, 2019 @ 10:39 pm

"Phew!"

What a strangely uncivil way to begin a comment on Language Log.

========

interj.
Used to express relief, fatigue, surprise, or disgust.

American Heritage Dictionary of the English Language

========

More on this tomorrow, after the current series of comments by the person who uttered that odd expression is over.
Jonathan Smith said,

January 26, 2019 @ 11:05 pm

Wu (2016: 67) under his proposed "dr- > h-" has ON draga ‘to launch (a ship)’ > Tw hā 下, but above the author says, with respect to the same ON draga, that given its onset cluster, "dra- underwent epenthesis to become dara-, eventually becoming Tw thè-lē 替戾." Also, on the above account the relevant proto-Germanic form *dragana gives Tw thè–lē-kong 替戾岡, whereas Wu (2016: 186) has Tw kong 岡 from ON borg ‘hill’, "with b- > k-". And pointing all this out is of very dubious value to begin with as by so doing one does short shrift to countless other confusions; certainly there is no reason for David Marjanović to bother to write about "sound changes assumed in the OP." I am sorry to say that the issues here go not to science but psychology :( That is not an essay I want to write; very briefly, my impression is that the author considers heterodoxy to be a virtue in and of itself — but as a matter of fact, "to live outside the law you must be honest."
Victor Mair said,

January 27, 2019 @ 5:25 am

Compare the question with which AntC begins his comment:

"Thank you Chau Wu for an intriguing piece of sleuthing."

He then goes on to ask a probing series of questions and politely express reservations, which is all well and good, a civil way to proceed, one that offers prospects for a productive discussion.
Victor Mair said,

January 27, 2019 @ 5:41 am

And compare the latest comment by Jonathan Smith, who makes incisive, substantiated remarks without resorting to empty sarcasm and sardonicism.

More to come after the final promised parade of pomposity "on the sound changes assumed in the OP", which, in truth, after Jonathan Smith's comment, are no longer necessary.
Philip Taylor said,

January 27, 2019 @ 6:33 am

Personally speaking (and I do appreciate that views will differ), I would sooner read a response that starts with the interjection "Phew!" than an article that includes the word "ratf***ing" (without the asterisks) in its title. But each to his own, I suppose.
David Marjanović said,

January 27, 2019 @ 6:52 am

interj.
Used to express relief, fatigue, surprise, or disgust.

I didn't know the fourth sense existed. Please be assured that I don't mean it. I meant to express the simple fact that there's so much material in the post, and so much I want to say about it, that I'm going to be fatigued in the end!

Compare the question with which AntC begins his comment:

"Thank you Chau Wu for an intriguing piece of sleuthing."

If I wrote that, it would be dishonest – I find it neither intriguing nor indeed a piece of sleuthing, as I hope to demonstrate later.

And compare the latest comment by Jonathan Smith, who makes incisive, substantiated remarks without resorting to empty sarcasm and sardonicism.

…I'm sorry, I honestly have no idea where I came across as using sarcasm or sardonicism. I mean what I say, and I don't mean what I don't say. Please point out where you feel otherwise, so I can avoid being misunderstood in the future!

More to come after the final promised parade of pomposity "on the sound changes assumed in the OP", which, in truth, after Jonathan Smith's comment, are no longer necessary.

I hope to write on historical linguistics, because – to my surprise – nobody else has yet seen fit to do it, not on pomposity. If I wanted to inflate myself, don't you think I'd choose a different venue?

Jonathan Smith points out that the sound changes/substitutions proposed in Wu (2016) differ from the ones proposed here. There's nothing wrong with changing one's opinion between 2016 and 2019, so I'll take the OP on its own merits.

Prof. Mair is correct in his comments about Old Chinese. Furthermore, a cursory glance at the the critical reviews of Baxter & Sagart's recent book show that the field is actually becoming even more divided.

Even your reconstruction doesn't strike me as so different from the other recent ones. The discussion has of course become more intense; that is normal as fewer and fewer people believe we can never know anything substantial about the subject, the data have begun to get better but still don't add up to a single coherent picture in many details, and the areas of disagreement become narrower and better defined. This has happened in many other fields of science.

On Karlgren's reconstruction and algebraic distinctions, I'd like to point out that his idea that Type B syllables difered from Type A in having a medial [j] appears to have been abandoned completely, for a number of very good reasons. These reasons may not allow us to decide between the remaining options, but their discovery still represents progress in narrowing the field of (known and as yet unknown) options down.
AntC said,

January 27, 2019 @ 8:14 am

I'm as confused as B.Ma and AG (and I apologise if Prof Mair's previous posts or papers have explained these etymologies and sound changes). "examples showing the change from European f- to Tw s-" – is this saying that those Taiwanese words were derived from Old Norse?

The hypothesis is that the Buddhist monk was speaking some IE-origin language, and we're guessing at the sound of it from how it was written using Early Middle Sinitic characters (and their then sound values). There's no reason to think the words (the sound of them) also entered the Sinitic lexicon with those meanings — ? or is there? That's why the sentence makes no sense in any Sinitic topolects: it's just a transcription of sounds.

Then what us non-Sinitic scholars are lacking is the meanings of those ten characters. If we had that, it would be clear they're nothing to do with troops advancing/capturing/imprisoning.

But Chau Wu argues from words that have entered Taiwanese from IE languages(?) Did all those f- corr s- words enter Sinitic languages at the same time/were they subject to the same series of sound pattern changes?

Chau Wu seems to be saying the comparative reference words have some common origin, with consistent sound changes that resulted in N.W. European f- corresponding to Tw s-. And (I surmise) that because Taiwanese and ON are relatively conservative within their respective language families, those words have undergone least change since their common origin. Well, OK languages on the periphery of a continent might be more conservative, but there seems an awful long way from S.E. China to N.W. Europe, and huge opportunity for other intrusions on sound patterns.
Chris Button said,

January 27, 2019 @ 10:09 am

@ David Marjanović

Upon what comprehensive lexicon of Old Chinese word families (comparable to those for Proto Indo European) are you basing such an assertion? Such a resource would certainly help me in the compilation of my dictionary! Surely you're not solely relying on Karlgren's "Word Families in Chinese" (1933) or Todo's 漢字語源辞典 (1965)!? Wang Li's 同源字典 (1982) and Schuessler's "ABC Etymological Dictionary of Old Chinese" (2006) are more recent but only compare a handful of items at a time (and, while valuable contributions, are of course not without their own problems).
Victor Mair said,

January 27, 2019 @ 3:07 pm

This is Language Log, so naturally we have to focus on language, but to solve the puzzle posed in this post, one has to take into account history and culture as well. That means seriously considering what sort of person the monk Fótúdèng 佛圖澄 (traditional reading Fótúchéng; ca. 232-348; Skt. Buddhacinga [?]) was. Remember, he's the one who spoke the Jie words under discussion. Don't forget that he came from Kucha, the medieval homeland of Tocharian B speakers, which is on the northern rim of the Tarim Basin (N.B.) and had studied in Kashmir. His background was similar to that of Kumārajīva (344-413), the greatest translator of Buddhist texts into Chinese, whose mother was a Kuchan (i.e., Tocharian) princess and whose father was from Kashmir.

https://en.wikipedia.org/wiki/Kum%C4%81raj%C4%ABva

And what sort of person was Fotudeng? He was a thaumaturge, yet a Buddhist. You can read a translation of his lengthy official biography in the Jin Shu (History of the Jin, chapter 95 = On the Practitioners of Occult Arts) here:

http://the-scholars.com/viewtopic.php?t=24024

Judging from all the materials we have about Fotudeng, he was unlikely to have been Tungusic, Turkic, or Siberian, but was evidently a Tocharian.

And who were the Jie? What did they look like? When they were massacred by the hundreds of thousands in 350-52, they were easy for their attackers to identify because of their conspicuous high noses and bushy beards (like the Yuezhi [see my comment above*]), which made them stand out from the Han populace. This information comes from the official Jin history, 107. See Otto J. Maenchen-Helfen, The World of the Huns: Studies in Their History and Culture (University of California Press, 1973), p. 372.

* http://languagelog.ldc.upenn.edu/nll/?p=41519#comment-1559523

It's interesting that Pulleyblank, who has suggested that the Jie were somehow connected to the Yeniseian Kets, by a convoluted explanation that I cannot follow also thought that the Jie were Tocharians. See his "The Consonantal System of Old Chinese," Asia Major (1963), 247-249 (of 205-265). I myself think that, through their links to the Yuezhi, the Jie were connected to the Tocharians (see the cited comment), so it would make sense that Fotudeng could speak their language.

For all of these reasons, I commend Chau Wu for his efforts to make a breakthrough in explaining the Jie sentence through a totally new path that takes into account all of the various types of information available to us — historical, cultural, ethnographic, linguistic, etc. — since all of the previous investigators, dating back nearly 120 years, fail on one or more of these counts.
Leo F said,

January 27, 2019 @ 4:28 pm

Have any Tocharian specialists commented on whether the sounds represented by 秀支替戾岡, 僕谷劬禿當 in Middle Sinitic could plausibly – or even just conceivably – mean something like "when the troops are sent out, they can capture [Liú] Yào" when uttered by someone speaking Tocharian A or B?
David Marjanović said,

January 27, 2019 @ 4:52 pm

Upon what comprehensive lexicon of Old Chinese word families (comparable to those for Proto Indo European) are you basing such an assertion?

Which assertion?

=============

All questions in this comment are honest, none are rhetorical.

=============

Early in the Sixteen Kingdoms there came into prominence a kingdom called Hòu Zhào 後趙 (Later Zhao [319-351]) founded by a king named Shí Lè 石 (274-333).

I apologize for misreporting part of the Vovin/Vajda/de la Vaissière paper I linked to above. On p. 126, it says the word for "person" is recorded in Pumpokol as ket, not kit, which happens to be how the Pumpokol word for "stone" is recorded.

The words were similar ever since Proto-Yeniseian, where */keʔt/ is reconstructed for "person" and */qeʔt/ or */qeʔs/ for "stone". My claim that there's no reason to suppose the */ʔ/ was absent from Pumpokol is correctly reported from the paper.

In footnote 5 on the same page, the paper points out that 石 shí means "stone". Maybe somebody mixed up the ethnonym with the word for "stone", misinterpreted the ethnonym as "stone" and then used that as the king's surname… or, of course, it could just be a coincidence.

a Kuchean* Buddhist monk, Fótúdèng 佛圖澄 (traditional reading Fótúchéng; ca. 232-348; Skt. Buddhacinga [?])

I'm curious: why is he thought to be Kuchean? Does it say so earlier in the Jìn Shū?

The Book of Jìn describes the Jié with such physical characteristics that today we would attribute to Caucasians: deep eye sockets (深目 shēn mù), high noses (高鼻 kāo bí), and bushy beards (多鬚 duō xū), suggesting that this ethnic group might have come from Central Asia or even Europe.

The Wikipedia article "Ket people" has a few photos. Some of the people on the photos show deep eye sockets and/or tall noses and/or sizable, bushy bears, despite not looking European.

Others, however, would not at all look out of place in China or Mongolia.

The sound correspondence between ON fylki / OE fylc(e) and Tw siù-ki 秀支 is analyzed as follows.

(1) The second segment –ki, being the same in both languages, needs no explanation.

I'm not sure about that. The Tw k is unaspirated, but the Norse k is quite noticeably aspirated, enough so that I'd expect a Tw kh.

(None of that changes if we substitute EMC for Tw. The vowel does, though; 支 is reconstructed with -e or -ie in the four reconstructions of EMC or similar stages compared in Chart 1 of Vovin et al., and in Schuessler's reconstruction of Late Hàn Chinese [also in that chart].)

Admittedly, this problem probably disappears if we start from stages of Germanic noticeably earlier than ON, as we must for a 4th-century text. But that ruins the first vowel, as I explain below.

(2) The correspondence of the first segment between fyl– and siù is due to two processes. The first one, that of the medial -l changing to -u, is fairly common across languages […]

I agree that there's no problem here.

The second process is less obvious. This was due to a lack of the /f/ phoneme in Old Sinitic (and Taiwanese), known in Chinese historical phonology as 古無輕唇音, thereby foreign /f/ was frequently turned into /h/.

No objections. I do wonder how the Sogdian /f/ is transcribed in contemporary Chinese sources, however.

This /h/ was not so stable, and often got assibilated to become /s/. Thus, the mechanism of the process can be viewed as a chain of reactions: f- > *h- > *hs- > s-. Because this is a rather unusual sound change, a few examples showing the change from European f- to Tw s- are given below.

Now, [hi] > [çi] > [ɕi] is actually quite usual, and Taiwanese si- is in fact [ɕ] according to Wikipedia. The vowel does not pose an obstacle if we start from Germanic languages of the 6th century or later: [fy] > [çiu] > [ɕu] looks reasonable.

But the sentence dates from the 4th century. That was before umlaut happened in North or in West Germanic. Instead of [fylki], we'd need to start from [fulkija]. And why would [h] become [ɕ] in front of [u]?

Proto-Gmc *fidwōr ‘four’ > *fid– > Tw sì 四 ‘four’ (the third tone reflects loss of -d)

So many questions.
– Why would a word for "four" be borrowed, when higher numerals were not?
– The 去 tone reflects loss of -h, which is preserved in one topolect today (that of 孝義/孝义 Xiàoyì in Shānxī), is heavily implied in MC-era descriptions of the tones, makes phonetic sense of the falling contour of the tone, and has a parallel in Vietnamese exact enough not to have been doubted since the 1950s when André Haudricourt pointed it out. The only known source, so far, for this -h is earlier -s, which not only has the same parallel, but is the only way to make sense of some transcriptions, e.g. Tsushima (i.e. /tusima/) being spelled 對馬 in a Chinese source of the 3rd century, and from loanwords into Korean (e.g. 篦 > Middle Korean /pis/, 芥 > MK /kas/, 蓋 > MK also /kas/). The most compact source for this is this summary.
– Proto-Germanic /d/ in positions like these wasn't [d], it was [ð]. If that was borrowed into any kind of Sinitic, in syllable-final position in front of [w], I would expect it to just disappear. In fact, it disappeared even on the way to Proto-West-Germanic, where [fiðwoːr] became [fiwːur]! (As inferred from OE feower, Old High German fiur and so on.)
– Why suddenly Proto-Germanic instead of OE or ON? Were intense contacts between Germanic and Sinitic ongoing for centuries (without anybody on the Sinitic side ever writing about them…)?
– What is "four" in Tibetan?

ON fyrnd ‘decay, dilapidation’ > Tw sún 損 as in sún-hāi 損壞 ‘damage, decay’

Why does ON fy- correspond once to siu-, once to su-?

ON fá ‘to grasp with the hands, get hold of’ > Tw sa (same meaning; no sinograph)

Why does fan- give sian, but fá give sa and not sia?

I should mention that the shortness of the ON word is a very late development in the prehistory of ON. It results from the regular loss of the infinitive ending -/n/ and the regular loss of /x/ between vowels. In the contemporary Germanic languages, and in Gothic (several centuries earlier), the word was /faːxan/ and not just /faː/.

(2) 替戾岡 (Tw thè-lē-kong), PT for ‘to go out’

The corresponding word is OE dragan (intransitive verb) ‘to draw oneself, go’ or ON draga (intransitive verb) ‘to move, draw’. The first two segments 替戾thè–lē– came about because OE dragan / ON draga has an initial cluster of dr-. In the Old Chinese period, initial clusters existed, but by the time of the Sixteen Kingdoms, clusters may have disappeared. Therefore, the Germanic dra– underwent epenthesis to become *dara-, eventually becoming Tw thè–lē– for 替戾. The same process is also seen in the loan of E ‘glass’ to Japanese as garasu (ガラス) and in the Mycenaean Greek ‘tripod’ (Classical Gk τρίπους) written in the syllabic Linear B as tiripo.

Epenthesis is indeed a likely option for how a borrowed word-initial cluster would have been represented in Middle Sinitic. Linear B ti-ri-po is a bad example, though, because there it's just the writing system, not the language, that didn't allow consonant clusters.

But where does the aspiration come from? And why -e- and not -a-?

From the reconstructed inflection pattern, the most likely form to account for替戾岡 is the indicative 3rd person plural form, *dragandi. Thus, Gmc *dragandi > *dragand– > *daragang > 替戾岡 (Tw thè–lē-kong).

Why would -[nd] end up as [ŋ] and not simply as [n], though? (I presume that would give Tw koⁿ?)

I should mention that the Yeniseian interpretation does predict [ŋ]: |t-il-ek-aŋ| ("out"-past tense-"go"-3rd person animate plural subject) "(they) went out". The vowels are much better fits, too.

(3) 僕谷 (Tw po̍k-kok), PT for the title of an official rank in the Hu government

Since it is clearly stated in the Book of Jìn that the monk Fótúdèng used the title to refer to Liú Yào 劉曜, who was a Xiongnu, not a Jié, obviously this title need not be sought in Germanic. However, I surmise that it may be related to L Prōcūrātor ‘one who is in charge, an administrator’, appropriate for a king.

Prōcūrātor is rather an insult for a king. It was the title of fiscal officers, some of whom were eventually appointed governors of certain small provinces.

The derivation involves gemination of the medial c, so that Prōcūrātor > Prōcūr– > *Prōc–cūr–

Why would that happen? After a long vowel no less?

> po̍k–kok 僕谷 with a sound change of -ur > -ok.

That is an extremely strange sound change. I'd like to see more examples of it.

But before that, I have two more questions: Why was the -ātor part just dropped, even though -rā- is the stressed syllable of the word? And why was the /r/ not dropped along with the /aː/, but assigned to the previous syllable – which, moreover, already had a long vowel?

Finally, I note that po̍k and kok have two different tones. This lines up with the fact that 僕 began with a voiced [b] in Middle Sinitic, as it still does in Japanese (boku). Why would a Latin [p] be borrowed (whether through Germanic or not) as [b] in Sinitic?

(4) 劬 (Tw kû, khu), PT for ‘prison’
[…]
(5) 禿 (Tw thut), PT for ‘tower’, here for ‘prison tower’
[…]
(6) 當 (Tw tong), PT for ‘take, seize’

…So these three syllables put together would mean "prison tower take"? That does not agree with the translation given in the text itself, which has no prison and no tower, just a verb (得): as you say, "when the troops are sent out, they can capture [Liú] Yào".

It would be hard to make sense of, too: how exactly would riding steppe warriors keep a prison, let alone a tower?

The Yeniseian interpretation has "take", possibly "take on foot"; close enough to "seize", "catch". Specifically, all of 劬禿當 is accounted for as |got-o-kt-aŋ| ("foot"?-3rd person masculine singular object-"take"-3rd person animate plural subject) "they take him (on foot?)" – "they catch him" or "they will catch him".

The cognate in Gothic is tekan ‘to touch’. Their etymological origin is Proto-Gmc *takan.

Not that it matters, but that's not possible; it can only be PGmc *tēkaną. The ON and OE a was long, as was every Gothic e; Northwest Germanic /aː/ and East Germanic /eː/ come from Proto-Germanic /eː/ (and Proto-Indo-European /eː/ and /eh/).

Summary. We can now read the puzzle in OE/ON as follows, suggesting that the Jié spoke a language of the Northwest Germanic group.

“秀支替戾岡, 僕谷劬禿當” is read “Fylki dragan, Prōcūrātor clūs tūr takan.”

But that is not a sentence. It's just a string of words. "Army, to draw, fiscal official, enclosure, tower, to take."

OK, let's replace every -an by -and to get the 3rd person plural in the present tense. But this plural disagrees with the singular army.

If clūs tūr is supposed to be a compound noun meaning "prison" (or "dungeon"), I would expect it to have a connecting vowel, therefore a whole extra syllable. But I don't know what the stem of clūs really is, so maybe not. But where is the "into"? In "they take the procurator into the tower", some kind of "into" is needed. Between 谷 and 劬 I can find no hint of /in/.

By the way, tūr with its long vowel is definitely a French loan (tour) in Old English.

I'm not sure if any of these nouns would have a distinct accusative form (prōcūrātor does, prōcūrātorem with a whole extra syllable – in Latin, perhaps not if borrowed into Germanic). But if so, all nouns except "army" should be in the accusative.

Summary. I prefer the attempt to explain the sentence as Yeniseian. While that leaves "army" and the title unexplained, it fits the phonetic transcriptions, the glosses and the translation very well, causes no geographic problems (Pumpokol-like place names occur in northern Mongolia), and probably causes no physiognomic problems (see photos on Wikipedia). The attempt presented here to explain the sentence as a string of North and West Germanic nouns and present-tense verbs plus a Latin noun definitely causes no physiognomic problems, but creates a huge geographic problem and a few anachronisms, fits the glosses and the translation quite poorly, and fits the phonetic transcription only if very odd sound substitutions and truncations are assumed, some of which, moreover, operate only some of the time, even though no conditioning factors are proposed or suggest themselves.
David Marjanović said,

January 27, 2019 @ 5:03 pm

I'm sorry! I forgot to refresth the page before submitting my comment!

And what sort of person was Fotudeng? He was a thaumaturge, yet a Buddhist. You can read a translation of his lengthy official biography in the Jin Shu (History of the Jin, chapter 95 = On the Practitioners of Occult Arts) here:

http://the-scholars.com/viewtopic.php?t=24024

Judging from all the materials we have about Fotudeng, he was unlikely to have been Tungusic, Turkic, or Siberian, but was evidently a Tocharian.

…but the first sentence of this translation says he "was from India"?

I myself think that, through their links to the Yuezhi, the Jie were connected to the Tocharians (see the cited comment), so it would make sense that Fotudeng could speak their language.

For all of these reasons, I commend Chau Wu for his efforts to make a breakthrough in explaining the Jie sentence through a totally new path […]

I don't understand: there's nothing Tocharian about Northwest Germanic (or Latin)?

I would actually like to know if a Tocharian interpretation of the Jié sentence is possible. But it seems unlikely to me, because the sentence contains two words ending in [ŋ], a sound that occurred in the attested Tocharian languages exclusively before [k], never at the ends of words. The occurrence of [b] and [g], absent from attested Tocharian (though who knows when they were lost), would also need to be explained.
Chris Button said,

January 27, 2019 @ 5:08 pm

Some later Pulleyblank thoughts from 1999 may be found here (see in particular p.49-51):

https://www.podgorski.com/main/assets/documents/Pulleyblank_1999.pdf
Chris Button said,

January 27, 2019 @ 5:20 pm

@ David Marjanović

You imply that there is significant agreement on how to reconstruct OC. If so, why is there no significant consensus on what words are even related to one another? People can neither agree on the onsets nor the rhymes, not to mention the prefixes which are just getting way out of hand now! Where is this supposed "concordance" (pun intended) to which you seem to have access? Of course, the situation will all change once my dictionary is finished :)
Chris Button said,

January 27, 2019 @ 5:32 pm

Maybe somebody mixed up the ethnonym with the word for "stone",…

"Stone" is unrelated. You should read Vovin a little more closely. Pulleyblank couldn't have known.
Chris Button said,

January 27, 2019 @ 6:14 pm

Actually I should say "most likely unrelated" but apparently still theoretically possible. Needless to say, Vovin's "person, man" etymology seems to work phonologically without any theoretical requirements for fortition in certain Yeniseian reflexes and is far more typologically likely in terms of the origin of ethnonyms.
Chris Button said,

January 27, 2019 @ 6:36 pm

Sorry I mean fortitions in Xiongnu, not Yeniseian! I need to stop posting now
liuyao said,

January 27, 2019 @ 6:42 pm

Minor points: 石勒’s second character is still missing; I’d say 澄 still has the reading chéng as the primary reading, especially as part of personal name.
Victor Mair said,

January 27, 2019 @ 8:10 pm

Fotudeng's lay surname was Bó 帛, which was the surname of the Tocharian royalty at Kucha.

Fotudeng 3,410 ghits.

Fotucheng 762 ghits
AntC said,

January 27, 2019 @ 9:45 pm

Judging from all the materials we have about Fotudeng, he … was evidently a Tocharian.

Thank you Victor. Then the obvious next step would be to correlate the guessed-at sound of those 10 characters with what we know about the sound pattern of (proto-?)Tocharian (B?).

Is there any evidence Tocharian sounded like early NW Germanic with bits of Latin thrown in? wp says "Proto-Tocharian shows radical changes in its vowels from Proto-Indo-European (PIE)."

How did (ancestors of) Tocharian speakers, and their language, get from some common place of origin (where?) to both Kucha and to NW Europe? And do so relatively quickly, and without their language getting corrupted in exchange with other languages along the way. We must, as you say, take history/culture/archaeology evidence into account. Is there any?

Specifically, it's easy to see how Latin could influence NW Germanic. How did Latin travel eastward to Kucha? (And check etymonline's entry for tower: "possibly from a pre-Indo-European Mediterranean language. " )
Victor Mair said,

January 27, 2019 @ 10:54 pm

@AntC

Thank you for making intelligent comments and asking smart questions. There are people out there who can respond positively and productively.
Victor Mair said,

January 28, 2019 @ 12:10 am

"goropism"

That's hardly what Chau Wu had in mind.
David Marjanović said,

January 28, 2019 @ 4:27 am

"Stone" is unrelated.

"Stone" and "person" are unrelated within Yeniseian. But, following the hint in Vovin et al., I speculated that someone not fluent in the Jié language confused these similar words, thought the ethnonym meant "stone", and then proceeded to use that as Lè's surname. Again, just speculation.

You imply that there is significant agreement on how to reconstruct OC. If so, why is there no significant consensus on what words are even related to one another?

We're probably looking at this in "glass half full/half empty" ways. I therefore retract my claim as largely meaningless and await both your dictionary and Guillaume Jacques' eventually forthcoming reconstruction. :-)

Fotudeng's lay surname was Bó 帛, which was the surname of the Tocharian royalty at Kucha.

Thank you! That's certainly evidence.
AntC said,

January 28, 2019 @ 6:36 am

Sino-Platonic Papers 7 1988 also abstract 143 (July 2004) has a section 'Proximity of Chinese to Germanic' pp32 ff http://sino-platonic.org/complete/spp007_old_chinese.pdf

"Among Indo-European dialects, Germanic languages seems to have been mostly akin to Old Chinese …"

Perhaps for Sinologists this is old hat, but I'm goggling.

Prof Mair: is this still considered applicable research, or is it superseded?

David M: the paper includes substantial vocab lists of IE cognates with Old Chinese. And these are basic/domestic items, mostly single-syllable in Germanic. The claim seems to be that polysyllabic inflection arrived late in (Southern) Indo-European, absorbed from Middle Eastern/Altaic contact, whereas Germanic was more conservative.

So the evidence is more voluminous than a ten-character transcription. Never the less, I see no Tocharian connection, let alone a chain of transmission.
Chris Button said,

January 28, 2019 @ 8:47 am

Regarding "stone", I find Pulleyblank's suggestion regarding 康 in 康居 and a supposed Tocharian A *kāŋk- "stone" interesting. I would also love for a Tocharian specialist to clarify how the/ŋk/ ~ /ŋ/ variation works in Tocharian if /ŋ/ is indeed a separate phoneme (i.e. if /ŋk/ can be distinguished from /nk/).
Victor Mair said,

January 28, 2019 @ 9:51 am

@AntC

Once again, AntC, thank you for your valuable input. You are constructive, not cynical, yet without losing your critical judgement. You have demonstrated that repeatedly, just in the comments to this post, but also in your comments to many other posts, for which I am grateful.

Because of longstanding prejudices against long-distance transmission within the field, Chang Tsung-tung's 1988 article has never been given a serious, fair hearing. Nonetheless, it has had an impact, since it was inspirational, if not foundational, for the work of Arne Østmoe (A Germanic-Tai Linguistic Puzzle, SPP 64 [Jan., 1995], 81, 6 pages) and Chau Wu's current investigations.

Once more, we have to take into account realities such as the following:

=====

The existence of Bronze Age socketed (and sometimes looped) celts (axes) and other specific types of bronze weapons all the way across the steppes.

The Tocharian words for "honey" and "lion" in Sinitic already by the Han Dynasty (and there are many other borrowings of IE words in Old Sinitic).

The existence of certain types of beads from Europe to East Asia, including its southern periphery, during the 1st millennium BC. Within the next year or two, I will publish in SPP a large monograph on these beads, together with numerous photographs and scientific analyses, that will prove the existence of long distance transmission already by the first half of the 1st millennium BC and before.

See Andrew Sherratt's remarkable posthumous "The Trans-Eurasian Exchange: The Prehistory of Chinese Relations with the West," in Victor H. Mair, ed., Contact and Exchange in the Ancient World (Honolulu: University of Hawaii Press, 2006), pp. 30-61, which documents the existence of a corridor for traffic and trade from the early Bronze Age, starting already in the third millennium if not before.

Cf. Barry Cunliffe, By Steppe, Desert, and Ocean: The Birth of Eurasia (Oxford: Oxford University Press, 2015).

The Tarim Basin mummies, dating back to the early 2nd millennium BC, with all of their technology and culture, which plugged a huge gap in the archeological record of west-east cultural transmission. See the books by J. P. Mallory & Victor H. Mair and by Elizabeth Wayland Barber.

=====

Slowly, but ever so surely, a new picture of the development of Eurasian civilization during the Bronze Age and Early Iron Age is emerging. Language is inevitably an important part of that picture.
Victor Mair said,

January 28, 2019 @ 9:56 am

The centum-satem split is not meaningless, and no amount of scholastic obfuscation can negate its fundamental significance.
Chris Button said,

January 28, 2019 @ 2:07 pm

Regarding "stone", I find Pulleyblank's suggestion regarding 康 in 康居 and a supposed Tocharian A *kāŋk- "stone" interesting. I would also love for a Tocharian specialist to clarify how the/ŋk/ ~ /ŋ/ variation works in Tocharian if /ŋ/ is indeed a separate phoneme (i.e. if /ŋk/ can be distinguished from /nk/).

Actually if we can go with Tocharian *kāŋka- then presumably Pulleyblank is bringing in the second syllable 居 too.
Victor Mair said,

January 28, 2019 @ 3:41 pm

From Chau Wu:

@ Victor Mair. “Fotudeng's lay surname was Bó 帛, which was the surname of the Tocharian royalty at Kucha.”

Regarding the surname of the Tocharian royalty at Kucha, it is interesting to note that there has been discussion about what Bó actually meant. Here I quote from a book entitled, Kutscha und seine Beziehungen zu China von 2. Jh. bis zum 6 Jh. n. Chr., by Liu Mau-Tsai (劉茂才), in Asiatische Forschungen, Band 27. Otto Harrassowitz, Wiesbaden (Germany), 1969. (For clarity I have inserted Sinographs in square brackets.)

“Der Familienname des Königshauses von Kutscha wurde erstmalig im Jahre 78 n.Chr. mit Po wiedergegeben….

“Was bedeutet nun der Name Po? Geschrieben wurde er mit Po = weiß [白] oder Po = Seide [帛]. Die Überlieferung gibt keine Auskunft über die Herkunft dieses Namens. Ich vermute nun, daß dieser Familienname die Bedeutung „weiß“ hat, denn schon Bailey (Anm. 255) vermutet, daß der Staatsname Kutsi „weiß“ bedeute. Dazu liegt in Kutscha der berühmte „Weiße Berg“ Po-schan [白山]; und eben dieser Name Po-schan [白山] war schon im 3 Jh. auch der Name eines Königs von Kutscha (Anm. 409)!”

The take-home message: the author Liu Mau-Tsai (劉茂才, a Chinese who studied in Germany) is of the opinion that the surname Bó of the Tocharian royalty at Kucha meant ‘white’ 白.
Victor Mair said,

January 28, 2019 @ 3:45 pm

From Douglas Adams:

Here are some thoughts on the question: are the Jié speakers of Tocharian B or a closely related Tocharian language?

(1) Fotudeng was by all accounts a Kuchean (and member of [a cadet branch?] of the royal family). He would have been a native speaker of Tocharian B, but probably also competent in Chinese and Sanskrit and possibly in other languages, including that of the Jie.

(2) His prediction, all are agreed, was not in Chinese. It has been wondered if it might be Tocharian B “spelled” with Chinese characters.

(3) In testing that hypothesis, I take the linguistically most secure part of the tradition, the Chinese translation, and ask myself how would I (taking the role of a Tocharian B speaker) translate the Chinese into Tocharian B.

(4) The TchB word for ‘army’ is retke. The middle Chinese [si̯u-ci̯e] looks like a transcription, if the transcribed language was TchB, of śuke or *śuce. The first means ‘taste, sap, liquid, juice,’ the second is not attested.

(5) ‘To go out’ would be some form of länt-. In the protasis of a (“real”) conditional sentence we would expect a subjunctive. In very archaic Tocharian B, which the (known) dating would demand, we should have something like *lä(n)tn/ñäṃ (<ṃ> = final /n/, realized for some speakers as nasalization of the preceding vowel?). Of the Middle Chinese [tʰei-let/lei-kɑŋ] the first character [tʰei] might represent the TchB pronoun te which might have had an archaic meaning of ‘then.’ The [let/lei-kɑŋ] may have the requisite l and t and the word ends in a nasal, but what would the k be?

(6) [bok/buk-kuk/yok] is, on anyone’s account, not Tocharian B.

(7) ‘Captured’ might be eṅk- ‘take’ (no word meaning specifically ‘capture’ is known. Apart from the final nasal, which would be the third person singular marker, [ɡi̯u̯o-tʰuk-tɑŋ] shows no resemblance.

(8) Aside from the -ng ending both halves of the couplet, I don’t see anything that really even suggests Tocharian B. The caveat here is that we certainly are not informed about much of the TchB vocabulary; there might be a special word for ‘capture’ for instance, though [ɡi̯u̯o-tʰuk-tɑŋ] does not have the phonological structure that matches anything we know about Tocharian verbs (except the -ng of course). Certain we are (as Yoda might have put it), however, that the word for ‘army’ was retke. [si̯u-ci̯e] is not retke.

(9) In sum, with perhaps the exception of the -ng’s, I see no positive evidence of Tocharian B (though, on other grounds I wish I did).

On another, but related, topic, if Jie is a Yenisean language (as some propose), why do the Jié people, murdered in the hundreds of thousands, described physically as so unlike speakers of any other Ket language? Language and race of course are not the same, but the stark physical differences between the very populous Jié and all other known Yeniseans deserves a mention/explanation I have not seen in the discussion chain.
Michael Watts said,

January 28, 2019 @ 4:55 pm

Why do Ethiopians speak a Semitic language despite being radically physically unlike typical Semitic speakers?
Eidolon said,

January 28, 2019 @ 6:40 pm

The question would be easier to answer if we knew the language of the Xiongnu, since it is well-established that the Jie referred to in Chinese records were subjects of the Xiongnu, regardless of their original ethno-linguistic identity. Shi Le, the founder and ruler of the Jie state in 4th century China, was initially a general serving the Xiongnu rulers of Han Zhao, and the Jie tribe to which he belonged lived in then Bingzhou, which was likewise the seat of Xiongnu power in the 3rd century, with a supposed population of 400,000 Xiongnu tribal immigrants.

Given these facts, it would be relatively straight forward to posit that the Jie – or at least those commanded by Shi Le – could have been speaking a common Xiongnu language, which Vovin might have had in mind since he is a supporter of the theory that the Xiongnu spoke Yeniseian. Their physical characteristics would then simply be a testament to the diversity of the Xiongnu empire.
Victor Mair said,

January 28, 2019 @ 7:00 pm

The Xiongnu were a confederation. It is likely that they had speakers of different languages and ethnicities under their command and control. The late Elling Eide worked for decades on the question of the language, culture, and people of the Xiongnu, and he had assembled a considerable amount of evidence that there were Iranian elements within the confederation. So far as I know, he never had a chance to publish any of this material, but some of his notes on the subject may be in the Elling Eide Center in Sarasota.

http://www.ellingoeide.org/
Victor Mair said,

January 28, 2019 @ 7:05 pm

Aside from the superficial similarity of their name, there's little reason to connect the Yeniseian Ket with the Jié 羯 who were slaughtered by the hundreds of thousands (identified by their big noses and bushy beards) in North China in the mid-4th century AD. They are in the wrong place at the wrong time, and they have the wrong appearance.

Jié 羯

Middle Chinese: /kɨɐt̚/

Old Chinese:

(Zhengzhang): /*kad/

I'm working on another hypothesis for who the Jié 羯 might have been and will report back after I flesh it out a bit. (Following Doug Adams' analysis, it seems that they were not Tocharian B speakers: http://languagelog.ldc.upenn.edu/nll/?p=41519#comment-1559613 )
David Marjanović said,

January 28, 2019 @ 7:09 pm

Thanks for the additional reading; I'll try to get to it as soon as possible.

I would also love for a Tocharian specialist to clarify how the/ŋk/ ~ /ŋ/ variation works in Tocharian if /ŋ/ is indeed a separate phoneme (i.e. if /ŋk/ can be distinguished from /nk/).

Unlike Douglas Adams, I'm certainly no Tocharian specialist. However, Tocharian was written in a script straight from India, which had a dedicated letter for [ŋ] (transcribed ṅ following Indological tradition, as in the mentioned eṅk- ‘take’). Wikipedia says this ṅ exclusively occurred in front of k and nowhere else in both languages. Spellings with nk do not seem to occur. In other words, [ŋ] existed but was an allophone.

That said, I have to retract my claim that words whose last syllables are transcribed with Chinese characters whose pronunciations end in [ŋ] therefore cannot be Tocharian: as Prof. Adams implies, this might merely indicate nasal vowels.

the stark physical differences between the very populous Jié and all other known Yeniseans deserves a mention/explanation I have not seen in the discussion chain

I presented one: all three of the mentioned features occur in some of the Ket people in the photos at the end of this article.

Alternatively or additionally, of course, there might have been any number of people of Iranian and/or Tocharian descent among the Jié people, as suggested immediately above the present comment. I certainly don't expect riding steppe warriors to be inbred!
David Marjanović said,

January 28, 2019 @ 7:23 pm

Oh, I forgot:

The centum-satem split is not meaningless, and no amount of scholastic obfuscation can negate its fundamental significance.

The centum-satem distinction has not been considered fundamental in the historical linguistics of IE in about fifty years. Rather, the centum languages have undergone one sound change, the depalatalization of palatalized velars (causing a phonemic merger with the existing plain velars), which must have happened at least four times independently rather than just once: in Hittite (not the rest of Anatolian), in Tocharian, in Greek and in "West IE" (Germanic + Italo-Celtic, if indeed there is such a grouping that excludes Albanian, which lacks the merger).

The satem languages have undergone two sound changes: the palatalized velars became coronal affricates (and then often fricatives), which is a purely phonetic change that did not (on its own) change the number of phonemes, and the unrounding of the labialized velars (causing a phonemic merger with the existing plain velars). This has happened in Balto-Slavic and Indo-Iranian – possibly a single time for both, if they are each other's closest relatives. The first change has also happened in Luwian, Albanian and Armenian; but the second has not happened in Luwian, and only in some environments in Albanian and Armenian.

They are in the wrong place at the wrong time

What do you think of the Pumpokol-type river names (i.e. Yeniseian with a sound shift that is documented only in Pumpokol) in northern Mongolia?
David Marjanović said,

January 28, 2019 @ 7:30 pm

Clarifications:
– even the existence of Italo-Celtic is controversial, so the centum shift may have happened up to six times independently;
– by calling the shift of the palatalized velars to affricates/fricatives "purely phonetic", I do not mean to imply that it is therefore potentially less diagnostic than a phonemic merger.
Zeppelin said,

January 28, 2019 @ 7:50 pm

Not an issue with the historical linguistics of this guest post, but it's about words, at least:
The idea of a "Caucasian race" is 18th-century pseudoscience, and it baffles me that the term has stuck around this long in English. As far as I can tell it's just used as a more "objective"-sounding synonym for "white" these days?
Not only does it perpetuate pseudoscience and erase actual Caucasians (as in, people from the Caucasus, many of whom don't qualify as "white") — it's also just confusing when you're talking about language and ethnicity, since there is a Caucasian language family/sprachbund.
Victor Mair said,

January 28, 2019 @ 8:10 pm

Look at the pictures of the Ket people again.

http://languagelog.ldc.upenn.edu/nll/?p=41519#comment-1559523

Citing:

https://en.wikipedia.org/wiki/Ket_people

I don't think these folks would be singled out for slaughter in the hundreds of thousands because of their big, high noses and bushy facial hair.
Victor Mair said,

January 28, 2019 @ 8:12 pm

Can anyone suggest an innocuous term for people with big, high noses, deepset eyes, a profusion of facial hair, thin lips, etc.?
AG said,

January 28, 2019 @ 11:03 pm

"Neanderthal-adjacent"?

kidding, of course! sort of!
Chris Button said,

January 29, 2019 @ 12:03 am

@ Victor Mair

Regarding the possible Kettish connection, I would really appreciate hearing your thoughts on Pulleyblank's comments below (copied and pasted from his 1999 article cited above):

Though I did not explore the possible historical implications of a linguistic connection between the Xiongnu and the Yeniseians known from more recent times, my assumption was that the Xiongnu must have been a southern extension of these northern forest dwellers who occupied the intervening space between the Yenisei and the Chinese frontier and between the Indo-Europeans to the west and the proto-Altaic speakers to the east in the pre-Qin period. If, as I now suspect, the Xiongnu were in fact natives of the Ordos who borrowed their horse-riding culture from farther north at a time when the outer steppe was divided between Indo-Europeans to the west and Proto-Mongols to the east, this explanation of a Kettish connection would be ruled out. The other possibility is that the Yeniseians represent a fragment of the Xiongnu who moved north after the fall of the Xiongnu empire in the second century. At the 34th International Congress of Asian and North African Studies in Hong Kong, August 1993, Professor Valery Jajlenko, of the Institute of Universal History, Moscow, read a paper entitled, ‘The Kets in Ancient Central Asia’, in which on the basis of hydronyms in Western Turkistan he suggested that this was the original home of the Kettish peoples before they were displaced and forced to move north by the coming of the Iranians. An equally plausible explanation for the existence of such river names would be that they are a legacy of the occupation of that region by the Hephthalites or White Huns, who, as we have seen, were a coalition of Huns (Xiongnu) and Avars (Wuhuan). This would strengthen the possibility of a connection between the language of the Xiongnu and Ket.
AntC said,

January 29, 2019 @ 2:06 am

@Zeppelin Not an issue with the historical linguistics of this guest post, but it's about words, …

I feel your pain. I'm struggling to keep up with all these groupings of peoples, and their multiple names, and their multiple pronunciations (in different languages) of what might be the same name. Not least of the problems is that the Bronze-age/Iron-age/early CE centuries peoples living in these areas of modern-day Xinjiang are not the people living there today.

So for example the wp article on the Tarim basin mummies describes them as 'Caucasoid', but the link takes you immediately to an article 'Caucasian race', which also uses 'Europid'. And yes, none of this denotes the peoples living in the Caucasus mountains today.

So from the "deepset eyes, high/large noses, bushy beards" descriptions: these are not ethnic Han nor ethnic Mongol nor Tibetans. (I've probably just committed another bunch of howlers.) OTOH neither do they sound much like modern-day speakers of Germanic languages. Except that Chinese historical records talk of red or blond hair. These are Vikings?!? Before they got far enough West to become seafarers?!?

To add further confusion: David M has been referencing Vajda's analysis of our 10 characters, suggesting the language is Proto-Yeniseian. I see Vajda also promulgates a Dené–Yeniseian hypothesis, which would require Yeniseian ancestors passing through NE Asia early enough to cross the Bering Straits land bridge; and yet also settling stably in Northern Siberia with traces up to the present day, and with linguistic links to Caucasian-family languages (not necessarily languages spoken today in the Caucasus).

I absolutely love all this 'big theory' stuff. At the same time I can't see it as different in kind from foundation myths: it's all too long ago/too far away for there ever to be definitive proof nor disproof. To say "goropism" is unkind and un-called-for. But I do find it at least an extraordinary coincidence that the language family which provides the world's most widely-spoken language (and which in traditional philological studies has been regarded as something of a backwater in the top-left corner of Eurasia) should also have provided substantial(?) vocab to the pre-eminent East Asian language group — is the claim.

I hope Prof Mair does get to publish his evidence for some sort of Bronze-Age silk road trading East-West across the northern littoral of Eurasia. Trans-Caucasus? It's still a long stretch to explain transmission of IE, specifically Germanic.

Disclaimer: I'm feeling embarrassed by Prof Mair's kind words. I have absolutely no expertise in any of this stuff. All I'm doing is applying critical thinking and 'common sense'.
ohwilleke said,

January 29, 2019 @ 2:36 am

In terms of material culture, the Tocharians were much more similar to the Italic and the Celtic peoples (particularly the Celtic) than they were to the Germanic. For example, the Tocharians had rudimentary Tartans woven in an early Celtic style and some wore hats that are dead ringers to the stereotypical Halloween witches hats which enter the European tradition via Italy. So, if I were looking for sound correspondences between an IE language and a presumed Tocharian one about which we might not have sufficient data to be clear on pronunciation, my first instinct would be to look to Celtic parallels or proto-Italic parallels rather than to Old Norse or Old English.

If I recall correctly, Tocharian men also tended to have Y-DNA haplogroup R1b, which tends to be associated with historically Celtic regions of Europe, rather than Y-DNA haplogroups R1a and I2, which tends to be associated with historically Germanic regions of Europe, which again tends to support looking for Celtic corollaries with Tocharian rather than Germanic ones.

So, Germanic meaning matches really have to be particularly strong to be convincing since other aspects of context would strongly disfavor this connection. Indeed, at the time that there might have been a Tocharian-Germanic sharing of culture (i.e. ca. 2000-1500 BCE), it is quite probable that Old Norse or Proto-Germanic didn't even exist yet even at its hypothetical source in the vicinity of Denmark.

Of course, better still would be to compare Tocharian correspondences to the extent possible, since it is attested in writing, and since the written Tocharian texts that we do have date from fairly close to the 4th century CE, because the Tocharians were literate only in the last few centuries of their civilization, rather than the distance past ca. the 20th century BCE when any connections between Europe and the Tocharians would have dated from.

I am also not nearly so confident as VM that the physical description of these people so clearly prefers the Tocharians over the Ket, particularly by the 4th century CE after two millennia of even slight admixture with neighboring East Asian populations. While the early Tocharians were pretty much straight up West Eurasians (as we know from their mummies), by the 4th century CE, they would almost certainly all have had a significant minority of East Asian admixture, which would tend to make them and the Ket more similar to each other in appearance.

This isn't to say that Yenesian is a particularly attractive choice either. The Ket people are called Paleo-Siberian for a reason. While there were still some populations around in the 19th century and there are still a few thousand alive today, they were already very much on the losing side of history already by the 4th century in favor of populations like the Uralic and Altaic peoples that pretty much overran them, and they were predominantly throughout this time to the present hunter-gatherers or marginal sheep/goat herders who were rather insular and didn't absorb much outsider culture, not a conquering steppe cavalry people or an urbanized people on the cosmopolitan Silk Road from the Tocharian tradition or the later Indo-Iranians. The geography and timing are much better for the Ket than they are for Germanic speakers, but they still aren't a good fit for the part for an impressive army involved in international intrigue and war involving foreign potentates.

On the other hand, the Ket would have been neighbors of the Tocharians early on when the Ket were more prosperous ca. 2000 BCE, and the Tocharians were in a formative stage so the possibility that Tocharians had borrowed some Ket words that aren't attested in surviving Tocharian works (many of which are Buddhist religious texts that might not have addressed some of the relevant lexical content) isn't preposterous either.
David Marjanović said,

January 29, 2019 @ 6:17 am

ohwilleke, are you aware that the website your name links to is not a blog on blogspot.com, despite its URL, but an ad for cameras in Malay or Indonesian? Just asking because I'm confused.

I don't think these folks would be singled out for slaughter in the hundreds of thousands because of their big, high noses and bushy facial hair.

Were they singled out, out of a larger population? Or were whole armies or settlements mowed down, and Chinese observers just found the presence of these traits among the Jié noteworthy?

Something like the modern gradient across northern and central Asia, where people very gradually look more and more European the farther west you get and more and more East Asian the farther east you get, seems to have been in place since the Bronze Age. The whole gradient seems to have been farther east at that time (because, to oversimplify, the Tocharian migration had happened but the Turkic expansion had not), but a gradient of alleles seen the northern half of Europe and in eastern Asia around the same time has been published in a series of papers on the genetics of people buried by the Afanasievo culture, its neighbors and their successors. – And, as ohwilleke points out, the 4th century was long after the Bronze Age.

To add further confusion: David M has been referencing Vajda's analysis of our 10 characters, suggesting the language is Proto-Yeniseian.

The analysis is at least by Vovin and Vajda (I don't know what de la Vaissière's contributions to that paper were), and the language is not supposed to be the considerably older Proto-Yeniseian, but something on the same branch as Pumpokol (i.e. sharing the latter's distinctive sound change(s)).

To say "goropism" is unkind and un-called-for.

More importantly, it is also inaccurate. Goropius Becanus claimed that his Flemish dialect was the ancestral language of mankind, unchanged since Adam & Eve. Chau Wu does not claim Taiwanese is anything like that; in this very post, he says Tw -ong comes from earlier -ang, while Becanus admitted nothing of the sort.

What they have in common are methods. Becanus, of course, went by random similarities between words from different languages. Wu tries to identify regular sound changes – though they often come out irregular in practice and account only for half of a word.

The Ket people are called Paleo-Siberian for a reason. While there were still some populations around in the 19th century and there are still a few thousand alive today, they were already very much on the losing side of history already by the 4th century in favor of populations like the Uralic and Altaic peoples that pretty much overran them, and they were predominantly throughout this time to the present hunter-gatherers or marginal sheep/goat herders who were rather insular and didn't absorb much outsider culture, not a conquering steppe cavalry people or an urbanized people on the cosmopolitan Silk Road from the Tocharian tradition or the later Indo-Iranians.

In the 19th and 18th centuries, yes. But in the 4th? We don't know that. We have next to no clue, most notably, what the Xiongnu spoke when they lived around Mongolia, or what the Huns spoke when they arrived in central Europe (some of their personal names make sense in Iranian, others in Turkic, others don't). There's no evidence against some of the people belonging to these no doubt heterogeneous alliances speaking Yeniseian…

…and here's a paper by Vovin arguing – very carefully and in great detail as usual – that the titles [qɑn], [qɑtʊn], [qɑʁɑn] and [tɑrqɑn]*, which are all over the attested Turkic and Mongolic languages, and in Chinese sources on the Ruanruan, Xianbei and, as it turns out, Xiongnu, but have no etymology in either Turkic or Mongolic, make sense in Yeniseian. Maybe, then, the speakers of one little branch of Yeniseian learned to ride and produced a short-lived steppe empire – much like how the Jurchen/Manchu are the speakers of one little branch of Tungusic that learned to ride and produced a short-lived steppe empire –, while the titles of their rulers lived on for much longer.

* I'm deliberately avoiding a phonemic analysis here.
Victor Mair said,

January 29, 2019 @ 6:44 am

Ran Min's armies were specifically ordered to track down and kill all people who had those physical characteristics.

Not only is this recorded in official history (Jin shu), it is corroborated in a vivid way from a contemporaneous account of Jie seeking to escape in this earliest collection of Guanyin (Avalokiteśvara) miracle tales: Guānshìyīn yìngyàn jì sānzhǒng 觀世音應驗記三種.
shubert said,

January 29, 2019 @ 8:15 am

"For all of these reasons, I commend Chau Wu for his efforts to make a breakthrough in explaining the Jie sentence through a totally new path that takes into account all of the various types of information available to us — historical, cultural, ethnographic, linguistic, etc." I agree with this remark.
ohwilleke said,

January 29, 2019 @ 10:23 am

"ohwilleke, are you aware that the website your name links to is not a blog on blogspot.com, despite its URL, but an ad for cameras in Malay or Indonesian? Just asking because I'm confused."

Must have been data entry error of my part. My blog is http://dispatchesatturtleisland.blogspot.com
ohwilleke said,

January 29, 2019 @ 10:24 am

Damn cyber squatter. actually http://dispatchesfromturtleisland.blogspot.com/
Andreas Johansson said,

January 29, 2019 @ 11:18 am

AntC wrote:
These are Vikings?!? Before they got far enough West to become seafarers?!?

Runic inscriptions attest the presence of North Germanic languages in Scandinavia well before the 4th century.
Chris Button said,

January 29, 2019 @ 11:42 am

In his analysis of the text, Vovin seems to struggle with 秀支 for "army", although I wonder if there is perhaps more to his very speculative association of the second syllable 支 with the Yeniseian "Arin" word kel "army" than he is prepared to admit (rather than his suggestion that it is probably a loanword or internal innovation)? The Old Chinese *-l coda was after all long gone by that stage…

… Regarding 秀支 "army", I wonder if Vovin and others might have more success looking for Yeniseian words meaning "camp, tent, hut" etc.? Such a semantic shift to "army" is common cross linguistically: compare English host (from whence "hostel" etc.) in its archaic sense of "army", or Chinese 呂 originally depicting a series of tents in an encampment (no not a "spinal column"!) and clearly related to the words represented by characters like 旅 and 廬 etc.

From Prof. Vovin in response:

Your proposal is very good, unfortunately it is difficult to substantiate it with data. "Birch bark tent" is PY *quɁs, one might want to see a metathesis either in xiong-nu siuke or in the PY form, but such explanations are imho inherently dangerous. It is difficult to reconstruct PY 'camp', the attested forms are akel, agel, ajel. We still have the problem with the first syllable here. Incidentally, I trust you misunderstood me on Arin -l in kel: you are certainly right that in the 4th c. AD there was no coda *-l in Chinese, but the rendering of Arin -l as Chinese -0 or -j is not impossible. Again, the word is isolated in Arin, and this is no good. But ultimately, words are words, they come and go. Morphology of the couplet is umistakenly Yeniseian. Btw, there was a significant progress in its understanding since 2000, you might want to see Vajda, la Vaissiere and my joint article in Journal Asiatique 304.1 (2016) on that matter — it is downloadable from Academia.edu.

I followed up with the following questions:

– I wonder if Arin "kel" (army) could then be related to "akel / agel / ajel" (camp)?

– If so, is it possible that the 秀 part is representing a similar kind of initial component to the "a-" in the latter forms that distinguishes the sense of "army" from "camp"?

– Are there any words or prefixes that might be able to account for that?

From Prof. Vovin (responding in order):

– Ket has 1ps āb 'my', but it invites more questions than answers. There is 1ps agreement prefix a- in Arin, but 1ps singular poorly fits in the given context (it is not Shi Le, but Futo Cheng who pronounces the said couplet)

– Not that I am aware of.

– For a-, yes (see above), for su-, I do not know.
Philip Taylor said,

January 29, 2019 @ 1:36 pm

Intrigued by what had previously been concealed by ohwilleke's cyber-squatter, I visited the new and as-yet-unsquatted URL and was delighted to find the following : "This is a republication of a post at the Physics Forums with some minor edits". Now, in British English, we have a long-standing joke that what to a scientist is clearly "un-ionisation" is to shop-floor worker "union-isation" and ohwilleke's "republication" has exactly the same possibilities. I think that he means "re-publication" but my first impression was most definitely that he was speaking of "republic-ation".
David Marjanović said,

January 29, 2019 @ 5:29 pm

Thanks, ohwilleke! I'll drop by on occasion. :-)

Ran Min's armies were specifically ordered to track down and kill all people who had those physical characteristics.

Thank you. Is the text specific enough to tell if it was people who had all of these characteristics, or was any one enough to doom them?

Btw, there was a significant progress in its understanding since 2000, you might want to see Vajda, la Vaissiere and my joint article in Journal Asiatique 304.1 (2016) on that matter — it is downloadable from Academia.edu.

Yep, that's the one I've linked to.

X支 army–arm–>limb 肢
related as well.

If you mean that the English words army and arm are related – English has two arm words, and they aren't related to each other. They're similar only by coincidence. Army is related to the one that means "weapon", but not to the one that means "body part".

Army is straight from French armée, which literally means "armed one [feminine]", "arm" as in "weapon", not as in "octopus".

Arm meaning "weapon" is straight from French arme meaning the same, and that is inherited from Latin arma (a plural without a singular, referring to weapons and generally the equipment of a soldier).

Arm for a body part is inherited in English, not borrowed. It's all over the Germanic languages. (…other than those that have lost it. The southern German dialects, like mine, have lost the word and the concept and just say "hand". But I digress.) The French for that is bras.
Victor Mair said,

January 29, 2019 @ 5:33 pm

The texts make it sound as though you had to have at least two of those characteristics.
Chris Button said,

January 29, 2019 @ 10:14 pm

X支 army–arm–>limb 肢
related as well.

I'm tying to figure out if this is serious or tongue-in-cheek, but it is brilliant nonetheless! From a tongue-in-cheek perspective, I assume you mean that since 支 and 肢 represent related words in Chinese via notions of "branch", "limb" etc. then logically we can relate the word for "army" too via its ultimate etymological association with "arm" (as in a "branch/limb") when written as 肢, except of course for the fact that it is but a transcription so doesn't really mean that at all in Chinese! Is pun the right word for this? From a more serious perspective, while "arm" and "army" do ultimately come from the same Proto-Indo-European root, it is via a very roundabout route unlike the typologically common notion of a "gathering (place)" being used to represent an "army (encampment)"
Victor Mair said,

January 30, 2019 @ 1:08 am

From Chau Wu:

@ NV: “Fylki dragan, Prōcūrātor clūs tūr takan.” seems ungrammatical – did OE or ON allow combining three nouns without putting any of them into genitive?

No. Let me try to explain the grammatical aspect of my interpretation.

In the first clause “Fylki dragan,” the subject is fylki (pleural) ‘battalions, hosts (in battle)’, and the verb is Gmc *dragandi ‘(they) go’, which is the present indicative 3rd person plural form of Gmc *dragana ‘to draw, go’. The subject fylki, a neuter noun, is the same for both singular and plural nominative. Translated into English, it reads “Armies go (out).”

The ON word fylki has two meanings: ‘district, county, shire’ and ‘battalion, host (in battle)’. I reported only the second meaning for interpretation. Actually, the two meanings are, so to speak, two sides of the same coin. This aspect of dual meaning is no small matter; in fact, it depicts a peculiarity in the ancient Germanic society. Interestingly, the duality also shows up in Sinitic. I will discuss this intriguing aspect in my next comment. Meanwhile, I would love to see if the duality also shows up in Yeniseian.

In the first clause, the verb (OE draga / ON dragan) is in the second position. And this is characteristic of Germanic syntax. Modern German adheres to this rule very strictly.

In contrast, the second clause has the verb “takan” at the end, not in the second position. Why? Does it suggest that the second clause is a dependent clause so that, like in Modern German, the verb is placed at the end? If so, where is the conjunction such as “and” or “so that” or something like it?

To understand why the conjunction is missing, we need to visualize that this is a divination poem. In the Book of Jìn, the 10-character poem is preceded by five characters, 相輪鈴音云 xiānglún líng yīn yún ‘so says the bell sound of the prayer wheel’. The scenario goes like this: the monk Fotucheng, after receiving Shí Lè’s request, spins his prayer wheel and listens to the ringing sounds of the bell attached to the end of wheel chain. From the bell sounds he discerns the subtle meaning and makes his divination. The prognostication is in the form of a poem in a couplet of five characters each; and it rhymes probably in *-an(d) [now read as 岡 gāng (Tw kong) and 當 dāng (Tw tong)]. Because it is a poem, it sets restrictions in length (5 characters) and rhythmic cadence (a topic too much to go into here – let’s say, it is somewhat similar to iambic). Therefore, the conjunction is omitted. In addition, the expected preposition before the words for ‘prison-tower’ (something like 入 ‘into’) is also missing. May we call it “poetic license”?

Now with this understanding, we can reconstruct the poem in English with words filled in for clarity: “Armies go out, (and) *Púgŭ (into) prison-tower (you)-take.”

Note that grammatically *Púgŭ is in the accusative case. But since Sinitic is non-inflectional, we would not expect to find any change in form.
Victor Mair said,

January 30, 2019 @ 1:10 am

From Chau Wu:

On February 19 of this year we will have the first full moon after the Lunar New Year. This day is celebrated in China and Taiwan as (MSM) Yuánxiāojié 元宵節 or (Tw) Siōng-goân-chiat 上元節, both meaning “the festival of the first full moon”. To celebrate this auspicious day, people eat glutinous rice balls in clear soup, display colorful, elaborate lanterns (hence, the name Lantern Festival in English) – some have human figures with automatic and realistic movements (the equivalent of Christmas decorations), and hold competitions of riddle solving in temples. The riddles are written on paper strips and hung on the beam that attracts large crowds. When the riddles are solved, they are then taken down, and rewards are handed out to the one who has solved them, to the applause of the people gathered there. When I was a kid, my parents would take me to temple for the festivities and watch the intellectual competitions. (As an aside, in the year I was born, my birthday happened to fall on the festival day.) All in all it was a lot of fun.

The riddle we are discussing here has existed for almost 1,700 years. I have offered my Germanic interpretation as an alternative to others that have been proffered by eminent scholars in Turkic or Yeniseian. If I am wrong, that is fine. The riddle can wait another hundred years. Let’s continue working on it but do it in the spirit of fun.
AntC said,

January 30, 2019 @ 2:11 am

Thank you Chau Wu, fun it is!

@David M The analysis is at least by Vovin and Vajda …, and the language is not supposed to be the considerably older Proto-Yeniseian, but something on the same branch as Pumpokol (i.e. sharing the latter's distinctive sound change(s)).

Then I apologise and withdraw the "Proto-". You say the Pumpokol is attested from C18th wordlists. Then are you asking me to believe Pumpokol's sound pattern didn't change at all since C4th?

More importantly, it ["goropism"] is also inaccurate. … Chau Wu does not claim Taiwanese is anything like that;

I make no presumption as to the mother tongue(s) of Chau Wu nor indeed of Chang Tsung-tung. (They both seem fully at home in academic English and in German, in addition to whatever Chinese topolect(s).) I would have thought it pretty obvious from the rest of my paragraph that this 'language of the Garden of Eden' that I was being careful to avoid attributing, was early Germanic, specifically after the operation of Grimm's/Verner's laws.

So the fun I was having was: who would expect the grubby language we're all speaking here to be descended from that of the (Sinitic) Garden of Eden?
Yeli Renrong said,

January 30, 2019 @ 3:20 am

The western branches share several innovations with Balto-Slavic and Indo-Iranian that are absent in Tocharian, like the great expansion of the class of thematic verbs or the meaning shift of */jebʰ/- from "enter" (which is its meaning in Tocharian) to a euphemism and then a swearword (which has a sacred use in one Sanskrit ritual).

Is the meaning of *yebh- such strong evidence?

It's possible for words to retain ordinary and euphemistic or profane meanings side by side, and even for the latter type of meaning to later lose its profane connotations or drop off entirely. For the first case, see English blow and (increasingly disambiguated by spelling) come, and various slang terms of the *yebh- type such as bang, smash, tap, and plow; for the second, see Upper Sorbian najebać 'despite' and English prick, suck, stones, etc.

It's also possible for dialects of the same language to undergo semantic divergence in certain words while remaining dialects of the same language. And in a highly esoterogenic process such as the development of euphemisms or slang terms, we should expect this to some degree. I've gotten a vague idea from somewhere that this is common in Latin American Spanish, but I don't know the details.

The semantic shift shown in *yebh- is of course one-way, but there must have been a time in the development of PIE during which *yebh- meant both 'enter' and… 'bang'. And there must have also been a time in the development of PIE during which it was a group of mutually intelligible but distinct dialects, across which innovations could have jumped.

(And then you have, or so I'm told, French foutre '(mis)place'! Maybe one can copulate into a building after all…)

The first of these cases seems the most likely, especially given Cheung's proposed identification (referenced in Malzahn's The second one to branch off? The Tocharian lexicon revisited) of a verbal reflex of *yebh- 'enter' in Iranian, "*yam(b/p) 'to move, wander, rove, crawl?'", as well as the various Greek nominals with proposed derivations from this root. PIE could well have had polysemy in the meaning of *yebh-, which was resolved differently in the various branches – with the euphemistic meaning taking hold in Greek, Slavic, and Indic, the entirely innocent one in Tocharian, and both in Iranian.
Eidolon said,

January 30, 2019 @ 4:21 am

In the spirit of fun, the easiest way to test Arne Østmoe's hypothesis of the "Germanic-Tai Puzzle" would be to do a genetics study on the Tai population. Since his theory relies on a special and direct influence from Germanic to Tai-Kradai – which he attributes to Germanic immigration into the proto-Tai-Kradai people – while rejecting the more obvious Indic and indirect routes, it is actually falsifiable, since we'd then expect to find remnants of reasonably dated Germanic DNA in the Tai-Kradai population, and can also reliably identify the former, courtesy of ancient European DNA studies. While genetics are not language, in this case I believe they are sufficient, since there is no way for Germanic to have directly passed such a substantial vocabulary set to proto-Tai-Kradai without any population contact.

The Sinitic case is, unfortunately, much more difficult, since Chang Tsung-tung's claim is monumentally broader and can encompass both direct and indirect IE influences. I suppose we will have to wait, as with the Jie puzzle, for better methods and more holistic evidence, since Chang's linguistic claims seem to have been generally disregarded by the present community.
AntC said,

January 30, 2019 @ 7:07 am

You [David M] say the Pumpokol is attested from C18th wordlists. Then are you asking me to believe Pumpokol's sound pattern didn't change at all since C4th?

I am struggling through Vovin, Vajda, Vaissiere. I am completely failing to find any explanation for how they reconstructed C4th Pumpokol. (I'm finding their English more than a little non-idiomatic. They're describing Pumpokol in simple declarative present, without all the 'if's and 'maybe's and '*-' prefixes usual in talking about dead and poorly-attested languages. A little voice inside my head is saying 'these guys are just making the evidence fit their argument, as much as they criticise the Turkic guys for doing.' Example drawn more or less at random:

" Pumpokol innovated its new subject agreement suffix position by extending the usage sphere of predicate agreement suffixes inherited from Proto-Yeniseian."

And yet the evidence is "the only recorded Pumpokol finite verb paradigm " — which is C18th. Then where are they getting the diachronic "innovated" claim from? And when did this innovation happen? More to the point, how do they know when? Or are they conveniently fitting in the innovation to coincide with C4th?

I do find this on wikipedia:
"Though very different from the Indo-European languages, Pumpokol (and to the lesser extent Imbat dialect of Ket language) had some remarkable similarities with proto-Germanic and certain other non-Yeniseian languages, " (there's a small list of examples).
goropism alert: the references from that list are to a German (?published in Russian) and to a C19th Prussian.

Then for the words giving the translation of our 10 characters, how many of the (alleged) Pumpokol words are 'remarkably similar' to similar-meaning Germanic words?

Then Chau Wu could be on to something; and that could be not inconsistent with V,V&V.

And we've just wrapped the riddle inside another mystery: how are Germanic-sounding words appearing in Pumpokol? (Or perhaps I mean: how are Pumpokol-sounding words appearing in Germanic?)
Victor Mair said,

January 30, 2019 @ 9:46 pm

From an anonymous Tocharian specialist:

I agree with everything Doug Adams has said. I also find it really unlikely that Yeniseians should have been described as "standard western barbarians", because, as noted, they simply don't look like that. But the same problem holds if they were Turks. Are these ever described in this way, i.e. the Tujue, for instance? I also agree with Doug Adams that Jie cannot have anything to do with Kucha. If Fotudeng is from India, he was probably simply good at languages. Probably the only people with impressive beards were Iranians, of which there must have been plenty.

VHM: Within a day or two, I will post my further thoughts, data, and hypotheses on this conundrum that complement Chau Wu's o.p.
Chris Button said,

January 30, 2019 @ 11:52 pm

@ Chau Wu & David Marjanović

This was due to a lack of the /f/ phoneme in Old Sinitic (and Taiwanese)…

I do wonder how the Sogdian /f/ is transcribed in contemporary Chinese sources, however.

Sogdian "fu-" looks to have been treated as a bilabial plosive:

http://www.iranicaonline.org/articles/personal-names-sogdian-1-in-chinese-sources

Thus, the mechanism of the process can be viewed as a chain of reactions: f- > *h- > *hs- > s-. Because this is a rather unusual sound change…

Instead of [fylki], we'd need to start from [fulkija]

While it isn’t /f/ as such, Sanskrit “v” /ʋ/ in “vai-" or "va(r)-” in words like vaiśālī and vairocana or varuṇa and vaśavartin tends to be attested in Chinese transcriptions with characters like 墮(隳) EMC xjwiă or 和 EMC ɣwa respectively. To get confusion with a sibilant you would indeed need “vi-” with the expected high front vowel which (outside of it generally being represented with 維 EMC wi) may be represented with something like 術 EMC ʑwit as in vidyā which in turn is also used for “-ṣi(t)-” in tuṣita
David Marjanović said,

January 31, 2019 @ 7:22 am

The texts make it sound as though you had to have at least two of those characteristics.

Good. Probably, then, the Jié army had rather heterogeneous ancestry, I'm guessing involving "Iranians, of which there must have been plenty" – or indeed Tocharians. Most likely they also spoke more than one language. But the one the attested sentence is in makes perfect sense as Yeniseian and not as any kind of IE.

To understand why the conjunction is missing, we need to visualize that this is a divination poem.

I have no problem with the absence of a conjunction, nor with the word order. Verbs were routinely put in the last place in all kinds of clauses in all the older Germanic languages; verb-second order may originally have been used for some kind of emphasis, and the sorting of the three word orders (verb-first, -second, -last) for three clause types (question, independent, dependent) can be watched during Old High German for example.

What I have a problem with is the absence of a preposition – as you point out: "something like 入 ‘into’". Even with all the poetic license in the world, "tower take" in any kind of Germanic could only mean "they (will) take the tower", not "they will take the *bok-kok into the tower".

Is the meaning of *yebh- such strong evidence?

Hi! Nice to meet you in an accessible venue! :-) No, it's not terribly strong evidence. But it is evidence. For each piece of weak evidence we discount, we need to create one more extra hypothesis – I'm making a parsimony argument.

(And then you have, or so I'm told, French foutre '(mis)place'! Maybe one can copulate into a building after all…)

That one seems to be: je l'ai foutu "I fucked it up" (destroyed it) > "I lost it" (I can't use it anymore, it's as good as destroyed) > "I put it somewhere" (and perhaps forgot where).

a verbal reflex of *yebh- 'enter' in Iranian, "*yam(b/p) 'to move, wander, rove, crawl?'", as well as the various Greek nominals with proposed derivations from this root.

The Iranian one is a nasal-infix derivative, so it might have been formed before Tocharian split off, and then its meaning diverged from the base verb. What are the Greek nominals other than the, shall we say, fertility-associated west wind (zéphyros, zeph- from *yebh-)?

I am struggling through Vovin, Vajda, Vaissiere. I am completely failing to find any explanation for how they reconstructed C4th Pumpokol.

They didn't quite do that. (And of course it's strictly speaking impossible, because there's little evidence to suggest absolute dates for the changes that must have happened between Proto-Yeniseian and attested Pumpokol.) They (or Vovin alone back in 2000) apparently first noticed that the sentence made more sense in modern Ket than in Turkic, then found that the discrepancies from Ket fit a Pumpokol-specific sound change (not to mention the absence of Ket-specific or Ket+Yugh-specific ones). Later in this paper, they compare the verb paradigms of several Yeniseian languages and the one incomplete verb paradigm we have from Pumpokol to show that everything seems to fit together.

I'm finding their English more than a little non-idiomatic.

That's Vovin for you. The slightly random use of articles in particular is caused by Russian.

A little voice inside my head is saying 'these guys are just making the evidence fit their argument, as much as they criticise the Turkic guys for doing.' Example drawn more or less at random:

"Pumpokol innovated its new subject agreement suffix position by extending the usage sphere of predicate agreement suffixes inherited from Proto-Yeniseian."

And yet the evidence is "the only recorded Pumpokol finite verb paradigm " — which is C18th. Then where are they getting the diachronic "innovated" claim from?

By comparing it to the reconstruction of Proto-Yeniseian, to which Vajda has contributed. Sure, the declarative claim describes a hypothesis, but it is the most parsimonious hypothesis available.

Also, note the parallels to other Yeniseian languages that are briefly alluded to in the paper.

And when did this innovation happen? More to the point, how do they know when? Or are they conveniently fitting in the innovation to coincide with C4th?

Well, it must have happened at some point; if it happened before the C4th, the sentence makes sense. They don't know when, and they don't need to know when to propose the hypothesis.

I do find this on wikipedia:
"Though very different from the Indo-European languages, Pumpokol (and to the lesser extent Imbat dialect of Ket language) had some remarkable similarities with proto-Germanic and certain other non-Yeniseian languages, " (there's a small list of examples).
goropism alert: the references from that list are to a German (?published in Russian) and to a C19th Prussian.

…Oh. That is Wikipedia at its worst: somebody put Original Research of the worst pseudoscientific kind on Wikipedia, and it's still there because nobody has seen it. Note the complete lack of comments on the talk page.

I doubt there's goropism going on, though: the references are simply to the places where the Pumpokol words are documented.

Right at the beginning of the list, you'll find ab "father" and am "mother". These are papa and mama, no more characteristic of Germanic than of Sinitic or almost anything else.

Then there's a comparison of another very short word meaning "forest" with oak. A weird semantic shift either way, and the -sy part of "oak, tree" remains completely unaccounted for. That's not etymology, that's the Scunthorpe problem.

Then, chasy is compared specifically to Old English, the only Germanic language with the [tʃ] required for the comparison – even Old Frisian has [k].

Even within IE, ivy has nothing to do with figs.

The -gh in high is real: German hoch. It's not there in Pumpokol.

Weird sound changes would need to be assumed for "height" and "horse".

The "love" word is so long it looks like a pretty complex verb, in which the root would never be at the very beginning.

"Measure" has a similar root all over IE, but the specific similarity with OHG is purely graphic: z was just the laminal /s/ as opposed to the apical one, and it was never voiced.

And we've just wrapped the riddle inside another mystery: how are Germanic-sounding words appearing in Pumpokol?

Apart from am and ab, which are nearly global, they're just chance coincidences. There are more impressive ones out there: start here.

(Or perhaps I mean: how are Pumpokol-sounding words appearing in Germanic?)

Well, the origin of house (isolated in Northwest Germanic, IIRC) is unclear, and one of the very tentative suggestions is that Yeniseian-speaking Huns brought it in (PY */quʔs/, IIRC)…
Victor Mair said,

January 31, 2019 @ 8:43 am

From Chau Wu:

People and army in Germanic society

I would like to bring up a subject that I briefly alluded to in the last comment, that is, the duality of meanings of a group of words. To refresh your memory, ON fylki has two meanings: (1) ‘district, county, shire’ and (2) ‘battalion, host (in battle)’. This is the word that I propose was transcribed as 秀支 in the puzzle.

This ON word belongs to a group of words that show similar dual meanings. The best example of this group is the familiar English word “folk.” All Germanic languages (Gothic data is lacking) have this word: OE folc = OS folc = OHG folc = OFris. folk, all meaning ‘people, race; men’, whereas ON fólk means ‘people; army, detachment’ (their etymon is Gmc. *folkam). The Oxford Dictionary of English Etymology (p. 367) has this to say about Gmc. *folkam, “the original meaning of which is perhaps best preserved in ON.” That is to say, the original *folkam carries two meanings: (1) ‘people’ and (2) ‘army’, and the ON fólk best preserves both meanings, whereas its cognates in other Germanic languages have lost the second meaning ‘army’.

So, now we return to ON fylki which is a derivative of fólk. The people grouped in a unit could reasonably be assumed to become a political district, so that the meaning evolved to carry the first meaning of ‘district, county, shire’ for fylki. In Viking Age Norway, the land was divided into fylki, each of them ruled by a fylkir. As regards the second meaning, there is a related ON word, fylking, whose meaning of ‘battle array; host, legion’ echoes the connotation ‘battalion, host’ of fylki. A compound word filkingar-armr means ‘wing of an army’.

The second group is represented by OHG heri with cognates in all branches in Gmc; it also carries the dual meanings: (1) ‘army’ and (2) ‘a people in the ethnic sense or as the body of those responsible for political and legal decision making.’ The double function of heri is a reflection of Germanic society as already described by the first century historian Tacitus. In his book, Germania, section 7, he writes, “It is a particular incitement to valor that their squadrons and wedges are not formed at random or by chance mustering but are composed of families and kinship groups. They have their nearest and dearest close by, as well, so that they can hear the shrieks of their women and the crying of their children.” (Translated by A.R. Birley, Oxford World’s Classics).

It should be noted that this duality of meanings is also found in Greek phŷlon which connotes (1) ‘a race of people, tribe’ on the one hand, and (2) often in pleural, ‘hosts, swarms of men’ on the other hand. A related word phýlopis meaning ‘the battle cry, din of battle, battle’ dates back to Homer.

For this interesting aspect of Germanic society, please refer to a chapter specifically devoted to “People and Army” by D.H. Green in his book, Language and History in the Early Germanic World (Cambridge University Press, 1998), Chapter 5, pp. 84 – 101.

The significance of this aspect of Germanic society is that there are Sinitic words that echo this duality. For example, 部 (Tw pō· / MSM bù) is used in two situations: (1) 部落 (Tw pō·-lo̍k) means ‘tribe’ and (2) 部隊 (Tw pō·-tūi) means ‘army troops’. I surmise that部 (Tw pō·) is related to Germanic folk through the sound change of f- > p- (reminder: Old Sinitic lacked the |f|* sound, hence it is substituted by p- here).

[*VHM: I'm using this makeshift notation since the comments section of WordPress does not support the opening and closing arrowhead brackets in Chau Wu's note.]

The third group: ON fyrðar (m. pl.) means ‘men; warrior’ and its OE cognate fyrd (also spelt fyrð) means ‘an army, troops’. Corresponding to fyrðar / fyrd is a pair of Sinitic words: (1) 師 (Tw su), which besides the usual meaning of ‘teacher’ also means ‘military troops’ (quotable from Zuŏzhuàn 左傳 ‘Tradition of Zuŏ’); and (2) 庶 (Tw sù) as is used in the compounds 庶民 (Tw sù-bín), 庶人 (Tw sù-lín), 黎庶 (Tw lê-sù), all three connoting ‘common people’. Derivation of 師 / 庶 from ON fyrðar / OE fyrd is via f- > *h- > s- as is demonstrated in the o.p.

I would like to know whether this kind of correspondence in dual meanings exists in Yeniseian. If it does, it would be wonderful because then we can connect the dots: NW Germanic – Yeniseian – 羯 – Sinitic, with the assumption that there are Germanic loans to Yeniseian.
Victor Mair said,

January 31, 2019 @ 10:09 am

For comparison with Chau Wu's Taiwanese transcription of the Jie puzzle, I offer this note by Diana Shuheng Zhang, who is adept at Sinitic reconstructions from different periods and places:

=====
秀sù 支cie 替thè 戾lùs 岡kang, 僕bok 谷kok 劬kuo 禿thok 當tang

Is this what the Buddhist ritual bell sounded like in The History of Jin 晉書, by which Fotudeng 佛圖澄, claiming that it is in the Kjet 羯 language, admonished Shi Le 石勒 to catch Liu Yao 劉曜 the Bokkok? I believe this (above) is probably what it would have sounded like in their time, the early-to-mid-4th century, in the North where they were active.

=====

Note that here Diana is using easily typable reconstructions, though she can also employ a blizzard of difficult diacriticals when called upon to do so.

At this point in the debate, I wish to make these observations:

1. The main thing going for the Ket (Yeniseian) hypothesis is the superficial similarity of their name to the Middle Sinitic reconstruction of Jié 羯. Otherwise, as has been remarked above, they were the wrong people at the wrong time in the wrong place with the wrong face.

2. Fotudeng was almost certainly a Tocharian.

3. Despite obdurate denials by one hyper-loquacious participant in this debate, Tocharian is more closely related to northwest IE languages than it is to Indo-Iranian or Balto-Slavic or, for that matter, Turkic or some imagined 4th-century Yeniseian language.

4. Considering the bulk of the evidence presented in the o.p. and the comments, I would like to propose that it would be more plausible to compare the Jie with Celts and Goths than with the Kets.

5. The ovicaprid semantic classifier (radical) of 羯 (lit., "castrated buck ovicaprid; wether") indicates a close relationship with nomadism or agro-pastoralism involving herding of sheep and / or goats.

6. This was precisely the time of the Völkerwanderung (300 AD-400 AD).

7. It would not be surprising if the Tocharians had some sort of association with the Celts and / or the Goths.

8. I have been in the caves at Kizil and other sites around Kucha and seen the magnificent portraits of the sword-bearing Tocharian dignitaries who — from their weaponry, blonde and red hair, fair skin, facial features, etc. — appear like nothing so much as European knights. However, in keeping with their location and intercultural connections, some of them sport a tilaka (Sanskrit तिलक) and their garb has Sasanian ornamentation, as I've pointed out in several documentaries and various publications.
Chris Button said,

January 31, 2019 @ 4:38 pm

支cie

So with the palatal stop /c/ here, are we leaning more toward the earlier velar /kʲ/ or later alveolar /tʲ/ side of the equation? The Later Han Sanskrit transcriptions seem to support both, but they are at least over a century earlier.
David Marjanović said,

January 31, 2019 @ 6:17 pm

1. The main thing going for the Ket (Yeniseian) hypothesis is the superficial similarity of their name to the Middle Sinitic reconstruction of Jié 羯. Otherwise, as has been remarked above, they were the wrong people at the wrong time in the wrong place with the wrong face.

Prof. Mair, the thing going for it is the language. You are in contact with Prof. Vovin, have quoted here from an e-mail he wrote to you about it, and have not said one word against the paper he wrote with Vajda (another Yeniseianist) and de la Vaissière. I linked to the paper on academia.edu where it can be read without any restrictions such as paywalls.

3. Despite obdurate denials by one hyper-loquacious participant in this debate, Tocharian is more closely related to northwest IE languages than it is to Indo-Iranian or Balto-Slavic

You keep saying so. And you keep not providing any evidence against the current consensus among IEists that all surviving IE branches are more closely related to each other than to Tocharian. The consensus is not very strong, so perhaps you'll find some. You are in contact with Douglas Adams and Donald Ringe – you don't need to take my word for such things!

Why do I write such long comments? Precisely because I take what you post (by yourself, or as guest posts by others) seriously. If I gave it shorter shrift, I wouldn't give your thoughts and your knowledge (or those that you relay) the respect they deserve.

or some imagined 4th-century Yeniseian language.

As I just said above…

6. This was precisely the time of the Völkerwanderung (300 AD-400 AD).

That only got really heated when the Huns showed up on the European scene in 375.
David Marjanović said,

January 31, 2019 @ 6:18 pm

I wouldn't give

That should be "I wouldn't be giving".
Victor Mair said,

January 31, 2019 @ 8:35 pm

From Chau Wu:

@B.Ma and @AG

B.Ma said, “This may be a silly question but I am confused by the “examples showing the change from European f- to Tw s-“ – is this saying that those Taiwanese words were derived from Old Norse?”

You asked a great question, and be assured, no questions are silly. The short answer is no, but I should explain the backgrounds to make my statements in the o.p. more clearly. For a brief introduction to Taiwanese (Tw), I would refer you to my publication, Sino-Platonic Papers, Issue 262, 2016 (free download). It is admittedly just a cursory introduction, but it is easily accessible.

Tw is in a branch of Sinitic (called Min) that split off from the mainstream fairly early and migrated southward in several waves. They finally settled in the Min area. Tradition has it that the time of the split is probably from Later Han 後漢 to the Yuánwèi 元魏 (Northern / Tabgatch / Tuoba Wei; 386-535). That the river by Quánzhōu 泉州 is called Jìnjiāng 晉江 is no accident because it was named in commemoration of the Jìn 晉 in one of the migration waves. The language they carried there became the substratum (the vernacular 口語音) of modern Southern Min (SM) and Tw. Later in the Tang dynasty, the Tang court sent government officials and garrison forces to administer and guard the area. The language of the Tang court became the superstratum (the literary 讀書音). As a result of the early parting, the overlap between Tw and other Sinitic in general is about 70%, with the remaining 30% being “peculiar” to SM and Tw with no Sinographs for them (called 有音無字 “graph-less phones”).

Of all the Sinitic topolects, SM/Tw has the most extensive bilayer co-existing side by side. This is somewhat similar to the bilayer of 吳音 (go-on) and 漢音 (kan-on) in Japanese. There is another distinction about SM/Tw. As Professor Mair has mentioned, SM/Tw is the most conservative among all the Sinitic topolects. What he means is that in the substratum, SM/Tw preserves archaic features, pronunciations, usages, and odd morphosyllables not seen in the general Sinitic. In addition, the same Sinograph can be pronounced differently, depending on whether the context is superstratum or substratum.

Like all other languages, Sinitic has absorbed foreign loans continuously throughout its long history. The loans include Greek, Latin, Germanic, Indic (primarily via Buddhism), areal words of northern nomads, and in modern times, Sino-Japanese (by back reflux) and modern European.

Now I shall answer your question. Germanic (especially NW Germanic) also came to Asia; some of the vocabulary was absorbed, and then internalized by Sinitic, the loans evolved according to the phonological rules of Sinitic. They eventually appear in modern SM/Tw and other Sinitic topolects. In the meantime, Germanic tongues underwent their own changes and diversifications to give rise to various branches. It happens that Old Norse, written as Old Icelandic in poetry and sagas, has been the best preserved and richest written language of the Medieval Age. Modern Icelanders can read Old Icelandic without much difficulty. Like Icelandic, SM/Tw has a long history of isolation, thereby preserving archaic features most abundantly. Luckily for me, I was born a Taiwanese and my parents passed on this archaic yet modern language to me. Having found the REGULAR patterns of sound correspondence between ON and Tw has been the greatest joy of my life (please see the 9 patterns of sound correspondence on pp. 43 – 104 in my paper). I would like to conclude by citing just a couple of examples (not from the paper) to illustrate my points.

(1) The word for ‘soap’ is 肥皂 féizào in MSM, in contrast, it is sap-bûn (a variant form is sat-bûn) in Tw, and it is a graph-less word. Soap was a Germanic invention; it was unknown to the Greeks and Romans of the classical period. Pliny the Elder in his Historia Naturalis (28.191) mentioned that it was a Gallic invention but he was mistaken. The Gmc word has been reconstructed as *saip(i)ōn. It had been borrowed into various Romance languages before the mainstream Late Latin took it up. Tw sap-bûn is closest to Rumanian săpun.

(2) The word for the verb ‘to drink’ is 喝 hē in MSM. Tw does not use this word. Please see the map of Chinese dialectal equivalents for 喝 ‘to drink’ in Wiktionary (https://en.wiktionary.org/wiki/Template:zh-dial-map/%E5%96%9D). Tw has two words for ‘to drink’: (a) Tw ím is the literary reading of 飲; it belongs to the superstratum. It is in my opinion a loan from Latin imbibere (> E imbibe). The derivation goes like this: L. imbibere ‘to imbibe’ > *im- > ím 飲 ‘to drink’. (b) The second one, Tw lim, is a vernacular word for daily usage and belongs to the substratum, hence it is in the category of graph-less phones. Recently the Ministry of Education of Taiwan has been creating graphs for those “graph-less” words. Thus, a new character 啉 has been created for Tw lim. Tw lim can be derived directly from either Latin lībāre or Greek leíbein (λíβειν) through homorganic nasalization (-b > -m), thus (using the Latin word as our starting point), lībāre > * līb- > lim. The Latin/Greek word means ‘to make an offering of a drink to the gods in sacrificial rituals’. Homer uses this word in his epics. You cannot be more reverential than this when you use this word. It has given me immense pleasure and pride realizing that I have been using one of the most refined words from the Greco-Roman world.
AntC said,

January 31, 2019 @ 10:00 pm

Thank you again Chau Wu. I'm afraid to say that as one mystery is explained, it is replaced by a different mystery.

You're explaining how SM/Tw preserved an older pronunciation of Germanic loanwords, even as the pronunciation in Germanic evolved (slowly in ON) and slowly in SM/Tw. OK. That is a strong, specific, measurable, time-oriented and above all scientifically testable hypothesis. (It's also an amazing surprise.)

But now you take up the example of 'to drink' (that's a Germanic substrate word?). Then I would expect you to be linking 'drink' to some word in SM/Tw. Instead you link a Latin word to the MSM and a Greek word to the SM/Tw. Those might be plausible linkages, but they have no bearing on your claims for Germanic linkages. When was the contact time from Homer's Greek to Early Sinitic? Did the word come via Sanskrit or Indo-Iranian?

I'd expect: there to be a Germanic word that is counterpart of Greek/IE leibein (I have a vague feeling I've come across that in German, but I'm too rusty); for you to cite its (reconstructed?) pronunciation in Germanic circa C4th (certainly after the operation of Grimm's law); for you to apply the expected sound changes in Sinitic from that time up to today's SM/Tw pronunciation. I'm plain not seeing how Germanic 'drink' is a cognate of Tw 'lim'. I apologise if I'm being dense.

To be clear: I'm not disputing that Tw 'lim' might be cognate with Greek 'leibein'. I share your pleasure and pride at being able to share the libations of Homer's seafarers. I do not see that claim as evidence for your main claim for Germanic loans into Sinitic. And specifically the bridge point being C4th Tocharian. To justify that main claim wrt 'drink', you need to get a Germanic phase into the history of sound changes.

Otherwise, your claims weaken to: a bunch of IE cognates from left, right, and centre sound-alike vaguely sense-alike words in SM/Tw. And throughout this topic we're seeing it's something of an occupational hazard for linguists (or would-be linguists) to hear sound-alikes. Of course if you widen the catchment from Germanic to also Latin and Greek, you're going to come across more apparent sound-alikes.

Prof Mair: much as I regret that you and David M seem to have started these threads on a 'wrong footing', I do value David's insisting on rigorous justifications. I can see from the (now apparently) bogus claims on the wikipedia Pumpokol article, just how easy it is to hear sound-alikes. I'll respond more specifically to your latest very helpful summary, thank you.
AntC said,

January 31, 2019 @ 10:06 pm

Correction: Chau Wu is linking Latin for drink to the Tw literary word/superstrate. Not to MSM, whose word 'hē' appears not to be a cognate.
AntC said,

January 31, 2019 @ 10:36 pm

BTW the 'im' of 'imbibe' is just prefix 'in' (with assimilation). The root that carries the meaning of drink is 'bib-' (cp English bibulous), cognate with IE 'pot-' (cp English potable).

leíbein to lībāre (cp English libation) might be cognate with 'lim', I suppose. But that comes to English from Latin, perhaps PIE *lehi-. I continue to see no Germanic.

Chau Wu: I think you're doing your claims a dis-service with reckless (dare I say fanciful) sound-alikes.
Chris Button said,

January 31, 2019 @ 10:50 pm

@ David Marjanović

The centum-satem distinction has not been considered fundamental in the historical linguistics of IE in about fifty years.

Respectfully (and bear in mind I am saying this as a specialist in Proto-Sino-Tibetan rather than Proto-Indo-European), I think you might be slightly misunderstanding the situation. The notion of a genetic split along centum-satem lines was overturned by the discovery of none other than Tocharian itself, so it would be weird if Prof. Mair were challenging this (which I don't believe he is). If I am understanding his position correctly, Prof. Mair is referring to the resultant treatment of the centum-satem distinction as an "areal phenomenon" (i.e. something shared by languages in geographical proximity) that still represents a fundamental distinction albeit in a very different manner. If you need any evidence of how "fundamental" this areal distinction remains, you might want to read about the controversial and hotly debated "Bangani Engima" that I don't believe was ever fully resolved.
AntC said,

February 1, 2019 @ 1:09 am

Thank you Prof Mair for your summary. To me, whilst it makes a number of assertions that I don't doubt, few of them amount to evidence either way.

2. Fotudeng was almost certainly a Tocharian.

Yes I don't doubt he spoke Tocharian. I don't doubt Tocharian is in the IE family, or that those bristling warriors could be Celts or Goths. I also don't doubt that Fotudeng, as a monk/missionary, traveler and all-round wise man could operate in many languages.

At the time of the action, he'd been in China some 20 years, and in the orbit of Shi Le for a decade. Given the history and mobility of both men, I'm surprised there wasn't a language they had in common whereby they could speak directly. Even if not, surely Fotudeng could find a language in common with the scribes/courtiers?

So I think it's wrong to focus on the speaker; let's also consider his audience.

It seems to me we're making a big assumption that Fotudeng wanted to be exactly understood. He was a smart operator promoting a pacific creed in turbulent times, with fearsome warriors. Then if I were him, I'd speak obliquely in the style of "a mighty empire will fall". "Troops will go out; there will be captures; there will be a great leader; there will be prisons; bad things could happen [Trump, attr]."

I'd speak in a mystic language. I'd use obscure words (of the sort that wouldn't get well-recorded in vocab lists). I'd use obfuscated syntax, and especially avoid prepositions that would betray who would be inside the prison. Plausible deniability,:in case the 'wrong' side prevailed, blame it on the scribes/a bad translation.

We don't have secure sound values for the 10 characters in the transcription. We don't have comprehensive word lists that cover all of the semantic spheres of the (alleged) translation. We don't have well-attested reconstructions of how those words could have sounded at the time in the candidate languages.

I'm not seeing enough evidence to move me from the default hypothesis: we'll never know.
AG said,

February 1, 2019 @ 2:53 am

wait – can someone look into the "soap in Taiwanese" thing? Why isn't sap-bun simply just, for example, colonial-era Portuguese or French rather than prehistoric Germanic? Do we have proof that was the Taiwanese word for soap before the colonial era? I am very curious about this example.
AntC said,

February 1, 2019 @ 4:52 am

5. The ovicaprid semantic classifier (radical) of 羯 (lit., "castrated buck ovicaprid; wether") indicates a close relationship with nomadism or agro-pastoralism involving herding of sheep and / or goats.

I'm not seeing how this intel helps. The place seems to be mostly desert. Wouldn't any peoples living there subsist by following the herds following the rains avoiding the snows? Or do we know that the agro-pastoralism of Tocharians was different to Yeniseians, different to Xiōngnú, different to Turkics, different to Indo-Iranians?

Would the Chinese scribes have used a different radical if the Jié had been herders of reindeer or of horses? Or is this the usual Chinese cultural imperialism stigmatising nomads = barbarians (i.e. don't live in cities). And the worst insult: they keep goats.
AntC said,

February 1, 2019 @ 5:39 am

@Chau Wu Tw sap-bûn is closest to Rumanian săpun.

Rumanian isn't a Germanic language. As AG points out, the Tw pronunciation is also close to Portuguese sabão, and recently Prof Mair covered Tw 'pang' from Portuguese pão. (And very tasty bread it was too, in Kaohsiung.)

Again the evidence we need is a chain from Proto-Germanic *saipon (or C4th), through the sound pattern changes to modern Tw. If the Tw is a 'graph-less word', we seem to be stumped. Or does an Early Middle Chinese reading of 肥皂 put us in the right place?

It's a bit hard to tell from Prof Mair's photos whether those Tocharian warriors used soap.

wikipedia says "True soap, made of animal fat, did not appear in China until the modern era." — not sure when that means, but the reference's graphic descriptions of (lack of) personal hygiene in the Tang Dynasty is quite revolting. So did nobody soap for a millennium after the Germanics got ousted?
AG said,

February 1, 2019 @ 8:36 am

I found this helpful list of languages where the word for "soap" is basically the same. Either the Goths visited nearly every country on Earth, handing out hotel soaps and exhorting everyone to lather up, after which everyone ignored the hygiene advice but remembered the word, or the word spread a lot later.

https://www.flickr.com/photos/macadamer/30227445882/
Victor Mair said,

February 1, 2019 @ 9:08 am

I've been to the area around Kucha (homeland of Tocharian B speakers) many times, and they follow a special type of nomadism / agro-pastoralism there (northwest rim of the Tarim Basin). Namely, they winter in the huge oases at the bottom of the foothills 'neath the Tängri Tagh (Tianshan; Heavenly Mountains) and summer with their herds in the lush valleys nestled in the high mountains.

Both north and south of the edges of the Taklamakan Desert, there are enormous oases, some of which have a hundred thousand or more inhabitants. Kucha is one of these. The oases around Kizil, although up closer to the mountains than Kucha, are also of considerable size. There is a monument to Kumārajīva, the great Tocharian translator of Buddhist texts into Chinese, at Kizil, and this is also where I saw the caves of the Tocharian knights, both of which I talked about in previous posts.

We know from the gigantic cemeteries on the gravelly terraces nearby that the scope of these oasis communities was large already in the period from the Late Bronze Age and Early Iron Age to the period corresponding to the Han through Tang.

Already by the 2nd c. AD, Kucha had a population of over 80,000 people (81,317, according to the official Han History / Han shu, completed in 111 AD).
Chau Wu said,

February 1, 2019 @ 10:07 am

@AG. Thank you for finding and sharing the flkr page. That’s great. I have three comments.

(1) The “Taiwanese sabun” included in the Austronesian group probably refers to that of the Native Taiwanese, not the Southern Min speaking people, the latter have sap-bûn for ‘soap’.

(2) The Austronesian group on the list is confined to those in Southeast Asia. I would like to know how Maori in New Zealand or Native Hawaiians call it. The reason I am curious about it is that in Southeast Asia Southern Min speakers are spread far and wide, and they exert significant influences in commerce.

(3) Don’t the Japanese use soap? They are not included in the Asian group. I know they care about cleanliness to a fault. Is it because they call soap sekken so that it is not listed? Similarly, MSM feizhao is not listed either.
Chau Wu said,

February 1, 2019 @ 11:09 am

@AntC. I apologize for misleading you to the impression that I meant the Taiwanese lexicon are Germanic. I re-read my comment to B.Ma and AG, and found that I was not careful in phrasing my sentences. B.Ma and AG asked whether “those Taiwanese words were derived from Old Norse?” I first gave a short answer “no”, and then went on to a brief introduction to the history of Taiwanese. In it I did say:

“Like all other languages, Sinitic absorbs foreign loans continuously throughout its long history. The loans include Greek, Latin, Germanic, Indic (primarily via Buddhism), areal words of northern nomads, and in modern times, Sino-Japanese (by back reflux) and modern European.”

Southern Min/Taiwanese, like Sinitic in general, received loans from various sources. Then, intending to answer B.Ma and AG’s question regarding the relationship between ON and Tw, I said:

“Having found the REGULAR patterns of sound correspondence between ON and Tw has been the greatest joy of my life…”

That is a clear case of misstatement. I should have said “between ancient European languages and Tw…” I hope this will remove the problems you have with the Tw words for ‘soap’ and ‘lim’.

I apologize for my lack of precision in phrasing the sentence that led you to a wrong track.
Chris Button said,

February 1, 2019 @ 11:37 am

@ Victor Mair & Antc

The ovicaprid semantic classifier (radical) of 羯 (lit., "castrated buck ovicaprid; wether") indicates a close relationship with nomadism or agro-pastoralism involving herding of sheep and / or goats.

Would the Chinese scribes have used a different radical if the Jié had been herders of reindeer or of horses? Or is this the usual Chinese cultural imperialism stigmatising nomads = barbarians (i.e. don't live in cities). And the worst insult: they keep goats.

An interesting comparison here is the name "Qiang" 羌. Pulleyblank quite rightly casts doubt on later perceptions that this meant they were sheep-herders since 羊 "sheep" is clearly playing a phonetic role (as it is also in 姜) and they were originally a hostile enemy, rather than peaceful shepherds, as attested all the way back in the oracle-bone inscriptions. Although in the case of 羯, the 羊 component is not playing a phonetic role, it might not be wise to take it literally as implying "sheep herders" here either since the logical extension would then be to tie the "Mo" 㹮 specifically to 犭(犬) "dogs" etc. etc. There might, however, be something in the general connection of animals of whatever kind with "barbarians"…
AG said,

February 1, 2019 @ 12:20 pm

@ Chau Wu – Sorry for any confusion; that is a list only of words that sound like "sapun", not a list of the words for soap in every language.

My point in linking to it was as an indirect way of wondering if, when the same word for a relatively modern invention shows up in so many languages, like "sapun" or, let's say, "taxi" or "ketchup" or "wifi", shouldn't it probably be disqualified from being used as an example of some kind of ancient linguistic exchange?

To say it another way: all the languages along the Silk Road might share the term "wifi", but that doesn't mean that either wifi or the term for it spread along the ancient Silk Road, right? (I'm not a linguist & hope the pros will forgive the extremely non-technical way I'm expressing my questions here).
Victor Mair said,

February 1, 2019 @ 1:27 pm

From "Year of the ovicaprid" (2/15/15):

Considering that domesticated, herded ovicaprids were brought to East Asia before the Bronze Age by pastoralists from the steppes to the northwest, it is remarkable how the semantic classifier yáng 羊 has entered into many important characters having positive, felicitous meaning in Chinese civilization:

yì 義 ("justice, righteousness")

shàn 善 ("good[ness]")

xiáng 祥 ("felicitous; auspicious")

yǎng 養 ("raise; nourish; nurture; rear; take care of in old age")

měi 美 ("beauty")

xiū 羞 ("[sense of] shame")

yǒu 羑 ("to guide to goodness / right / reason")

xiàn 羨 ("admire; be fond of")

xiān 鮮 ("fresh; delicious food; delicacy; good and kind — an obvious merging of ovicaprid and piscine qualities")

qún 群 ("group; community")

For a fuller treatment of this phenomenon, see "Lamb of Goodness, Goat of Justice" (pp. 86-93) in Victor H. Mair, "Religious Formations and Intercultural Contacts in Early China," in Volkhard Krech and Marion Steinicke, ed., Dynamics in the History of Religions between Asia and Europe: Encounters, Notions, and Comparative Perspectives (Dynamics in the History of Religion, 1 [Ruhr-Universität Bochum]) (Leiden: Brill, 2011), pp. 85-110. (available on Google Books)

But ovicaprids are only one part of a large package of cultural attributes and technologies that entered East Asia along the same vector from the northwest. Tattoo is another, and it ultimately contributed in a fundamental way to the development of writing:

"Tattoos as a means of communication" (9/1/12)
Chris Button said,

February 1, 2019 @ 2:37 pm

It might be worth noting how the "beautiful" plumage in early inscriptional forms of 美 above the 大 graphically converged with 羊 (presumably facilitated by the association of sheep with wooliness). I wonder to what extent this contributed to the development of the later senses of 羊?
AntC said,

February 1, 2019 @ 7:12 pm

Maori in New Zealand have hopi for soap, which is transparently a loan from the English, not from a Romance language. (The trailing -i gets added because Te Reo is a syllabic language/no final consonants.) https://maoridictionary.co.nz/search?idiom=&phrase=&proverb=&loan=&histLoanWords=&keywords=soap

Also at that link you'll see uku, white clay used for washing/bleaching.

As AG says (and it's an entirely sensible point): there might be lots of cognates in Tw from all sorts of IE languages, and they might be endemic to Tw not the mainland, because of Taiwan's particular colonial history. (For example if sap-bûn arrived from Portuguese, it maybe wouldn't appear in SM/Hokkien in Fujian — they'd use the MSM word. Pang arrived more recently from Portuguese via Japanese, so I also would expect not in Fujian.)

Classically (for scientific methodology) if you're trying to prove 'all swans are white', then finding black crows does not count as evidence, however much you get a thrill from studying crows. (If Tocharian/Germanic loans came into Early Middle Chinese, and their pronunciation recognisably survived into SM/Tw, then finding IE loans in Tw that arrived via later Indo-Iranian or modern European contact is not evidence.)

@AG that is a list only of words that sound like "sapun". Ah thanks, that explains why the German isn't there, nor the English. Then I'm curious how some other Germanic languages are there. Tentatively: Tocharian split off from Germanic before the invention of soap, so neither the stuff nor the word made it to the steppes. So Tw got the word via some other route (probably a Romance language). And the Chinese went a thousand years unsoaped.
AntC said,

February 1, 2019 @ 7:53 pm

Professor Mair, you are getting close to a tactic of blowing smoke, by just mentioning any old cultural artefact coming into Chinese. Now we have tattooing. (How did you manage to write a whole article about that without mentioning that the English word is from Samoan, and the long history of tattooing throughout Polynesia, from Austronesian origins, via Taiwan amongst other places).

Tattooing dates to at least 3,000 BC (says wikipedia), found in both Egyptian mummies and a body preserved in a glacier in the Alps. " It was one of the early technologies developed by the Proto-Austronesians in Taiwan and coastal South China prior to at least 1500 BCE, before the Austronesian expansion into the islands of the Indo-Pacific."

Note: South China, and already by the time of the Tarim mummies. Then whether or not tattooing " entered East Asia along the same vector from the northwest.", it certainly came to the region by other routes.

So this is another example of non-evidence bearing on the Tocharian question.
Victor Mair said,

February 1, 2019 @ 8:23 pm

AntC:

Now you are beginning to lose your sense of perspective and code of etiquette.

You need to go back and read that post on tattoo much more carefully, especially for the historical context in which I discussed wén 文.

And I did mention the tattoos on Ötzi the Iceman who died very close to where my father pastured his animals in the Alps as a young boy. Elsewhere I have stressed the importance of Maori tattoo, but that does not negate the significance of the transfer of tattoo across Eurasia through time and space.

Finally, although the Tocharians are crucial to the solution of the 4th-century AD puzzle that is the focus of our long conversation, they are not the be-all and end-all of the problem. To make a breakthrough in understanding the dynamics of East-West cultural interactions, we have to look at a much bigger picture.
AntC said,

February 1, 2019 @ 10:05 pm

@Chau Wu … Maori in New Zealand or Native Hawaiians … The reason I am curious about it is that in Southeast Asia Southern Min speakers are spread far and wide, and they exert significant influences in commerce.

Comments like that do nothing to dispel the smell of goropism and chauvinism that has never really left this thread. Right there in Taiwan you are hugely lucky to have preserved the aboriginal Austronesian languages. I'd love to see 'big theory' work about how those languages spread further South/East in Asia, and throughout Polynesia. Despite all the tattooing, they never thought to invent writing. So Prof Mair's theory needs an explanation for that. (Maori 'moko' facial and body designs are under protocols that relate your whakapapa: genealogy/ancestry/tribal affiliations. So there is semiotics of a sort.)

Polynesian seafaring, navigation and exploring is an amazing story: thousands of miles out of sight of any land. Not only did they find tiny islands, they also navigated back home, accurately to equally tiny islands. There's some evidence they got as far as South America and brought back sweet potato and chickens. Compared to which, Southern Min coast-hugging trading is really pretty feeble. (Arguably, Egyptian/Homeric era or even Viking seafaring isn't much more than coast-hugging either.)

I'm going to discount Gavin Menzies' pseudohistory about Admiral Zheng He reaching New Zealand/Australia; anyway by then Polynesian exploration was complete. (They didn't get westwards from NZ to Australia: Antarctic winds blow the wrong way.)

No Chinese got to NZ until the British brought them — C19th gold-prospectors, mostly Cantonese rather than SM; and they were treated with appalling racism, so I doubt any linguistic influence.
Chau Wu said,

February 1, 2019 @ 11:52 pm

@AntC: “BTW the ‘im’ of ‘imbibe’ is just prefix ‘in’ (with assimilation). The root that carries the meaning of drink is ‘bib-‘ (cp English bibulous), cognate with IE ‘pot-‘ (cp English potable).”

You touched on an important topic in lexical loan processes involving transfer of multisyllabic words to a monosyllabic language (Sinitic), in particular, the prefix of a word. The question is whether a prefix, by nature a bound morpheme, can be used as an independent morpheme in Sinitic, in this case, the ‘im’ of imbibere to become a free morpheme as 飲 (Tw ím) and carry the meaning of ‘to drink’.

First let’s consider the existence of words in English which are originally prefixes: pro / (pl.) pros from professional; sub ‘a submarine; a submarine sandwich; a substitute’ and the verb ‘sub’ (subbed, subbing); hype (hyped, hyping); Super!

In English, the freeing of a supposedly bound morpheme is due to the principle of least effort (Ref. 1), that is, if I can get people understand that a ‘pro’ means a ‘professional’, then by displacement, the “fessional” can be omitted.

Now when a multisyllabic word like Latin imbibere is loaned to Sinitic (a monosyllabic system), in addition to the principle of least effort, extra pressure is exerted on the whole word to break up and to retain just what is enough for communication. It is usually the first syllable that is retained. This phenomenon is fairly regular that can be called the rule of first syllable. I would refer you to my paper (Sino-Platonic Papers, Issue 262, 2016, pp. 24-25) where it is called Operational Rule-1.

I have presented eight examples of Latin im- (prefix) > Tw im in the same paper as Pattern of Sound Correspondence-7 (pp. 91-92). One would ask, wouldn’t it create a lot of homophones in Sinitic? Yes, but Sinitic has found ways to solve the problem.

Ref. 1. Hans Frede Nielsen, “The Continental Backgrounds of English and its Insular Development until 1154”. Odense University Press, 1998. Pages 15-16. (In the section on the principle of least effort, the author applies it to a number of phonological phenomena, but not to freeing of prefixes discussed here.)
Chau Wu said,

February 1, 2019 @ 11:57 pm

My first submission got mutilated. Second try.

@AntC: “BTW the ‘im’ of ‘imbibe’ is just prefix ‘in’ (with assimilation). The root that carries the meaning of drink is ‘bib-‘ (cp English bibulous), cognate with IE ‘pot-‘ (cp English potable).”

You touched on an important topic in lexical loan processes involving transfer of multisyllabic words to a monosyllabic language (Sinitic), in particular, the prefix of a word. The question is whether a prefix, by nature a bound morpheme, can be used as an independent morpheme in Sinitic, in this case, the ‘im’ of imbibere to become a free morpheme as 飲 (Tw ím) and carry the meaning of ‘to drink’.

First let’s consider the existence of words in English which are originally prefixes: pro / (pl.) pros Tw im in the same paper as Pattern of Sound Correspondence-7 (pp. 91-92). One would ask, wouldn’t it create a lot of homophones in Sinitic? Yes, but Sinitic has found ways to solve the problem.

Ref. 1. Hans Frede Nielsen, “The Continental Backgrounds of English and its Insular Development until 1154”. Odense University Press, 1998. Pages 15-16. (In the section on the principle of least effort, the author applies it to a number of phonological phenomena, but not to freeing of prefixes discussed here.)
AntC said,

February 2, 2019 @ 12:23 am

the smell of goropism and chauvinism

I'd better explain that comment, before I'm accused of more breaches. It seems to me:

Experts on Turkic languages will hear our 10 characters as Turkic.

Experts on Yeniseian languages will hear our 10 characters as Yeniseian.

Speakers of SM/Tw will hear in the sounds a resonance of archaic/proto-Taiwanese.

Experts on early Germanic will hear early Germanic. If they're furthermore experts on other branches of IE, for the bits that don't sound-alike Germanic, they might sound-alike Latin/Greek.

But all linguists are experts on branches of Germanic/IE, because they need a command of English professionally. (I might have to make an exception for Vovin.)

English has a huge vocabulary, greatly increasing the likelihood of sound-alikes/sense-alikes with other languages. Not only does English have a Germanic substrate, it has also absorbed words from all stages of other branches of IE.

I admire Chang Tsung-tung's paper, because that in effect excludes IE-in-general sound-alikes, and narrows the hypothesis to well-attested Germanic roots at a point in time after it branched from IE. (Specifically, after the operation of Grimm's/Verner's laws, and before Germanic started changing (both in vocab and morphology) from contact with the Romance branch.)

To tackle our riddle, it seems to me we need an expert who is equally balanced in Yeniseian and Tocharian. (Those two being the strongest candidates so far.) Who we must exclude is anybody expert in IE-in-general or even Germanic-in-general, because they can't help themselves hearing echoes of Germanic.

Anybody who is a pro-Tocharian chauvinist must also be excluded: nobody on this thread is disputing that Tocharians were in the right place at the right time; nobody is disputing that Tocharian is IE (and generally Germanic/Goth); nobody is disputing that Fotudeng spoke Tocharian. Providing more and more evidence that Tocharians were in the right place at the right time/what I tell you three times is true, is demonstrating to me only chauvinism.

Is anybody denying that there were other peoples/other languages in the right place at the right time? Is anybody denying that Fotudeng could have spoken some of those languages? For that matter: is anybody denying those other peoples subsisted by nomadism/herding animals, probably including sheep/goats? It would be great to bring to bear counter-evidence like that.

To make a breakthrough in understanding the dynamics of East-West cultural interactions, we have to look at a much bigger picture.

is another example of what I'm calling true-but-not-evidence. That is a different topic for a different thread. The case for East-West cultural interactions doesn't rely on what language Fotudeng was speaking. The etiquette I am observing in this thread is "sticking to the subject"; and weighing evidence against a testable hypothesis.

David M has presented what seems to me strong internal evidence that our 10 characters do not sound sufficiently Germanic, neither are the proposed sound-alikes sufficiently sense-alike. I have not seen any comparably strong rejection of the Yeniseian/early Pumpokol hypothesis; but then neither am I in a position to assess the strength of that hypothesis.
Victor Mair said,

February 2, 2019 @ 12:47 am

People who are going on at great length against supposed supporters of a Tocharian hypothesis for the 10 character puzzle should stop beating a dead horse and go back and read the o.p. and the comments. After the observations of Doug Adams (one of our leading Tocharianists, especially for Tocharian B), how could anyone adhere to a Tocharian hypothesis?
AntC said,

February 2, 2019 @ 2:09 am

at great length against supposed supporters of a Tocharian hypothesis for the 10 character puzzle should stop beating a dead horse and go back and read the o.p. and the comments. After the observations of Doug Adams … how could anyone adhere to a Tocharian hypothesis?

Em … ?

Prof Mair, with respect, after the observations of Doug Adams, there's somebody been posting a great deal of information about Tocharians, and Kucha. And that somebody is you. Most recently "the Tocharians are crucial to the solution of the 4th-century AD puzzle that is the focus of our long conversation".

A couple of days after DA's observations, you wrote " Within a day or two, I will post my further thoughts, data, and hypotheses on this conundrum that complement Chau Wu's o.p." That was a couple of days ago. I'd been assuming that the material you've been posting constitutes those thoughts.

On reading back through the posts, perhaps the material is not explicitly in support of a Tocharian hypothesis. But I'm not seeing why else it's appearing. That Fotudeng was Tocharian seems to me rather incidental than "crucial" — that is if our 10 character puzzle is not in Tocharian.

Clearly I am not following. Then I apologise for posting distracting and irrelevant commentary.
AntC said,

February 2, 2019 @ 6:38 am

@Chau Wu transfer of multisyllabic words to a monosyllabic language (Sinitic), in particular, the prefix of a word.

In light of Prof Mair's recent "stop beating a dead horse", I think I'm going to give up on this side-thread. You could perhaps seek guidance from fellow Sinologists and philologists as to whether what you outline is a plausible loanword process. I am no expert at all. Some brief thoughts — ah, it's turned out not brief:

I particularly like the way Chang Tsung-tung keeps to single-syllable Germanic roots corresponding to single-syllable Sinitic words. After firmly establishing the sound-pattern correspondences and time of contact then perhaps cast the net wider.

Your o.p. was about (mostly) Germanic words corresponding to a EMC sound pattern somewhat-reconstructed by reference to SM/Tw as exhibiting conservative sound changes; and very stimulating it was. I am just not going to enter into further correspondence about random IE words allegedly entering SM/Tw. In particular, for graph-less phones how could we reconstruct their time or place of entering, or their sound changes since entering? You're also talking about loans from Romance languages which AFAICT have no close Germanic equivalents. This is way off the original topic. Perhaps my questions below are answered in your papers, but I'd far rather put my time into Chang Tsung-tung's paper (which is hard going for me, but very rewarding).

Just because a word appears in SM/Tw but not in MSM does not necessarily mean you have to start looking for loans. There's plenty of everyday words that appear in single languages with no apparent cognates, even in closely-related languages. ('Dog' in English, for example.) Specifically, if SM/Tw is considered more conservative, it could be an EMC word that died out in Northern topolects or got replaced there by a Manchu/Mongolian word.

Even if the SM/Tw word is sound-alike/sense-alike with some IE word, that doesn't on its own constitute much evidence of a loan — particularly if you cast the net wide to any IE language at all, at any time of contact with Indo-, Irano- or Southern or Northern Euro- interactions. I plain do not believe Tw has a loan from Rumanian, particularly when there's well-recorded Portuguese contact: you make yourself ridiculous.

If a word appears in Tw but not mainland SM, there's many more possible explanations, given the very particular colonial history of Taiwan.

If you're going to allow snipping off syllables, including taking prefix but not the stem, you've lowered the bar for testable hypotheses to pretty much ground level. Why take the one syllable not two of 'imbibe'? Why take the first, which is unstressed? Why take both syllables to sap-bûn, not just the first?

a multisyllabic word like Latin imbibere is loaned to Sinitic Is it? When? Where? Did the Romans or Byzantines come to Southern China? If they'd come to Taiwan, they wouldn't have found any Sinitic speakers. Was it Marco Polo? He went Silk Route to Northern China. Did it come via Portuguese? But there were no Sinitic-speaking communities in Taiwan at that time. Was it when the Dutch brought indentured labour (Han from Fujian) to Taiwan? But the Dutch use a cognate of 'drink'/Germanic, not 'imbibere'/Romance.

Why (from your o.p.) do you think Germanic 'four' came into Chinese, but not any other number? (And in that case it's every topolect, and it's pronounced pretty much the same everywhere/nothing different conservatively preserved in SM/Tw.) At the time these im- prefix Latin words allegedly entered Sinitic, why didn't 'quatuor'/'quattro' (of course only its first syllable)? If you're going to say: there was already a word for four; then I'd be pretty confident early Sinitic had a word for four long before it encountered early Germanic: how old is character 四? (I half-remember Tw has a word for two or for twice/double, in addition to 'Èr', and that doesn't appear in MSM. Now that would be interesting to look into.)
Chris Button said,

February 2, 2019 @ 5:56 pm

Ok – time for some random speculation on my part…

There is apparently a Gaulish word *slugi "army" attested in a tribal name (cognate with Irish slua from Old Irish *slóg "army"). The time period of the transcription 秀支 seems possibly a little too late for certain key OC features to be retained, but xiesheng derivatives like 誘 EMC juw' (from OC *lə̀wʔ) suggest that 秀 EMC suwʰ should be reconstructed as something like OC *sɬə̀ws. As for 支 EMC tɕiă (the /ă/ is Pulleyblank's pharyngeal off-glide distinguishing it from 脂 EMC tɕi with which it had merged by the time of LMC), we can still reasonably assume it was perhaps being articulated more in the velar than alveolar region as its OC form *kàj would suggest. This gives us 秀支 EMC suwʰ tɕiă from OC *sɬə̀ws kàj. Something in between the OC and EMC forms, possibly with intervocalic voicing between the two syllables, leaves us with something plausibly close to Gaulish *slugi "army".
bgermain said,

February 2, 2019 @ 6:54 pm

Backtracking, "Phew!" is that rare sort of word, incomplete without punctuation. Yet it is complete itself and would always be capitalized.

It is part of a satisfying group that are mainly exclamations, showing onomatopoeia related to physical reactions, such as "whew," "wow," and "whee!" as well as the nonverbal comment by a low whistle, to express that something is surprising or impressive.

"Phew!" is also one of an interesting group, either exclamations or natural noises made by part of the face, that have a final u-w-v-f or f-equivalent: "Ow! ," "Ugh," " Latin "heu!," Italian "Uf!," the noise of revulsion that made it into print in recent years as "Eww," plus huff, puff, laugh, sniff, and cough.

I've never related "phew" to disgust, as in "ugh," "eww," archaic "faugh" or the juvenile "peeee-ew! " for "what's that smell?" To me it has always been a close cognate to "Whew!, " the hard exhale of being out of breath (fatigue,) or an expression of something being surprising or impressive, (what a lot!) or of relief (that was close!) What I love is that "phew" appears, to me at any rate, to represent the breath and relate to a Greek root. Just as the six-year-old's "pitooey!" makes the spitty "ptuein" sound, as will a modern, secular yet cautious Jew who doesn't knock on wood but does turn aside and say "tu!tu!tu!" to avert misfortune, so the impulsive breath-noise "phew" sounds similar to "pneuma" or "pleumon." And these it is said come down from "pleo," I float, fly, swim, sail, which was used in idioms to express abundance or quantity, similar to "swimming in cash." Phew!
Chris Button said,

February 2, 2019 @ 6:59 pm

I might add that this supports Professor Mair's earlier suggestion of a Celtic connection; the form itself seems to have no known cognates outside of Celtic and Balto-Slavic. Apparently the Irish reflex was borrowed as "slew" in English.
Victor Mair said,

February 2, 2019 @ 7:18 pm

@bgermain

In your penultimate paragraph, you say "phew" belongs with a group of exclamations that indicate revulsion, yet in the final paragraph you express the opinion that that you do not associate "phew" with feelings of disgust.

Do you make a sharp distinction between revulsion and disgust?
Victor Mair said,

February 2, 2019 @ 7:20 pm

@Chris Button:

Thanks very much for this excellent suggestion of Celtic words for "army".
Victor Mair said,

February 2, 2019 @ 7:35 pm

We've now moved from beating a dead horse to flailing a straw horse.

I did lay out my new hypothesis, and it included an explanation of what all this has to do with the Tocharians, even though the ten character puzzle itself does not seem to be in Tocharian. Better to read through the o.p. and all the comments closely and carefully before flogging a dead, straw horse again.

Chris Button did keep in mind what I said about Celtic and Gothic, so I was very pleased by his latest comments.

Anyway, I was also happy that someone who is highly skeptical and extremely critical would say the following:

=====

I admire Chang Tsung-tung's paper, because that in effect excludes IE-in-general sound-alikes, and narrows the hypothesis to well-attested Germanic roots at a point in time after it branched from IE. (Specifically, after the operation of Grimm's/Verner's laws, and before Germanic started changing (both in vocab and morphology) from contact with the Romance branch.)

—–

I particularly like the way Chang Tsung-tung keeps to single-syllable Germanic roots corresponding to single-syllable Sinitic words. After firmly establishing the sound-pattern correspondences and time of contact then perhaps cast the net wider.

—–

…Chang Tsung-tung's paper (which is hard going for me, but very rewarding).

=====

Isn't it interesting that both Chau Wu and Tsung-tung Chang have come to similar conclusions about interactions between Germanic and Sinitic, though they did so employing completely different methodologies? Maybe there's some deeper, underlying truth to what they're saying.

It's also intriguing to me that both Chau Wu and Tsung-tung Chang are Taiwanese. The first is a pharmacological scientist with a doctorate in that field, and the second had two doctorates, the first in economics and the second in Sinology (initially emphasizing research on oracle bone inscriptions). Perhaps their deep familiarity with Taiwanese permitted them to make breakthroughs in the conceptualization of the development of Sinitic that others who relied only on theoretical reconstructions could not achieve.

Diana Shuheng Zhang fits into all this puzzle solving because of her thorough training in historical phonology and exceptional brilliance which permit her to fine tune her reconstructions beyond geographically and temporally obscure Old Sinitic and Middle Sinitic. Her outstanding philological skills also enable her to find and comprehend a vast array of textual sources that have a bearing on the subject under discussion.

Although some unnecessarily barbed remarks have been launched during this long debate, I'm sure that Chau Wu appreciates the serious attention that has been paid to his hypothesis and will refine his approach accordingly. At the very least, we now have a totally new hypothesis concerning the 10 character Jie transcription that is on the table, some aspects of which will undoubtedly prove fruitful for continuing research on this particular puzzle, and with profound implications for investigations concerning related issues having to do with Trans-Eurasian Exchange (with a nod of gratitude to Andrew Sherratt), for which see above:

http://languagelog.ldc.upenn.edu/nll/?p=41519#comment-1559598

and here:

https://www.researchgate.net/publication/285033177_The_Trans-Eurasian_exchange_The_prehistory_of_Chinese_relations_with_the_West

Sometimes if one wants to see and understand small details before one's eyes, one also needs a broader vision that will put them in proper perspective.
AntC said,

February 3, 2019 @ 12:49 am

I was very hesitant to step into this again, but since Prof Mair has quoted from me at length, I wish to draw a sharp disagreement with what he says. As a disclaimer, I should say again that I am no professional linguist or philologist; just a keen amateur who laps up Prof Mair's posts on Sinitic language history. When I said:

I admire Chang Tsung-tung's paper, because that in effect excludes IE-in-general sound-alikes,

that was because Chau Wu does exactly the opposite and considers IE-in-general sound-alikes. I raised that point in particular because of the claims about Latin 'imbibere', which AFAICT has no corresponding Germanic word. In fact Chau Wu's occasional dipping into Latin or Greek seemed to me so much in need of explanation that contra my protestations earlier, I couldn't help myself examining SPP No 262.

And there I was appalled. Chau Wu draws IE words hither-thither from Germanic, Latin, Greek (but in that case why not Indo- or Iranian?). He justifies the after-borrowing historical sound changes in Sinitic sometimes by reference to Japanese, sometimes to other Chinese topolects, sometimes to other sound-alikes or segment-alikes in SM/Tw: but why those words and not to all sound-alikes/segment-alikes in those comparative languages? And it would seem to need referring to all sorts of time periods in the evolution of topolects.

The bulk of the source words are indeed drawn from Germanic, but again it's hither-thither from ON, OHG, OE, reconstructed PGmc, …

Is this legitimate 'Comparative Method' ? Not that I've ever seen before, but what do I know? Perhaps a philologist could comment. In fact I hope the editor-in-chief of the SPP series could comment.

Isn't it interesting that both Chau Wu and Tsung-tung Chang have come to similar conclusions about interactions between Germanic and Sinitic, though they did so employing completely different methodologies? …

No I don't see them as having come to similar conclusions. Chau Wu is making claims about a rag-bag of IE interactions with Sinitic. In contrast Tsung-tung Chang very much 'sticks to the knitting' of a very narrowly time-defined circle of Germanic words. And by choosing only single-syllable words, and hypothesising they entered at a time before Sinitic developed the tonal system, that dovetails very nicely into explaining tones as arising from morphing the syllable-final consonants from Germanic. In comparison, Chau Wu's very diverse explanations for tones and nasalisation of SM/Tw again exhibit a scatter-gun and frankly ad-hoc range of morphing that shouts to me: this is choosing just any means to justify the ends.

Maybe there's some deeper, underlying truth to what they're saying.

I think there may well be a deeper truth to what Tsung-tung Chang is saying. Then I have to say that in my judgment Chau Wu is attracting ridicule to an hypothesis that merits closer examination.

This comment is already long enough, but strong claims (on my part) need strong justification, so let me take just one other example. I found it odd Chau Wu's o.p. draws on a sound pattern change from Proto-Gmc *fidwōr ‘four’ — this is to justify a morph from Gmc leading f- to *h- to *hs- to s- in Sinitic. As David M asked, why four and not other number-words?

Furthermore, four in Germanic is something of an anomaly within IE, so a strong indicator of Gmc and precisely not IE-in-general. From resources I looked at, there's no good explanation how PIE *kwetwer- got to Proto-Gmc *fidwōr; a vague "Watkins explains the -f- as being from the following number (Modern English five)".

So let me first answer David M's question, by reference to SPP No. 262 section CV-4 NUMERALS: other Chinese number words are claimed to come from Germanic. In fact every Chinese number word comes from Germanic, allegedly. Of course a bewildering cherry-picking of OS, OHG, ON, OE, reconstructions, inferences of 'missing' consonants by comparison to Latin/Greek (because annoyingly those consonants don't appear in PGmc, but are needed to justify the Sinitic derivation). And of course only the first syllable from polysyllabic number words (such as *fidwōr), with ad-hoc reasons as to whether any of the intersyllabic consonants got included into tones/nasalisation.

OK, so at least Proto-Germanic *fidwōr and *fimfe five both start f-, so they would morph in parallel in Sinitic, right? Wrong: five ends up in Tw not starting s-, but as gō (MSM Wǔ 五).

As a sanity check on all this, can I ask: is it plausible that number words arrived holus-bolus into Sinitic from Germanic? Or that in general a whole number-word system should be borrowed into an established language? My understanding is that the continuity of number-words is a mark of the continuity of a language. Then must the explanation be that Sinitic has a Germanic substrate from the beginning? I assume Sinitic had number words and a sophisticated numbering system 'for ever'. (To run an empire or even a small warring kingdom you need to count stuff.) What are the earliest attestations of pronunciations and characters for numbers? Is there evidence of (say) some invader overthrowing an older number system and bringing in their own, precisely because they needed to count their conquests? Or do we need to push back this time of Germanic contact to even before Sinitic writing, indeed sometime before even the Tarim basin mummies?

[Chau Wu] is a pharmacological scientist with a doctorate in that field, and has drawn on that background for the " completely different methodolog[y]", namely "multiple sequence alignment analysis (MSAA) commonly employed in studies of the structure–function relationship of cellular proteins"

It strikes me that that as scientific good practice, an innovative methodology should be validated against well-understood data before being deployed in speculative reconstructions. An excellent test-bed would be to see if MSAA approaches coincide with the painstakingly arrived-at conclusions from IE philology. I see no evidence that was tried. I don't even see that the MSAA algorithms were 'trained' against any well-understood data. As it is, the transplanting of a methodological approach reminds me strongly of geneticists blundering into language typology using genome evolution models. We all (who've read Prof Liberman's posts on the topic) know how that went. Junk science.
AntC said,

February 3, 2019 @ 4:42 am

the juvenile "peeee-ew! " for "what's that smell?" To me it has always been a close cognate to "Whew!, " the hard exhale of being out of breath (fatigue,) or an expression of something being surprising or impressive, (what a lot!) or of relief (that was close!)

And then there's Deputy Dawg.

It would appear that in my idiolect there are two words spelled "Phew!", with markedly different intonation patterns.

'Phew! that tofu is stinky' with rising-falling intonation and a long drawn-out -oooo.
That's the revulsion/disgust sense. That same sense I could also pronounce/spell "Phew-ee!".

'Phew! that hill was a steep climb'; or 'Phew! what a scorcher!' with falling intonation/short trailing aspiration. I hypothesise the short aspiration is from shortness of breath after exertion.

I've tried putting the 'wrong' pronunciation into those sentence frames, and it doesn't work for me semantically.

Then I know from David M's many posts here and on other language-oriented blogs, there would indeed be great mental exertion behind his posts. (I say "behind", because often the post is the tip of the iceberg, and the stupid/unlearned like me have to draw out the detailed reasoning with follow-up questions to expose what was 'blimmin obvious' to David.) And so it indeed proved after David's "Phew!" comment, with several long critiques and explanations.

Another case in point up-thread would be the wikipedia article on Pumpokol. It never occurred to me that somebody would go to the bother of creating an article on such a recondite subject, with comprehensive citations, purely to see on the internet some theory they yanked out of thin air. Particularly when it would have been much easier to translate the Russian language article on the subject (which includes no such 'theory'). But David M immediately saw what was going on, and could refute the 'evidence' in detail.
Philip Taylor said,

February 3, 2019 @ 7:26 am

VHM ("In your penultimate paragraph, you say "phew" belongs with a group of exclamations that indicate revulsion"). That is not how I read it. The original reads "Phew!" is also one of an interesting group, either exclamations or natural noises made by part of the face, that have a final u-w-v-f or f-equivalent: "Ow! ," "Ugh," " Latin "heu!," Italian "Uf!," the noise of revulsion that made it into print in recent years as "Eww," plus huff, puff, laugh, sniff, and cough..

My reading of this is that only "Eww" (the noise of revulsion that made it into print in recent years as "Eww") is associated with revulsion, all the others (as well as "Eww") being simply either exclamations or natural noises made by part of the face, that have a final u-w-v-f or f-equivalent.
Victor Mair said,

February 3, 2019 @ 8:10 am

I've consulted numerous dictionaries, and most of them say that "phew" can convey disgust. That's certainly how it's used in the part of Ohio where I came from.

E.g.:

Merriam-Webster

Definition of phew. 1 —used to express relief or fatigue. 2 —used to express disgust at or as if at an unpleasant odor.

Wiktionary

Used to show relief, fatigue, or surprise.

Phew, that took a long time to define!

Used to show disgust.

Phew, it stinks in here!

Oxford

Expressing a strong reaction of relief, or of disgust at a smell.
‘phew, what a year!’

Etc., etc.
Victor Mair said,

February 3, 2019 @ 8:12 am

Attitude

My college basketball coach, Doggie Julian, had an unusual argot. He would say things like:

"All right, you guys, move out of Missouri".

"You're a bunch of fut cows."

"Stay off the cundiments".

We didn't always fully understand everything Doggie said, but there's one word he used that we knew exactly when he meant: "AT-TEE-TYOOD".

If you wanted to accomplish anything, succeed in anything, you had to have the right attitude.

It's the same way with scholarly endeavors: if you want to make progress, you need to have the right attitude toward yourself and toward others.

A gold prospector has to dig through a lot of dirt and stone to find the nuggets of that precious metal that he is after. The miner has to sift through layers of earth to discover those shining gems of great worth that he is seeking.

There are wonderful treasures in the materials that Chau Wu is presenting to us. We don't know how this is all going to play out with regard to Celtic, Gothic, Tocharian, Balto-Slavic, Indo-Iranian, etc., but you can be sure that I will continue to pay close, respectful attention to Chau Wu's linguistic and ethnographic excavations.

As Doggie might say, "Listen, you guys, don't throw the jewels out with the sluice water!"
AntC said,

February 3, 2019 @ 9:19 am

I did lay out my new hypothesis, and it included an explanation of what all this has to do with the Tocharians, even though the ten character puzzle itself does not seem to be in Tocharian. Better to read through the o.p. and all the comments closely and carefully

7. It would not be surprising if the Tocharians had some sort of association with the Celts and / or the Goths.

Ok. That would explain why Fotudeng was in the area. That is, if the Jié were Celts/Goths.

The first word of our 10-character riddle leaves us with something plausibly close to Gaulish.

Hmm. Matching one word out of 10 is only a suggestive starter. Is it legitimate to lump together Gaulish/Celts/Goths?

the [first word's putative] form itself seems to have no known cognates outside of Celtic and Balto-Slavic.

Then not Gothic? Not Germanic? Gaulish/Celtic is not Germanic(?) Then whence the Germanic loans into Sinitic? (If that's what they are.) All of the putative sound correspondences in the o.p. are irrelevant, as is the paper from Chang Tsung-tung.

If I count up the evidence here: we have the physical appearance of the Jié, which seems more (northern) European than northern Asiatic or Indo/Iranian. Is there any DNA evidence from gravesites in the region? (Other than Tocharian sites.)

And we have one word in an IE European language. (But at the time rather more mid/southern European.)
Victor Mair said,

February 3, 2019 @ 9:37 am

We'll keep digging. This is just a start — a new beginning.
Victor Mair said,

February 3, 2019 @ 11:52 am

In my hometown, "phew" has both main usages, viz., to express relief or revulsion, but the latter is primary, since "pee-ew / pee-yew / peeyew / P U / P.U. / peeyoo", which is very common when reacting to foul smells, is considered a drawn-out version of "phew":

Pronunciation

IPA: /piː(j)uː/

Interjection

pee-ew

An exclamation of disgust

1947, Leslie Waller, Show Me the Way, page 50:

"Pee-ew," he exclaimed. "She fouled up the whole shirt with that perfume of hers. I can smell it all over."

1997, Lavyrle Spencer, Small Town Girl:

“Pee-ew, girl do you stink!” Renee said. “Go take those boots off!”

2007, Carolyn Keene, The Halloween Hoax:

"Pee-ew!" Nancy said, squeezing her nose.

2008, Kristin Hannah, Firefly Lane, page 44:

The sudden quiet brought Sean and the dog running into the room, tripping over each other. "Katie looks like a skunk," Sean said. "Pee-ew."

2009, Janna Cawrse Esarey, The Motion of the Ocean, page 163:

…when I get a whiff of precisely where this scent is coming from: my sweaty (and slightly hairy) armpit. Pee-ew.

2010, Lissa Rankin, What's Up Down There?, page 318:

After sniffing her toilet bowl (hey—anything in the name of science!), I said, "Pee-ew!"

2011, Jodi Lundgren, Leap, page 73:

"Pee-ew, smells like moth balls in here," Sasha said over her shoulder.

https://en.wiktionary.org/wiki/pee-ew
Alexander VOVIN said,

February 3, 2019 @ 3:03 pm

With all due respect to everybody let me make a several remarks.
1. While I value history, anthropology and archeology, they cannot supplant historical linguistics in determining genetic affiliations of languages. As one of the greatest historical linguists of our time, Lyle Campbell has always said: "Languages should speak for themselves" :-).
I fail to see how deep eye sockets, big noses, and bushy beards of Jie could possibly tell us anything at all about their language and its genetic affiliation. Incidentally, these physical anthropological characteristics are shared by Anatolian Turks, who happen to speak the language belonging to the same linguistic family as Tuvan, whose speakers lack all these features.
2. Once we allow the irregularity in correspondences and unaccounted for segments, it becomes possible to prove anything, like Eskimo-Koisan, or whatever one wants.
3. Different languages and language families develop with different speed and exhibit different degrees of language change and restructuring. Moreover, this speed is not constant even in the history of a individual language. Some languages can go berserk for a short time, e.g. proto-Korea and likely Old Korea as well used to have just one series of stops p, t, c, k, but within just 200or 300 years two new series, aspirated and tense (so now we have p, ph, pp) appeared as well as different vowel and consonant losses took lace as well. To illustrate the point Modern Korean has tti 'belt' < Middle Korean stïy, but ultimately in Old Korean the word was sitïli. On the other hand, different languages may exist in the "refrigerated" state for a long period of time. Modern Finnish is not that different from proto-Balto-Finnic, and as far as we can judge it is the same, if not the greater span of time as between Jie and Pumpokol. Finally, paradigms normally change much slower than a lexicon.
4. Any genetic relationship hypothesis is fallible by the demonstration of either a) that the data it is built upon is corrupted or false; or b) that it does not stand the scrutiny from the viewpoint of the comparative method (i.e., correspondences are irregular ad there are unaccounted for segments), or by both a) and b). It cannot be fallible by the slogans like "made out of the thin air" without any evidence to the contrary presented.
Victor Mair said,

February 3, 2019 @ 3:56 pm

Thanks for joining in the conversation, Sasha.

"I fail to see how deep eye sockets, big noses, and bushy beards of Jie could possibly tell us anything at all about their language…".

Well, in combination with evidence from history, archeology, genetics, culture, anthropology, ethnology, philology, and many other fields, they do tell us a lot about identity, movements, and nature of various peoples. Clearly, though, when we're studying language, we should place primary emphasis on language itself, and that is why, in confronting the tantalizing ten character conundrum that is the focus of this post, we are trying to make sense of it linguistically, but in a way that does not clash with other types of evidence.

As for Lyle Campbell, I lost a lot of respect for him and his values when he declared that Joseph Greenberg's American language classification "should be shouted down". No matter what anyone may think of Greenberg's methods, the fact remains that he made enormous contributions to linguistics, including fundamental findings about African languages. Incidentally, when I was working in the Stanford library years ago, I was astonished to come upon Greenberg's carrel in a very public part of the building — first floor near the checkout counter and front door. He had mountains of papers and note cards on his desk, all the massive amount of handwritten data that he relied on to formulate his comparisons. Every day, hundreds of people walked by his desk. He must have been a trusting man, not one who deserved to be "shouted down" for his painstaking linguistic labors.
AG said,

February 3, 2019 @ 7:01 pm

This might be a very naive question, but since Kumarajiva etc. have come up – have all 10 characters been checked to see if they were ever used in Chinese translations of Indian Buddhist terms, names, etc.? Wouldn't those comparisons give an almost exactly contemporary picture of how Chinese dealt with Indo-European multisyllabic words?
AntC said,

February 3, 2019 @ 7:10 pm

2. Once we allow the irregularity in correspondences and unaccounted for segments, it becomes possible to prove anything,

Thank you Professor Vovin. Very wise counsel.

The "made out of thin air" comment was from me. To avoid any misunderstanding: I was speaking not about anybody's contributions on this thread, but about an article on wikipedia (on Pumpokol, as it happens). I think everybody here has been meticulous/painstaking and methodical in their derivations, and in providing grounds for their suggestions.

I have been critiquing some of those derivations as failing on exactly your point 2.

"Languages should speak for themselves" is a great ideal. Unfortunately, the languages we're considering here went mute long ago; so what 'speaking' we have is unreliable and open to wildly different interpretation. That would apply also for your (with co-authors) Yeniseian-family hypothesis.
AntC said,

February 4, 2019 @ 4:59 am

We'll keep digging. This is just a start …

the form itself [of " a Gaulish word *slugi "army" "] seems to have no known cognates outside of Celtic and Balto-Slavic.

Wait (1). There is (or likely to be) a Balto-Slavic cognate of *slugi ? I know our time period is rather before the great expansion of Slavic, but where and when were the Balts? Were they (for example) in higher lattitudes/on the borders of Europe?

I'm thinking of the deep-set eyes and bushy beards business; and remembering the Bogatyrs of the Great Gate of Kiev/Pictures at an Exhibition. Granted Mussorgsky/Hartmann were much later; the epic poems (Bylinas) date back to ~800 AD; the myths must have come
from somewhere earlier. The typical depictions of Bogatyrs makes them dead ringers for Prof Mair's sword-bearing dignitaries.

7. It would not be surprising if the Tocharians had some sort of association with the … Goths.

Wait(2). The Goths were in higher latitude Europe at the right time (and had been expanding from there since ~750 BC). This was before the expansion of the Slavs from the south/east, so they would have reasonable access along the Eurasian littoral(?).

They also fit the deep-set eyes business. (The dark eye make-up, and garish lipstick might have come later ;-)

An advantage here is that the Gothic language is well-attested. Plenty of vocab. Does that improve our chances of finding sense-alikes for the translation of the 10 characters?

Can we particularise Chang Tsung-tung's approach, to take Gothic stems? (One problem I see is that Gothic seems to have rather a lot of inflection/polysyllabism.)

This seems a more closely testable hypothesis. OTOH it might be like the guy searching for his lost keys under the street lamp because it's easier to see.
AntC said,

February 4, 2019 @ 5:24 am

The reconstructed Proto-Slavic language features several apparent borrowed words from East Germanic (presumably Gothic), such as xlěbъ, "bread", vs. Gothic hlaifs. [wikipedia on Gothic language — or should I take that claim with a pinch of salt, as per my unfortunate Pumpokol experience?].

I remember Хлеб-соль/ bread and salt from an ice-cream advert at the Sun and Moon lake in Taiwan: Prof Mair posted my photo and a discussion on the Log.

Proto-Slavic dated to 5th to 9th centuries AD. The Goths were goneburgers after about 800.

A bunch of dratted linguists think " it is time to abandon Iordanes' classic view that the Goths came from Scandinavia." Rather: from close to present-day Austria. Gnaa! Can't we get anything nailed down? Perhaps David M could advise on the current consensus.
Victor Mair said,

February 4, 2019 @ 7:24 am

When Ran Min ordered his soldiers to go out and massacre all those with big noses and bushy beards, i.e., overwhelmingly the Jie, the probability that people with those physical characteristics would have spoken Jie language would have been greater than for those who did not possess those physical characteristics.
AntC said,

February 4, 2019 @ 4:29 pm

2. Once we allow the irregularity in correspondences and unaccounted for segments, it becomes possible to prove anything, like Eskimo-Koisan, or whatever one wants.

Be careful what you wish for; and almost.

Igbo-Eskimo I can do, via Basque — which is nearly identical to Igbo; as would be obvious if not for the conspiracy of linguists trying to pull the wool over our eyes with the 'comparative method'.

The Eskimoan comes from Basque being the pidgin for fur traders/whalers in the North Atlantic.

Chomsky says the Language Faculty is innate, therefore the world must have started off speaking one language, in the heart of Africa (south-east Nigeria).

geneticists blundering into language typology using genome evolution models.

Yep, of course that Atkinson et al research gets cited. An absolute gift to crackpots.

A word of caution for readers before you follow those links. Prepare a bolster to put between head and wall, for banging against. Prepare a nice chewable carpet. And ration yourself to small doses with time in between to go lie in a darkened room.
DYHH said,

February 4, 2019 @ 7:17 pm

It's also intriguing to me that both Chau Wu and Tsung-tung Chang are Taiwanese… Perhaps their deep familiarity with Taiwanese permitted them to make breakthroughs in the conceptualization of the development of Sinitic that others who relied only on theoretical reconstructions could not achieve.

I was reluctant to join this conversation because of the dismissive way in which
skeptics of Chau Wu's work were treated in the comments, but I would like to address this anyway. Familiarity with Taiwanese (and not, say, other Min dialects or other southern Sinitic languages) can also lead to certain pitfalls. As a Taiwanese-American whose parents speak Taiwanese and Hakka natively, I was interested to see what sound correspondences I would find in the SPP articles mentioned above. I was disappointed. Just as an example: Chau Wu states OE gōs > Tw gô 鵝 (l.) and ON gás > Tw giâ 鵝 (v.). However, Southern Min [g-] comes from a velar nasal initial [ng-] which remains intact in other dialects of Min (and Hakka and Cantonese). I also saw little or no attempt to match voiced/voiceless/aspirated initial consonants (understandable, given the writer's knowledge of Taiwanese and Mandarin, which have lost initial voicing on MC obstruents). AntC has already pointed out the severe problems with allowing irregular sound changes from whatever IE languages come to mind, ignoring mismatches in space/time. On the Sinitic topolect side, it's also quite disappointing to see all other Min topolects overlooked, to say nothing of Wu, Cantonese, and Hakka, which (if Old Chinese reconstructions are truly 100% untrustworthy and to be thrown out) are more useful in determining initial voicing (Wu) and final consonants (Cantonese and Hakka in some cases).

All of this is information Prof. Mair, with his wealth of knowledge and years of experience in Sinology, surely knows, and far better than I do. Claiming Taiwanese (but, oddly, not other Min topolects?) is "the most conservative" is incorrect; although proto-Min diverged from all other Sinitic varieties before MC and thus provides additional information on how Old Chinese may have sounded, the modern Min topolects have undergone many sound changes which obscure these older distinctions unless they (as in all of them together, not just one variety of southern Min) are carefully examined. Reading characters in modern Taiwanese to find superficial matches in any of Germanic, Greek, or Latin, while ignoring existing Sinitic cognates with well-documented, regular sound correspondences (because they sound less similar to the IE root? or because the author is unaware of them?), is simply not a valid methodology. I'd normally be delighted to see my mother's native tongue being discussed on LL, but it's sad to see the classic "my topolect is more conservative than Mandarin/other Sinitic branches" elevated into a complete disregard for the information available in other southern Sinitic topolects, in a semi-serious study no less.
Chris Button said,

February 5, 2019 @ 12:30 am

@ AntC

Furthermore, four in Germanic is something of an anomaly within IE, so a strong indicator of Gmc and precisely not IE-in-general. From resources I looked at, there's no good explanation how PIE *kwetwer- got to Proto-Gmc *fidwōr; a vague "Watkins explains the -f- as being from the following number (Modern English five)".

Actually Watkins' proposal is not as strange as it seems. Numerals do tend to contaminate each other in "runs" which can also leave weird relics. In fact, 四 is a great example since it has two Middle Chinese reflexes of siʰ and sit. The latter is often seen as "irregular", but it is phonologically entirely reasonable and makes perfect sense from a broader Tibeto-Burman perspective where the hardening of *s > t may be found in the numbers "two" and "seven" although Chinese only attests the hardening in the latter without variation.
AntC said,

February 5, 2019 @ 1:33 am

Numerals do tend to contaminate each other in "runs" which can also leave weird relics.

Thanks Chris. I wasn't trying to raise questions about PGmc/PIE, I was taking that as secure reconstruction.

The bigger question was: if EMC borrowed from Germanic the "run" *fedwores-*fimfe (and ref recent discussion, the Gothic has them both starting not only with the same consonant, but the same vowel following); and all branches of Germanic have retained at least leading f- for both; then how did they end up so very different in SM/Tw? (Or indeed in MSM.) And remember Chau Wu was drawing on 'four' as one of many examples to show leading f- morphed to s-. (That was f- from Latin, as well as from Germanic.)

In the paper, the chain of derivations for the two number words are given without explanation or reference to any other patterns. Both morph f- to *h- or *hs-, but then four goes to s- whereas five goes to g- (literary/superstrate ng-).

The justifications wrt other derivations are along the lines: SM/Tw "doesn't do that". That is, doesn't allow some particular sequence of phonemes. (OK, I get that: a syllabic language with open syllables wouldn't cope well with trailing consonants, let alone consonant clusters. Except: 1, Prof Mair is always telling us about Chinese words that are bisyllabic; 2, it's generally thought that early Chinese did have final consonants. So now Chau Wu seems committed to a hypothesis about when number words came into Chinese — i.e. after the tonal system was in place, which seems to me very late, and for example too late for our blasted puzzle.)

And the bigger bigger question (sorry I'm so slow to get there: this is very unfamiliar territory for me): if these words are getting borrowed into a language with a very different sound pattern, which is so conservative that sound-alike segments get bent willy-nilly; then wouldn't that language already have number-words? IOW it would seem to reject a hypothesis that Chinese has an IE substrate.

That's why I asked if there are examples of (mature – with cities and administration) languages borrowing a complete set of number words. That is very different from a few number words seeming "irregular".

I also asked if character 四 (or its recognisable predecessors) has always been used for four; I might also ask if four is always reconstructed to start s-. I'm bending over backwards to accommodate Chau Wu's claim it started f- (and morphed through *h- to *hs-).

Again disclaimer: IANAL I Am Not a Linguist. I'm only watching and learning the comparative method.
Alexander Vovin said,

February 7, 2019 @ 8:33 pm

Lyle Campbell and myself were colleagues at the U of Hawai'i for several years. It is difficult to imagine that 'to shout' comes from someone as soft-mannered as him. Occasionally, we all might use a wrong word at a wrong place and time. Should he say 'criticize', no questions would be asked. Although I must admit that I find Greenberg's "Amerind" theory untenable for many reasons, and incidentally, I believe that at least some parts of his frican classification came under heavy fire recently, although I am not a specialist in either American or African languages.
Leaving the problem of human origins aide for a moment, I have always thought that a monogenesis of the human language is an act of faith — it cannot be proven scientifically, and the same applies to a polygenesis. While we might not have a clear picture of dead languages' phonetics, with some limitations, we do have rather clear idea of their phonology, and it is the phonology, not the phonetics that is relevant for the language comparison. It is in this sense that the dead languages "speak".

RSS feed for comments on this post

An early fourth century AD historical puzzle involving a Caucasian people in North China

141 Comments

Dick Margulis said,

Victor Mair said,

Philip Taylor said,

Paul F said,

Narmitaj said,

Victor Mair said,

Chris Button said,

AntC said,

Victor Mair said,

cameron said,

B.Ma said,

NV said,

BobW said,

AG said,

Victor Mair said,

David Marjanović said,

David Marjanović said,

Chris Button said,

Chris Button said,

Victor Mair said,

Jonathan Smith said,

Victor Mair said,

Victor Mair said,

Philip Taylor said,

David Marjanović said,

AntC said,

Chris Button said,

Victor Mair said,

Leo F said,

David Marjanović said,

David Marjanović said,

Chris Button said,

Chris Button said,

Chris Button said,

Chris Button said,

Chris Button said,

liuyao said,

Victor Mair said,

AntC said,

Victor Mair said,

Victor Mair said,

David Marjanović said,

AntC said,

Chris Button said,

Victor Mair said,

Victor Mair said,

Chris Button said,

Victor Mair said,

Victor Mair said,

Michael Watts said,

Eidolon said,

Victor Mair said,

Victor Mair said,

David Marjanović said,

David Marjanović said,

David Marjanović said,

Zeppelin said,

Victor Mair said,

Victor Mair said,

AG said,

Chris Button said,

AntC said,

ohwilleke said,

David Marjanović said,

Victor Mair said,

shubert said,

ohwilleke said,

ohwilleke said,

Andreas Johansson said,

Chris Button said,

Philip Taylor said,

David Marjanović said,

Victor Mair said,

Chris Button said,

Victor Mair said,

Victor Mair said,

AntC said,