Language Log

Penglin Wang’s response to David Marjanović’s comments

February 21, 2019 @ 8:26 am · Filed by Victor Mair under Borrowing, Phonetics and phonology, Transcription

(The following is a guest post by Penglin Wang.)

Thanks to Professor Victor Mair’s organization of a series of informative postings, which share expertise in areas that I do not often get a chance to be a participant, I was happy to contribute material with which I am familiar. As I have a heavy teaching load of 13-15 hours per week plus other inevitable undertakings in the fall and winter quarters, I have no choice but to refrain myself from allocating time to extracurricular activities. By taking advantage of this relatively long weekend I went through the previous discussions and found my posting about the diffusion of the Germanic word for ‘hart’ in Tungusic and Mongolic ("Of reindeer and Old Sinitic reconstructions" [12/23/18]) commented on by David Marjanović (DM) and mentioned by some other esteemed colleagues. I wish to thank those of you who opined about my posting. In response to David Marjanović I have drafted the following notes.

People moved around and encountered varying occasions to hear other people speaking. It was unlikely that people obtained foreign words through mouth-to-mouth teaching and imitation. Thus, discrepancies and inconsistencies have been popping up from time to time in interlanguage phonology. Numerous instances of crosslinguistic borrowings do not fit into what DM argues for. English terylene was introduced into Chinese as dilun (滌綸) even though the Chinese speakers have no difficulty pronouncing it as terilin (特日林) to get it much closer to English terylene. Can anyone deny the connection between these two words by asking why tery of English terylene became represented as di of Chinese dilun? Hardly! Likewise, English ginseng was ultimately from Mandarin Chinese renshen (人參, Hokkien yinsim, Cantonese yansam). There is every reason to believe that English speakers can successfully articulate Hokkien yinsim, Cantonese yansam, or Mandarin renshen, but they came up with ginseng. This was the way it was in the domain of interlanguage phonology. I do not know if anyone could answer the question: Why would this happen? For one thing, we do not know who was the first to have pronounced the word ginseng and introduced it into English. Hence, we have no way to investigate why he or she did so.

DM: Why would this happen? Manchu doesn’t have a problem with /rd/, does it?

PW (Penglin Wang): The Manchu and Mongolian phonotactics of the postvocalic liquid r/l can in no way preclude the liquid from being rendered as a nasal, especially before an obstruent consonant. In other words, such combinations as /rd/ cannot be perpetuated. Both liquid and nasal are sonorant consonants, facilitating the direction of phonetic change from the former to the latter. Parallel to the postvocalic liquid is the syllable pattern containing postvocalic nasal in both languages. Assume that Germanic word for ‘hart’ was in the process of diffusion in Inner Asia, and then the local speakers heard it as if it was pronounced kand-/hand-.

DM: Why would [h] be represented by [qʰ], instead of simply by [h], which Middle Mongolian also had to offer? In most of today’s Mongolic languages, this [h] has disappeared, and [qʰ] has become [χ] or even [h], but that was not yet the case in the times of Genghis Khan, who was not a [χɑːn] yet, but a [ˈqʰɑɦɑn].

PW: Instead of the sound [qʰ] (sic), the grapheme q representing a velar stop in Written Mongolian (WMo) was indeed pronounced h in many words recorded in the Sino-Mongolian glossaries belonging to the Middle Mongolian (MMo) period, which covered the times of Chinggis Khan. In the case of WMo qağan, it manifested itself either as hahan (哈罕) or ha’an (哈案) or he’an (合安) in the Sino-Mongolian glossaries and, moreover, was transliterated as hehan (^中合罕) or han (^中罕) in Menggu mishi (蒙古秘史, The Secret History of the Mongols). The MMo h– developed from the attested grapheme q mostly, if not all, remains in all the Mongolic languages spoken today. On the other hand, the MMo h– derived from the *k or *q has its reflections in the Western Mongolic languages (Monguor, Baoan, Dongxiang, Eastern Yugur, and Kangjia) and Dagur as an Eastern Mongolic language.

Corresponding to WMo qandağai and Manchu kandagan are Dagur handəğ and Buriat handagai bearing the word-initial h-. Crosslinguistic phonology in the process of vocabulary diffusion from language to language exhibits irregular phonetic realizations, which cannot be construed in terms of whatever ideal correspondences. In my framework, I have tried to explore and discover a couple of sound correspondences running in a number of words in Altaic and beyond, including n–l correspondence (WMo nağur ‘lake’ – Latin lacus ‘pond, pool, lake,’ OE lacu ‘pool, pond,’ lagu ‘water’), rhotacism or lambdacism in the postvocalic position (Latin lacus – WMo nağur), and development from a postvocalic liquid to a nasal (WMo hülde– ‘to chase, hunt – OE huntian ‘chase game, English hunt)’ (Wang 1992, 2000, 2001, 2015).

DM: Why would eo – which was just what it looks like: [ɛɔ̯] or thereabouts – be represented by [ɑ]? Why not by [ə] or [i] or [o], which Middle Mongolian also had to offer?

DM: And if we take your proposal less literally, so as to lessen the severity of 1) and 2), fresh new problems arise. The Proto-West-Germanic ancestor of heort was *herut, which looks even less like *qarta-. You can still see this from the fact that Old English eo comes from *e…u. In High German, BTW, this same sequence first became i…u, which is indeed found in the Old High German form hiruz (where z stands for /s/, the regular outcome of word-final */t/ after a vowel). Later in the history of German, the u dropped out as usual, and /rs/ became /rʃ/ by another regular change, giving the modern form Hirsch.

PW: Whatever happened to Germanic, the net result is that English hart existed in the past and exists at modern times. It is impossible for me to tell when the Manchu kandagan and WMo qandağai came about, since I am unable to locate them in the available material dating back to the middle stage of Manchu and Mongolian. But the ethnonyms Qarta’at (-t as a plural suffix) and Qardakidai were attested in Menggu mishi. Certain sound changes may converge in different languages. To the best of my knowledge, since Manchu and Mongolian unlikely had the diphthong eo occurring in OE heort ~ heorot, their speakers felt it convenient to shift to a. As an alternative scenario, we cannot rule out the possibility whereby the Manchu and Mongolian words were adopted from the form hart when it existed somewhere in Inner Asia, perhaps in Tokharian or in the speech of the Yellow-Headed Shiwei (黃頭室韋) who had been a sizeable ethnic group in Manchuria during the seventh-twelfth centuries (Wang 2018:187-199).

To conclude, trying to investigate any linguistic connections resulting from vocabulary diffusion in language contact situations based solely on sound correspondences and geography can lead us astray. A connection could take place at ancient times when demographic and ethnographic information on who was who was uncertain. Human legs were not entirely controlled by the law of universal gravitation and instead were extended by pack animals or animal-driven vehicles that were able to take them elsewhere, leaving their ‘footprints’ for us to scrutinize. If William Jones were bound by geography of remote distance between India and Europe, he would not have been able to find out systematic similarities for his construction of the Indo-European family. A subsistence pattern is also a factor to consider. Inner Asia was characteristic of nomadism often with no permanent settlement, which had allowed herders to move well beyond their previous places.

References

Wang, Penglin. 1992. On the origin of the Middle Mongolian initial h- and the motivation for its loss. Archív Orientální 60.4:389-408.

Wang, Penglin. 2000. Lexical connections between Germanic and Mongolic. Interdisciplinary Journal for Germanic Linguistics and Semiotic Analysis 5.1:71-91

Wang, Penglin. 2001. The correspondence between Old English l and Mongolic n. Altaic Affinities, edited by David B. Honey and David C. Wright, 209-224. Bloomington: Indiana University Research Institute for Inner Asian Studies.

Wang, Penglin. 2015. Number Conception and Application. New York: The Nova Science Publishers.

Wang, Penglin. 2018. Linguistic Mysteries of Ethnonyms in Inner Asia. Lanham: Lexington Books.

February 21, 2019 @ 8:26 am · Filed by Victor Mair under Borrowing, Phonetics and phonology, Transcription

Permalink

70 Comments

Victor Mair said,

February 21, 2019 @ 9:55 am

Speaking of nomadism, we need to think about the concept of "Wanderwort" in all of this. How did they happen in reality?

=====

A Wanderwort (German: [ˈvandɐˌvɔʁt], 'wandering word', plural Wanderwörter; capitalized like all German nouns) is a word that has spread as a loanword among numerous languages and cultures, especially those that are faraway from one another [VHM: emphasis added], usually in connection with trade. As such, Wanderwörter are a curiosity in historical linguistics and sociolinguistics within a wider study of language contact. At a sufficient time depth, it can be very difficult to establish in which language or language family it originated and in which it was borrowed.

Examples

Typical examples of Wanderwörter are sugar, ginger, copper, silver, cumin, mint, and wine, some of which can be traced back to Bronze Age trade.

Tea, with its maritime variant tea and Eurasian continental variant chai (both variants have entered English), is an example whose spread occurred relatively late in human history and is therefore fairly well understood: tea is from Hokkien, specifically Amoy, from the Fujianese port of Xiamen, hence maritime, while cha (whence chai) is used in Cantonese and Mandarin. See etymology of tea for further details.

Farang, a term derived from the ethnonym Frank through Arabic and Persian, refers to (typically white, European) foreigners. From the above two languages, the word has been loaned into many languages spoken on or near the Indian Ocean, including Hindi, Thai, and Amharic, among others.

Another example is orange, which originated in a Dravidian language (likely Telugu or Malayalam), and whose likely path to English included, in order, Sanskrit, Persian, possibly Armenian, Arabic, Late Latin, Italian, and Old French. (See Orange (word) § Etymology for further details.)

The word arslan (lion) of Turkic origin, whose variants are now widely distributed from Hungarian, Manchu to Persian, although merely serving as personal names in some languages; used as Aslan in the English novel series The Chronicles of Narnia.

Some ancient loanwords are connected with the spread of writing systems, an example would be Sumerian musar 'written name, inscription', Akkadian musarum 'document, seal', apparently loaned to Proto-Indo-Iranian *mudra- 'seal' (Middle Persian muhr, Sanskrit mudrā). Some even older (late Neolithic) Wanderwörter have been suggested, e.g. Sumerian balag, Akkadian pilakku-, or Proto-Indo-European pelek'u- 'axe'. However, Akkadian pilakku- really means 'spindle', and Sumerian balag is properly transcribed balaĝ (ĝ stands for [ŋ]), meaning 'a large drum or harp'. and was borrowed into Akkadian as balangu-.

=====

Source

See also "Some Wanderwörter in Indo-European languages" (1/16/09) and the many other posts on this subject cited therein.

Also here:

"The Linguistic Diversity of Aboriginal Europe" (1/6/09)

"Don Ringe ties up some loose ends" (2/20/09)
PeterL said,

February 21, 2019 @ 11:50 am

Isn't "ginseng" from the Korean pronunciation, which would be a borrowing from Middle Chinese? (In Japanese it's "nin-jin" (にんじん), presumably from a different Middle Chinese topolect; the same word can also mean "carrot".)

[This is similar to the English pronunciation of "Paris" or "Calais" despite being quite able to pronounce "pa-ree" or "ca-lay" … it's French that has changed while English has maintained the older pronunciation.]
AntC said,

February 21, 2019 @ 3:48 pm

'Ginseng' taken from Mandarin/Beijing dialect, according sources I could find. First attested use in English 1654. I've always imagined it's confused with ginger — which is a much older word/Indo-Aryan. I'm not seeing how 'ginseng' would come via Korean(?)

[This is similar to the English pronunciation of "Paris" or "Calais" despite being quite able to pronounce "pa-ree" or "ca-lay" … it's French that has changed while English has maintained the older pronunciation.]

This (now "older") pronouncer of Br.English has never said anything other than 'ca-lay', nor heard anybody pronounce an /s/ at the end. (Yes I do sound the /s/ at the end of "Paris".)

I'm pointing those out not to be pernickety but in response to:

trying to investigate any linguistic connections resulting from vocabulary diffusion in language contact situations based solely on sound correspondences and geography can lead us astray. [PW]

At a sufficient time depth, it can be very difficult to establish in which language or language family it [a Wanderwort] originated and in which it was borrowed. [VHM]

Thank you Penlin Wang and Prof Mair. If we don't have evidence of all of regular sound correspondences and geographic contact of speakers and transmission of a word with its referent, the null hypothesis must be that speakers just made up the word (we know that happens) or took it from a language now lost; and that the sound-alike/sense-alike is just coincidence. 'Dog' in English is a mystery. We can be reasonably certain it's not cognate with 'dog' (with the same meaning) in that Australian language.

Yes "A connection could take place at ancient times when demographic and ethnographic information on who was who was uncertain." Then what's holding anybody back from saying all languages/all words are derived from some single 'language of Adam'/proto-world?

The point is "William Jones … [was] … able to find out systematic similarities for his construction of the Indo-European family." Systematic based on overwhelming numbers of reconstructions.

Increasingly, the reconstructions Prof Mair has been posting on behalf of others are of single words needing explanations in terms of sound changes that are phonologically plausible but not systematically correlated with overwhelming numbers of parallels. Whilst one word might demonstrate an unusual sound change at one step, it's a stretch that that same word should undergo a whole series of unusual/idiosyncratic changes, especially when those intermediate steps are not well-attested.

To this non-philologist looking on, it is so lacking in systematics, I can't see it as different from interpreting ink blots. There seems an ever-present danger of hearing a sound-alike/sense-alike where there is none.
Victor Mair said,

February 21, 2019 @ 4:38 pm

Not when it is backed up by archeological, anthropological, art historical, and historical evidence, of which there is much more to come.
Michael Watts said,

February 21, 2019 @ 5:21 pm

I never saw "ginseng" as an implausible derivation from Mandarin 人参 renshen. The "r" indicates, in the modern day, a sound anywhere along the continuum between an American [ɹ] and a [ʒ]. For reasons that aren't clear to me, many English speakers seem to be uncomfortable beginning words with /ʒ/, preferring /dʒ/.

I have no idea what version of the chinese word was borrowed to become the English word, but every part of renshen corresponds to a part of "ginseng" in a fairly obvious way. (etymonline claims that "ginseng" derives "from Chinese jen-shen", which would be the Wade-Giles spelling of the Mandarin word.)
Penglin Wang said,

February 21, 2019 @ 5:36 pm

What I have been advocating is not based on an isolated phenomenon. Concerning the n-l correspondence, I presented a number of supporting examples out of Mongolic and Germanic in my 1992 article, some of which are as follows:

WMo nağad- ‘to play’
OE lācan ‘to play’

WMo nüken, Dagur nuğ, Monguor noko ‘hole’
OE luh ‘loch, pond’, OHG luh, German loch ‘hole’

WMo nabci (*nab-ci) ‘leaf’
OE lēaf, ME lefe ‘leaf’. German laub ‘leaf’ (< PIE base *leubh-)

WMo nigen ‘one, the same’, MMo niken ‘one’
OE -līc ‘like, alike, similar’. Old Frisian līke ‘like’

MMo hinege-, Dagur hinəd- ‘to laugh, smile’
OE hliehhan ‘to laugh’. Latin hilarō ‘to cheer, gladden’

WMo nilbu-sun (l < *s, lambdacism) ‘spittle’
OE lyswen ‘purulent, pus’. Tokharian AB leśp ‘phlegm’

WMo nil-ka (l < *t, lambdacism) ‘small.’ Dagur nialk ‘the youngest (son, etc.)’
OE līt ‘little’

I found one typo in my posting: “WMo hülde- ‘to chase, hunt” should be “MMo hülde- ‘to chase”.

In addition, the Mandarin Chinese r pronounces [ʐ].
Philip Taylor said,

February 21, 2019 @ 5:39 pm

For me, too, the relevance of "Calais" was completely unclear. Michael Watts' explanation of 人参 (rénshēn) -> "ginseng", on the other hand, makes perfect sense to me.
Scott P. said,

February 21, 2019 @ 5:55 pm

"This is similar to the English pronunciation of "Paris" or "Calais" despite being quite able to pronounce "pa-ree" or "ca-lay""

Wait — do the English not pronounce Calais as "Ca-lay"? That's how Americans pronounce it.
Andrew (not the same one) said,

February 21, 2019 @ 6:43 pm

The English also say 'Ca-lay'.

I believe that at one time – including the time when Calais belonged to England – they did not. Indeed, I've heard that it was sometimes written 'Callice'. But they do now.
Jerry Friedman said,

February 21, 2019 @ 7:00 pm

Since I'm sure everyone has been waiting for the OED's etymology for "ginseng", here you go. Though even if it makes the English form explicable (as it seems to me), there may still be inexplicable borrowings.

Origin: Of multiple origins. Partly a borrowing from Latin. Partly a borrowing from French. Etymons: Latin ginsem; French ginseng.
Etymology: Partly (i) < post-classical Latin ginsem, ginseng (both uninflected: see note),

and partly (ii) < French ginseng (see note),

both < Chinese (Hokkien) jîn-sim and its Mandarin equivalent rénshēn (with a 17th-cent. pronunciation) < rén person (probably on account of the forked shape of the root, resembling the legs of a person and also the Chinese character for this word) + -shēn, denoting ginseng and related plants.
Form history.

The uninflected post-classical Latin form ginsem (1654 in the passage translated in quot. 1654) probably reflects either the Chinese word directly (as transcribed by the Italian author) or Portuguese jinsém , †ginsem (although this is first attested slightly later: 1657 as ginse ; probably directly < Chinese); compare the early English form ginsem and also the French forms †ginsem (1667 in a translation from Portuguese) and †gins (apparently showing reinterpretation of the letters -em as a Latin case ending; 1654 in a translation of the Latin passage also translated in quot. 1654 at sense 1a).

French ginseng (1663 in the work reviewed in quot. 1666) and uninflected post-classical Latin ginseng (1674 in the work reviewed in quot. 1677 at sense 1b) apparently reflect a separate set of borrowings < the same Chinese word (which is cited as ginseng in a Latin context in 1668). Compare Italian ginseng (1671), Dutch ginseng (1670), and German Ginseng (1689 or earlier). Forms of this type may have arisen from misinterpretation of the final m of the Chinese word, since m was also used to write Chinese /ŋ/ in some early European texts (although not by Martini), probably reflecting Portuguese phonology.

Forms with -ng in the first syllable (e.g. gengzeng, gingseng) show assimilation to the final consonant.

In the β. forms [which have a or o in the second syllable —JF] probably influenced by forms of the word in other varieties of Chinese; perhaps compare (Cantonese) yàhn sām .

The forms gniseng and guiseng apparently originated as typographical errors which were subsequently copied into other texts.
AntC said,

February 21, 2019 @ 10:05 pm

I presented a number of supporting examples out of Mongolic and Germanic in my 1992 article …

I'm sorry that just won't do. We've already seen similar so-called 'support' in earlier threads, that turned out to be bogus derivations.

Why do you cherry-pick which Germanic source language? I see OE, ME, Frisian, OHG. Did all those peoples visit/settle in NW China or trade with China/Mongolia? When/how? How did a ME word get there? If it wasn't ME people themselves, wouldn't it be early Germanic traders? Then how would they have pronounced the word?

I also see Latin and Tokharian. If the trade was via the Tokharians, why don't all the loans show Tokharian phonology? Previous threads have shown Tokharian was very different to Germanic.

Perhaps you think it strengthens your evidence by throwing in stray source languages? I think it does the exact opposite: if you're alleging a Tokharian influence, then include the Tokharian for every word so we can see if there's a supporting pattern. If we have the OHG or Proto-Germanic reconstruction, show that for every word. If you're not giving the Tokharian or P-Gmc, I'd be suspicious you're suppressing them because they contradict the pattern you're alleging.

I admire Tsung-tung Chang's approach because he doesn't cherry-pick. He takes Germanic vocab at a specific point in time (from Pokorny's word lists) to show systematic sound pattern changes to a best-guess of EM Chinese at a similar time. If the 'game' allows picking from any branch of Germanic (or even Latin), and then layering on sound changes such as (l < *s, lambdacism), of course you increase your chances of finding sound-alike/sense-alike. You're merely increasing the likelihood of coincidences.

WMo nağad- ‘to play’
OE lācan ‘to play’ (and for all the examples similarly …)

English dialectal or colloquial lake/laik "to play, frolic, make sport" (c. 1300, from Old Norse leika "to play," from PIE *leig- (3) "to leap") [etymonline]

Germanic 'leap' Proto-Germanic *hlaupanan "of uncertain origin, with no known cognates beyond Germanic". So did 'leap' get into WMo? With that leading hl- it should follow your pattern for 'laugh'. What became of its leading /l/? Is the Tokharian for 'leap' cognate with PIE or Germanic?

These n-l correspondences only counts as "support" if you have an attested (or reconstructed by way of overwhelming systematic parallels) early Germanic form, and the early WMo form (ditto) at the same date, and an explanation for how the Germanic form morphed into the early WMo from the sound patterns at the time. I hope there are a great many more examples of n-l correspondences, with a variety of phonetic contexts. (Tsung-tung Chang's lists are some 1500 words.)

Do you have evidence of borrowing as specific as a translation from Latin 1654, as we do for ginseng? If not, why do you think that some set of n-l correspondences are more certain, and therefore provide 'support'? Isn't every correspondence equally dubious? Does every /n/ morph to a /l/? Do you have an explanation for why some didn't? Or do we suppose those source words didn't make it from Germanic? Why not?
Chris Button said,

February 21, 2019 @ 10:37 pm

@ AntC

Increasingly, the reconstructions Prof Mair has been posting on behalf of others are of single words needing explanations in terms of sound changes that are phonologically plausible but not systematically correlated with overwhelming numbers of parallels.

At first blush that would seem logical providing there are multiple loans from one language into another that may be compared, but it ignores a few crucial considerations. Firstly, the same loan may surface differently at different time depths since language is constantly changing (e.g. French doublets in English, or multiple Sino-Japanese readings of individual Japanese kanji). Furthermore, two allophones representing a single phoneme in the donor language may be interpreted as two different phonemes in the language receiving the loan. This can be further exacerbated by different dialects and hence it is the possible surface phonetic realizations that need to be fully considered rather just the basic "phonological plausibility".
Victor Mair said,

February 22, 2019 @ 12:32 am

Please, no bandwagoning. Don't just say things like "I agree with so-and-so" and "I disagree with so-and-so". That really doesn't help us understand better what is being discussed.

And, from time to time, refresh your memory about our Comments policy which is listed in the banner at the top of our homepage.
Victor Mair said,

February 22, 2019 @ 12:36 am

It is possible that only one word or two words or a few words or maybe a dozen words might be borrowed from one language into another language. It is not necessarily the case that instances of borrowing from one language to another number in the hundreds.
Michael Watts said,

February 22, 2019 @ 12:39 am

In addition, the Mandarin Chinese r pronounces [ʐ].

It may represent a phonemic /ʐ/. Pronunciation varies.

I'll note that as an American English speaker, I do not distinguish [ʐ] from [ʒ]. I do distinguish [ʒ] from [ɹ], as those are quite different phonemes in American English. And if you listen to Chinese people talk, it's very clear that Mandarin "r" may be realized as either.
AntC said,

February 22, 2019 @ 12:53 am

Thanks @Chris B, I don't think I'm disagreeing with your explanations. I am asking why there's one such explanation for one word but a different such explanation for a different word.

the same loan may surface differently at different time depths since language is constantly changing

Sure. And for English borrowings from Romance sources we have an extended period of contact, the Normans, the Crusades, the churchmen/Medieval Latin, the Rennaissance/Italian and loans from Arabic, etc. We also have well-established evidence and timings of sound pattern changes in all the applicable languages.

This post is (apparently) claiming loans from different time depths OHG, OE, ME. Then do we have extended contact with WMo/MMo throughout the applicable period? Can we correlate different borrowings from different time depths with sound changes in Germanic and Mo? Does that thereby explain n-l correspondence at some time depths but something else-l at other time depths? I see no evidence presented. I see only cherry-picking, little better than anecdote. I don't think the rules of the 'game' allow hand-waving like: it happened for Romance-to-English, so let's suppose it happened for Germanic-to-certain East Asian languages, but somehow the borrowing leapfrogged over all sorts of other languages that have been in the region at the given time depth.

two allophones representing a single phoneme in the donor language may be interpreted as two different phonemes in the language receiving the loan.

Bzzzzt. "may be interpreted" No, not without explanation. An explanation like Verner's Law as an adjustment on top of Grimm's, for example.

Did some /l/s go to /n/ whereas others go to (whatever) by flipping a coin? If there's alleged to be a pattern, then the explanation needs to say what the pattern is: within what contexts does this rule apply; within what contexts does some other.

If the choice of 'pattern' is at the whim of the reconstructer (as seemed to be the case with Chau Wu), then we don't have a testable hypothesis (in the Popperian sense). Then an alleged correspondence isn't proof neither is its failure disproof; it has no explanatory power.

the possible surface phonetic realizations that need to be fully considered rather just the basic "phonological plausibility".

Sure. "phonological plausibility" is a minimal sanity check. On its own for one word not evidence at all. It has been sufficient to reject that the 10-character puzzle was Tokharian, for example.

I'm very aware all the objections I'm raising could be (probably were) raised against William Jones/PIE reconstruction. Then gathering the overwhelming/irrefutable volume of evidence was an act of superb scholarship, and eventually so scientifically convincing that Darwin could borrow the principle into explaining the Origins of the Species. And we've subsequently found archaeological and DNA evidence to track the migrations of the IE languages. (Just as post-Darwin we found gene theory, plate techtonics, DNA, carbon dating explanations for Origins.)

But we can't just say: if it happened for PIE then it could have happened for Germanic to Mongolian/Chinese (or Chinese to Celtic) without providing equally strong evidence. So far what I've seen is only suggestive of a hypothesis. At first it was fun making hypotheses; but the fun rather went out of it when I applied critical thinking to Chau Wu's claims, and he started dragging in all sorts of words from all sorts of languages. Again I contrast that Tsung-tung Chang is making very specific, time-depthed derivations with large volumes of word lists supporting the patterns.
B.Ma said,

February 22, 2019 @ 2:37 am

Pinyin r is used to transliterate a /dʒ/ or /ʒ/ sound in 日内瓦 and 尼日利亚.

Korean 인삼 sounds nothing like English "ginseng".
Levantine said,

February 22, 2019 @ 4:01 am

Regarding Calais, the older English pronunciation is still used in performances of Shakespeare. Indeed, “Callice” is the usual form in the First Folio.
Chris Button said,

February 22, 2019 @ 6:35 am

@AntC

Just to be clear, my comment was not intended to be supportive (or not) of any of the proposals for loans made here or on other LLog threads. Rather, it was just intended as a general comment about the complexity of the issue.
AntC said,

February 22, 2019 @ 7:13 am

Indo-European Loanwords in Altaic, Penglin Wang

Sino-Platonic Papers No. 65, 1995

"Since for the time being I am not in a position to determine which word is native to Altaic or native to Old English, it is safe to posit that Old English (or a related Indo-European language) was a lender."

Here's a method: If the OE has cognates traceable back to Proto-Germanic and PIE, it's reasonable to expect the word is 'native' to OE. (Whatever 'native' is supposed to mean here.) Perhaps the editor-in-chief of the SPP series could have advised you there. Being 'native' to OE or IE is no sort of evidence that it was lent to Altaic.

I am not following why this paper should highlight OE as a lender into Altaic. Aren't there likelier sources of lending from Germanic languages closer to Altaic lands? And indeed as soon as an OE cognate is difficult to find, the paper casts around for any old Germanic/Gothic/Norse, or Latin or Greek stem. We've been here before. And then there's

"It was likely that many Old English and Tokharian words penetrated into the Altaic languages including Xiongnu. "

And when/where was this wondrous sprachbund that spoke a mish-mash of Tocharian/OE/Germanic?

There is no hypothesis about different time depths of contact with different IE branches. The methodology seems to be: pick an OE word; find some Tocharian sound-alike; make up some fairy story about how the two words must be sense-alike:

OE. an 'one', anga 'only, sole'. ToA anu 'cessation, rest'
Presumably, ToA anu once meant 'holiday, the New Year's Day'.

[top of page 3. apologies that copypaste has mangled the diacritics on the ToA form.]

Did the Tocharians take a holiday on the first day of the year/first day of some month? Anyway if we stir around the meanings a bit, we can approximate Year and/or New Year and/or celebrate and/or beginning and/or 'the most'[sic] in a variety of Altaic languages. (I'm not denying those might be cognates from proto-Altaic, if there's such a thing.)

I noticed 'holm', because it's under discussion in another place

OE. holm 'sea, water', ON holmr 'island' (as in Stockholm)

In no Germanic language does 'holm' mean sea/water. They all have island (in a river) or raised land in a river/marsh. It's not just that the alleged OE sense is wrong, it's that it's easily verified. Never the less all the Altaic senses have to do with water/river/bathing (they could be cognates within Altaic). This looks like doctoring the OE evidence to fit the alleged lending.

we may be tempted to speculate whether the river names Humber (OE Humbre)
and Elbe have any connection with Jurchen humpa and Manchu elbiSe- respectively.

I suppose if 'holm' did mean water one might speculate. Humber: "Most European hydronyms are Celtic in origin and numerous Celtic or Pre-Celtic derivations for Humber have been suggested. The Celtic root *com-*bero (coming together), …" [wikipedia]. And indeed the Humber is the coming together of many rivers draining North and West Yorkshire, and Nottinghamshire. Elbe: "First attested in Latin as Albis, the name Elbe means "river" or "river-bed" and is nothing more than the High German version of a word (albiz) found elsewhere in Germanic; …". So it's not cognate with 'holm'. If Elbe is cognate with the Manchu for river, we're lacking an explanation how the Altaic words are related, because the rest of them are (vaguely) more sound-alike to 'holm'.

I could go on and on with such examples. There's no hypothesis about regularities in sound patterns. Just long lists of vaguely sound-alike/vaguely sense/alike, suppositions as to how the senses could have spread.

Professor Mair: if in drawing attention to this (and I've been as factual and dispassionate as I could manage) I am transgressing the Comments policy, I can only say: my critique is taking it we're trying to do science/research here. Please explain, as editor-in-chief of SPP what sort of science this is. I could discount this paper as an early foray into a complex subject. But in the predecessor post to this thread (on 'hart' amongst cervids/reindeer), Penglin Wang continues to draw uncritically on OE cognates for Altaic.
David Marjanović said,

February 22, 2019 @ 7:20 am

Whoa! I'm famous!

I'm glad that my comments have resulted in such a substantive response. I don't have time today, but promise to engage with it tomorrow. In the meantime, please continue – don't wait for me!
Victor Mair said,

February 22, 2019 @ 8:49 am

Just because a word is borrowed from language X into language Y at a certain time depth doesn't mean that another word (or words) from language X could not be borrowed into language Y at another time depth(s).

Trying to regularize or systematize borrowing is always going to be a messy business because, among other reasons, there is the problem of topolects and dialects. Within a given donor language there are significant differences according to place and time, and the same goes for recipient languages.

My own preference is to deal with individual words for which there is substantial non-linguistic supporting evidence of the type I have mentioned above and in other posts. But that's just my own modus operandi. There's nothing to prevent others from wanting to systematize and regularize the data; indeed that's a natural tendency for scientists and historians, to try to make comprehensive sense out of a mass of data. I'll be writing a separate post about that within a week or so. For the time being I'll simply say that, in assessing the systematic treatment of materials gathered by other scholars, let us not throw the baby out with the bath water (the gold nuggets out with the wash water). I treasure the gems that are discovered, no matter how or whether they fit into a system.
Penglin Wang said,

February 22, 2019 @ 12:15 pm

Many of what AntC has been asking have been discussed in my previous work and will be discussed in my ongoing research. Given my hectic daily teaching agenda, I have no time to go through every posting and every question. By the way, could we expect all the participants in this open forum to use their full names. I always use my full name to publish my opinions in order to show my sense of responsibility for what I have been saying.

One correction: in my posting of February 21, 2019 @ 5:36 pm, "in my 1992 article" should read "in my 2001 article." Sorry for this glitch.
Andre Mayer said,

February 22, 2019 @ 4:08 pm

In Calais, Maine, USA the "s" is pronounced. I'd also note that terylene is called dacron in the US – closer to "dilun"?
AntC said,

February 22, 2019 @ 7:26 pm

@Pengling Wang, no problem with you knowing my name and private email address: I'm happy for Prof Mair to pass that on to you.

This is a blog, where it's usual for folk to go under a persona. Furthermore it's a public forum, so I'm not going to post my email address for spambots to harvest.

I am not an academic or professional linguist of any water, as I've made clear in comments on many of Prof Mair's recent threads. Whatever you expect by "show my sense of responsibility for what I have been saying.", I think the ideas and critiques stand or fall by the evidence presented, we don't need to get personal. I drew my evidence from readily-available online resources, which I cited. Perhaps they were not so readily available in 1995, but I presume any academic institution could provide resources for etymologies of IE/Germanic.

I appreciate that most posters here are busy academics, with little time to work through detailed derivations, let alone adumbrate detailed evidence and comparisons for rank amateurs like me who question them. Then it behoves Prof Mair as editor-in-chief of SPP, before publication, to arrange adequate peer review, and to quality control postings he makes on others' behalf. At least to ' sanity check' against his huge knowledge of Sinitic languages.

Then I find Prof Mair's continued support for Chau Wu's work quite indefensible, to the extent it is discrediting the whole SPP series. As well as the recent thread with many critical comments from academics, I've just discovered there was a whole previous round with even more critical comments. "methodology looks dubious", "Wu's claims … well I think "folk etymology" means a different thing, so maybe crank etymology.", "I would like to see Wu's work receive a careful review by other scholars. … would not be well accepted by the larger linguistic community. ", "The University of Pennsylvania is home to one of the world's leading experts on Indo-European historical linguistics. Was Don Ringe unavailable for comment?", " it is probably worth stressing just how different Pulleyblank’s approach is from that of Chau Wu.", "pseudoscience".

It is admirable for Prof Mair to encourage linguistics of early Chinese and N/E asian languages. It is admirable that he's a loyal colleague. I do not find it admirable to support and publish work with such dubious methodology. Ultimately it risks discrediting the whole SPP series. This is not like the early proponents of an IE family standing out against orthodoxy; not like Galileo standing out against the Church. It is more like supporting 'inventors' of perpetual motion machines.

I am trying to remain open to the idea there was language contact/borrowing between IE and early Chinese. It is a stimulating hypothesis. Chau Wu's work is so flawed as to draw ridicule to the hypothesis. Pengling Wang's paper seems to suffer many of the same flaws.
AntC said,

February 22, 2019 @ 7:33 pm

terylene is called dacron in the US – closer to "dilun"?

Thank you @Andre. Yes plausible. But we need an audit trail of when/how the word came into Chinese. That shows the dangers of assuming sense-alike must be sound-alike. Etymology must be meticulous.
AntC said,

February 22, 2019 @ 11:27 pm

Just because a word is borrowed from language X into language Y at a certain time depth doesn't mean that another word (or words) from language X could not be borrowed into language Y at another time depth(s).

Indeed. What time depth(s) do you have in mind for IE/Altaic (Manchu/Mongolian) diffusion?

Previously you've said "what must have occurred is extensive borrowing that started no later than the third millennium BC, when cultural exchange was occurring across the Eurasian steppe." And you've adumbrated archaeological evidence of cultural exchange, at a rather greater timedepth than Penglin Wang seems to be suggesting. Your "must have occurred" appeal is something of a red flag to me. It's in need of substantiation.

Then I am having great difficulty understanding why Penglin Wang is drawing borrowings mostly from OE. We have to suppose cultural exchange was occurring across the Eurasian steppe to as far west as Frisia/Western Europe (or at least northern Germany). Weren't there rather a lot of speakers of other languages 'in the way' by the time of OE? (Balto-Slavic for example.) Why were the Anglo-Saxons heading West if they had cultural contacts to the East? (So I'm having trouble aligning an OE timedepth to a Middle Mongolian timedepth.)

Within a given donor language there are significant differences according to place and time, and the same goes for recipient languages.

Indeed. Then a methodical approach would be to take a word at the same time depth in both the donor and recipient. How is it legitimate to refer to all of OE, ON, Tocharian, Gothic, with no time-depth segregation? Particularly Tocharian, given its time-depth is older; given that it split from IE at some great time depth; and given that its phonology is markedly different to Germanic.

There is something Prof Mair fails to point out: to robustly show that some word was borrowed specifically from OE to MMo (as is the claim for heort ), you'd need to show the word/cognate did not exist previously in the recipient language. In "Inner Asia … for millennia …Rock artists took interest in representing animals including deer and elk" says Penglin Wang in the thread on Reindeer. So presumably there was already a word for deer/cervids. What was it? Why could it not be the source for the MMo or Manchu words for deer?

We could suppose an earlier time of borrowing, from an earlier form of Germanic. But then Penglin Wang's hypothesis falls apart: earlier Germanic forms are not sound-alike to the MMo/Manchu form. We could suppose an older form in MMo/Manchu. Has that been reconstructed? Was it sound-alike?

in assessing the systematic treatment of materials gathered by other scholars, let us not throw the baby out with the bath water (the gold nuggets out with the wash water). I treasure the gems that are discovered, no matter how or whether they fit into a system.

Tsung-tung Chang's treatment I find systematic; he is taking a rather older time depth. I see here a treatment that might represent earnest endeavour, but I can't see as systematic. I see not 'gold nuggets' but fool's gold; nothing convincingly 'discovered'; not gems but paste diamonds. Then scholarship demands that we should throw that out.
Philip Taylor said,

February 23, 2019 @ 4:58 am

Ant C ("[This is ] not like Galileo standing out against the Church. It is more like supporting 'inventors' of perpetual motion machines.") — At the time that Galileo challenged the terracentric beliefs of the Church (and of the vast majority of mankind, at that time), his willingness to do so was very much on a par with that of someone supporting the claims of 'inventors' of 'perpetual motion machines' today. Heterodoxy is surely always to be welcomed — if we all choose to simply accept the received orthodoxy, what hope can we have of ever making real breakthroughs rather than merely making incremental progress ? Whilst I do not have the background to judge for myself whether Chau Wu's or Pengling Wang's theories are viable hypotheses or snake oil, I am delighted that they can be presented here so that those better informed than I are able to discuss them and disinterestedly assess them on their merits.
Victor Mair said,

February 23, 2019 @ 8:27 am

The letter copied below comes from a colleague (fellow scholar) in China. It is pertinent to our present debate in a number of striking ways.

=====

Dear Prof. Mair,

Happy Lunar New Year to you.

Recently I pay attention to the wanderwords of sorcerer/wizard in Inner Eurasia; and your striking article on the etymological connection between Chinese 巫 and Old Persian magush would enlighten me in a deeper scale through some excerpts cited in another articles. Since most of the international online resources have been barred in Chinese mainland [VHM: emphasis added], I haven't read the whole article of yours. Would you please share it with me by a PDF version? Thank you very much.

Best wishes.

XXXXX
YYYYY, China

=====

Aside from his sincere interest in Wanderwörter, what this scholar says also has implications for the free flow of information in society.

This is prima facie evidence for the harsh restrictions on the internet that are in place in China, the severity of which is constantly increasing.

As I have stated repeatedly, a country cannot remain vital and competitive when its citizens have only starkly limited access to information that is freely available to the citizens of other countries. In the present case, I feel a particular poignancy to the plight of fellow scholars in China vis-à-vis those in the free world. All who pursue knowledge surely can sympathize with the predicament of those in totalitarian dictatorships like China where communication is tightly controlled by the ruling elite with the conscious intention of keeping the populace in a benighted state of mind.

Every day, I receive many similar indications of the dysfunctionality of the internet and thought control in China. This one hit me particularly hard because of who I am in terms of my own profession.

I feel so sorry for the scholars (scientists and humanists alike) and common citizens in China who do not have full access to the wondrous riches of the Internet.
Jerry Friedman said,

February 23, 2019 @ 11:02 am

Does anyone ever try to set up a baseline or control for studies of this type by looking for apparent borrowings between languages known not to have been in contact? How many plausible-looking borrowings would there be between Algonquian languages and Albanian or between Maori and Mandinka? Or am I just revealing how naive I am about linguistics and linguists?

(Okay, Google Translate says the Albanian for "raccoon" is "rakun".)
Rodger C said,

February 23, 2019 @ 11:48 am

between Algonquian languages and Albanian

I've seen books that traced Algonkian to "dialectal [!] Norwegian" and Etruscan to Albanian. I wonder if these were in the back of Jerry Friedman's mind when he wrote this. At any rate, when I look at Chau Wu's work, I'm immediately reminded of them. A native speaker of an obscure language thinks that it gives him/her privileged access to another language that s/he knows only from dictionaries.
Michael Watts said,

February 23, 2019 @ 1:02 pm

Does anyone ever try to set up a baseline or control for studies of this type by looking for apparent borrowings between languages known not to have been in contact? How many plausible-looking borrowings would there be between Algonquian languages and Albanian or between Maori and Mandinka?

There are tons of plausible-looking borrowings between any two languages. Mark Rosenfelder wrote about this years ago in response to someone else's kooky claims about deriving the ur-language.

https://www.zompist.com/proto.html – mostly making fun of the research

http://zompist.com/chance.htm – mathematical analysis of how easy it is to find chance correspondences

There's all kinds of fun stuff in there:

Similar words with similar meanings do not prove that languages are related. They might point to a relationship– but they might also be due to borrowing ('gung ho' really is from Chinese); they might be due to universal processes like babytalk or onomatopoeia; and above all they may just be chance.

This seems to be hard for some people to accept. Just look at ren and runa, or gaijin and goyim, they seem to think– how could that possibly be due to chance?

These people should be treated with respect. They are the people who made Las Vegas what it is today.

There are a number of obvious problems with this list. [The list is of purported cognates between Quechua and various afroasiatic languages.]

– The compiler (whose name I omit out of charity) knows nothing about the comparative method; no regular correspondences are presented.

– It's quite naive to compare individual Semitic languages with modern Cuzqueño dialect. On the Semitic side proto-Semitic or proto-Afro-Asiatic should be used; and on the Quechua side, reconstructed proto-Quechua. We also know some words in an even earlier form; for instance qocha is related to Aymara qota– which looks even less like the proposed cognate gubshu.

However, my only concern here is to answer the compiler's question: "Is it a mere coincidence that there are so many correspondances between these languages?"

[Rosenfelder calculates that] given his phonetic and semantic laxness, the comparer is ordinarily going to find a random match for almost every Quechua word.

Is it a mere coincidence that there are so many correspondances between these languages? No, it isn't; what would be really surprising would be if there weren't a thousand more of the same quality.

(much text omitted from within the quoted section; emphasis original)

Don't the probabilities become meaningful once you look at hundreds of words, or at many language families? Well, no. A bad methodology doesn't become more respectable just by repeating it. My Quechua/Chinese bogus cognates do not merit additional respect when I add to them a few more bogus cognates from Greek, Spanish, or French.
Victor Mair said,

February 23, 2019 @ 1:21 pm

A duplicate of the preceding comment was also submitted by Outis.
Michael Watts said,

February 23, 2019 @ 1:48 pm

Οὖτις

My comments seem to disappear from VHM's posts with some frequency. It's already happened in this very thread.

So I assumed it was happening again.
Victor Mair said,

February 23, 2019 @ 4:29 pm

Wrong assumptions.
David Marjanović said,

February 23, 2019 @ 4:52 pm

People moved around and encountered varying occasions to hear other people speaking. It was unlikely that people obtained foreign words through mouth-to-mouth teaching and imitation.

It is, at least, textbook wisdom that loanwords do not generally enter languages when one person hears or mishears one word from another language once. Rather, words are generally borrowed by people who speak the donor language well and understand its sound system.

Thus, discrepancies and inconsistencies have been popping up from time to time in interlanguage phonology. Numerous instances of crosslinguistic borrowings do not fit into what DM argues for. English terylene was introduced into Chinese as dilun (滌綸) even though the Chinese speakers have no difficulty pronouncing it as terilin (特日林) to get it much closer to English terylene.

This looks a lot like the word was not directly borrowed from English into Mandarin, but went through some intermediate or several. Southern Sinitic topolects generally lack the northern Mandarin r sound, and instead resort to /l/ to approximate foreign [r] or [ɹ]*; that would neatly explain why the -ryl- part, somewhere around [ɹəl] ~ [ɹːl] ~ [ɹl̩] in English, is represented by a simple [l].

* Hence the Western stereotype that "the Chinese pronounce R as L", which has no basis in northern/Standard Mandarin.

It is well known that many Western words entered Mandarin through Cantonese in the following way: Cantonese speakers heard the words, used them in speaking Cantonese, and used them in writing in characters; Mandarin speakers then read the characters in Mandarin pronunciation.

So, how is 滌綸 pronounced in Cantonese or Hokkien? (Not a rhetorical question. I have no idea, and no easy way of finding out on my own.)

…Or of course dilun isn't from terylene at all, but from dacron as proposed above. But this, too, calls for a path through another Sinitic language in order to explain why -cron isn't represented by -kun or -keran or something.

Likewise, English ginseng was ultimately from Mandarin Chinese renshen (人參, Hokkien yinsim, Cantonese yansam).

Yes, ultimately. The g- makes sense as a French or Portuguese spelling for [ʒ], which is the closest approximation these languages have to the Mandarin r*. The -ng makes sense as a reinterpretation of a nasal vowel; I know some Mandarin varieties have turned -n into vowel nasality… and Portuguese spells its word-final nasal vowels with -m.

A nasal vowel in the other syllable could explain the i; confusion with ginger, proposed above, would probably work even better.

* In the standard, it lies between the retroflex approximant [ɻ] and the retroflex fricative [ʐ]. Or, rather, its average realization lies there, while its range overlaps both.

The Manchu and Mongolian phonotactics of the postvocalic liquid r/l can in no way preclude the liquid from being rendered as a nasal, especially before an obstruent consonant. In other words, such combinations as /rd/ cannot be perpetuated.

I don't understand how the second sentence follows from the first, or how the second sentence is compatible with the existence of the cluster /rd/ in both Manchu and Mongolian.

Both liquid and nasal are sonorant consonants, facilitating the direction of phonetic change from the former to the latter.

That a sound change is possible does not mean it happened in this particular instant.

Yes, of course languages are known that have shifted /rd/ to /nd/, or even syllable-final /r/ to /n/ more generally. But such things are not known in the history of Mongolic or Tungusic, either as sound changes within the language or as sound substitutions applied to loanwords. What reason is there to think that this one word was an exception?

Parallel to the postvocalic liquid is the syllable pattern containing postvocalic nasal in both languages.

That's an indication that the Mongolian and the Manchu forms are related in some way or other. Nothing more.

Assume that Germanic word for ‘hart’ was in the process of diffusion in Inner Asia, and then the local speakers heard it as if it was pronounced kand-/hand-.

But why would I make such an assumption, when these languages had (and still have) a robust distinction between /nd/ and /rd/?

Instead of the sound [qʰ] (sic), the grapheme q representing a velar stop in Written Mongolian (WMo) was indeed pronounced h in many words recorded in the Sino-Mongolian glossaries belonging to the Middle Mongolian (MMo) period, which covered the times of Chinggis Khan. […] Corresponding to WMo qandağai and Manchu kandagan are Dagur handəğ and Buriat handagai bearing the word-initial h-.

Ah, but h means different things.

In Modern Standard Mandarin, h is [x]. Farther south, it is [h]. There is yet another accent, I don't know where it's spoken, that consistently uses [χ] instead. Elsewhere, [x] &gt: [χ] > [h] seems to be a one-way trip, so, if there's no evidence to the contrary (is there?), I assume [x] is the oldest of these.

Most languages that have any one of these sounds either have only the one, or use two or three of them as allophones of a single phoneme. There are not many languages that distinguish two. Buriat, however, is one of them: it has a /x ~ χ/ from k ~ q, and a /h/ from s. So I got curious what "handagai" really is.

хандагай gives me 25,900 Google hits, of which the second through fourth explain that it means "moose".

һандагай gives me four Google hits. At least one of them is actually in Buriat, but һандагай occurs only once on that page (where хандагай does not occur at all), so it could easily be some kind of typo.

Maybe, then, [qʰ] > [χ] had already happened when the Secret History was written down. Or maybe there was even an intermediate [qχ] phase, and whoever wrote the Secret History found /x/ a better Early Mandarin approximation than /kʰ/…

On the Germanic side, Tacitus used ch- for western and h- for eastern names; ch was also used much later for Frankish words, but in a Latin/Romance context where h was expected to be silent, so that may not mean much. The Gothic h may have been [h] or [χ] (or of course both as allophones), but not [x] – tell me if you want me to explain why. From all other Germanic languages there's no evidence for anything but word-initial [h].

development from a postvocalic liquid to a nasal (WMo hülde– ‘to chase, hunt – OE huntian ‘chase game, English hunt)

Are you saying OE /rt/ became Mo /nd/, but Mo /ld/ became OE /nt/?!?

Whatever happened to Germanic, the net result is that English hart existed in the past and exists at modern times.

The etymon existed. Its /a/ dates from a sound shift in Late Middle English – 15th century! – that did not spare such loans as clerk or sergeant. Without this /a/, your hypothesis has a serious problem as far as I can see.

To the best of my knowledge, since Manchu and Mongolian unlikely had the diphthong eo occurring in OE heort ~ heorot, their speakers felt it convenient to shift to a.

This is a mere restatement of your position, not an answer to my question, which I therefore have to repeat here: "Why would […] [ɛɔ̯] or thereabouts […] be represented by [ɑ]? Why not by [ə] or [i] or [o], which Middle Mongolian also had to offer?" All three of the latter are quite a bit more similar to [ɛɔ̯] than [ɑ] is.

As an alternative scenario, we cannot rule out the possibility whereby the Manchu and Mongolian words were adopted from the form hart when it existed somewhere in Inner Asia,

We can, frankly, rule out that the form hart existed anywhere in Inner Asia.

perhaps in Tokharian

Tokharian didn't have any kind of [h] or [x] or suchlike.

or in the speech of the Yellow-Headed Shiwei (黃頭室韋) who had been a sizeable ethnic group in Manchuria during the seventh-twelfth centuries (Wang 2018:187-199).

Of course I don't have access to your book; but in the early 8th century at least, there were Sogdians in Liáoníng. Could these be the yellow-headed people? Sogdian is an Iranian language, not a Germanic or Tokharian one.

The Wikipedia article on the Shiwei ascribes various Turkic and Mongolic origins to different Shiwei tribes, and adds: "The Huangtou ('yellow-head') Shiwei may have been named so because of a high incidence of blondness within their tribe, but it is not certain. However, blondness still occurs regularly in the region today."

============================

This comment is long enough, and contains a link; I'll need to post at least two more links, so I'll interrupt myself here.
David Marjanović said,

February 23, 2019 @ 5:54 pm

A Wanderwort […] is a word that has spread as a loanword among numerous languages and cultures, especially those that are faraway from one another [VHM: emphasis added], usually in connection with trade. As such, Wanderwörter are a curiosity in historical linguistics and sociolinguistics within a wider study of language contact. At a sufficient time depth, it can be very difficult to establish in which language or language family it originated and in which it was borrowed.

This omits an important part: Wanderwörter wander through chains of languages, each adding sound changes and reinterpretations. Therefore, when a word wanders from language A through B, then C and then D to E, E ends up with a form that is very different from what it would have if it had borrowed from A directly. The presented examples farang and orange make this particularly clear. Tea also counts: one European language got it directly from Amoy Hokkien, and then the rest of western and central Europe got it directly or indirectly from that one European language.

'Dog' in English is a mystery. We can be reasonably certain it's not cognate with 'dog' (with the same meaning) in that Australian language.

Specifically, dog in English is one of those n-stem nicknames; what's not clear is what it was formed from, but the two proposals I know are here, one in the paper linked to from the end of the OP, one in the first comment.

Conversely, dog in Mbabaram is the regular outcome of gudaga, attested as such in related languages if I'm remembering that right. Mbabaram has undergone a thoroughly bizarre but fully regular process of dropping the first syllable of every word that has more than one.

For reasons that aren't clear to me, many English speakers seem to be uncomfortable beginning words with /ʒ/, preferring /dʒ/.

That's easy to explain: other than words of obviously French origin – obvious to people without higher education, that is –, English simply doesn't have words that begin with /ʒ/. The medieval loans from French came in before (non-Walloon…) French had changed from [dʒ] to [ʒ].

MMo hinege-, Dagur hinəd- ‘to laugh, smile’
OE hliehhan ‘to laugh’. Latin hilarō ‘to cheer, gladden’

The OE and the Latin word are neither cognate with each other, nor are they related by borrowing. This is really blindingly obvious without even taking a look at Wiktionary. Tell me if you'd like me to elaborate.

Otherwise, please pick one: which one of the two would be related to the Mongolic forms?

Previous threads have shown Tokharian was very different to Germanic.

Concerning that, I forgot to mention something that seems very important:

Throughout IE there's a widespread root *mer-. Throughout the crown-group, it means "die". (In English, it's hidden in murder.) In Anatolian, it means "disappear". This is interpreted as "disappear" becoming a euphemism for "die", eventually replacing the original "die" root. I forgot if *mer- is attested in Tocharian, but the usual "die" root in Tocharian is *wel-, which is also attested with this meaning in Anatolian.

Is the Tokharian for 'leap' cognate with PIE or Germanic?

Broaden your search to "run", because that's what laufen means in German. …Except for a large part of Germany, where it has moved on to "walk"!

two allophones representing a single phoneme in the donor language may be interpreted as two different phonemes in the language receiving the loan.

Bzzzzt. "may be interpreted" No, not without explanation. An explanation like Verner's Law as an adjustment on top of Grimm's, for example.

Did some /l/s go to /n/ whereas others go to (whatever) by flipping a coin? If there's alleged to be a pattern, then the explanation needs to say what the pattern is: within what contexts does this rule apply; within what contexts does some other.

That's what "allophones" is about: two (or more) sounds are allophones of the same phoneme if the choice between them is explained by the phonetic context.

Of course, if allophones are postulated for an extinct or reconstructed language, it is necessary to present the evidence that allows us to test that hypothesis.

(Okay, Google Translate says the Albanian for "raccoon" is "rakun".)

That's a borrowing from English – raccoons aren't native to Europe.

One more thing to follow.
David Marjanović said,

February 23, 2019 @ 6:10 pm

Does anyone ever try to set up a baseline or control for studies of this type by looking for apparent borrowings between languages known not to have been in contact? How many plausible-looking borrowings would there be between Algonquian languages and Albanian or between Maori and Mandinka?

Rosenfelder did it in a very mocking way that seems easy to dismiss. But when it's done seriously, the results are pretty scary. Here is an Indo-Europeanist arguing that the "Proto-Eurasiatic" reconstructions in the left column are bogus by trying to find Quechua matches for them, complete with a few non-trivial regular sound correspondences. In the comments you'll find me saying he must be cheating, and it turns out he's not. Then the mathematics break out. :-)

Somewhere out there there's a list of identical words with identical meanings in different languages that are identical only by chance. Dog in English and Mbabaram is one. /bæd/ in English and Farsi is another. I forgot which language in New Guinea it is where /buːb/ means exactly what you think it means. And there are more…
AntC said,

February 23, 2019 @ 8:24 pm

Thank you @Michael W and @David M. Also thank you to Zompist and Rosenfelder for the hard work they've put in to explaining so clearly, doing the math, and finding bogus sound-alike/sense-alikes. Those links are the material I knew was around but couldn't find previously.

Why is it so hard for even highly intelligent people to convince themselves that random matches will be few? asks Zompist. Earlier he referred to "those with no linguistic training" but that's not who we're talking about here.

If we replace 'highly intelligent people' by 'Historical Linguists', I'd like to ask:

Are Linguistics undergraduates (at least at the point where they specialise to Historical Ling) not exposed to this sobering material? How about Historical Linguists who've transferred from other disciplines, perhaps as mature students returning to academe as a career change? Might they 'fall through the cracks' as it were?

How do we explain the 'lumpism' of a Historical Linguist like Greenberg: was he just not aware of the dangers of false positives?

I am not belittling or trying to take away from the huge effort of close study and scholarship represented by these claimed word comparisons. They've started from an invalid presupposition, seems to be the problem. I'm reminded of the case of William Shanks, who was a diligent and accomplished amateur mathematician who published much useful work. Sadly he's famous (amongst a certain cognoscenti) today for only one thing: in his calculation of the decimal expansion of pi to 607 places, he made a mistake at digit 527. The calculation needed such enormous effort that nobody checked his work until the arrival of mechanical calculators, some 70 years later.
AntC said,

February 23, 2019 @ 9:11 pm

thank you to Zompist and Rosenfelder

Errk, those two are the same. I meant to them and Piotr Gasiorowski/langevo.

If the William Shanks comment isn't clear: going wrong at digit 527 means every digit after is wrong. In the same way that if you make a mis-step at a derivation, every derivation that relies on that sound change is wrong.
AntC said,

February 24, 2019 @ 12:03 am

@Philip T Heterodoxy is surely always to be welcomed — if we all choose to simply accept the received orthodoxy, what hope can we have of ever making real breakthroughs rather than merely making incremental progress ? … I am delighted that they[hypotheses] can be presented here so that those better informed than I are able to discuss them and disinterestedly assess them on their merits.

It's a matter of economics of time input from "those better informed". I notice that in the 2016 round assessing Chau Wu's paper there was significant input from over a dozen busy academics, as well as from interested amateurs.

On this thread we are down to a handful of academics. I know many of the 'absent' academics are still active assessing interesting ideas on other blogs. Like you, I hugely value their expertise; I would like to keep them contributing. I hypothesise: they will go to the effort once to read a paper or long blog posting, and provide a detailed critique; they might go to the effort twice, to skim and provide a briefer comment. At the third, they will regard it as a waste of their time/that Prof Mair is not learning to sanity check before posting, and they will simply stop even looking, let alone providing feedback. I see comments on other blogs that LLog is not what it was, that Prof Mair's topics are no longer of the quality they used to be (and everybody soooo much misses Geoff Pullum).

In the words of the poet "Had we but world enough, and time, …", or rather if 'those better informed' had time, it would maybe be a delight to examine all sorts of hypotheses. I for one am already far less than delighted with Prof Mair's uncritical promotion of junk science. I'm disinclined to look into further examples. And that would be a shame, because the reindeer thread before Christmas was both topical and entertaining and stimulating.

To put it more pithily: Prof Mair has 'cried wolf' too many times already.
Nelson Goering said,

February 24, 2019 @ 5:15 am

Just a couple of quick comments. Firstly:

'If William Jones were bound by geography of remote distance between India and Europe, he would not have been able to find out systematic similarities for his construction of the Indo-European family.'

The wider point (that 'geographical plausibility' is only a secondary consideration) is a fair one, but it's worth pointing out that Jones himself never actually did any proper reconstruction of IE. He merely noted that there were a lot of similarities (never saying what they were in detail!) and speculated that this could maybe be explained by his 'common source'. This was a hypothesis, but nothing more: no data, no argument, no testing, no proof. It was only in the 19th century that the actual reconstruction of PIE was undertaken, and Jones's hypothesis was shown to be correct (as supported by a really overwhelming amount of data).

There are geographical quirks even in more plausible contexts. The word _path_ is a notorious example: it's limited to West Germanic (OE pæð, OFris. path, OHG phad), but corresponds very nearly precisely to the Iranian word for 'path' in both form and meaning: Avestan (gen.) paθō, Old Persian paþī- (reformed as a feminine ī-stem, with the paþ- grade generalized). Even if some contact between Germanic and Iranian speakers is not out of the question for the early West Germanic period (the Sarmatians, Alans, etc.), this is hardly what we'd usually think of as a prime borrowing context. (The semantics are, I think, middling: it's perhaps not as good as *strātō < Latin strāta, where there was a distinctive technological object, the Roman road, to go with the name; but it's at least not in a semantic sphere where borrowing would seem exceptionally implausible.)

As so often, the whole thing comes down to linguistic plausibility. If we can convince ourselves that this link is linguistically justified, we can imagine the sociolinguistics working out well enough. If we decide a single, isolated correspondance of this sort isn't meaningful (not an unfair position to take), then we can abandon the whole idea without much loss.

'In no Germanic language does 'holm' mean sea/water.'

While I generally share AntC's view that the links proposed here are highly dubious, this specific point is not true: in Old English (and OE _alone_ in Germanic) _holm_ does in fact (nearly exclusively) mean 'sea' — or, as the Dictionary of Old English puts it, 'ocean, sea, water; wave'.
David Marjanović said,

February 24, 2019 @ 6:21 am

How do we explain the 'lumpism' of a Historical Linguist like Greenberg: was he just not aware of the dangers of false positives?

Greenberg figured that the comparative method compares languages that are already thought to be probably closely enough related. So how do we decide which ones we think are closely enough related? He developed his "mass comparison" or "multilateral comparison"* as a quick-and-dirty method for this prerequisite step.

One problem was that the data he had for American languages were chock full of errors of fact. No method is immune to the law of "garbage in, garbage out".

The other probably was indeed that he underestimated the probability of random chance. Outside of population genetics and certain branches of physics, even most scientists (myself included) are not really familiar/comfortable with statistics.

* That name is a reference to the bizarre practice of comparing only two languages at a time, which was apparently widespread among Americanists when Greenberg did his work.

==================

Anyway, I'll try to read Chang's paper next.
Victor Mair said,

February 24, 2019 @ 7:56 am

There's a lot of facile, poorly documented reductio ad absurdum going on around here, not to mention plenty of overly confident ex cathedra pronouncements, and oh so many non sequiturs, plus a profusion of denunciations. At times the small group of participants are self-congratulatory, at times mutually adulatory. More recently, ad hominem attacks have begun. It was exactly this sort of ganging up and vicious comments by a handful of mean-spirited persons that drove GP and other colleagues away from Language Log.
AntC said,

February 24, 2019 @ 8:43 am

in Old English (and OE _alone_ in Germanic) _holm_ does in fact (nearly exclusively) mean 'sea' — or, as the Dictionary of Old English puts it, 'ocean, sea, water; wave'.

Heh, heh thank you @Nelson. I stand corrected. It just goes to show how meticulous you have to be. Most dictionaries (of Modern English) I checked had the island sense. I also thought of some of the English placenames with holm, which are generally well inland. But gnaa there's a cognate 'holme' meaning holly (because it grows on hills not swamps?). As in Holmfirth, old spelling 'Holmefirth', which I know well in West Yorkshire, just about as far from sea/ocean as you can get in Northern England.

So I'm generally checking my etymologies in Etymonline. For 'holm' it says both

"small island in a river; river meadow," late Old English, … and
Cognate Old English holm (only attested in poetic language) meant "sea, ocean, wave."

Of course any source might be wrong/out of date.

But to Penglin Wang's hypothesis: this would have to mean that specifically a troubador band of early Old English poets traveled to the Altai mountains (not in late OE; no sort of other Germanics); spoke to no-one they met en route who might have corrupted their idiolect; failed to notice the entire absence of ocean/sea in the mountains; shouted 'holm' in a crowded theatre, somehow miming ocean/waves at the same time; then disappeared from history.

The (early/proto-)Mongolic speakers must have taken that miming to be swimming in the only water they knew: a river.

Now for both 'heort'/hart and 'holm'/water/river, PW's hypothesis requires that the lending language was very specifically OE not any earlier Germanic, not some form of Germanic from east of the Anglo-Saxons — otherwise his sound correspondences don't work.

You can see I've now reached a stage of hysterical incredulity about this OE-to-Altaic hypothesis.

@David M I'll try to read Chang's paper next.

I've been back-pedaling on that in light of recent discussions. Chang's at least passes the initial sanity checks that both PW and CW fail so abjectly. But I meant to re-read it myself with a view to what sort of phonetic and semantic contortions it undertakes. IIRC the sense-alikes used much stricter criteria.

Before he gets to the wordlists, he's very careful to lay out his methodology, and justify it in terms of a sociolinguistics context, and explain his sources (rhyme tables) for guessing at the EMC pronunciations. Those sections of his paper are a relatively quick read. They already might tell you whether it's worth going on to the wordlists, which I found horrible: typewriter font with hand-written diacritics and phonological adornments.
Shubert said,

February 24, 2019 @ 10:32 am

"abjectly" is derogative, even improper to kindergarteners…
David Marjanović said,

February 24, 2019 @ 3:03 pm

The (early/proto-)Mongolic speakers must have taken that miming to be swimming in the only water they knew: a river.

Or a lake, up to and including Lake Baikal…

the wordlists, which I found horrible: typewriter font with hand-written diacritics and phonological adornments.

What has layout got to do with anything? Plenty of scientific journals throughout the 20th century couldn't afford professional layouting and published typewritten papers like that.

=================

Anyway, promptly after I said I was going to, I downloaded Chang's paper and opened it, intending to just take a short look. The paper is much shorter than I expected, so I read the whole thing.

It's an unexpectedly wild ride.

Chang's main claim seems to be that Sinitic is a branch of IE – or perhaps something completely different with a thick IE superstrate; it's not clear if Chang understood the difference between these. The existence of a Sino-Tibetan family is pooh-poohed in a sentence or two, for reasons of which some were defensible in 1988, while others are just misunderstandings. I'll get to them in the next comment.

Chang started out trying to do it right: instead of comparing selected languages of the last 1000 years, he tried to compare Old Sinitic and Proto-Indo-European. Unfortunately, it was 1988, so the only comprehensive PIE reconstruction available was Pokorny's (1959) Indogermanisches Etymologisches Wörterbuch (IEW), and the only comprehensive OS available was Karlgren's (1957, with slight modifications by Li Fang-kuei in the 1970s).

To his credit, Chang tried to improve upon the OS reconstruction. He noticed several issues (like the "divisions" of Middle Sinitic, and the origin of tones) that he proposed solutions for. The work of the last 31 years has found much better solutions for these problems.

The IEW, unfortunately, was to some extent already outdated when it was published: it took almost no evidence from Anatolian and Tocharian into account, and it did not accept "laryngeals". Moreover, it systematically erred on the side of inclusion: whenever Pokorny thought there was any chance at all that two words could be related, even if that required assuming all sorts of irregular developments, he wrote that in just in case. As a result, about half of the IEW is not taken seriously by IEists today. (Prof. Mair, I'm deliberately not including a source for this claim because you can get a first-hand assessment of this by asking several of your colleagues instead of having to rely on the one or two throwaway remarks I could link to.) Again, none of this is Chang's fault; the first one-stop shop after the IEW, the LIV (Lexikon der indogermanischen Verben), only came out in 1998, its second edition (known as LIV²) in 2001, and as its name says it covers just the verbs – when continued funding for the planned noun dictionary was denied, a reduced version was published in 2008, and the particles and pronominal stems had to wait for 2014. The only way Chang could have done better than the IEW would have been to burrow through an enormous stack of papers, many of which were hard to get.

The IEW even includes roots that are only attested in one IE branch if they aren't known to be loans from elsewhere. (The LIV does the same, though with stricter criteria for what looks like a plausibly IE root.)

While Chang, understandably, took the reconstructions in the IEW as given and did not discuss them at all, he has a doctorate in Sinology and tried, in the paper, to improve the reconstruction of Old Sinitic. On p. 26–29, he discussed the information in the Middle Sinitic rhyme books and Karlgren's interpretation of them, among others, before concluding on p. 29:

"According to my reconstruction Early Middle Chinese has the following seven vowels:
i u e ə o a å
All vowels except are not autonomous but must occur in combination
with other vowels or finals, as is reflected in the absence of the simplex in the rhyme groups 29, 9, and 11, whereas simple u must be supported by ə in Grade 37 I."

That is not a remotely plausible vowel system for a human language. Even English, which really overdoes the diphthongs, has a few monophthongs that are allowed to end a syllable.

Then it goes on:
"Old Chinese has the same seven basic vowels. […] /a/ is the neutral vowel which can interchange with all other vowels. The high vowels i,
e, u occur frequently in company with ə. The autonomous /a/ in Old
Chinese became mostly /əi/, /iə/, /ei/ etc. in Middle Chinese"

That's all the explanation we get on why there are three different outcomes "etc.", or what "interchange" could mean.

On such a "basis", Old Sinitic was discussed. I quote from p. 8:

"Karlgren (1923, pp. [sic] 28) already came to the conclusion that the Middle Chinese words of rising and vanishing tones ending in -i or -u must originally have had a final consonant -g or -d (occasionally also -b), but he did not go so far as to ascribe -g to all words with rising tone. In Grammata Serica (1940) he introduced further -r (p. 25) and -g (p. 34 and 39) for the two groups of words with level tone. […] Pulleyblank (1962 and 1983) […] With numerous examples he succeeded in confirming that Middle Chinese words of vanishing tone had a dental final (1962, p. 215) , and those of rising tone a velar final (p. 225) in the early centuries A.C. [sic] These correspondences which have been partially attested by the rhyming of Old Chinese poetry, can now also be proved by Indo-European synonymous stems. Thus the following equations may be posited:"

And the correspondences then proposed look largely plausible at face value. Problem is, these word-final *-b, *-d, *-g are an error. Haudricourt solved pretty much the whole problem of East Asian tonogenesis back in 1954: rising tone comes from a final glottal stop, which is preserved in some Wu and Min topolects today (but not Chang's native Taiwanese); falling ("departing", "vanishing") tone comes from -h, which is preserved in at least one Jin topolect and is apparently universally agreed to have come from an earlier -s; level tone is the default that comes from the absence of final -p, -t, -k, -s/-h or -ʔ. Not only is all this paralleled in the history of Vietnamese and related languages (not to mention the similar tonogeneses, plural, of Athabaskan), it also makes phonetic sense, while [b d g] turning into different tones in irregular ways does not.

Some of the reasons for which -b, -d, -g were reconstructed involve rhyme; ever since Haudricourt, these have been explained as -ps, -ts, -ks and their regular developments.

The fun part is that Pulleyblank (1962) already acknowledged all this, and provided further evidence for OS -s from transcriptions of foreign words in OS. Chang (1988), as quoted above, cited that work, and… the "velar final" quoted above is [ʔ], which is not a velar consonant, yet Chang apparently took it as corroborating his/Li's/Karlgren's "-g"! Likewise, "a dental final" in the quote refers specifically [s], not "-d".

A useful review of non-Min Sinitic tonogenesis published in English before 1988 is Mei (1970). Chang (1988) did not cite it.

On p. 25, Chang insisted that -g was lost at different times in different words. This was not accompanied by any proposed conditioning factors or even an acknowledgment that that was a problem. I would like to see a lot more discussion and data before accepting an irregular sound shift.

Not mentioned either is that in Karlgren's OS reconstruction, there are almost no open syllables! No attested language is like that. Even Klingon looks more like a human language than that.

P. 31: "Probably voiceless aspirates ph, th, kh, tsh were still absent in Old Chinese to be developed later from p, lh, h, ts." How? As an unconditioned split that makes no phonetic sense? We aren't told – no context or explanation is given.

The conclusions begin on p. 32; I'll address them in the next comment.
Jerry Friedman said,

February 24, 2019 @ 3:24 pm

Rodger C: I haven't seen those books you mentioned. I was just wondering how one goes about establishing that the number of resemblances between languages is so much greater than you'd expect that common ancestry or borrowing is probably established.

Michael Watts and David Marjanović: Thanks for the amusing examples!

David Marjanović: Sorry I was unclear about rakun. I proposed Albanian and Algonquian languages as a euphonious example of languages that were never in contact, and then it occurred to me that there could be words of ultimate Algonquian origin in Albanian. I looked for the most obvious one, and behold, there it was.
David Marjanović said,

February 24, 2019 @ 5:43 pm

Chang (1988: 32):

"Our knowledge of regular phonetic correspondences between Old Chinese and Indo-European opens immense possibilities for lexical comparison."

"Regular phonetic correspondences" is rich, coming right after the unsorted mess of vowel (including diphthong) correspondences on p. 30, where every Pokorny-PIE vowel (all 16 of them, irrespective of length) can correspond to every vaguely similar OS vowel – up to five different ones, with no hint of conditioning factors given – and vice versa.

Laxity in the correspondences of initial consonants is explained away by reference to the fact that "in Old Chinese the alternation of initials in voicing was a conventional means of creating new words from one basic form." Indeed, related words (e.g. transitive and intransitive verbs) often differ in Middle Sinitic in that one has a voiceless and the other a voiced initial consonant; often they are even spelled with the same character. However, this never meant "oh, we need a new word, so let's take an old one and voice its initial". Current reconstructions of OS propose specific prefixes with specific meanings; these prefixes disappeared after the root-initial consonant had assimilated to them. Baxter & Sagart (2014) reconstructed two, *m- and a nasal consonant that always assimilated in place of articulation to the root-initial consonant; both of them would have voiced originally voiceless root initials.

"In the last four years I have traced out about 1500 cognate words which would constitute roughly two thirds of the basic vocabulary in Old Chinese. The common words are to be found in all spheres of life including kinship, animals, plants, hydrography, landscape, parts of the body, actions, emotional expressions, politics and religion, and even function words such as pronouns and prepositions, as partly shown in the lists of this paper."

This wide spread of semantic fields should mean that Chinese is Indo-European; the assumption of a sub-, ad- or superstrate would have real trouble explaining it, unless maybe if intense contact lasted for millennia (there's an interesting paper on "unlimited borrowing" from several stages of Sinitic into Bái, which now has "borrowings all the way down").

"Among Indo-European dialects, Germanic languages seems to have been mostly akin to Old Chinese in consideration of the following points:
a. Among Indo-European dialects, Germanic preserved the largest number of cognate words also to be found in Chinese."

Fair enough, except see below.

"b. Germanic and Chinese belong to the group of so-called centum languages, in which all Proto-Indo-European velars remain velars (with only a few exceptional variants in Chinese, cf.
p. 18, 449; p. 18, 449; p. 20, 644)."

This would have been a great opportunity to declare the supposed satəm forms Iranian loans. Alas.

"c. The initial /h/ in Germanic corresponds mostly to /h/ and /ɦ/
been in Old Chinese. Though Germanic /h/ has hitherto been interpreted as a shift from Indo-European /k/, it must have existed already in Proto-Indo-European, since interrogatives both in Germanic and Chinese have laryngeal initials (cf. p. 6, 645; p. 20, 644, 647, 648)."

Lolwut?

In just one sentence, without any further explanations or any further data, Chang (1988) denied Grimm's law. I'm sorry, that's like denying that whales are mammals, and responding to questions by nothing more than "they just kinda look like tuna, it's obvious".

"d. In comparison with Sanskrit, Greek and Latin, Chinese and
northern Germanic languages are poor in grammatical categories
such as case, gender, number, tense, mood etc. I would surmise generally that the daily speech of Germanic Peoples [sic] might
have had a much simpler grammar than that suggested by the earliest
Germanic literature which consists without exceptions of biblical translations from Greek or Latin."

Clearly, Chang had never looked at comparative Germanic grammar. First, it is not true that the earliest Germanic literature "consists without exceptions" of Bible translations, even though it's almost true of some, in particular Gothic. Second, nobody would invent a system of morphology just to translate a text. Third, the morphology of the attested Germanic languages is cognate between the languages, so that it is not particularly hard to reconstruct the morphology of Proto-Germanic; fourth, it is cognate with IE morphology more broadly, showing that Proto-Germanic inherited it. Fifth, with recent "northern" exceptions, Germanic languages have three genders just like Sanskrit, Greek and Latin.

Still in the same paragraph, extending to p. 33:
"German proverbs and idioms are formulated without indications of case, gender and number, like 'mit Kind und Kegel', 'schwarz auf weiß', 'alt und jung'."

This implies that these idioms are otherwise ungrammatical exceptions within German. For someone who mentioned his "command of [,,,] German" on p. 4, that's a strange thing to say. The adjectives in the last two examples are all predicative; in German, predicative adjective behave like adverbs and unlike attributive adjectives precisely in being zero-marked. The nouns in the first example happen to be identical in the dative (which is triggered by mit) and the nominative… nowadays, that is; in earlier times (in some styles up to the mid-20th century), Kind did take the dative ending -e, and mit Kinde und Kegel has 126 Google hits. The singular is interesting, but perhaps "every" has dropped out somewhere, or the pattern started with uncountable nouns.

"Moreover, when the Franks settled in France as conquerors, the complex declination system of Vulgar Latin collapsed and Old French emerged without case and number. This historical fact may suggest that the Germans originally spoke a language without declinations."

Funnily enough, Old French and Old Occitan were the only attested Romance languages that kept the Latin distinction between the nominative and the other cases for nouns. The slow merger of these other cases into a single "oblique case" can be traced through the history of Vulgar Latin long before the Franks moved in.

"With Old Chinese as evidence, we may conclude that the Germanic
group of Indo-European was conservative in its phonetical and
grammatical developments because of its peripheral northern location, far from the early high civilizations in the Near East where Hamitic and Semitic were spoken."

With the rest of IE as evidence, we can rather conclude the opposite. There are features where Germanic is indeed conservative, but not many. Innovations happen everywhere; they don't necessarily come from "high civilizations". And there's nothing Afro-Asiatic in the grammar or phonetics of the other IE branches.

"Hamitic" ( = all of Afro-Asiatic except Semitic) was already an anachronism in 1988… and there are no such languages attested in the Middle East anyway. (Sumerian is not Afro-Asiatic, not that it matters.)

"On the other hand, the complicated conjugation system in Greek, Latin and Southern Germanic might have emerged later under the influence of a rich modal and temporal system of Altaic tribes,"

Then why are all Altaic conjugation systems so different? Only Turkic and some Tungusic languages even have person/number agreement, and when it is present, it is transparently agglutinative, not fusional like in IE.

And where are the Altaic words in IE!?! If contact was so intense that even inflectional morphology was sort of calqued, why isn't there a huge heap of loanwords?!?

"with whom Indo-Europeans had coexisted for thousands of years in Central Asia and in whose company they emigrated into Europe."

That's just made up. There's no evidence for any of this in the paper or anywhere else. That must be why I've not even encountered the claim anywhere else.

"As to the relationship of Chinese to Tibetan, this is a dead-end branch of comparative linguistics where some 'Sino-Tibetanists' have devoted their whole life in vain attempts to prove the prevailing hypothesis of a Sino-Tibetan language family."

This may be a stark illustration of how times have changed. I'm not sure there are any doubters of ST left.

"Recently, Colbin (1986) published a list in which he has collocated 489 Sino-Tibetan roots mainly suggested by Paul K. Benedict, Nicholas C. Bodman, Axel Schüssler [sic] and others (see Introduction p. 8). Unfortunately, "Sino-Tibetanists" allow themselves too great freedom when doing phonetic and semantic comparison."

It is in fact true that the Proto-ST "reconstructions" we have, with one exception, have all been done by, more or less, Greenberg's "multilateral comparison", because not many regular sound correspondences between the ST branches have yet been identified. That's because ST, syn- and diachronically, remains starkly underresearched. The 400-odd languages, not all of which have yet been documented in sufficient detail, form over 40 uncontroversial branches; most of the proto-languages of these branches have not been reconstructed yet, and the relations between the branches remain very uncertain.

The mentioned exception is the PST reconstruction by Peiros & Starostin (1996). It uses the good old comparative method… but on only five languages, so clearly a lot of important data is missing. It should prove very interesting to add Proto-Qiangic and Proto-Kiranti to it once those are reconstructed…

"Moreover, a large number of words are claimed to be common Sino-Tibetan, though they are not to be found in Tibetan vocabulary at all (for instance the word cow, cf. Coblin p. 52, cattle/ox). Thus only about a third of the words listed by Coblin may be accepted as common Sino-Tibetan."

…That would make sense if Tibetan were considered the sister-group to the entire rest of the family. And that has never been proposed to this day.

"It is unlikely that there had ever existed a 'Sino-Tibetan' as a common mother language of Chinese and Tibetan, since:
a. Tibetan is syntactically an agglutinative language like Mongolian and Japanese. It uses case suffixes and has neither prepositions nor conjunctions at the head of sentences as is the case in Chinese and in Indo-European."

The way in which Tibetan is agglutinating is very much unlike Mongolian or Japanese. It is, however, quite similar to what is nowadays reconstructed for Old Sinitic, especially by Baxter & Sagart (2014), who deliberately did not take any evidence from outside of Sinitic into account. Several pre- and suffixes, and their status as such, are clearly cognate.

Sinitic doesn't have pre- or postpositions. At first glance it seems to have both – but at second glance, the "prepositions" turn out to be verbs, while the "postpositions" turn out to be nouns.

P. 34:
"b. Though Tibetan word stems are mostly monosyllabic as in Chinese and Indo-European, they are rich in initial consonant clusters like Polish and poor in vowel clusters as opposite to those of
Middle Chinese and Germanic."

…but very much like Old Sinitic. Large vowel systems are a fairly recent occurrence in Germanic (lacking in Gothic), and the enormous di- and triphthong inventory Chang interpreted into Middle Sinitic has not been picked up by any other publication I'm aware of.

"Among the words common for Chinese and Tibetan, there are many Indo-European stems. In comparison with Old Chinese, however, the Tibetan words are lacking final stops and therefore rather akin to those of Tocharian."

This refers to Karlgren's *-b, *-d, *-g for OS.

"c. It is not deniable that there is a small stock of Sino-Tibetan
common vocabulary which is absent in Indo-European. But we must
investigate whether such Tibetan words are borrowings from Burmese or from Old Chinese."

This stock is not only large, it contains so much basic vocabulary that borrowing is an unlikely explanation – and that's before we get to the shared morphology.

"d. In the T'ang period, when China and Tibet established the first
diplomatic relations, nobody ever noticed any common vocabulary
or grammer [sic] of the two languages."

Nobody noticed Indo-European either before Jones, and he was comparing languages from 2000 to 3000 years before his time, not contemporary French to contemporary Bengali.

"In the final analysis, I would surmise that Tibetan may have emerged as a mixed language with an aboriginal and Proto-Indo-European substratum and an Altaic superstratum."

So, the Altaic superstratum imposed the abstract idea of agglutination, but no other part of the grammar (Tibetan is ergative/absolutive, for crying out loud, and Altaic is a prefix-free zone), and the PIE substratum left behind words scattered all over the vocabulary, rather than the typical fields for substratum words (local fauna & flora, local geography…)? I would not call that an "analysis".

P. 35 sketches a historical scenario. Chang points out that the mythical Yellow Emperor (whose "founding of the Chinese Empire" he seems to take for granted) supposedly conquered the place, had a name meaning "wagon shaft" (…"thill"?), "ordered roads to be built, and was
perpetually on the move with treks of carriages. At night he slept in a barricade of wagons. He had no interest in walled towns,
so only one city was built […]. All of this indicates his origin from a stock-breeding tribe in Inner Mongolia." This is where it gets interesting! "With introduction of horse- or oxen-pulled wagons, transport and traffic in Northern China was revolutionized. Only on this new technical basis did the founding of a state with central government become feasible and functional." And finally, Chang says, "yellow" might mean "blond".

I doubt that last part. But, sure, it is entirely possible that wagons, chariots, sheep, horses, perhaps cattle, and other Chalcolithic innovations were introduced to China by people who spoke Pre-Tocharian and have been mythologized as certain aspects of the Yellow Emperor.

It's just that Germanic has nothing to do with this.

At the end of that page and the beginning of the next, Chang pointed out that the Yellow Emperor did a lot of rectification of names, and interpreted that as founding "the Chinese language" by introducing IE words. But surely it would be automatically said about a mythical culture hero that he was a good Confucian ruler and rectified hundreds of names?

"The rule of Huang-ti [Huángdì = Yellow Emperor] is traditionally dated back to the 27th century B.C. Subtracting 200 or 300 years as hyperbolic predating, we may assume that the founding of the first Chinese empire took place at the latest at about 2400 B.C."

…Right.

P. 39:
"Parallel to the emergence of the Chinese Empire and the Chinese language in East Asia, there were also invasions of Indo-European warriors to the A[e]gean and Adriatic area, to Syro-Palestina and even to Egypt around 2500-2200 B.C. (cf. Gimbutas 1970, pp. 191)."

The Sea Peoples? That's after 1200 BC, and they didn't invade the Aegean and the Adriatic, they may have come from there.
David Marjanović said,

February 24, 2019 @ 5:51 pm

Oops, I forgot to write almost any conclusions of my own.

As I've said before, I expect further research to turn up further Tocharian and, for later stages, Iranian loanwords in Old Sinitic. Germanic, not so much.

I was just wondering how one goes about establishing that the number of resemblances between languages is so much greater than you'd expect that common ancestry or borrowing is probably established.

You don't, really. You take the resemblances you find, see if you can work out a plausible system of sound correspondences which are each backed by a non-negligible number of examples, and "apportion the strength of your belief to the strength of the evidence".

Sorry I was unclear about rakun. I proposed Albanian and Algonquian languages as a euphonious example of languages that were never in contact, and then it occurred to me that there could be words of ultimate Algonquian origin in Albanian. I looked for the most obvious one, and behold, there it was.

Oh! I thought you were looking for coincidental similarities.
AntC said,

February 24, 2019 @ 7:12 pm

Thank you @David M for your diligence and patience. So we seem to be at a dead end wrt IE linguistic interchange with Old Chinese.

I too re-read last night, and particularly noted the legends/myths about the Yellow Emperor's nomadic/Inner Mongolia(?) roots. When you say "Pre-Tocharian" you presumably mean not at all IE?

Now I'm somewhat surprised in the other direction. If there was trans-Eurasian trading contact across the Silk Route for thousands of years, and Indo-Europeans at the Western end of that, and even in the middle (Tocharians and Indo-Iranians), why are there no/few loanwords identified so far?

The trade would exchange spices and silks from the East with what? from the West. Would the goods bring their words with them? To identify only the Tocharian for honey is a pretty poor showing.

AC the wordlists, which I found horrible: typewriter font with hand-written diacritics and phonological adornments.

What has layout got to do with anything?

Nothing at all. The "horrible" was an aesthetic/legibility comment, meaning that if you were being so diligent as to work through the wordlists in detail, it would be a struggle to follow the notation.

Plenty of scientific journals throughout the 20th century couldn't afford professional layouting and published typewritten papers like that.

Sure. 20/30 years ago I would have been in better practice. Some of the foundational papers in my profession are like that, but most have been re-typeset.
AntC said,

February 24, 2019 @ 7:32 pm

Chang "In the last four years I have traced out about 1500 cognate words which would constitute roughly two thirds of the basic vocabulary in Old Chinese. The common words are to be found in all spheres of life including kinship, animals, plants, hydrography, landscape, parts of the body, actions, emotional expressions, politics and religion, and even function words such as pronouns and prepositions, as partly shown in the lists of this paper."

DMThis wide spread of semantic fields should mean that Chinese is Indo-European;

With respect, I didn't find Chang's claim prima facie implausible (well, apart from the ' function words' ).

We have the example of English with a huge borrowing from Romance. That hasn't meant that English is Romance. And the story for English with Norman overlords parallels the legends for the Yellow Emperor — that is, if he had been IE/Germanic.

The difference with English is that it has maintained both the Anglo-Saxon vocab and the Romance. But we only know that because we have enough recorded to reliably capture OE (and its Germanic stock) before the Normans. How would (say) late Middle-Ages English have looked if we didn't have that knowledge?

Do we have enough evidence from Old Chinese to reconstruct if it has a substrate and superstrate? Or did the Yellow Emperor just adapt and speak like the locals — as did the Vikings settling in Northern France?
Jay said,

February 24, 2019 @ 9:47 pm

AntC: are there other blogs you can recommend that the critics you refer to in your post of 12:03 pm read or post to? I'm not in the linguistics field, so I have no idea what's popular or respected or well-read.
David Marjanović said,

February 25, 2019 @ 4:26 am

When you say "Pre-Tocharian" you presumably mean not at all IE?

I do mean IE, something ancestral to attested Tocharian, just thousands of years older than that.

I mean, I have no way of knowing if the Chalcolithic pastoralism package was transmitted to China through an intermediary, carried by people who did not speak IE. That would, however, be one additional assumption that I don't need to make.

We have the example of English with a huge borrowing from Romance.

I can only think of one body part with a Romance names in English right now, though: stomach.

The difference with English is that it has maintained both the Anglo-Saxon vocab and the Romance.

And the Anglo-Saxon grammar (what was left of it after the Vikings were done with it)! Celtic (substrate) influence on English grammar has been proposed, and a few aspects seem mainstream these days; Romance (superstrate) influence has hardly ever been proposed, and does not seem to be a widely accepted hypothesis.

Do we have enough evidence from Old Chinese to reconstruct if it has a substrate and superstrate?

We would need a lot more comparative Sino-Tibetan than we have today. However, ST does have a stock of basic vocabulary, and it also has a stock of derivational affixes that were, as we now know but didn't in 1988, still present in OC (and have left traces like voice alternations or tone alternations in MC and often later topolects). On the phonological side, even the Type A/Type B distinction of OC syllables lines up with vowel uvularization ~ velarization in Qiangic languages.
David Marjanović said,

February 25, 2019 @ 4:30 am

(…That is, Type A lines up with syllables whose vowels are uvularized in some extant Qiangic languages, velarized in others, and, well, Type A in Tangut; Type B lines up with syllables that lack this feature.)
AntC said,

February 25, 2019 @ 6:04 am

I mean, I have no way of knowing if the Chalcolithic pastoralism package was transmitted to China through an intermediary, carried by people who did not speak IE. That would, however, be one additional assumption that I don't need to make.

Hmm? You would appear to be making an equally strong assumption the other way: that the only pastoralists throughout (Northern) Eurasia were IE speakers. Are you claiming that prior to the IE expansion out of Anatolia (or Yamna or wherever we start them from), the steppes were completely uninhabited? That is, apart from the pre-Sino-Tibetans and the pre-Mongolic. How does that fit with the preponderance of Ket/Yeniseian toponyms in Siberia?

I can only think of one body part with a Romance names in English right now, though: stomach.

;-) tastebuds?

Only a hungry Phonologist would overlook: palate, uvula, velum, pharynx, cranium, dental, labial, ocular; and if I go below the neck: osteo-, gastro-, ….

Ok those are medical/scientific terms, and quite a few are via Romance from Greek. There are earthier terms that duplicate them. That exactly reflects the prestige given to Romance words over Anglo-Saxon, for many centuries.

I'm not claiming any influence other than a large amount of vocab. But then I don't think Chang was; although I agree with your critique that he rather hedged his bets. Anyway that's another straw horse we don't need to carry on beating.
Eidolon said,

February 25, 2019 @ 9:04 pm

The standards for claiming genetic relationships are much higher than those for claiming loans. I wonder whether Chang would've fared better, had he not made sweeping generalizations about Sino-Tibetan, Indo-European, and Altaic that in hindsight, aged poorly. I surmise his work would've been better received as a "mass comparison" starting point for investigating Eurasian language contact. I feel the same way about Greenberg. There's value to compiling long lists of lookalikes, the problem starts when you begin to draw radical conclusions from them without considering the many limitations of the method.
Jonathan Smith said,

February 25, 2019 @ 10:28 pm

Re: IE>OC loans, as he's noted, Prof. Mair is inclined to look for babies no matter how mingin the bathwater, so yeah… mingin bathwater is mingin. How to deal with the occasional baby (i.e., an interesting bit of data rescued from the ming-deluge) might be worth debating.

@ David Marjanović "On the phonological side, even the Type A/Type B distinction of OC syllables lines up with vowel uvularization ~ velarization in Qiangic languages."

Again… this is manifestly false (or vacuous, depending on what you mean by "lines up"). Can you explain why this particular Hot Take, as you seem generally to avoid them in principle? Did the author of the paper make such a statement? Re-read?
AntC said,

February 26, 2019 @ 2:42 am

The standards for claiming genetic relationships are much higher than those for claiming loans. I wonder whether Chang would've fared better, had he not made sweeping generalizations about Sino-Tibetan, Indo-European, and Altaic that in hindsight, aged poorly.

Small correction: I think Chang made no claims about Altaic, but only Sinitic(-Tibetan) and IE. The Altaic are PW's claims.

No: I think the parts of David M's analysis dealing purely with the phonetics/phonology stand irrespective of the sociolinguistic/ethnologic "sweeping generalisations". In particular: Chang's vowel system for OC not plausible "for any human language"[DM] — something the editor-in-chief/peer review should have sanity-checked[AC]; his (in effect) denial of Grimm's Law — ditto[AC]; and the claim that early/Proto-Germanic did not have inflectional morphology, so the "mass comparison" need look only at stems.

I think that last point needs some teasing out — or would do if it were the only objection. In a contact situation, where some other language is bringing a few words for some new thing with a new name, I think morphology would get lost. 'Aubergine' in French from Catalan from Arabic al+badinjan, where the 'al' is the definite article (same thing with 'algebra' and 'algorithm', etc). Something similar has happened with the ending in 'melanzana' in Italian/etc, for what is a loan from the same source. Even within a language, morphology gets re-analysed: 'an apron' reanalysed from 'a napron', cognate with 'napkin'.

So you could imagine that in a pidgin, the speaker of the lending language would drop any morphology or use a minimally marked form. Particularly if both languages have single-syllable stems. But if it's a case of mass borrowing (per Chang's claim) of many everyday words, that must mean an extended period of close contact. It's hard to imagine the overlords talking pidgin-Proto-Germanic all the time.

I surmise his work would've been better received as a "mass comparison" starting point for investigating Eurasian language contact.

We have to take his work as a case of garbage in-garbage out. The garbage being Pokorny's wordlists and Karlgren's " word-final *-b, *-d, *-g [that] are an error."[DM]

Chang's work is explicitly mentioned as an inspiration by both Chau Wu and PW. Then it's a shame neither of them paid close attention to either Chang's meticulousness, nor to the flaws that arise from garbage in.

Is it worth doing another mass comparison knowing what we know now? Perhaps limit the source wordlist to those known to be single-syllable (that is, for the unmarked forms in the lending language). I'm not saying that only single-syllable words were loaned; I'm saying that's how to start to build sound pattern correspondences. First question would be: do we start from Germanic or from Iranian or from Tocharian?
AntC said,

February 26, 2019 @ 2:55 am

Is it worth doing another mass comparison knowing what we know now?

Oh, and first do some endogenous/diachronic analysis in the borrowing language: when did words enter the language? Can we see some new form supplanting an old; or there being two sense-alike words at the same time: stomach/belly, cranium/skull, palate/roof-of-mouth?

Or do new words only arrive when the thing arrived? Particularly specifically nomadic technology/wheels/wagons, iron smelting/armoury, weaving/braiding, whatever the IE/Silk Route traders would be bringing. Honey, apparently.
AntC said,

February 26, 2019 @ 4:55 am

Can we see some new form supplanting an old;

Rereading the Honey thread, there's this from Axel Schuessler, forwarded by VHM

… ['honey' in English supplanting PIE *medh-] it happens all the time. The IE root for ‘dog’ survived in Engl. in the specialized word ‘hound’, therefore another word for ‘dog’ filled its semantic place,

Did it happen that way round? 'hound' got specialised, so needing another word to fill the gap? Or was it first that both words meant much the same, and one moved aside to avoid a clash? At first the difference was one of register: earthy Anglo-Saxon working/herding/guarding 'dog' vs refined pet 'hound' for pleasure pursuits/the hunt.

I'm not seeing either from my belly/stomach etc pairs are moving aside. Again it's a difference of register.

Honey, apparently.

Or equally apparently (from that thread) not. Just as much the Kirin/hrein(deer) thread came to no discernable conclusion.

In fact, is there any well-agreed loanword(s) from any non-Sinitic source into OC or EMC?
David Marjanović said,

February 26, 2019 @ 5:43 am

Hmm? You would appear to be making an equally strong assumption the other way: that the only pastoralists throughout (Northern) Eurasia were IE speakers.

They're the ones we know.

Are you claiming that prior to the IE expansion out of Anatolia (or Yamna or wherever we start them from), the steppes were completely uninhabited?

Oh no – there were hunter-gatherers, and the mysterious Botai culture that had domesticated horses (apparently for food) but no other livestock, no wheels, no agriculture.

That is, apart from the pre-Sino-Tibetans and the pre-Mongolic.

It has never been suggested that Sino-Tibetan was spoken that far northwest. Pre-Mongolic may have been farther east, where pastoralism didn't arrive very soon.

How does that fit with the preponderance of Ket/Yeniseian toponyms in Siberia?

Those are farther north.

Again… this is manifestly false (or vacuous, depending on what you mean by "lines up"). Can you explain why this particular Hot Take, as you seem generally to avoid them in principle? Did the author of the paper make such a statement? Re-read?

The paper is here. Most of it deals with the comparison between Tangut and Rgyalrongic. The last paragraph of the Conclusions is this:

"Finally, Gong Hwang-cherng's observation (2007) that Old Chinese *-j- [i.e. Type B] corresponds with Tangut –j– [i.e. Grade III, reinterpreted as plain/unmarked in the paper] in Chinese-Tangut cognates acquire entirely different implications under the light of the hypotheses proposed in this essay. If Gong Hwang-cherng's hypothesis is correct, the A/B contrast in Old Chinese, now often reconstructed as pharyngealization (Norman 1994, Baxter & Sagart 2014), would be directly cognate with what I reconstruct here as Tangut uvularization. The potential reality of this hypothesis, which can only be decided with yet better descriptive and comparative work on modern Qiangic languages, holds revolutionary implications for Sino-Tibetan."

I have not read, indeed I'm unable to read, Gong (2007), which is:

Gong, Hwang-cherng (龔煌城). 2007. Xīxiàyǔ zài Hàn-Zàng yǔyán bǐ jiào yánjiù zhōng de dìwèi 西夏語在漢藏語言比較研究中的地位 [The importance of Tangut in the comparative study of Sino-Tibetan languages]. Language and Linguistics 8(2). 447–470.

Naturally, I would appreciate your perspective on this!

First question would be: do we start from Germanic or from Iranian or from Tocharian?

Go ahead, try all three and see what works! Of course I predict that Germanic will turn out to be a waste of time – but that's a testable hypothesis.

Did it happen that way round? 'hound' got specialised, so needing another word to fill the gap? Or was it first that both words meant much the same, and one moved aside to avoid a clash? At first the difference was one of register: earthy Anglo-Saxon working/herding/guarding 'dog' vs refined pet 'hound' for pleasure pursuits/the hunt.

Both are Anglo-Saxon, of course. Dog at first had a specialized meaning, while hound was the cover term. That is one reason why it appears late in written sources and is at first very rare; the other is that it's morphologically a nickname. Slowly, it became the established term for certain breeds, and was exported to other European languages as such in the 17th century or so, if I remember that right; its meaning expanded more and more, while hound has ended up as a specialized term.

In fact, is there any well-agreed loanword(s) from any non-Sinitic source into OC or EMC?

"Honey" is one; the Tocharologists and the Sinologists seem to agree on it, and no controversy is mentioned on Wikipedia. Honey (at least for mead) was important in IE culture early on, which was not the case in China, IIRC.

"Wax" has been proposed so recently to be a loan from Vietnamese that there hasn't been much discussion of it. I'll post the link to the paper later.

Part of the reason why not more have been (to my knowledge) proposed is that's difficult. The difficulty lies in reconstructing the languages under consideration to the appropriate time depths. For OC we've been seeing this rather dramatically.
AntC said,

February 26, 2019 @ 8:34 am

"Honey" is one [loanword]; the Tocharologists and the Sinologists seem to agree on it, and no controversy is mentioned on Wikipedia. Honey (at least for mead) was important in IE culture early on, which was not the case in China, IIRC.

Are you sure? Schuessler in the Honey thread seems unconvinced, saying the cognate should (by the phonological rules) be the word for mead rather than honey:

if the Toch.-OC relationship is one of borrowing and not chance resemblance, Chinese must have borrowed it from Toch. Why? The Chinese must have been aware of honey since time immemorial. But: languages engage in strangest borrowing, …

Beekeeping in ancient China has existed since ancient times and appears to be untraceable to its origin. [wikipedia, mentions written reference from the Spring and Autumn period, but that's hardly ancient.]

So I'd expect a pre-existing word for honey in OC, that got supplanted by Toch for mead.

If we don't know what was the pre-existing word, then how can we say *mit is a borrowing, as opposed to that it was there all along, and the resemblance is chance?

To say "the strangest borrowing" seems to me to open the barn door for more dubious word-chasing.
Jonathan Smith said,

February 26, 2019 @ 12:46 pm

@ David Marjanović. Ah thank you, the direct quote is much more cautious. Off-topic here but an interest of mine, so one of a number of examples: MC "Div. IV" syllables are transcribed into Tangut "Grade III" (hmmm might this suggest that "Grades" simply reflect northwestern late medieval Chinese where "Div. IV" >> "Div. III"?) but of course come from OC "Type A", so given the deep cognancy you suggest, we should be able to find Tangut "uvularized" = MC "Div. IV" in this subset of cases… but this might be hard since to begin with have defined Tangut "uvularized" largely by reference to transcriptional relationships with MC I/II AS OPPOSED TO with III/IV :/ :/ :/ etc., etc. It would probably be good to consider Tangut in light of the copious data on contemporary northwestern LMC… a point I think you would heartily agree with were this an IE problem :D
Chris Button said,

February 26, 2019 @ 12:50 pm

Regarding some comments above above about Proto-Sino-Tibetan (i.e. Old Chinese plus Proto-Tibeto-Burman), James Matisoff's output is definitely worth checking out. He doesn't force the system to meet a preconceived agenda, but instead posits "allofamic" variation so as not to sweep anything under the rug. His system also happens to be very supportive of "ə/a" in Old Chinese since he has recently just removed "e" and "o" from Proto-Tibeto-Burman, and readily acknowledges that evidence for "i" and "u" is marginal at best.

As for Type A and B syllables in Old Chinese, I still think the best proposal is Pulleyblank's as bolstered by Ferlus' observations. Starostin's (later adopted by Baxter & Sagart) inspired suggestion of a correlation with Mizo (and by extension Kuki-Chin) vowel length is essentially looking at the same data (Pulleyblank went with Sizang as his Kuki-Chin language of choice), but does not take into account that the length is simply a surface phenomenon resulting from far more complex tense/lax alternations running across the syllable. As a consequence, Starostin's quest for a nice alignment of Kuki-Chin vowel length with Old Chinese syllable type is bound to be in vain, and Baxter & Sagart's (2016) attempt to imbue it with some statistical support is hardly convincing in my opinion.
AntC said,

February 26, 2019 @ 7:38 pm

First question would be: do we start from Germanic or from Iranian or from Tocharian?

Go ahead, try all three and see what works!

Come to think: going by those probabilities from Rosenfelder, there are bound to be (chance) pseudo-cognates. Then the real surprise is there's so few been found. There surely must be at least one between Tocharian and OC; then honey is it. I take Schuessler to be saying the correspondence isn't entirely explicable — that's consistent with it being chance.
David Marjanović said,

February 26, 2019 @ 7:54 pm

The paper that proposes 蠟 "wax" as a Vietnamese borrowing is here; the goods are mostly in two paragraphs on p. 5. ("Ancien chinois" is meant to be what others have called "Late Hàn Chinese", between Middle and Old.) The non-linguistic context is: the character only shows up in the 2nd to 3rd centuries, and is even kept out of the considerably later rhyme dictionaries; wax is important in Buddhist rituals and was imported from Cambodia.

If we don't know what was the pre-existing word, then how can we say *mit is a borrowing, as opposed to that it was there all along, and the resemblance is chance?

In short, we don't. The argument, however, rests on the cultural significance of mead in IE (where the meanings "mead", e.g. in Germanic, and "honey", e.g. next door in Slavic, interchange a lot): rather than basic vocabulary or the kind of landscape/flora/fauna term that is often kept from a substrate, it would be cultural vocabulary and as such easily borrowed. Likewise, wax cannot have been literally unknown in China before Buddhism was introduced, but Buddhism made it part of the cultural vocabulary and thus borrowable.

MC "Div. IV" syllables are transcribed into Tangut "Grade III" […] but of course come from OC "Type A", so given the deep cognancy you suggest, we should be able to find Tangut "uvularized" = MC "Div. IV" in this subset of cases…

As cognacy, not necessarily as transcription.

(hmmm might this suggest that "Grades" simply reflect northwestern late medieval Chinese where "Div. IV" >> "Div. III"?)

Indeed, the paper does not say or imply that "Mediaeval Héxī Chinese" distinguished Div. III from IV. Both III and IV are transcribed as Tangut III, which in turn is transcribed as Div. I/II in the other direction.

have defined Tangut "uvularized" largely by reference to transcriptional relationships with MC I/II

No – by reference to cognates in other Qiangic languages.

He doesn't force the system to meet a preconceived agenda, but instead posits "allofamic" variation so as not to sweep anything under the rug.

Does he now reconstruct specific affixes to explain this variation?

As for Type A and B syllables in Old Chinese, I still think the best proposal is Pulleyblank's

That's short vowels for Type A, right? Because pharyngealized vowels in Tuva and Tofalar correspond to short vowels elsewhere in Turkic.
Jonathan Smith said,

February 26, 2019 @ 9:45 pm

"No – by reference to cognates in other Qiangic languages."

No, Tangut "Grade" is an implicit lexicographical category (this itself presents problems which I tried to outline above maybe unsuccessfully.) If it had been possible to make primary reference to cognate relationships, that would obviously be great and we wouldn't have this paper / this debate.

"Both III and IV are transcribed as Tangut III, which in turn is transcribed as Div. I/II in the other direction."

No…

So I like Xun Gong's paper; it is author's prerogative to be bold. At the same time, it is readers' responsibility to question. Set these two forces against each other — progress!! Sounds-good-ism by folks who aren't that familiar with or invested in the problem in the first place, though, gums the works.
Chris Button said,

February 26, 2019 @ 11:11 pm

That's short vowels for Type A, right?

Pulleyblank's proposal deals with moraic weight which he terms a "prosodic accent". I've posted about this on LLog before – you respond by again citing the paper by Xun Gong (a version of which I believe I incidentally heard him present in Jena a little under a year ago):

http://languagelog.ldc.upenn.edu/nll/?p=41346#comment-1558927
Chris Button said,

March 1, 2019 @ 2:51 pm

Ferlus' suggestion that 蠟 "wax" is a borrowing from Vietnamese is interesting. I wonder if the same can be said about 蓮 "lotus" which is also in Maspero's (1912:80) list of four Vietnamese s- correspondences? However, I don't think a case can be made purely on phonological grounds since 力 also makes the list and goes all the way back to the oracle-bone inscriptions.

RSS feed for comments on this post

Penglin Wang’s response to David Marjanović’s comments

70 Comments

Victor Mair said,

PeterL said,

AntC said,

Victor Mair said,

Michael Watts said,

Penglin Wang said,

Philip Taylor said,

Scott P. said,

Andrew (not the same one) said,

Jerry Friedman said,

AntC said,

Chris Button said,

Victor Mair said,

Victor Mair said,

Michael Watts said,

AntC said,

B.Ma said,

Levantine said,

Chris Button said,

AntC said,

David Marjanović said,

Victor Mair said,

Penglin Wang said,

Andre Mayer said,

AntC said,

AntC said,

AntC said,

Philip Taylor said,

Victor Mair said,

Jerry Friedman said,

Rodger C said,

Michael Watts said,

Victor Mair said,

Michael Watts said,

Victor Mair said,

David Marjanović said,

David Marjanović said,

David Marjanović said,

AntC said,

AntC said,

AntC said,

Nelson Goering said,

David Marjanović said,

Victor Mair said,

AntC said,

Shubert said,

David Marjanović said,

Jerry Friedman said,

David Marjanović said,

David Marjanović said,

AntC said,

AntC said,

Jay said,

David Marjanović said,

David Marjanović said,

AntC said,

Eidolon said,

Jonathan Smith said,

AntC said,

AntC said,

AntC said,

David Marjanović said,

AntC said,

Jonathan Smith said,

Chris Button said,

AntC said,

David Marjanović said,

Jonathan Smith said,

Chris Button said,

Chris Button said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta