Reinterpretation of Xianbei qifen ("grass") and its reflection in Mongolic

« previous post | next post »

[This is a guest post by Penglin Wang]

The Chinese transcription of foreign words has made a unique and valuable contribution to our understanding of linguistic situations in early Inner Asia, but it was sometimes inevitably fraught with logographic confusion and scribal errors. Even given quite advanced word-processing and printing in modern times, one can hardly prevent miswriting or misspelling from happening. In ancient China, presumably, it was historians and other authors who heard foreign words spoken and jotted them down, and then further changes developed through the involvement of scribes, typographers, and printers, with each possibly committing their own miswritings and infelicities. It is therefore necessary to reinterpret certain transcriptions on the basis of the known philological and linguistic relevance of what came to be written down.

The choice of the character 俟 (pronounced qi and si in Putonghua depending on its meaning or contextual use) in 俟斤, 俟利發, and 俟汾 corresponding respectively to Old Turkic erkin~irkin (official title), elteber~ilteber (official title), and Written Mongolian ebesü(n) ‘grass’, must be an erroneous substitute for 埃 (ai). I have discussed the wrong use of 俟 in 俟斤 due to its phonetic deviance from the word-initial vowel e~i in Turkic erkin~irkin as a result of the logographic similarity between 俟 and 埃 (Wang 2020). In this short note I want to extend my points to the Xianbei 俟汾.

The earliest known mention of Qifen (俟分) as an ethnonym for one of the twelve hordes of Gaoche (高車) is in Weishu (103.2310) completed in 554 and then in Beishi (98.3273) completed in 659. In their ethnographic account of the Yuwen (宇文) tribe the authors of Xin Tangshu (71.2403) completed in 1060 consider that the ethnonym Yuwen originated in 俟汾 through gradual phonetic changes and try to justify their etymological solution with the meaning of 俟汾 as ‘grass’. The authors imply that Yuwen came forth from China’s Shennong (神農) ‘divine farmer’: “After Shennong was defeated by the Yellow Emperor, his descendants fled to the North for habitation. The Xianbei customarily called grass 俟汾, and since Shennong was able to taste grass, they called themselves 俟汾, which evolved to yuwen through phonetic erosions”. 

By reinterpreting 俟汾 to 埃汾 (aifen) and reconstructing the ancient pronunciation of the latter as *eben, we can locate its etymological and cognate reflection in Written Mongolian as ebesü(n)~ebüsü(n) ‘grass, hay, herb’, of which ebe-~ebü- is the root and –sü(n) is the suffix. It is worth noting that the stem ebesü-~ebüsü- is deep-rooted in Written Mongolian and is productive in its derivation: ebesüči(n) ‘mower, hay-maker, seller of grass or hay’, ebesüd- ‘to become weedy, be[come] covered with grass or herbage’, ebesüle- ‘to feed cattle with grass or hay; to feed on growing herbage; to lead to pasture’ (Lessing 1960:287, 291).

 

References 

Beishi (北史 History of the Northern Dynasties) compiled by Li Yanshou in 659. http://hanchi.ihp.sinica.edu.tw/ihp/hanji.htm. 

Lessing, F. L, General Editor. 1960. Mongolian-English Dictionary. Berkeley: University of California Press. 

Wang, Penglin. 2020. The Turkic influence on Kitan: Kitan bori (the name for an evil person) and Old Turkic böri ‘wolf’. Языки стран Дальнего Востока, Юго–Восточной Азии и Западной Африки: материалы XIV Международной научной конференции, edited by А. Ю. Вихрова, 55-59. Москва: Ключ-С. (See http://www.cwu.edu/anthropology/penglin-wang

Weishu (魏書 Book of Wei) compiled by Wei Shou in 554.
http://hanchi.ihp.sinica.edu.tw/ihp/hanji.htm.

Xin Tangshu (新唐書 New Book of Tang) compiled under Ouyang Xiu in 1060. http://hanchi.ihp.sinica.edu.tw/ihp/hanji.htm

 

Selected readings



24 Comments

  1. Victor Mair said,

    August 23, 2021 @ 5:04 am

    From Juha Janhunen:

    This sounds like a correct interpretation and provides evidence that the Xianbei (or some part of them) spoke (Pre-Proto)´-) Mongolic.

  2. Marcel Erdal said,

    August 23, 2021 @ 6:14 am

    Perfect connection. It should, also for this word, be noted that there is no Turkic cognate.

  3. David Marjanović said,

    August 23, 2021 @ 9:28 am

    I'm wondering about the f of fen, though. At the time it wasn't [f], but was [pj] or whatever it was really a good fit for a Mongolic b?

    Also, ai for what seems to have been [ə] strikes me as an odd choice (the [ɛ] of modern Khalkh, but not e.g. Chakhar, is most likely an innovation).

  4. Peter B. Golden said,

    August 23, 2021 @ 9:31 am

    See Andrew Shimunek, Languages of Ancient Southern Mongolia and North China. A Historical-Comparative Study of the Serbi or Xianbei Branch of the Serbi-Mongolic Language Family, with an Analysis of Northeastern Frontier Chinese and Old Tibetan Phonology. (Wiesbaden: Harrassowitz Verlag, 2017), who notes (p.338) Yuwen (a Middle Serbi [Xianbei] dialect): *ǝbun/*ǝbǝn “cognate to Middle Mongol ebesü ~ ebesün, Common Serbi-Mongol *ǝbe-n “grass” and pp. 454-458 with discussion of the *sU suffix, which he views (p.338, n. 213) as a morphological innovation in Pre-Proto-Mongolic distinguishing the Mongolic branch from Serbi.”

  5. Chau said,

    August 23, 2021 @ 11:37 am

    Thank you for the enlightening post.

    "By reinterpreting 俟汾 to 埃汾 (aifen) and reconstructing the ancient pronunciation of the latter as *eben, we can locate its etymological and cognate reflection in Written Mongolian as ebesü(n)~ebüsü(n) ‘grass, hay, herb’, of which ebe-~ebü- is the root…"

    Latin herba means 'grass'. Assuming H-dropping occurring in Northern China during the Serbi/Xianbei times, the L. herb- loan might develop into Serbi *eben 'grass'.

  6. Penglin Wang said,

    August 23, 2021 @ 3:45 pm

    I am thrilled with and highly appreciate all the comments on my post and want to have additional remarks. Chau’s insightful tracing of WMo (Written Mongolian) ebesü(n) ‘grass’ to Latin herba ‘grass, herb’ is well supported by the loss of the Latin initial h in Mongolic. Moreover, the instability of the postvocalic liquid r~l in Mongolic is another factor for us to connect WMo ebesü(n) with Latin herba, i.e., r of herba dropped off in Mongolic as well. Thus, we can reconstruct the root *herbe- for Xianbei and Mongolic. Such reconstructions can have a support from the connection between Old English hyhtan~hihtan ‘to hope, trust’ and WMo itege- (< *hilte-) ‘to believe, trust, rely on, hope’. In this case, the early Mongolic form *hilte- is at present echoed with two variants of Eastern Yugur ļdeğe- or hətege- ‘to trust’, with the former having the initial hə deleted and the latter losing the ļ in the postvocalic position in the initial syllable (Wang 2000).

    The ancients disagree on the origin of the ethnonym yuwen. Zhoushu (1.1) completed in 636 holds that the Xianbei customarily called heaven yu (宇) and monarch wen (文), implicitly glossing yuwen as ‘heavenly monarch’. I do not believe yuwen consisting of two morphemes and instead treat it monomorphemic. But I consider the sense of ‘heavenly monarch’ more or less relevant to yuwen. Phonetically, there is no problem at all reconstructing *man for wen of yuwen. And I prefer to reconstruct *ü or *ȵü for yu of yuwen. Thus, we have the form *üman or *ȵüman meaning ‘eight’ and connected with WMo naiman and Baoan nimaŋ ‘eight’. If the former, the initial nasal dropped off.

    Why ‘eight’? First, the people in Inner Asia had been accomplished in using numerals in ethnonyms including Naiman meaning ‘eight’. Second, the interested historians seem to be united in their opinion that Kitan was ethnically developed from Yuwen. Third, in its ethnographic and ethnogenetic account of Kitan Liaoshi (32.377) is characteristic of its thick discourse of babu (八部) ‘eight hordes’ or gu babu (古八部) ‘ancient eight hordes’. And finally, at ancient times the Chinese numeral ba ‘eight’ is culturally and symbolically associated with Bagua (八卦) or eight diagrams. According to Zhouyi (周易 Book of Changes), Baoxishi (包牺氏), legendary Chinese progenitor and emperor, created Bagua to comprehend the virtue of divine brilliance and patternize the phenomena of myriad things. The octonary ‘divine brilliance’ here is compatible with ‘heavenly monarch’. That is why I am arguing for more or less relevance of ‘heavenly monarch’ to yuwen. This means: I do not agree with the proposition of the authors of Xin Tangshu to connect qifen with yuwen; qifen should be aifen; WMo ebesü(n) is cognate to Xianbei aifen; WMo naiman ‘eight’ is again cognate to yuwen.

    References

    Liaoshi (遼史 History of Liao) compiled by Tuotuo in 1344. http://hanchi.ihp.sinica.edu.tw/
    ihp/hanji.htm.

    Wang, Penglin. 2000. Lexical connections between Germanic and Mongolic. Interdisciplinary Journal for Germanic Linguistics and Semiotic Analysis 5.1:71-91.

    Zhoushu (周書 Book of Zhou) compiled by Linghu Defen in 636. http://hanchi.ihp.sinica.edu.tw/
    ihp/hanji.htm.

    Zhouyi (周易 Book of Changes). https://ctext.org/text.pl?node=46944&if=gb&show=parallel&remap=gb.

  7. Alexander Vovin said,

    August 23, 2021 @ 7:06 pm

    I am sorry, but your logic for replacing 俟 with 埃 here and in your 2020 article you refer to escapes me. Ditto for the same replacement that Shimunek suggested as Peter pointed out. What is the evidence besides the necessity to connect Mong. ebe(sü)n with this word? But I trust we might have an additional problem here: The book 103 of Wei Shu was largely reconstructed n the basis of the book 98 of Bei Shi by the Song period scholars. And this means that we cannot use LHC reading Ɂə of 埃, which is sorely needed for Mong. ebe(sü)n ( e = ə). At the earliest, it is EMC Ɂậi, that you mention, but it would underlie OT *ärkin, not ėrkin > erkin ~ irkin (leaving aside the problem whether it is Turkic, Mongolic, etc. title).
    Also MM h- is quite stable. MM ebesün is amply attested in both EMM and WMM, while MM *hebesün is not attested at all.

  8. Penglin Wang said,

    August 23, 2021 @ 11:01 pm

    Although I respect phonologists’ efforts to reconstruct ancient Chinese words as homogeneously as possible, in my discussion of the erroneous logographic substitution I rely on the bona fide established pattern of Chinese transcription of foreign words. In the case of the Chinese 埃 (ai), we have recurring and established pattern for 埃 to correspond to the foreign e: Eddington 埃丁頓, Edgar 埃德加, Edmund 埃德蒙, Edwin 埃德溫, Eleanor 埃莉諾, Emma 埃瑪, and so on and so forth. However, the pronunciation of 埃 is definitely not homogeneous in Chinese: Putonghua ai, Cantonese āi, oi, ai, Hakka ai, yai, and Wu ei, a, e, ɛ. Hypothetical reconstruction often in isolation and out of context pales in significance in the face of such a solid pattern. It will be wrong if the future phonologists try to reconstruct a single homogeneous form for 埃. On the other hand, it will be preferable if they reconstruct ai for Putonghua by recognizing the existence of regional dialects.

    It is, moreover, irrefutable that such dialectal differences existed throughout history. Confucius instigated the ideology of yayan (雅言) ‘gracious speech’ and became a role model in using it in education and ceremonies. Mencius once ridiculed the speech of southern dialect speakers as shrike twittering. It is my understanding that the Confucian concept of ‘gracious speech’ refers to one mainstream dialect which rose to the level of a common language serving as a cross-dialectal tool for communication and which eventually evolved to be what is today’s Putonghua. And Mencian ‘shrike twittering’ is a piece of strong evidence for the dialectal differences in ancient China.

    The attestation of the initial fricative h in Middle Mongolian is in itself indicative of its continuation from and does not preclude its loss in earlier Mongolic. With increases in transcontinental flow of human population and exchange of ideas in many parts of ancient central Eurasia, it is becoming increasingly more difficult to think of vocabulary of a language as an isolated, insulated entity. In this sense, Chau’s idea of connecting Latin herba with Xianbei *eben ‘grass’ could be a clue to the loss of word-initial h- in Xianbei and Mongolic *herbe- ‘grass’.

    I thank Professor Golden in drawing my attention to the reconstruction of yuwen in the form of *ǝbun/*ǝbǝn ‘cognate to Middle Mongol ebesü ~ ebesün, Common Serbi-Mongol *ǝbe-n ‘grass’, but I do not see ‘Ditto for the same replacement’ in his posting. I will check this issue later.

  9. Andreas Johansson said,

    August 24, 2021 @ 1:06 am

    Also MM h- is quite stable. MM ebesün is amply attested in both EMM and WMM, while MM *hebesün is not attested at all.

    MM h- may be stable, but Imperial era Latin h- is not, so I don't think that's a problem.

    Why grassslands-dwellers in the far east would borrow a word for "grass" from the far west is another question of course.

  10. David Marjanović said,

    August 24, 2021 @ 3:46 am

    What Andreas Johansson said.

    I mean, it's possible that the speakers of Mongolic didn't "always" live in grasslands, and that, when they entered that landscape, they borrowed the vocabulary to describe that landscape, maybe even something as utterly basic as "grass", from a language that was already there. The list of choices for such languages is pretty long – but why in blazes would it include Latin? Likewise, how would a Germanic word get there that early, and why?

    Thus, we have the form *üman or *ȵüman meaning ‘eight’

    What happened to vowel harmony?

    In the case of the Chinese 埃 (ai), we have recurring and established pattern for 埃 to correspond to the foreign e: Eddington 埃丁頓, Edgar 埃德加, Edmund 埃德蒙, Edwin 埃德溫, Eleanor 埃莉諾, Emma 埃瑪, and so on and so forth.

    This pattern, quite reasonably, uses [æ] – the pronunciation in ai in many even near-standard Mandarin accents – to approximate foreign [ɛ].

    But early Mongolic, by all appearances, had no such sound at all, neither [æ] nor [ɛ] nor for that matter [e]. What Mongolists traditionally denotate as e appears to have been [ə], which it still is in about half of the Mongolic language family today, and which fits the theoretical expectations for a vowel-harmony system based on neutral vs. retracted tongue root (as observed today in all Mongolic languages that haven't lost vowel harmony altogether).

    This is why Prof. Vovin just commented that the Late Han Chinese "reading Ɂə of 埃 […] is sorely needed for Mong. ebe(sü)n ( e = ə)".

    The reasons for the traditional denotation as e are as follows:

    1) The modern Mongolic idiom most Mongolists are most familiar with is Khalkh, which has shifted [ə] to [ɛ] (and spells it accordingly). As part of the same shift, it has also shifted [o] to [ɵ] – but, IIRC, just the short one; the long one is still [oː], which is noteworthy because vowel length is a pretty recent development that is still lacking in Written Mongolian.

    2) The vowel-harmony systems most phonologists are most familiar with are those found in and around Europe: the backness-based systems of Uralic and Turkic languages (recently lost in some of them without replacement, e.g. Estonian and Uzbek). The systems found in Mongolic (and most of Tungusic, and other languages of the region) are different, but similar enough that most Mongolists preferred to assume they were derived from a backness-based system by a few very odd sound shifts. This made it possible to assume that early Mongolic had practically the same vowel system as (early) Turkic. (This assumption used to be so widespread that it underlies the Cyrillic orthography of Khalkh and Buryat, and almost every Latin transcription I've ever seen.) Only in the last ten years has RTR-based harmony really been understood.

    3) The availability of e on typewriters. It's really hard to write ə on a typewriter designed for English, French or Russian: take the paper out, put it back in the other way around, finagle with the typewriter for ten minutes to find the exact spot, type e

    4) In order to maintain backwards-compatibility, historical linguists tend to be very conservative in their choice of notation; compare the continuing use of *h₂ by Indo-Europeanists despite widespread agreement that it was a plain old [χ].

    (There is also a widespread myth in historical linguistics that sounds cannot be reconstructed anyway, only abstract phonemes – so if there's no evidence for a contrast between [ə] and [ɛ] or [e] in the language you're reconstructing, you can just write e without misleading anyone. That's true sometimes. In other cases, it is easier to reconstruct sounds very precisely than to figure out which phonemes they belonged to.)

  11. Chris Button said,

    August 24, 2021 @ 8:28 am

    @ Alexander Vovin

    And this means that we cannot use LHC reading Ɂə of 埃, which is sorely needed for Mong. ebe(sü)n ( e = ə). At the earliest, it is EMC Ɂậi, that you mention,

    埃 is EMC Ɂəj (LMC Ɂaj) in Pulleyblank’s reconstruction. I wonder what might have been more appropriate? Incidentally, the -j coda that remains unaccounted for in many Old Chinese reconstructions comes from an earlier velar glide coda in his system.

  12. Penglin Wang said,

    August 24, 2021 @ 12:48 pm

    Wonderful discussions.

    Andreas had made a very good point concerning the phonetic variation of h- and raised again a very good question: Why grassslands-dwellers in the far east would borrow a word for ‘grass’ from the far west? There are two schools of thought how to deal with the similarities in languages. One school holds a genealogical approach and organizes language families. The late Professor Joseph Greenberg put forward a very large family called the Eurasiatic Language Family including Indo-European and Altaic among others. Given this approach, many words of similar phonetics and relatable meaning found in Altaic and Indo-European could be considered as cognates. So, we do not need to bother ourselves who borrowed from whom and why people borrowed words from other languages afar. The other school treats linguistic similarities as the result of language contact. I prefer to use this approach. In my article (Wang 2000) I argued:

    "My assumption is that the historical long-term contact between Altaic and Indo-European in the Eurasian steppes could lead to repeated circulation of basic words. Ancient vocabulary was not as complex and abundant as today’s. Moreover, the proportion of basic words in ancient vocabulary was significantly larger than today’s. For example, if basic words represent fifteen percent of today’s total vocabulary, they were eighty percent of the ancient one. So what circulated were mostly basic words."

    The ethnonym Yuwen can be traced back to the third century according to Weishu (1.5) recording the Bigman Mohuan (莫槐) of the Xiongnu Yuwen tribe. Mohuan lived in the third century up to the year of 293. I have no knowledge of if vowel harmony existed in the language of those who initiated the name yuwen. Suggested by David, I shall now consider the vowel harmony to adjust my hypothetical reconstruction of *üman or *ȵüman to *ïman or *ȵïman for yuwen.

    How would a Germanic word get there that early, and why? Because the Xiongnu people in Inner Asia in general and in historical Mongolia in particular were interacting with their western neighbors – the Yueshi and Wusun groups speaking Indo-European languages, the former of which, according to some researchers’ suggestion, was what would become the Tokharian, we must assess the impacts that the Xiongnu and Yueshi were having on each other in spite of their confrontation with each other. From that point it was a natural development to follow vocabulary pooling into each other’s language to see how the two languages interacted. The Altaic groups essentially inhabited the Xiongnu territory and could be componential parts of Xiongnu. According to Douglas Adams, the Tokharian and Germanic share lexicon (by memory). Furthermore, Ronald Ringer with a coauthor argue for Indo-European homeland in the area lying between the Black Sea and Caspian Sea (by memory, I’m responsible for any inaccuracies). This area is very close to Central Asia and henceforth easy for Indo-European to radiate in the eastern, Altai direction. Some people see the Altai mountains as natural barriers for transcontinental exchange as if they insurmountable. In reality, according Russian archaeologists, there was crowded traffic in the Altai mountainous area at ancient times, and some people built up walls to block the traffic (by memory). Apparently, many early people saw mountains as connections between peoples and wanted to explore the other sides of the mountains.

  13. David Marjanović said,

    August 24, 2021 @ 2:30 pm

    One school holds a genealogical approach and organizes language families. The late Professor Joseph Greenberg put forward a very large family called the Eurasiatic Language Family including Indo-European and Altaic among others. Given this approach, many words of similar phonetics and relatable meaning found in Altaic and Indo-European could be considered as cognates. So, we do not need to bother ourselves who borrowed from whom and why people borrowed words from other languages afar. The other school treats linguistic similarities as the result of language contact.

    Oh, I'm sorry, this is a fundamental misunderstandings of both the "long-rangers" and the "splitters".

    Both schools agree that the null hypothesis, the default explanation, for any similarity between any languages is random chance. Only when chance is shown to be a very improbable explanation for a similarity can we look to other explanations and try to test them against each other and against the null hypothesis.

    Latin [(h)ɛrba] and Proto-Mongolic *[əb̥ə(su)n] are barely even similar. Why do you think what little similarity they have requires any explanation at all?

    I have no knowledge of if vowel harmony existed in the language of those who initiated the name yuwen.

    If that language was Mongolic or closely related to Mongolic, the simplest hypothesis is that it had RTR-based vowel harmony. This is both because this is so widespread (if not universal) among Mongolic languages today and because it is so widespread among other language families of northeastern Asia that it must be quite old in that area.

    According to Douglas Adams, the Tokharian and Germanic share lexicon (by memory).

    Not more than any other two branches of Indo-European. Tokharian and Germanic are not closely related with Indo-European either, and they have literally never been in direct contact (Iranian may have been in direct contact with both, but evidence for contact with Germanic before the Alans of the 5th century is very sparse at best).

    Ronald Ringer

    Donald Ringe.

    argue for Indo-European homeland in the area lying between the Black Sea and Caspian Sea

    Yes, this is the current consensus (at least for the last common ancestor of Tocharian and Indo-Actually-European, not necessarily for the last common ancestor of that and Anatolian).

    But Germanic formed only after an IE expansion in the opposite direction (northwest). There is ample loanword evidence showing that Germanic and Finnic have been in contact for thousands of years.

    In short, I expect future research (assuming past research hasn't done that already) to uncover plenty of Tocharian and Iranian words to show up in the reconstructed lexicon of Proto-Mongolic, and perhaps vice versa (even if these words were perhaps passed on by intermediaries). But Germanic or Italic – I see no reason to consider this in the continued absence of much stronger evidence than you have yet provided.

  14. Penglin Wang said,

    August 25, 2021 @ 9:32 am

    This is an addendum to my posting: Shimunek (2017:338) identifies the substitute of ‘俟 for 矣’ and reconstructs the form *əbun/*əbən for Xianbei [俟汾] ‘grass’ cognate to Middle Mongol ebesü ~ ebesün ‘grass’.

    I am sorry that Shimunek’s book escaped me when I posted my note. I thank Professor Golden for drawing my attention to Shimunek’s findings and Professor Vovin’s follow-up reminder of ‘Ditto for the same replacement’. I apologize for my misunderstanding of their points. (Shimunek, Andrew Eric. 2017. Languages of Ancient Southern Mongolia and North China: a Historical-Comparative Study of the Serbi or Xianbei Branch of the Serbi-Mongolic Language Family, with an Analysis of Northeastern Frontier Chinese and Old Tibetan Phonology. Wiesbaden: Harrassowitz Verlag).

  15. Chau said,

    August 31, 2021 @ 9:07 pm

    @David Marjanović: "The list of choices for such languages is pretty long – but why in blazes would it include Latin?"

    I was the person who made the suggestion of a possible connection between Latin herb- 'grass' and Serbi *eben 'grass'. Therefore, I feel that it is my responsibility to answer any questions regarding this possible connection. Hence this reply. It turns out that Taiwanese (Tw) has a word corresponding to a descendant of L. herba, thereby prompting me to make the suggestion, but I will come to the Tw word later. First I would like to address the fundamental issues of (1) presence of Latin words in Asia (especially in Sinitic/Taiwanese), and (2) that the default explanation for any similarity between any languages is random chance.

    A single Latin word, standing alone in the vast landscape of Asia and itself not even 100% similar to the intended comparandum (L. herb- vs Serbi *eben), deserves all the skepticism. Even if we could round up a lot of words with some sort of similarities, still it would not be convincing. I totally agree. However, I think I can address the issue of presence of Latin words in Asia as well as removing doubts about random chance for correspondence. I will use Taiwanese-Latin correspondences as I know nothing about Serbi language.

    To address both issues at the same time, like killing two birds with one stone, I am going to present a pattern of sound correspondence (PSC) to which the pair of Latin herb- (or more precisely, a Romance descendant of it) and its corresponding Tw word belongs. To address the first issue, I select data only from Latin lexicon, excluding other sources such as Greek and Germanic. Second, when the pattern is supported with enough examples, the correspondence is said to be *regular*. This will remove the thorny issue of random chance. How many examples are needed for a pattern to be regular?

    David P. Branner states in his book, The Classification of Miin and Hakka (p. 15), that, "For a correspondence to be regular, the correspondence set in which it is attested should contain a reasonable number of examples. It is conventional to demand, at the very least, three." To be beyond any doubt, I set a minimum of six examples for a pattern to be qualified as regular.

    L. herba has two meanings: 'grass' and 'herb', and its descendants in Romance languages retain both meanings. In Sinitic/Taiwanese two Sinographs are used to write 'herb, drug, medicine': 藥 and 葯, both with the same two pronunciations. The major pronunciation is io̍k (l.) / io̍h (v.), which can be derived from Greek ΐός 'poison' through an -os > -ok/-oh sound change.

    The second pronunciation ia̍k, is less common but more interesting because it can be correlated with the descendants of L. herba in East Romance languages: Rumanian iarbă and Dalmatian (Vegliotic) yarba. The correlation is derived through a PSC -ar > -ak. Thus, Rum. iarbă (Dalm. yarba) > iar- > Tw ia̍k 藥 / 葯 'herb'. The sound correspondence can be generalized to L. -Vr > -Vk, where V is any vowel. Examples of Latin-Taiwanese correspondence set for the PSC are shown below. The colon symbol (:) stands for "corresponding to".

    At this point, it should be emphasized that different receiver languages may handle the same foreign loan differently, according to their own phonological rules. Sinitic/Taiwanese take the iar- > iak route, whereas Serbi/Xianbei obviously took another approach: L. (h)erb- > Serbi *eben.

    Abbreviations: lit., literally; l., literary pronunciation; v., vernacular pronunciation.

    Latin dictionary: Latin-English Dictionary, by Sir William Smith and Sir John Lockwood. Publishers: Chambers (Edinburgh) and John Murray (London), 1933.

    PSC: L. -Vr:Tw -Vk (with V = any vowel)

    (1) Late L. sera 'evening' > ser-:Tw se̍k夕 'evening'.
    (2) L. cera 'a writing tablet covered with wax' > cer-:Tw chhek (l.) /chheh (v.) 冊 'book' (with palatalization of the Latin k sound to [tʃ], written in POJ as chh-). [There is in Museo Nazionale, Naples, a "Portrait of a Man and His Wife" from the house of Terentius Neo, Pompeii. It shows the wife holding wax tablets and a stylus. I will send a digital file of the painting to interested individuals upon request.]
    (3) L. bēryllus 'precious stone, beryl' > bēr-:Tw phek 碧 as in 碧玉 'emerald'
    (4) L. fortuna 'fortune' > for-:Tw hok 福 'fortune' (Tw lacks the f sound, f- > h-)
    (5) L. arbor 'tree' > (aphetic) *-bor:Tw bo̍k木 'tree, wood'
    (6) L. taurus 'bull' > It./Sp. toro > *tor-:Tw to̍k犢 'calf' (with a semantic shift)
    (7) L. corium 'skin, hide' > cor-:Tw kok 鞹 'skin, hide'
    (8) L. cerealis 'cereal' > cer-:Tw chhek (can be written as 粟) 'un-husked rice' (with palatalization of the Latin k sound to [tʃ], written in POJ as chh-)
    (9) L. spīrāre 'to breathe' > spīr- > *sir-:Tw sek 息 'breath' (*sik is written sek).
    (10) L. sera 'a movable bolt or bar for fastening door' > ser-:Tw sek 軾 'a horizontal bar in a carriage for resting arm/hand'
    (11) L. series 'a chain, row, succession' > ser-:Tw sek 式 'id.' as in hâng-liat-sek 行列式 'rows and columns', tiān-náu-thêng-sek 電腦程式 'computer program'
    (12) L. vērum 'what is true or real, the truth, reality, the fact' > ver-:Tw pe̍k 白 (original meaning of 白 is 'white', here it is borrowed to transcribe the sound with the meaning of 'the truth') as in piáu-pe̍k 表白 'to tell the truth'; kò-pe̍k 告白 'confession'; bêng-pe̍k 明白 (lit.) 'Clear is the truth' (Alles ist klar) > 'understand'. Side note: That v in a foreign loan becomes p in Tw is supported by many examples, the best known of which is a transcription from Buddhism: nirvana > Tw lia̍p-poân 涅槃 'nirvana'.

    The 12 examples above, twice the number of the minimum of six I set at the beginning, strongly indicates that the PSC is regular. And we have seen not just a few isolated Latin words, but twelve, conforming to the rule. This is just one PSC. Since there are many other PSCs showing regular Latin-Taiwanese lexical sound correspondences, it is abundantly clear that there are not just a few single Latin loanwords in Sinitic, but they constitute an important portion, significantly enriching its lexicon.

    It is worthwhile to note that the PSC can be traced back to loans from Indo-European. Because the main theme of this reply is on Latin, I will just cite two homophonous PIE-Tw correspondence examples.

    PIE *(s)ker- 'rauhe Haut' (Pokorny 933) > *ker- > Tw kek 革 'skin, animal hide' as in phê-kek 皮革 'leather'.
    PIE *(s)ker- 'schneiden' (Pokorny 938) > *ker- > Tw kek 革 'cut'. Examples: kek-tû 革除 'cut and remove'; kek-sin 革新 'cut the old and replace with the new'; kek-chit 革職 'cut the job, dismissal'. The weightiest word, kek-bēng 革命, now used to translate 'revolution', originally meant 'toppling the old dynasty (cutting off its mandate)'. It is attested in one of the oldest of the Chinese classics, I-Ching 易經 'Book of Changes': 湯武革命, 順乎天而應乎人 Thong Bú kek-bēng, sūn-hoo thian jî èng-hoo jîn, 'Terminations of dynasties by Tang and Wu are following (the mandate of) the Heaven and responding to (the wishes of) the people'.

    It should not escape our notice that the same PIE stem *ker- with two different meanings resulted in two homophonous loans with the corresponding meanings. The probability of double matches of homophones due to random chance should be the square of the probability of a single match, which greatly reduces the probability of random chance as a contributing factor for the correspondence.

  16. David Marjanović said,

    September 1, 2021 @ 4:17 am

    In Sinitic/Taiwanese two Sinographs are used to write 'herb, drug, medicine': 藥 and 葯, both with the same two pronunciations. The major pronunciation is io̍k (l.) / io̍h (v.), which can be derived from Greek ΐός 'poison' through an -os > -ok/-oh sound change.

    Well, -s > -h is pretty unremarkable, but -s > -k or for that matter -h > -k would be nothing short of bizarre. Neither is such a sound change documented anywhere else that I know of, nor is there a known mechanism for how it could have worked. Random sounds don't change into random other sounds.

    The second pronunciation ia̍k, is less common but more interesting because it can be correlated with the descendants of L. herba in East Romance languages: Rumanian iarbă and Dalmatian (Vegliotic) yarba.

    That would be an independent change of e to ya. That's entirely reasonable; this change has happened all over the place in a lot of languages independently. (Think of e as an open front vowel. Then separate the frontness and the openness as y and a.)

    The correlation is derived through a PSC -ar > -ak.

    That's another entirely bizarre sound change, and it's very convenient that the second syllable just disappears for no discernible reason.

    It should not escape our notice that the same PIE stem *ker- with two different meanings resulted in two homophonous loans with the corresponding meanings.

    Both require a highly improbable sound change, and they can hardly be from Latin, where *(s)ker- was not preserved.

    Rather, words for cutting contain [k] in very many, perhaps most languages around the world, because that's what it sounds like when a knife hits a board.

  17. David Marjanović said,

    September 1, 2021 @ 4:20 am

    To be beyond any doubt, I set a minimum of six examples for a pattern to be qualified as regular.

    But if you allow any Latin sound to correspond to any Taiwanese sound in principle, you'll find six or indeed twelve examples of one particular correspondence just by chance. You need to correct for multiple testing of the same hypothesis.

    https://xkcd.com/882/

  18. Chau said,

    September 1, 2021 @ 9:25 am

    @David Marjanoviċ: "and it's very convenient that the second syllable just disappears for no discernible reason."

    A few examples in English: doctor > doc; brassiere > bra; mathematics > math; gymnasium > gym; representative > rep; canister > can; fanatic > fan; professional > pro; synchrony > sync; crocodile > croc, etc.

    I guess you have reasons for these examples?

    Then try these: influenza > flu; refrigerator > fridge, etc.

  19. David Marjanović said,

    September 2, 2021 @ 3:17 am

    OK, OK, but most of these are unusually long words that don't really fit English phonotactics (because they're learned loans of much less basic vocabulary than "grass"), so they get shortened individually and irregularly – and in most cases informally only; the full forms are mostly not forgotten. It's not a regular process across the whole language.

    Can you find me a reasonably uncontroversial example of a r > k shift anywhere on this planet?

  20. Chris Button said,

    September 2, 2021 @ 10:00 am

    @ David M and Chau

    An -r coda becoming a -k coda is well attested in certain languages. It has been discussed on LLog in the past. However, just because it occurs in certain languages in a language family does not mean it will occur in others.

    However, I think the discussion of “herb” is a red herring that is distracting from Penglin Wang’s more reasonable Mongolian proposal in the original post. For that, closer attention to what any reconstructed Chinese values actually were (both from a phonological and surface phonetic perspective) is important.

  21. Penglin Wang said,

    September 2, 2021 @ 12:37 pm

    Shimunek (2017:11) defines the hypothetical reconstruction of Ancient Mongol *p (> MMo h) as ‘evidence’ and a major contribution to ‘our understanding of Proto-Serbi-Mongolic phonology’, making the scientific distinction between hypothetical reconstruction and evidence vanished in his methodology. Having traded the obsolete, disproved, and undefendable hypothesis for evidence Shimunek (2017:290) considers his Serbi *p(ʰ) : MMo h (> Khalkha Ø) as one of ‘the most informative sound correspondences’ and declares: “Regular sound correspondences can be established for these cognates and regular sound changes can be identified”. Three out of his four examples are: Late Kitan *pʰɔ ‘season, time’ : MMo hon ‘calendrical year’; Middle Kitan *p(ʰ)ulu intercalary’ : MMo hüle’ü > Khalkha илүү ‘id.’; Middle Kitan *p(ʰ)ar ‘ten’ : MMo harba-n > Khalkha apaв ‘id.’. In my analysis, none of these Middle Mongolian initial fricative h- derived from the hypothetical *p or attested p. The p in Kitan po ‘time’ cannot explain the origin of h of MMo hon. What it suggests is that: the f of Manchu fon ‘time’ influenced the use of the Kitan p (Wang 1992:397). WMo (h)ilegü and MMo hüle’ü is etymologically connected with Greek khilioi ‘1,000’ (Wang 2015:66). As for the MMo harban, it is related to Iranian aspa ‘horse, a troop of horse’, Kurdish hasp and Baloch hasp ‘horse’ (Wang 2018b). Another instance of making hypothesis evidence, Shimunek uses the hypothetically reconstructed ethnonyms, e.g., Serbi and Shirwi for the ethnonyms Xianbei (鮮卑) and Shiwei (室韋), respectively.

    The ‘Regular sound correspondences’ prerequire a considerable number of lexical items. Does Shimunek have any Ancient Mongol words beginning with the p? For the time being I have not seen a single Ancient Mongol word beginning with the p at his disposal in his book. To understand how the phoneme /p/ emerged in Altaic, especially in the word-initial position, we should start with Old Turkic. Thanks to the Russian lexicographers of ДТС, we can see how the word-initial /p/ came about. ДТС has 180 entries beginning with p on pages 396-399: about 82 came from Sanskrit, 17 from Iranic (Persian and Sogdian), 10 from Chinese, 56 alternated with those beginning with b including one with f. This preliminary statistics shows: language contact is responsible for 60% of the p-initial words, and the b ~ p alternation for 31%. The remaining 9% might be internally motivated according to the early Old Turkic consonant system organized by Erdal (2004:61), who does not include sounds found only in loan words. So, his system includes /p/ but not /f/. The lack of /f/ in Old Turkic reminds us of the similar in Kitan. That was why Kitan responded to Manchu (Jurchen) fon with po.

    Ironically, Shimunek (2017:32ff) has enacted a bunch of ‘must’ and ‘must not’ criteria for his comparative work. The criteria seem to be beautiful. But the theorization is one thing, how to understand and apply it is another. Which of his criteria has he used in his comparison? Shimunek’s ‘must’ criteria stress ‘mainstream’. What is his ‘mainstream’? He does not elaborate it. Having preliminarily surveyed what he has done in his pursuit of the so-called ‘regular sound correspondence’ between his Serbi *p(ʰ) and MMo h-, the ‘mainstream’ overlaps with ‘herd mentality’. If everyone is confined and obligated to adhere to ‘mainstream’, as history of scientific development tells repeatedly, there would be no innovations and inventions. If William Jones followed the ‘mainstream’ insensitive to genealogical connection, he could not have initiated what would become the Indo-European family. It is no wonder higher education trains students to develop critical and independent thinking skill.

    By observing the 'mainstream' Shimunek took it for granted that Tabγač is a separate language developed diachronically from Xianbei. The ethnonym Tabγač (Tuoba 拓跋 in Chinese) was not applied to the language spoken by the very people because of its onomastic orientation toward agricultural cultivators. There may not be a correlation between a polity name and its language(s). The state name Netherlands derived from the country’s lower placement and does not apply to its language which is called Dutch (which is named Helanyu (荷蘭語) in Chinese in response to the Chinese name Helan (Holland) for the Netherlands). People in Mongolia speak Khalkha (this name could have been given by eastern Iranic speakers by using their word χalg ‘person’). Sometimes one might be surprised to find that the Old Turkic phrases Tabγač äl and Tabγač budun literally denoting ‘Tabγač state’ and ‘Tabγač people’ actually referred to the Chinese state and Chinese people instead of the Tabγač of Xianbei ethnicity. In the eye of the contemporaneous Turkic and Xianbei, Tabγač as an ethnic label and as an agricultural occupation was meant for the agricultural China. Presumably, the contemporaneous Tuoba people were reluctant to equate the ethnic label Tuoba with their language in their own case: although they might have applied the ethnonym Tuoba to themselves, they did not consider Tuoba to be a suitable name for their language, since they lived and grew up by speaking the Xianbei language per se from generation to generation (Wang 2018a:156ff). If and only if there was an Old Turkic parlance meaning ‘Tabγač language’, it could refer to the Chinese language after the pattern of Tabγač äl, Tabγač budun, and Tabγač xan ‘Chinese emperor’ or the farmers’ language in sociolinguistic terms.

    References

    ДТС = Дreвнетюркский cловарь (Old Turkic Dictionary), 1969. Лeнингpaд: Наука.

    Erdal, Marcel. 2004. A Grammar of Old Turkic. Leiden: Brill.

    Shimunek, Andrew Eric. 2017. Languages of Ancient Southern Mongolia and North China: a Historical-Comparative Study of the Serbi or Xianbei Branch of the Serbi-Mongolic Language Family, with an Analysis of Northeastern Frontier Chinese and Old Tibetan Phonology. Wiesbaden: Harrassowitz Verlag.

    Wang, Penglin. 1992. On the origin of the Middle Mongolian initial h- and the motivation for its loss. Archív Orientální 60.4:389-408.

    Wang, Penglin. 2015. Number Conception and Application. New York: The Nova Science Publishers.

    Wang, Penglin. 2018a. Linguistic Mysteries of Ethnonyms in Inner Asia. Lanham: Lexington Books.

    Wang, Penglin. 2018b. Number beasts and numerals in Altaic languages. Presented at the Western Conference on Linguistics (WECOL), California State University, Fresno, November 30-December 2. https://cwu.academia.edu/PenglinWang

  22. Chau said,

    September 2, 2021 @ 10:07 pm

    @David Marjanović

    We always try to come up with reasons to explain linguistic phenomena, but Nature likes to defy what the textbooks dictate. For example, the English name for the mouse-like winged quadruped, bat, is derived from ME backe / bakke, which are not unusually long words (as compared with the examples given in the previous comment) and yet the second syllable just disappears for no discernible reason.

    "Can you find me a reasonably uncontroversial example of a r > k shift anywhere on this planet?"

    Ancient Cretan δορά [f.] 'beam' is cognate to Classical Greek δοκός [f.] (later [m.] also) 'beam'. This example is quite incontrovertible.

    @ Chris Button

    Thank you for the reminder. As I commented early on, Penglin Wang's original post is an enlightening one. I admire his scholarly work. And I apologize for the distractions I caused.

  23. Andy said,

    September 3, 2021 @ 3:41 am

    'Ancient Cretan δορά [f.] 'beam' is cognate to Classical Greek δοκός [f.] (later [m.] also) 'beam'. This example is quite incontrovertible.'

    It's incontrovertible that δοκός means 'main beam' and that there's another word 'δορά, found in Hesychius, which may mean the same (the entry is potentially problematic), but there's no evidence whatsoever (or discussion, as far as I'm aware) to suggest that these words are cognate.

  24. Penglin Wang said,

    September 4, 2021 @ 11:31 am

    In my posting of September 2, 2021 @ 12:37 pm 'Tabγač is a separate language' should read 'Tabγač is a separate linguistic entity'. For the revised version of this post see the link: https://www.academia.edu/44751726/The_Altaic_Hypothesis_Revisited

RSS feed for comments on this post