A Persian word in a Sinitic topolect
« previous post | next post »
Yesterday afternoon at Indiana University I gave a wide-ranging lecture on Iranian and Chinese interconnections from the Bronze Age through the late imperial period. After the lecture, Chen Su, a doctoral candidate in Central Eurasian Studies, approached me and said that some of the points I made helped her to realize something about her own speech that had confused her for years.
Chen Su, who hails from Xi'an, where Guanzhong topolect is spoken, had noticed an interesting coincidence in the similarity of the pronunciation between Persian and Guanzhong topolect for the word “head”.
On the one hand, we have Persian sar سر (it's the same in Middle Persian).
The corresponding Guanzhong topolectal word is sá.
The usual Mandarin words for "head" are tóu 頭 and shǒu 首.
What I find most revealing is that there is no exact character for this oral term in Guanzhong topolect. According to Chen Su, even in modern Chinese dictionaries there is no parallel character that has the same pronunciation as this topolectal word for "head".
David Marjanović said,
March 10, 2020 @ 8:28 am
Interesting.
At least as a dysphemism ("noggin"), words for "head" can definitely be borrowed. Look no further than German, where Kopf is from Latin cuppa meaning "bowl" (and ancestral to cup, a later borrowing from Latin into English). (BTW, the vowel change is regular.)
I wonder about the second tone. AFAIK it's not regular for an inherited syllable that starts with a voiceless consonant, so that's a point in favor of a borrowing after the tone split of Middle Sinitic. Was the tone perhaps deliberately chosen to avoid confusion with another sa? Is there another sa it could be confused with?
Coby Lubliner said,
March 10, 2020 @ 11:23 am
The Wikipedia page you refer to says that "Xifu dialect … is the oldest language in China." What does that mean?
michaelyus said,
March 10, 2020 @ 12:05 pm
There are some lexemes with voiceless initials in modern Mandarin varieties that have a 陽平 tone, e.g. 邪 祥 神 隨 inherited from Middle Chinese. It is true that voiceless plosives and plosives in modern standard Mandarin in 陽平 would either be colloquial or from a 入聲 tone originally (e.g. 白); but fricatives are not subject to this (as there was no voiced fricative to "aspirated" voiceless fricative tendency).
Jonathan Smith said,
March 10, 2020 @ 1:21 pm
^ nonetheless "native" yangping obstruent words do have *voiced* antecedents in ping (and ru) tones in Guanzhong (including the items you mention), so David Marjanović seems to be right that there are implications for time of proposed borrowing.
Martin Schwartz said,
March 10, 2020 @ 1:34 pm
For all it matters, the Persian word is pronounced. [sær].
It has cognates in East Iranian languages, most relevantly
Sogdian sar- (which is inflected by adddition of vowels),
the quality of whose -s- is uncertain.
Martin Schwartz
Victor Mair said,
March 10, 2020 @ 2:51 pm
From Chen Su:
A possible trace through which this word entered Guanzhong topolect from Persian that I could come up with is from Persian to Xi’an’s Muslim community dialect and then has been borrowed again to Guanzhong dialect. This word is shared between Muslim and Han ethnic people. For the Muslim community’s dialect of northwestern China, there is a series of folk documents called Xiaoerjin(g), which contains the Chinese’ phonetic writing form in Perso-Arabic script.
—–
Xiao'erjin(g) 小兒經 / 小儿经 // 小兒錦 / 小儿锦
https://en.wikipedia.org/wiki/Xiao%27erjing
https://languagelog.ldc.upenn.edu/nll/?p=45291#comment-1569430
Victor Mair said,
March 10, 2020 @ 3:24 pm
From a long-term resident of Xi'an:
I mostly use the Guanzhong topolect term for head for poking fun at my Shaanxi friends, a phrase that comes out something like “dalegesa”, which sort of means “your dad’s head!”.
Suzanne Valkemirer said,
March 10, 2020 @ 3:47 pm
The question has come up before on Language Log about whether we are dealing with chance formal and semantic similarities OR etymologically related forms.
That question needs to be asked here.
The greater the number of similarities, the likelier the chance of relationships of one or more kinds, and here the number, at least now, is just one, which suggests a chance similarity. Why should just one word have been borrowed and why one that belongs to the core vocabulary? Surely not to fill a (presumed) lexical gap in any Sinitic lect.
David Marjanović said,
March 10, 2020 @ 7:40 pm
I've already addressed these two points; "head" does not belong to the core-of-the-core vocabulary.
But yes: are there other Persian or Sogdian borrowings?
Chris Button said,
March 10, 2020 @ 11:49 pm
If the loan is supposed to go back to Middle Chinese times then the source of the modern "a" vowel could be a challenge because we'd presumably be looking for a stop coda or a retroflex onset, neither of which are attested in the Middle Persian form. Then again, the -r coda in the Persian form could perhaps be being reflected as the retroflexion. Notably, Pulleyblank originally treated his Early Middle Chinese *ʂaɨ as *saʳ.
Chris Button said,
March 10, 2020 @ 11:51 pm
That should be *ʂaʳ with retroflex *ʂ-
Suzanne Valkemirer said,
March 11, 2020 @ 1:27 am
"'head' does not belong to the core-of-the-core vocabulary" (David Marjanović).
You write as if a set of core concepts existed in some Platonic ultimate reality, only waiting to be discovered, when in fact it is a construct of linguists and therefore subject to discussion and modification, as Swadesh's list of 1952 (was that not the first attempt to establish a set of core concepts?) has been.
Therefore, "does not belong" should be "does not, in my opinion, belong."
To my mind, 'head' belongs to the core (in contrast, say, to 'wrist', which in my opinion does not.
Swadesh's final list, published in 1971, ranks 'head' in thirty-eighth place among the one hundred core concepts ranked according to core-ness. You are entitled to disagree with any list of core concepts but not to claim to possess the Irrefragable Truth.
Suzanne Valkemirer said,
March 11, 2020 @ 1:38 am
There is no consensus on the ultimate etymon of German Kopf 'head'. That the word goes back to Latin cupa is one suggestion. Another is that it goes back to Proto-Indo-European *kewp- 'a hollow'. And a third is that it goes back to Proto-Germanic *kuppaz 'crown [of the head]', literally 'bowl', more literally 'round object'.
Bob Ladd said,
March 11, 2020 @ 2:12 am
Wherever Kopf comes from, the fact remains that it replaced a perfectly good word for 'head' (Haupt) that the other Germanic languages seem to have been perfectly happy with. In Romance, too, the original Latin-based word for 'head' has been replaced in French and Italian by a word meaning 'pot' and in Sardinian by a word apparently meaning 'seashell'. In all those languages (I'm not sure about Sardinian) the original word has stayed in use but with a transferred meaning ('boss', 'principal', etc.). So regardless of how much a part of the core vocabulary it is, the Western European example suggests that head is subject to metaphor-based shifts. In the Chinese/Persian case at hand, is there any reason to think that the original word is still around with a meaning like 'chief'?
Chris Button said,
March 11, 2020 @ 5:55 am
@ Bob Ladd
Regardless of the case at hand, I think to be more precise, your example of "testa" (head) is a regular semantic extension of the original Latin sense (pot). A loan would be something like its use specific botanical use in English.
David Marjanović said,
March 11, 2020 @ 7:41 am
So did you.
BTW, there's no need to rely on Swadesh's work from 40 years ago. As a starting point for current research in the area, try the Leipzig-Jakarta list. "Head" isn't in the top 100 at all.
In both of these cases, the Latin word cuppa must have been borrowed into Germanic at some stage. You can see that from Grimm's law: PIE *k at the beginnings of words always becomes *h in Germanic.
Haupt, the regular cognate of head and the almost regular cognate of Latin caput, means "head" only in poetic language anymore; other than that it survives as a prefix meanng "main". Haube refers to various head coverings (which ones exactly differs geographically).
The French outcome of caput is chef, BTW, including English chief.
David Marjanović said,
March 11, 2020 @ 7:52 am
And in both of these cases the word must have been borrowed late enough to have undergone the littera rule in Latin that changed it from cūpa ( < *kewp-) to cuppa.
David Marjanović said,
March 11, 2020 @ 7:54 am
I made a typo in the link to an elaborate conference handout on the littera rule.
Philip Taylor said,
March 11, 2020 @ 7:58 am
(core vocabulary) — this is way, way, way outside my areas of expertise, but — I find it remarkable that the Leipzig-Jakarta list includes (e.g.,) "liver" but not "heart", and that "25% of the words in the Leipzig–Jakarta list are body parts" but not one of those is "head". One cannot argue with statistics, but I do find these facts odd.
Jonathan Smith said,
March 11, 2020 @ 8:44 am
Re: "head" and "basic" vocabulary, in Chinese the etymon represented by Mand. tou2 頭 seems to have replaced the early Chinese word reflected as Mand. shou3 首 (now in compounds where it means most significantly 'first') across daughter varieties beginning 2000+ years ago. Perhaps tou2 頭 ultimately reflects 'pot', or 'bean', or 'skull'… at any rate it could be a "dysphemism" of some kind. And there are similar, newer forms in modern northern (and other) Chinese: nao3dai 腦袋 lit. "brain sack", nao3ke 腦殼 lit. "brain shell"… So re: Philip Taylor's remark, Chinese "head" is naturally "basic"/"core" vocabulary in terms of usage/semantics, but feels less so in terms of resistance to change/replacement, which is often what is meant by the term.
AntC said,
March 11, 2020 @ 3:59 pm
"brain sack" … "brain shell"
I'm (unusually) with Philip T in finding these etymons odd. There's far more prominent things about heads than what's inside them. Even a word derived from 'cup' seems odd: did the Latins habitually drink out of skulls? For a head to be seen as same-shaped as a cup, the cup (or the head) would have to be upside down(?)
Or do all these cultures regard heads as taboo, thus causing frequent turnover of eu/dysphemisms?
What range of languages/cultures did the Leipzig-Jakarta list survey in coming to the conclusion 'head' is not core-of-core?
Michael Watts said,
March 11, 2020 @ 4:37 pm
What range of languages/cultures did the Leipzig-Jakarta list survey?
Going by the wikipedia article and the project's self-description, it was these 41 languages:
Swahili, Iraqw, Gawwada, Hausa, Kanuri, Tarifiyt Berber, Seychelles Creole [it isn't clear to me how a creole could consist of less than 100% loanwords, though], Romanian, Selice Romani, Lower Sorbian, Old High German, Dutch, English, Kildin Saami, Bezhta, Archi, Manange, Ket, Sakha, Oroqen, Japanese, Mandarin Chinese, Thai, Vietnamese, White Hmong, Ceq Wong, Indonesian, Malagasy, Takia, Hawaiian, Gurindji, Yaqui, Zinacantán Tzotzil, Q'eqchi', Otomi, Saramaccan, Imbabura Quechua, Kali'na, Hup, Wichí, and Mapudungun.
They first picked 1460 semantic items and surveyed the expression of those items; it's not a comprehensive investigation into which concepts are most resistant to replacement by foreign loanwords. It's an investigation into which of 1460 concepts have seen the least replacement by foreign loanwords compared to the rest of the list. This doesn't impact "head" much, since it's on the list.
"The head" is vocabulary item 4.2, with a "borrowed" score of 0.2 on a scale from 0 to 1. It's quite unlikely to be borrowed. An item that did make the Leipzig-Jakarta list, "the horn", is vocabulary item 4.17, with a "borrowed" score of 0.16. But the number there is flawed, since the English word "antler" is not included as a reflex of the concept "horn", a clear mistake that would have added a "clearly borrowed" reflex to the "horn" concept. (At least, in American English, the difference between an antler and a horn is the species of the animal — deer have antlers; goats have horns — not anything in the semantics.)
(It's not clear to me why "resistance to replacement by foreign loanwords" is a measurement of interest — I don't see why replacement by an innovative native word would be theoretically distinct from replacement by a foreign word.)
Michael Watts said,
March 11, 2020 @ 4:39 pm
(If it wasn't clear, after looking into this I'm pretty comfortable with the conclusion that "head" is indeed a core-of-core concept.)
Jerry Friedman said,
March 11, 2020 @ 4:50 pm
Michael Watts: I was brought up to believe that the difference between antlers and horns is that antlers are shed each year and horns are not, which I think counts as a semantic difference. (Sorry, I have no comments on your main topic.)
David Marjanović said,
March 11, 2020 @ 5:38 pm
Dysphemisms are not reactions to taboo. They're common in youth slang as exaggerated efforts to "speak normal" as opposed to highfalutin'. Sometimes they take over; the Romance languages have replaced a lot of Latin body-part terms by dysphemisms, and evidently this spilled over into southern West Germanic.
Yes. Still, it's a round container, and typically hollow (especially as a dysphemism). The original meaning of noggin is "cup, mug" or thereabouts. In contemporary German, one way of saying "crazy" translates as "having a crack in the bowl".
Horns are living bone sheathed by horn substance (same as fingernails and beaks), and are not shed. Antlers are dead exposed bone that is shed every year, and exclusive to deer because some things are too crazy for evolution to come up with more than once in a billion years.
For topics precisely like the one of this thread: in the absence of historical documentation, how likely is it that sá "head" is a loanword?
cameron said,
March 11, 2020 @ 5:45 pm
English "horn", mentioned above, is actually cognate with Persian sar. "Cranium" is also from the same PIE root, along with a whole slew of other words pertaining to horns, skulls, and heads.
Chris Button said,
March 11, 2020 @ 7:10 pm
@ Cameron
You beat me to it in posting that. I wouldn't put much faith in any "core" vocabulary lists for that very reason.
Michael Watts said,
March 11, 2020 @ 7:17 pm
Nonsense. This is like saying that bugs are a particular order of insect characterized by sucking mouthparts. That's true as to specialized zoological usage, but it's quite false as to common vernacular English, where "bug" refers to any small crawly thing, such as a centipede (a couple dozen legs, biting mouthparts… it couldn't be less appropriate as a zoological "bug"!). Ordinary humans do not have a specialized semantic space for "little crawly things that suck as opposed to biting", and they also don't have a special semantic space for "horns, but made of dead bone instead of living bone".
Compare Merriam-Webster's glosses for "horn" and "antler":
This makes it clear that, in the eyes of Merriam-Webster, an antler is a type of horn — specifically, the type of horn that is attached to a deer. Any other type of horn is a "horn", including horns that do not contain any bone at all (1.a.3).
Moving to another language, which gives us some idea of what might or might not be a natural semantic space, I see that my English-Chinese dictionaries both gloss "antler" as 鹿角, "deer horn".
(In passing, I'm intrigued that the WOLD glosses for "the horn" in Mandarin and Thai are jiǎo and khǎo, both marked "no evidence for borrowing". They look like the same word to me (but I have absolutely no knowledge of Chinese-Indochinese loaning in general; I'm just looking at the words as they are right now).)
I see a lot of problems with the WOLD glosses in general. English has "horn" for "the horn" but doesn't have "antler". But for "the cliff or precipice", English has two (2) words: "cliff" and "precipice". Why those two? For "the insect", English is represented by a single word, clearly borrowed: "insect". The likely-native "bug" — which would be the correct choice to refer to the semantic space of "the insect" — doesn't exist. The same is true for Mandarin Chinese, which is represented by the zoological term 昆虫 but not the vernacular 虫子. For "the river or stream", Mandarin is represented by 河, 溪 (!), and 川… but not 江.
Except you haven't even addressed the problem I mentioned immediately after the quote you pulled — I see no reason to believe that the processes of (1) replacement of a word by a foreign loanword; and (2) replacement of a word by an alternative native word, differ in any way. But counting foreign replacements will only get you group (1), which will give you the wrong estimate of the rate of replacement. It doesn't make sense to try to count "replacement by foreign words" directly; it is surely better to count "replacement by anything" and adjust that rate for the local knowledge of foreign words.
Chris Button said,
March 11, 2020 @ 10:21 pm
@ David Marjanović, michaelyus & Jonathan Smith
To clarify the confusion here, I think David M's above statement omitted the crucial qualification of "in (Early) Middle Chinese" at the end.
Although an earlier voiced fricative (later devoiced) would have regularly been needed to give the tonal reflex here, I'm not sure that would necessarily have any bearing on the time of the loan. We are talking about the lexeme in the context of a loanword after all, and so to David M's other point…
There appear to be no words in Mandarin of the shape sá (the other three tones are all attested).
My vote therefore is that something like Middle Chinese *ʂaʳ or *ʂaɨ becomes sá with the tone being associated with the external origin. I wonder if the fact that the source word had coronal /s/ might also have facilitated the loss of (perhaps lighter?) retroflexion, which is by no means untoward, although more uncommon than its retention (I don't know if a satisfactory account has ever been made for when it was retained and when it wasn't).
Jonathan Smith said,
March 12, 2020 @ 9:02 am
@ Chris Button It doesn't appear that the variety in question even contrasts s/sh, etc., so "loss of retroflexion" is not an issue. Tone remains an issue however since we lack support for the idea of yangping as "loan tone" or some such. There are interesting collections of purported colloquial Guanzhong items floating around the webs, but I don't have any more serious resources on hand unfortunately…
Philip Taylor said,
March 12, 2020 @ 9:14 am
The Chinese Text Project (Ctext.Org) offers (U+252CC, "to glance at, to catch sight of") as a hanzi pronounced sá.
Jerry Friedman said,
March 12, 2020 @ 10:01 am
Michael Watts: I think far more English speakers don't use "horn" for what deer have than restrict "bug" to Heteroptera (had to look that up). E.g., on the iWeb corpus, "deer horn" gets 98 hits and "deer antler" gets 1508. The differences are much more conspicuous on the human scale—antlers have branches and are shed. (Pronghorns are a bit problematic.)
I don't read the M-W definition as saying what antlers are in the dictionary's eyes. It's reflecting actual usage: some people do refer to deer's horns.
I'm not arguing anything about the "borrowed" score of "the horn" on the Leipzig-Jakarta list.
Philip Taylor said,
March 12, 2020 @ 11:23 am
i am not convinced that in British English, "bug" gets abused quite as much as it seems to in America. What (I think) Americans call "ladybugs", we call "ladybirds". OK, they're no more birds than they are bugs, but you get my point. We have (as far as I know) "bed bugs" (Cimex lectularius, Cimex hemipterus), "shield bugs" (Palomena prasina, Nezara viridula, Rhaphigaster nebulosa) and that's about it, really. The rest we call "insects" (even, sadly some that don't have six legs !), and the shield bugs (which are true bugs) are often called "shield beetles" or "shield insects" in my experience. Apart from bed bugs, computer bugs, surveillance bugs and metaphorical uses such as "that's really bugging me", the word is rarely used over here.
As far as antlers/horns go, deer (inc. elk and moose) have antlers and everything else has horns.
Chris Button said,
March 12, 2020 @ 8:59 pm
So I've been ruminating on this tone thing for a while. And I think it's pretty decisive evidence of a loan.
Firstly, there appear to be only two syllables in EMC with the voiced retroflex sibilant onset /ʐ/. This is automatically extremely suspicious and /ʐ/ is not even distinguished in the Guangyun rhyme tables (see Baxter 1992:206 for a brief discussion)
Secondly, there aren't many syllables with /z/ and they appear restricted to grade-III from OC type-B syllables (see Baxter again).
As I've mentioned elsewhere on LLog, I believe that /z/ in Early Middle Chinese was a variant reflex of /j/ coming from Old Chinese *l- (and I've just noticed that Baxter 1992:198 idly speculates that might be the case–he should have stuck with his hunch).
The (near) lack of /ʐ/ is then neatly accounted for because *lr- would be quite a challenging articulation.
So, in the case of "sá" from Persian "sar" سر, the tone would suggest Early Middle Chinese ʐaʳ (or ʐaɨ). But, we've just seen that /ʐ/ was not really allowed in Early Middle Chinese, and based on the Persian form it should have been voiceless *ʂaʳ in any case (and quite possibly without retroflexion in the onset at all to leave *saʳ ).
So the tone is quite simply exceptional and therefore smacks of a loanword.
As to why it didn't just develop as "sā" anyway, that could be to avoid homphony, but also it wasn't part of the native lexicon and so like most loans wasn't susceptible to the regular rules (I recall when I was working on some Kuki-Chin tones that words with a tense/long vowel and stop coda couldn't have a certain tone according to a particular phonological rule, but that tone was then exceptionally attested in loanwords and the like in spite of the phonotactic constraint)
Jonathan Smith said,
March 12, 2020 @ 10:26 pm
@ Chris Button
?! There was no "Early Middle Chinese" form precisely because such a hypothetical form in (e.g.) z- fails to match potential source items (not to mention that — aside from such a form being, sure, a bit special internally — we apparently do not see cognates elsewhere in Sinitic.) If the loan idea is right, this item has entered relatively recent, devoiced Guanzhong Mandarin in particular, which was the point of the brief comments upthread.
Chris Button said,
March 12, 2020 @ 11:13 pm
@ Jonathan Smith
Actually, the tone is evidence of a loanword because there's no reasonable basis for a form like "sá" to have emerged internally from Middle Chinese. It's no coincidence that the pronunciation doesn't exist apart from in this isolated case.
And it doesn't matter what the source is.
As established on internal grounds, it is right (or else something else weird was going on) regardless of the source.
Mandarin doesn't have voiced obstruent onsets. Or are there certain varieties of Mandarin that do?
Perhaps it was recent and the "r" coda was simply ignored.
Perhaps it entered much earlier and then we have an account for the "r" because it can account for the "a" vowel in Mandarin.
On what basis can you unequivocally choose the former explanation over the latter?
Jonathan Smith said,
March 13, 2020 @ 8:20 am
Speculation is always interesting; I just wanted to emphasize that invoking medieval forms in ʐ-/z- (absent, say, comparative or textual support) is precisely what is *not* warranted here. Yes it appears the tone is special, as pointed out repeatedly above, a fact for which *one* possible explanation could be that the item is borrowed. Also as noted already, the onset *could* have been "retroflex" sh- in earlier Guanzhong, which *could* related to foreign -r.
Chris Button said,
March 13, 2020 @ 8:33 am
I'm saying the opposite–or rather I'm agreeing with you. A form with a voiced fricative would not have existed earlier in spite of being what is required for a regular tonal reflex today.
Andrew (not the same one) said,
March 13, 2020 @ 10:11 am
Apart from bed bugs, computer bugs, surveillance bugs and metaphorical uses such as "that's really bugging me", the word is rarely used over here.
In my usage there are also thunderbugs. And stomach bugs, etc., in which the 'bug' is a germ. I agree that in British English 'bug' is not a general term for little crawly things.
Regarding 'horn', the Abbots Bromley Horn Dance involves antlers, though as it goes back at least to the 17th century this may not perfectly reflect modern usage.
Jonathan Smith said,
March 13, 2020 @ 10:58 am
… so if a loan, it is (relatively) recent and entered yangping of Guanzhong (not registerless "EMC"), and there is no question of "why it didn't just develop as "sā" anyway"… right?
Chris Button said,
March 13, 2020 @ 11:36 am
No, wrong. If it entered during middle Chinese it would of course have remained voiceless just like at any other time. For why it then developed a weird tone contour (as if it had been voiced, which it unequivocally wouldn't have been), see my earlier comment that includes the Kuki Chin remarks.
Jonathan Smith said,
March 13, 2020 @ 12:43 pm
Gotcha; yes that seems to be the crux of the disagreement. And aside from the problem of differential development, you need of course to account for the lack of attestation in other (modern or historical) Chinese languages…
David Marjanović said,
March 13, 2020 @ 6:47 pm
I agree.
I see. I'm not sure I agree, but whether there's a difference may well depend on the borrowing culture…
Chris Button said,
March 14, 2020 @ 5:54 am
Interesting that 首 was then used in an alternate form 頁 for 葉 "leaf" as "page".
Chris Button said,
March 15, 2020 @ 12:47 am
Although, having said that, 頁 for 葉 (EMC jiap) is clearly just a modern loan presumably resulting from a modern convergence in pronunciation.
頁 itself appears to have a different pronunciation of EMC ɣɛt, which still bears no relation to 首 (EMC ɕuw').
Takashima's Bingbian vol.2 (p.330-331) has an interesting discussion of a possible oracle-bone source for 頁 as opposed to 首. I suppose 首 and 頁 could then have been confused graphically (regardless of their unrelated pronunciations) and 頁 could then much later have been used phonetically for 葉 (due to the modern convergence of their pronunciations).
Chris Button said,
March 15, 2020 @ 12:50 am
頁 from EMC ɣɛt giving the Mandarin reading "xié" as opposed to the more commonly encountered "yè"