Evolution of the sinograph and the word for horse
« previous post | next post »
This is the regular script form of the Chinese character for horse: 馬.
When I used to give talks in schools, libraries, and retirement homes, anywhere I was invited, I would write 馬 (10 strokes, official in Taiwan) on the blackboard or a large sheet of paper and show it to the audience, then ask them what they thought it meant. Out of the hundreds, if not thousands, of people to whom I showed this character, not one person ever guessed what it signified. When I told those who were assembled that it was a picture of something they were familiar with, nobody got it. When I said it was a picture of a common animal, nobody could recognize what it represented.
All the more, when I showed the audiences the simplified form of the character, 马 (3 strokes, official in the PRC), nobody could get it.
Here is the evolution of this character from the oldest form (about 3,200 years ago) at the top left, going to the top right, then going to the next line and proceeding from left to right, and the same for the third line, and ending with the regular, traditional form and regular, simplified form at the bottom right.
Juha Janhunen assembled a wealth of relevant data in “The horse in East Asia: Reviewing the Linguistic Evidence,” in Victor H. Mair ed.,The Bronze Age and Early Iron Age Peoples of Eastern Central Asia (Washington, DC: Institute for the Study of Man; Philadelphia: The University of Pennsylvania Museum, 1998), vol. 1 of 2, pp. 415-430, but didn't draw a firm conclusion concerning possible relatedness between IE words for horse and Central and East Asian words for horse.
Here are Janhunen's latest thoughts (3/3/19, personal communication) on Eurasian words for horse:
I do not see any particular chronological problem in connecting Old Chinese *mra with IE "mare". A possible problem is, however, the geographical distance, as cognates of *mare* do not seem to have been attested in other IE branches except Germanic and Celtic.
However this may be, my point in the 1998 paper was that horse terminology is more diversified in the languages spoken in the region where the horse comes from, and where the wild horse still lives, that is, northern Kazakhstan, East Turkestan, and Mongolia. In view of this it looks like the word *mVrV 'horse' could be originally Mongolic. In any case, it was certainly borrowed from Mongolic to Tungusic (at least twice), and quite probably also to Koreanic (*morV) and Sinitic (*mVrV), from where it spread further to Japonic. From Tungusic it was borrowed to Amuric (Ghilyak). It may also have been borrowed westwards to some branches of IE, if we do not think that the geographical distance is a problem. However, even if the cognates of "mare" can mean 'horse' in general, this does not seem to have been the basic word for 'horse' in PIE. By contrast, in Mongolic *morï/n is the basic word for 'horse', while other items are used for 'stallion' (*adïrga, also in Turkic) and 'mare' (*gexü, not attested in Turkic, but borrowed to Tungusic).
I have always felt that Sinitic mǎ 馬 ("horse") is related to Germanic "mare", though not necessarily directly (from Germanic to Sinitic).
There are some problems, of course, namely:
- "mare" refers to the female of the species.
- Germanic is too late for Sinitic, which had the word mǎ 馬 ("horse") by 1200 BC (though Janhunen doesn't think it's an insuperable problem)
However, the word is also in Celtic (see below), and how far back would that take us?
Even the 5th ed. of the AH Dictionary cites Pokorny 700 "marko", but that may not be a reliable PIE root. Nonetheless, the phonology of the Celtic words alone fits quite well with the Old Sinitic reconstructions for mǎ 馬 ("horse"), namely:
(Baxter–Sagart): /*mˤraʔ/ (Zhengzhang): /*mraːʔ/
Here is what the Online Etymology Dictionary has to say about "mare":
…"female of the horse or any other equine animal," Old English meare, also mere (Mercian), myre (West Saxon), fem. of mearh "horse," from Proto-Germanic *marhijo- "female horse" (source also of Old Saxon meriha, Old Norse merr, Old Frisian merrie, Dutch merrie, Old High German meriha, German Mähre "mare"), said to be of Gaulish origin (compare Irish and Gaelic marc, Welsh march [VHM: ["stallion; steed"], Breton marh "horse").
The fem. form is not recorded in Gothic, and there are no known cognates beyond Germanic and Celtic, so perhaps it is a word from a substrate language. The masc. forms have disappeared in English and German except as disguised in marshal (n.).
So the big questions are:
- how far back do the Celtic words go?
- how are the Germanic and Celtic words related?
- what came before the Celtic and Germanic words? "a word from a substrate language" OR Is Pokorny 700 "marko" for real? (He could not have dreamed it up to satisfy a possible relationship with Sinitic.)
From Axel Schuessler, ABC Etymological Dictionary of Old Chinese (Honolulu: University of Hawai'i Press, 2007), p. 373:
mǎ 馬 ("horse"), Minimal Old Chinese / Sinitic reconstruction *mrâ?
Horse and chariot were introduced into Shang period China around 1200 BC from the west (Shaughnessy HJAS 48, 1988: 189-237). Therefore this word is prob. a loan from a Central Asian language, note Mongolian morin 'horse'. Either the animal has been known to the ST people long before its domesticated version was introduced; or OC and TB languages borrowed the word from the same Central Asian source.
Middle Korean mol also goes back to the Central Asian word, as does Japanese uma, unless it is a loan from CH (Miyake 1997: 195). Tai maaC2 and similar SE Asian forms are CH loans.
So much for horse-related words for now. There are many more posts and comments related to horses, horse chariots, horse riding, and so forth, and more to come.
Next up, we have to figure out the sexagenary cycle of 60 intermeshed 10 Heavenly Stems and 12 Earthly Branches (zodiacal animals) and how 5 duodenary cycles fit into that, horse being the 7th animal in that cycle of 12 zodiacal signs.
Meanwhile, today let's celebrate the Year of the Horse in as many languages as we can think of:
The Horse zodiac sign (seventh in the cycle) represents energy and independence, known as mǎ (马) in Chinese, uma (午) in Japanese, and ngọ in Vietnamese. It is widely recognized in East Asian, Southeast Asian, and related zodiac systems that share the 12-animal cycle, including Thai, Korean, and Mongolian cultures. (AIO)
Hi Yo / Ho Silver, Away!
Selected readings
- "Mare, mǎ ("horse"), etc." (11/17/19)
- "Horse culture comes east" (11/15/20) — with very long bibliography
- "Once more on Sinitic *mraɣ and Celtic and Germanic *marko for 'horse'" (4/28/20)
- "Some Mongolian words for 'horse'" (11/7/19)
- "'Horse Master' in IE and in Sinitic" (11/9/19)
- "Horses, soma, riddles, magi, and animal style art in southern China" (11/11/19)
- "'Horse' and 'language' in Korean" (10/30/19)
- “Of horseriding and Old Sinitic reconstructions” (4/21/19)
- Mair, Victor H. “The Horse in Late Prehistoric China: Wresting Culture and Control from the ‘Barbarians.’” In Marsha Levine, Colin Renfrew, and Katie Boyle, ed. Prehistoric steppe adaptation and the horse, McDonald Institute Monographs. Cambridge: McDonald Institute for Archaeological Research, 2003, pp. 163-187 — The domesticated horse, the chariot, and the wheel came to East Asia from the west, and so did horse riding.
JMGN said,
February 16, 2026 @ 11:14 am
Here we are, horsing around with hanzi agaiin… ;-)
Chris Button said,
February 16, 2026 @ 2:41 pm
Lai Guolong once gave a presentation where he suggested that the eye served as a quasi phonetic. I'm not sure if he ever published it.
A medial -r- is not necessary in 目 *mə̀q, but a reconstruction of mrə̀q would give the same reflex. And *mrə̀q chimes quite well well with 馬 *mráɣʔ
Chris Button said,
February 16, 2026 @ 3:18 pm
Japanese uma is also attested as 馬 muma. It's similar to how 梅 ume is attested as muməj, which is also another early Sinitic loan.
anon said,
February 16, 2026 @ 4:02 pm
Can anybody illustrate for us the Mundari sadom "horse", Ho sadom "horse", Santali sadɔm "horse", Asuri sadom "horse" and proto-Kherwarian Munda *sadɔm 'horse'?
ajay said,
February 17, 2026 @ 5:33 am
I have maintained that the eye was exaggerated this way because, for Central Plains people who were not intimately acquainted with horses, they were terrified of the large jaws and especially the big, glaring eyes of the horse, though a few people have told me that they don't think those features reflected a fear of horses nor were they especially noticeable. I emphatically beg to differ.
I would not say that the eye is deliberately exaggerated. It's a dot. You can't make a dot any smaller than a dot. There would be no way, using that writing utensil and drawing a horse at that size, not to have an eye that is far bigger than it should be. See also the mane – if that's to scale, their horses have a mane of three hairs, each of which is about ten inches thick.
What it reminds me of is a child's drawing – just as a child will draw people with huge heads, huge facial features and small bodies, because the child is exaggerating the important things about a person. That doesn't necessarily mean that the child is exaggerating the features she is afraid of. And, really, who is afraid of horses because their eyes are so big? The jaws, yes, and the hooves (entirely reasonably) and the general size of it.
There are some problems, of course, namely:
1. "mare" refers to the female of the species.
It isn't unknown for the term for the female also to be the generic term for the animal. "Goose". "Hen". "Duck". "Cow". The common factor is that the female is the more useful of the sexes because it produces eggs or milk or whatever. Same could be true of horses; Central Asian horsemen I believe preferred to ride mares because they had better endurance.
Philip Taylor said,
February 17, 2026 @ 7:25 am
Two comments, in logically reverse order. (1) As to whether "Central Asian horsemen […] preferred to ride mares because they had better endurance", I cannot say, but a modern equestrian saying is "tell a gelding, ask a mare, but discuss it with a stallion". This may or may not be relevant. (2) Having read Victor's introduction, as soon as I saw the blue-grounded image at https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcRO3K6w7lDjKu4M1J4dFHRSN5hgraVOWXaMpR03i9aDkRqbax9T and long before reading the second paragraph below ("[…] the horse is standing vertically") my eye mentally rotated the top-left image through 90° anti-clockwise, but what I then saw appeared to be a cow rather than a horse.
Victor Mair said,
February 17, 2026 @ 8:04 am
In the oracle bone form, it was just a large, round eye. In the bronze form, they added a pupil.
Pamela said,
February 17, 2026 @ 9:44 am
How much are we projecting by making a problem of the fact that what is now a word that applies only to a particular sex of a species might once have been a term for a whole speciies? Since horse breeding began, mares have always vastly outnumbered stallions (and even geldings). Moreover, in most regions mares have tended to be local while stallions tended to be imported. It doesn't seem to me to be a problem that "mare" in some form or other could have been the normal way to indicate "horse" for some string of centuries. Horse breeders, of course, would always have needed sex-specific terminology, but breeders were generally small in number and a specialized sort of population. But they may have brought their terminologies along with them, as with their stallions, and depending on social dynamics, their terms could have caught on. In any case, awareness of horses as a wild species (and there are no wild horses surviving, only back-bred feral horses) must have predated interaction with domesticated horses by tens of thousands of years, so it seems to make sense to me that there will be a chronological layering of terms, along with interjection of breeding vocabulary as geography and social dynamics determine.
Victor Mair said,
February 17, 2026 @ 10:27 am
Sage words, Pamela!
ajay said,
February 17, 2026 @ 10:40 am
In the oracle bone form, it was just a large, round eye. In the bronze form, they added a pupil.
Oh, you think that whole circle is the eye and the dot is just the pupil? Yes, that would be huge.
katarina said,
February 17, 2026 @ 1:22 pm
Happy Lunar New Year !
In the second graph/glyph of 馬 “horse", I am struck once again
by the wit of the ancient graphic artist-cum-scribe. The scribe has made the elongated shape of the horse's head serve as the elongated shape of the horse's eye by adding the circle with dot to represent an iris with pupil.
The artist-cum-scribe has created a portmanteau graph by collapsing two images– head and eye– into one image. Eyes in a face convey intelligence and vitality, making the head-cum-eye graph dynamic, while most of the other 馬 graphs are static. Also dynamic is the third graph which seems to show the horse's mane flying in the air, suggesting speed.
VMartin said,
February 17, 2026 @ 3:21 pm
According to the author of Japhetidology Nikolai Marr the word for "horse" in Gothic language exists in Namai (translation of AI from Russian "языку нама").
Marr proposed his own explanation of Korean horse mal and Chinese ma. In his Marxist theory, the same name of an animal skipped from one domestic species to another as ancient societies developed. In this case, the same name might be used successively for a dog, donkey, horse, and camel.
In Namai common words from the circle of names of domestic animals: 'sheep' turns out to be the same word as 'cow', resp. 'bull'. The same is the case with 'horse' (ha-p), the name of which had time to be deposited even with the change ( ha -> ka—> ga-) in the crossed terms of the Romance world (ka-bal: Latin caballus, etc.), as well as the Semitic (Hebrew ga-mal, etc., here according to the functional semantics meaning 'camel'), and the second element bal (val—»-mal) the same terms, Romance, "Indo-European", and Semitic, are associated with the Pacific East.
He elucidated his concept of Chinese ma and the details for the names of donkeys in ancient and Caucasian languages in the article Iberian Horse From Sea to Sea inChinese Language and Speech Paleontology.
Yves Rehbein said,
February 17, 2026 @ 4:27 pm
The OBI shape (1) has a large head and an exagerated eye, but the head does not resemble the eye sign itself. The Bronze Inscription shape (2) has the whole eye sign replacing the entire head. To think that this is due to shared onset is only logical (@ Chris Button), though it is not clear that that is intentional.
I had better say merging rather than replacing. Its modern form 馬 hardly resembles 目. The strokes extending to the right have apparently merged wit the mane.
As for rotation, I guess that hunted animals were seen layed out on the ground and that this view became generalized. Carriages are depicted from above since the neolithicum.
Well, assume an arbitrary m-prefix, rhotacism of an approximant, tapped consonant, perhaps d after Kortland's effect *d ~ *h1 (say bottle of water in Bri'ish English) and don't take yourself to serious. It might be possible to see some correspondence to PIE *h₁éḱwos. I have a couple of pet theories about this.
E.g. compare Latin asinus ("donkey, add"): Usually compared to Ancient Greek ὄνος (ónos) (which cannot be its direct ancestor), and, just like other IE words for "ass", must be traced back to an unknown substrate source in Asia Minor (compare Hieroglyphic Luwian [script needed] (tarkasna), Sumerian (anše)). The lack of rhotacism of the single intervocalic -s- after a short vowel would point to a recent borrowing. https://en.wiktionary.org/wiki/asinus#Latin Cf. "The agnostic interpretation ‘equid’ is retained in Oreshko 2012b:161, 270." and "Luw. tarkasna- ‘donkey; horse(?)’ has no direct cognates outside Anatolian." (eDiAna #319). Note, incidentally dragon, from δέρκομαι ( to see").
In another way to look at it *kw and *kw are prone to become bilabial (NB: Juliette Blevins's explanation of initial *m starts from plosive *b). That would leave the rest of (m)arkos to be explained, though. See at any rate δρόμος and dromedar.
The latest common ancestor of the Celtic languages is not much older than Proto-Germanic, and barely as old as late Shang (v. the Oracle Bone Script character). Wikipedia gives Proto-Celtic an era 1300–800 BC, but for Proto-Germanic it is precisely 500 BC. So that's apples to oranges, because Celtic on the mainland mostly is an archaeological, socio-cultural designation.
The bulk of loanwords in Proto-Germanic suggest a receiving position under Celtic influence (cf. The Indo-European puzzle revisited). The effect of Grimm's law is rarely observed in these (cf. most recently The Reconstruction of Indo-European Stop Systems). Germanic has another putative Mongolian loanword, quiver, however. So that is difficult to explain.
If Italo-Celtic may be considered a unitary branch (op. cit. citing Olander et al., The Indo-European language family) and if the Germanic word may be considered to be borrowed from Celtic, then the absence of evidence in Italic would set a lower bound for reconstruction. I do not adhere to that theory because I did not think it was widely accepted.
Are Mongolic, Tungusic and Turkic even that old, as their late attestation imposes limits on a feasable reconstruction?! "not earlier than the thirteenth century." (Janhunen apud WT: Proto-Mongolic).
.
Since I commented on the snake year, I feel compelled to add another comment. Actually, I am not sure.
Tashi Delek!
Coby said,
February 17, 2026 @ 7:35 pm
I assume that the (unstated, as far as I could tell) motivation for this post is that today marks the beginning of the year of the horse.
Victor Mair said,
February 17, 2026 @ 8:02 pm
@Coby
It's right there at the beginning of the penultimate paragraph.
ajay said,
February 18, 2026 @ 4:44 am
"Can anybody illustrate for us the Mundari sadom "horse", Ho sadom "horse", Santali sadɔm "horse", Asuri sadom "horse" and proto-Kherwarian Munda *sadɔm 'horse'?"
These are all derived from English. The horse is almost unique among animals in that it can be ridden by humans. It can be "sat on", hence "sadom".
Philip Taylor said,
February 18, 2026 @ 10:50 am
How many counter-examples would one have to adduce in order to challenge your "almost unique", Ajay ? I would add the donkey, the water-buffalo and the camel — perhaps others could add more.
ktschwarz said,
February 18, 2026 @ 2:03 pm
According to Wiktionary and etymological dictionaries, Proto-Germanic *marhaz meant 'horse', not specifically 'mare'; the specifically feminine word was *marhijō, formed by adding a suffix, and this is the ancestor of mare. Descendants of *marhaz without the suffix can also be seen e.g. in marshal (discussed previously at Language Log), Icelandic mar '(poetic) horse', and several East Germanic personal names (according to the OED's etymology).
If *marhaz was generically 'horse', then no explanation is needed for why the Celtic and potential other relatives also mean 'horse' rather than specifically 'mare'.
Jonathan Smith said,
February 18, 2026 @ 2:48 pm
ajay forgot the green font, but the Munda sadom-type words are interesting (see e.g. p. 141 in Suzuki et. al. ed. Linguistic Atlas of Asia and Africa vol. 1, or all of Chap. 3 for "horse" generally)… let us test increasingly-embraced-on-LL-Asia GenAI, below ChatGPT, on "sadom type words [for horse] in smaller AA languages" —
——
"Compare with Proto-Malayo-Polynesian *sadi(m) — 'riding animal, mount' (often specifically 'horse' in later daughter languages) […] this root shows up widely in western Indonesian languages. […] Blust et al., Austronesian Comparative Dictionary (ACD) entry (summary): *PMP sadi(m) — 'riding animal, mount.'"
When pressed
"I messed up. There is no solid evidence for a Proto-Malayo-Polynesian (PMP) reconstruction *sadi(m) ‘mount / riding animal’ in the major comparative sources. That form should not have been presented as a reconstruction. […] The claim that a form like *sadi(m) 'shows up widely in western Indonesian languages' is false."
Well! What is true? Who (that uses such tools) knows?! Good times.
Andreas Johansson said,
February 18, 2026 @ 3:35 pm
@Philip Taylor:
Animals that have been ridden on a regular basis include elephants (at least two species) and reindeer. "The" camel includes two regularly-ridden species, of course.
Chris Button said,
February 18, 2026 @ 4:13 pm
I would love to know where the Chinese word for camel (駱駝 / 橐駝) came from.
Philip Taylor said,
February 18, 2026 @ 5:27 pm
Ah yes, elephants I forgot, but I did not know that reindeer are / can be ridden. I have once again learned something from this erudite forum.
VMartin said,
February 18, 2026 @ 5:34 pm
According to Nikolai Marr ancient tribal concepts were diffuse and combined horse, water and river into one image, hence the frequent horse names of rivers. As an example, I found only Sequana and equus, but he also mentions the Danube. This should be reflected also in mythology: Poseidon Hippios, Neptune Equester.
Between Slovakia and Bohemia flows the river Morava, German Mähren/March. AI etymology of Mähren states that the name is different from Mähre, old horse. However, in Slovakia, a similar-sounding word for cattle is sometimes used, marha. It is also used as insult. Marha is from Hungarian and means cattle. I remind you that Hungarians came to Europe from Asia on ponies and not horses, as they present it. They wedged themselves into the Old Slavic settlement with established agriculture. Here AI etymology teaches us that Marha is derived from German Marchat and Latin marcatus, as market. But why would Hungarians take this word from German and the other words inextricably linked to the cattle from Old Slovak? Smith, slov. kováč, hung. kovacs, Hay slov. seno, hung. szena, A group of servants on a farm is slov. čelaď, hung. cseléd etc..
German Mähren is in Czech/ Slovak Morava. AI for Morava states that the word "evokes water".
DDeden said,
February 18, 2026 @ 11:04 pm
Jonathon Smith, I think Blust's correction is correct.
Today, in Malay and Indonesian, tunggang/menunggan refers to riding a horse or riding in a car, while naik refers to 'climb aboard (a bus)'. I've never heard of sadi(m) or sadom used in that sense.
David Marjanović said,
February 19, 2026 @ 8:46 am
That's why the German form has umlaut, Mähre. (It doesn't mean "mare" anymore, it's a pejorative for horses in general, but it's still grammatically feminine.)
There is no way to tell from the words themselves whether Proto-Germanic *marha- is cognate with Proto-Celtic *marko- or a loan from it; both ways we'd get the same result (as long as the loan happened before Grimm's law did, but 1. most Celtic loans in Proto-Germanic happened before Grimm's law, and 2. there's no obstacle to dating Grimm's law pretty late. It happened before first contact with the Romans, but the figure of "500 BC" that is traditionally bandied about is just a somewhat educated guess with a huge error margin.
If they're cognate, i.e. inherited from the last common ancestor of Proto-Germanic and Proto-Celtic, it does not follow that it was already present in PIE because it is not like that PIE was the last common ancestor of PGmc and PC. Further, the *a is, at best, highly unusual for a PIE root, so this alone already makes the word look like a loan, even if it was loaned into an ancestor of "West IE" (Germanic + Italo-Celtic) long before 500 BC. We could explain the *a away by postulating a PIE-level *mh₂r-kó-, but, if this was even phonotactically possible, there doesn't seem to be a root *meh₂r- or a root *mh₂er- that *mh₂r- would be the zero grade of. Also, in this case, the stressed diminutive-or-so suffix *-kó- would guarantee that the word was borrowed from (Pre-)Celtic into Pre-Germanic because otherwise Verner's law would have turned the *k into *g [ɣ] instead of *h [x]. Celtic had already defaulted all stress to word-initial by that time, and initial stress would indeed generate the observed Germanic *h.
In short, the mare word looks like a loan from a local language in northwestern Europe into Celtic and, directly or indirectly, Germanic; and its similarity to the Sinitic version of the northeastern Asian "horse" word looks like just another coincidence.
There aren't even any traces of it in Turkic, are there?
Well, *ʔ argues against a loan directly from a form with *k, and the Mongolic etc. forms have no plosive at all.
ajay said,
February 19, 2026 @ 10:02 am
Philip et al: Routinely ridden animals, off the top of my head, are horse, donkey, mule, dromedary, Bactrian camel, Indian elephant, North African forest elephant (now extinct but used by Hannibal among others), water buffalo/carabao and no doubt other similar bovines, reindeer, yak.
I don't think anyone's managed to ride African savannah elephants.
A few other animals *could* be ridden (zebras, goats, giant tortoises, ostrich, maybe llamas) but that's more in the "stunt" category than "routinely ridden".
Jonathan Smith said,
February 19, 2026 @ 1:28 pm
@DDeden to be clear, all of that including under "when pressed…" is ChatGPT. I know nothing of the relevant languages but didn't on (quick) inspection find anything to suggest its statements re: Blust and associated reference works are anything other than sheerest hallucination, to use the current euphemism.
Yves Rehbein said,
February 19, 2026 @ 5:38 pm
@ DM, I am not sure if a suffix would participate in borrowing. Sinitic is perhaps not strictly monosyllabic, and there may well be lookalike compounds I don't know, but it is not agglutinative, though whether traces of affixes remained is another debate.
Depending on the interpretation of *h2 /*χ/ (after Kloekhorst e.g.) merging with *h1 /*ʔ/ in *H may be comparable with /*mˤraʔ/ under metathesis, although I am not sure it is reliable. For examples see Lubotsky, in the above cited ed. Mair 1998, vol. I, and Matisoff also said that r frequently particpates in metathesis. A velar suffix could be added independently or be metathesized, too, see *meh₂ḱ-, macro-, *méǵh₂s ~ *m̥ǵh₂-, mega-, and no laryngeal for *megʰ-, (to be) mighty; similarly Latin ego and mihi, quite uncertainn, hic sunt dragones. Besides, the sequence *ḱr is also seen in English horse.
Interestingly, a root *(s)meh₂- exists independently (s.v. manus), which is ultimately not clear to me but its Germanic meaning (*mundō "hand; protection, security") would remind of *peh2-. The parallel of *méh₂tēr and *ph₂tḗr should support this, although from my perspective it is not warranted, not only because Buba-Kiki words are frequently onomamapopoeic, but since my intention was to compare *h₁éḱwos and *daps, I would likewise argue for a suffix, which agrees with *ḱwṓ ~ *ḱun- ("hound") and to a lesser extent with *h₂ŕ̥tḱos ("bear") and not very much with *wĺ̥kʷos ("wolf"), though *waylos (the same) should count for something. Incidentally, PST *d-kʷəj-n (s.v. 狗犬, Wiktionary) notes *m-par ~ pra "wild dog, wolf" with m-prefix. Cf. Burmese wampu.lwe "wolf", wam "bear" + pu.lwe "flute" (Wiktionary) makes sense because it howls (PIE *wáylos "howler"). As for pan-Slavic medjed, *med "honey" + *(j)ěsti "eat": We know that honey was borrowed in Sinitic (thus Lubotsky 1999), which would constrain the comparison somewhat.
Chris Button said,
February 19, 2026 @ 6:11 pm
Best not ignore the the Mon-Khmer velar nasal coda that also pops up highly sporadically in Tibeto-Burman languages.
Michael Vnuk said,
February 20, 2026 @ 5:39 am
Many decades ago, when Christmas cards were far more common, I received one from an unknown person. The envelope had no return address and the only potential identifying feature was a signature, but I didn’t recognise it as a whole, nor could I make out any individual letters with confidence, apart from a possible J at the beginning. Later, I found out that the card was from one of the organisation’s top managers. This manager was many levels above me in the chain of command, and I had had almost no dealings with him, directly or indirectly, hence I didn’t recognise his signature. (There had been numerous recent resignations in the organisation and perhaps the card was to show that management appreciated those who stayed. I never got another card in the following years I was with the organisation.)
Signatures are often like that. When the signer is young, the signature usually comprises fairly recognisable letters, but signatures often evolve (mutate?) into something quicker to write, and legibility can suffer. The US Secretary of the Treasury (2013–2017), Jacob J Lew, had an illegible signature (like a piece of stretched Slinky), but he must have been convinced to be more careful for the version of his signature that ended up appearing on paper currency.
And signatures remind me of sinographs. In some, you can see more or less what the original was, but it is impossible in others, even when you are told what the sinograph or signature was based on. Victor Mair says that none of the hundreds of people he’s shown the modern ‘horse’ sinograph to have recognised it as a horse, even with some hints. (It’s just possible to see how it has evolved over the centuries in the diagram he provides, but that’s got nothing to do with recognising the modern sinograph.) So if no one can recognise it, why did he tell people that ‘it was a picture of a common animal’? It is not a ‘picture’.
David Marjanović said,
February 20, 2026 @ 5:58 am
As always that depends mostly on whether the borrowers recognize it as such. The classic case is Arabic nouns showing up with al- in Spanish but without it in Italian.
Old Chinese still had a few productive suffixes (mostly *-s).
More likely *[h] actually; this would not only explain why it behaved phonotactically like the other fricatives, but also Bozzone's law (…speaking of horses…!). The "Kortlandt effect" is not (as the Leiden school says) a merger of *d into *h₁ which then always disappears with compensatory lengthening, but a direct dissimilatory loss of *d with compensatory lengthening.
I do think *h₂ [χ] merged into *h₁ [h] in a very large IE branch (West IE + Indo-Slavic or so), but borrowing that as [ʔ] would require some additional explanation.
If you equate everything with everything, you'll end up going mad like Marr who in the end believed all language had developed from the four syllables "sal, ber, yon, rosh".
I'm not sure what you even mean. There's no agreement between "horse" and "dog", if that's what you mean; "horse" does not have a prefix – its root is *h₁éḱ-, "fast".
(Also, the form *h₁éḱwos with two unreduced vowels is not even Proto-Indo-Actually-European. Anatolian inherited the expected *h₁éḱus, e.g. Hieroglyphic Luwian á-zu- */hatsu/-. Then the word was turned into an *o-stem, giving *h₁ḱwos, the starting point for the Greek form (*i inserted as a zero-grade marker and stressed by nominalization; see link above). And then the stress was shifted back to the root to nominalize the adjective, giving *h₁éḱwos at last in the rest of the family.
David Marjanović said,
February 20, 2026 @ 7:09 am
Forgot:
Does indeed contain a suffix.
No, medved, *medʰu- + *ed-. It looks like "honey-knower" (med- + ved-), but that's not what it is.
Chris Button said,
February 20, 2026 @ 7:57 am
Most of the affixes posited in Old Chinese are unwarranted in my opinion.
Regarding the comment about the word for dog above, I posted this in an earlier thread:
"A combination like *-waɲ could not occur in a native Old Burmese word (the combination of an earlier medial -j-, which would have palatalized an earlier -n coda, with the medial -w- is a phonotactic violation). So, the Burmese word /kʰwé/ "dog" is a perfect correlate for 犬 with its nasal coda in Old Chinese.
Unfortunately, comparisons in the academic literature tend to just talk about a spurious -n suffix, which is wholly unnecessary (like many of the pseudo affixes that are often posited)."
VMartin said,
February 20, 2026 @ 10:56 pm
Chris Button said,
February 21, 2026 @ 6:07 pm
I think I'm going to have to retract my support of this proposal.
目 reconstructs as *mjə̀q (as Pulleyblank 1977-8 has it) with a -j- rather than a -r- or nothing. The reflex would have been the same in EMC as mukʷ, but there is solid evidence for the -j- in its relationships.
David Marjanović said,
February 24, 2026 @ 2:47 pm
Last time we talked about this, I mentioned a conference abstract (which, unfortunately, I didn't link to and haven't found again) that proposed that 犬 did not have a nasal coda and instead ended in *-r. In your reply you didn't explain why you disagreed with that.
Just soon enough to save the lives of a whole bunch of actual linguists. The biologists were not so lucky.
Marrism : linguistics :: Lysenkoism : biology
Random pseudoscience.
Dkl said,
February 24, 2026 @ 5:50 pm
@VMartin: "According to Marr, who was indeed sent to a madhouse by some of his opponents…" and so on.
Apparently you are making it up. Nikolay Marr wasn't sent to any medical institution by any of his opponents. He ended his days in 1934 at the peak of recognition and glory. For fifteen more years Japhetic theory and linguistic paleontology were used against the "bourgeois linguistics" the same way as Lysenkoism was used to fight the "bourgeois genetics". It was not until 1950 that linguistics in the Soviet Union was rehabilitated.
Lacking linguistic (or philolgical as it was called those days) education, he did not even understand the arguments of his opponents. What you cite is fringe (and cringy) pseudoscience.
Dkl said,
February 24, 2026 @ 5:52 pm
@VMartin: "According to Marr, who was indeed sent to a madhouse by some of his opponents…" and so on.
Apparently you are making it up. Nikolay Marr wasn't sent to any medical institution by any of his opponents. He ended his days in 1934 at the peak of recognition and glory. For fifteen more years his Japhetic theory and linguistic paleontology were used against the "bourgeois linguistics" the same way as Lysenkoism was used to fight the "bourgeois genetics". It was not until 1950 that linguistics in the Soviet Union was rehabilitated.
Lacking linguistic (or philolgical as it was called those days) education, he did not even understand the arguments of his opponents. What you cite is fringe (and cringy) pseudoscience.
Chris Button said,
February 24, 2026 @ 11:50 pm
@ David Marjanović
The presentation was by Laurent Sagart in 2018: "OC *-r in early Chinese loans to Bùyāng, and related issues".
I happened to be in attendance.
The idea that -r shifts randomly/dialectally to -j or -n is a wild card in my opinion.
The supposed contacts between -j and -n come rather from -l (the shift of -l to -j came later) and -n sometimes overlapping in rhyming.
As for Old Chinese 犬 kʰʷə́ɲˀ, it is the perfectly regular correlate of Old Burmese kʰwɨj² “dog” (tone ² even parallels the glottal in Old Chinese).
Take a look at how the anomalous word klwaɲ "serve" in inscriptional Burmese, which I believe is the only known case of -waɲ, is also written in a more expected form as klwɨj.
VMartin said,
February 25, 2026 @ 6:53 am
@DkI
You are right, his opponents only said, that he should be sent there. And yet Alpatov in his critical The History Of One Myth (2004) wrote also positively about Marr's studies on Kartvelian languages and stated:
Marr as Caucasus scholar was also highly regarded abroad. In the early 1920s, he was invited to occupy the Georgian Chair at Cambridge. Even as late as 1946, V. Manning wrote in the journal "Armenian Quarterly": "Armenians and humanity will never forget Marr the Armenian scholar" .
David Marjanović said,
February 25, 2026 @ 11:26 am
I was wondering why I hadn't seen that story before…!
Oh! Awesome. (That presentation is prominently featured on his academia.edu page; I just didn't look inside this time.)
Definitely not randomly, but of course it would help to have additional evidence for these eastern vs. western dialects that (regularly) merged -r with one or the other.
Is it clear what the regular OB correlate of OC -r is?
That's when he was still in shouting distance of sanity. He kept getting worse over the decades.
Chris Button said,
February 25, 2026 @ 1:49 pm
Random variation would of course need to be justified as some kind of "dialectal" variation to make it sound acceptable and not just a wildcard to be pulled out when needed.
David Marjanović said,
February 25, 2026 @ 5:58 pm
I found the apparently only proposal: zero.
Chris Button said,
February 25, 2026 @ 8:17 pm
… which presumably requires the assumption that there was an -r coda in Old Chinese as something distinct from -n?
Chris Button said,
February 26, 2026 @ 7:56 am
And so a better question is what Pulleyblank was asking back in 1962:
What was it about the /n/ coda that allowed it overlap internally in Chinese with /l/ and also externally in foreign words with /r/ ?
I'm thinking of sounds like [ɾ̃] or [n̆] perhaps?
A look at the phonetics behind the Yue l ~ n onset interchange could be informative.
David Marjanović said,
February 26, 2026 @ 4:37 pm
Yes. It seems you doubt that?
Are you trying to imagine an actual human language in which syllables can end in the [ɾ̃] found in AmE international but not in something straightforward like [r] or [n]?
Chris Button said,
February 26, 2026 @ 5:33 pm
Of course!
The -r is just an artificial construct designed to account for the faulty notion that -j and -n overlap.
I'm trying to understand what it is about the -n coda that allowed it overlap internally in Chinese with -l and also externally in foreign words with -r?
That is the question that you should be asking.
I would start by looking at things like the Spanish [ɾ] coda or the Quebecois rhotic articulations of nasalized vowels in syllable final position.
The problem since Karlgren's time is that phonetics have been blithely ignored in favor of the phoneme.
I can do no better than quote Egerod (1955) in his review of Martin's phonemicization of Karlgren:
"… it is precisely the richness of the living language, the wealth of phonetic information checked and rechecked with informants, which makes phonemic simplifications of a definite nature possible, whereas in a reconstructed language like Ancient Chinese the formulae, more or less inadequately draped in phonetic garb, do not posses sufficient phonetic richness to allow of much further interpretation. There is even a vicious circle involved: since often phonemic considerations of patterning and contrasts have been employed, consciously or unconsciously, to arrive at the phonetics, these in turn cannot in such cases help us to establish the phonemics of the language."
Chris Button said,
February 26, 2026 @ 5:50 pm
As Pulleyblank (1962) notes, we find 安 transcribing Antonius and Aršak
David Marjanović said,
February 26, 2026 @ 6:05 pm
…Oh.
Evidence for a separate *-r in OC is right there in Sagart's presentation handout:
In short, Càijiā – like Mǐn – is descended from OC but not from MC; instead of merging *-r chaotically into *-n in some words but into *-j in others, it keeps *-r and *-n distinct as zero and whatever ɴ is (it's the IPA symbol for the uvular nasal, so maybe that's what's intended, but you never know).
The long next section shows that OC *-r corresponds to Tibetan zero after *i *u, but to Tibetan -r after all other vowels.
Later:
…and…
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hill 2016 is also of interest, particularly (all brackets in the original, footnotes and superscript letters indicating different types of evidence omitted):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In addition to Dakota and Lakota, there is Nakota, which has merged *r into *n.
Cree has dialects that keep *r distinct as [r] or [ð], while others merge it into [j] or [n]… the endonym of the language illustrates all by itself that these are mergers: yêhiyawêwin in [j]-dialects, nêhiyawêwin in [n]-dialects.
David Marjanović said,
February 26, 2026 @ 6:10 pm
If you're trying to say that rhymes are the only evidence for *-r, that is not so, as I just posted.
In the same work, by the same transcriber?
David Marjanović said,
February 26, 2026 @ 6:12 pm
…in a long comment that then disappeared. Probably it's in moderation because it contains two links.
Chris Button said,
February 26, 2026 @ 7:26 pm
None of what you cite is evidence for -r in OC.
What you cite are hypotheses for which there are alternative interpretations. Circular arguments do tend to be self-reinforcing.
If you're really interested in OC, I recommend reading more widely. Bear in mind, this is not a popular hard science, such as biology, so older articles are often better than newer ones because there isn't much out there.
The question as to why some Tibeto-Burman languages show a three-way -n, -l, -r distinction where Chinese only shows two is an interesting one.
Regarding -r vs -n, Pulleyblank suggested "a merging of the two phonemes into a single one with some of the characteristics of both."
It's a good hypothesis. But it's also worth noting that /r/ patterns like /j/ and /w/ in OC as a prosody in the OC syllable. And neither -j nor -w existed as original codas. Rather, they resulted from coda palatalization/labialization.
Chris Button said,
February 26, 2026 @ 8:07 pm
Basically, it starts with a faulty premise: there is a weird OC -j ~ -n alternation (e.g. in the classic 殷 vs 衣). That faulty premise is then addressed with a spurious internally unattested coda -r. That spurious coda is then forced into service elsewhere to provide "evidence" for the faulty premise (e.g. in "dog").
Jonathan Smith said,
February 26, 2026 @ 11:39 pm
In that conference paper Sagart says (p. 4) "with exceptions, -ɴ is the expected Càijiā treatment of [OC] *-m, *n, *-ŋ" — but he then footnotes about as many "exceptions" as positive examples. And there are plenty more even in the short available wordlists… just one is wrt his remark (also p. 4) that
"Càijiā loses *-r" as shown by ka33 'dry' and ka33 'liver'… but looks like Caijia 'cold' is ka31, also Chinese-looking but AFAIK regarded as OC *-n… and these three items look kinda a lot like e.g. SMin (Taiwanese) koãA1 'dried [meat, fruit, etc.]', koãA1 'liver', and koãA2 'cold' w/ their nasalized a…… who knows about "OC", but I would hazard a guess on this and other bases that Min and Caijia progenitors were connected meaningfully and complexedly.
(If what Wiktionary reports about Bo 2004 is right, Caijia ka33 actually means [Mand.] shai4 曬 'expose to the sun (e.g.) to dry', not 'dry adj. (like e.g. earth)', bringing it closer to the Min item above, but who's counting.)
At any rate the idea that a clear signal of hypothetical OC *-r ≠ *-n was going to be straightforwardly reflected in short, vague (e.g.) Caijia wordlists was always pretty dubious… the nature of contact w/ multiple kinds of Chinese among others across millennia was just more complicated than such a notion allows… FWIW I don't think most people even regard this language as Sinitic, though Sagart (as is typical) states this as fact.
BUT no @Chris Button you are still not right :D Sagart at least looks at language data, and comes up with (wild) hypotheses about them. I really don't know what compels you to present as an authority on such matters (as if there were any) and exhort others to "read up" all the time *sighs deep*
Chris Button said,
February 26, 2026 @ 11:49 pm
The shift of the EMC palatal nasal ɲ to the LMC retroflex rhotic ɻ presumably via a retroflex nasal ɳ has me wondering if the OC -n coda was articulated more like -ɳ to distinguish it from OC -ɲ.
Chris Button said,
February 27, 2026 @ 12:39 am
@ Jonathan Smith
Agreed.
@ David Marjanović
Returning to "dog", the failure of the onset in the Lushai word for "dog" (cited above) to correspond with the OC velar is simply being ignored.
I'm intimately familiar with it because it was a word I carefully elicited when collecting fieldwork data to see if such words began with glottal stops or not.
The spectrogram clearly showed that it does, although several related Northern Chin languages do just have a vocalic (i.e. zero) onset there.
Chris Button said,
February 27, 2026 @ 1:01 am
@ David Marjanović
I apologize if this came across as patronizing.
And I don't mean to unduly criticize Baxter & Sagart, who are both scholars I have met and respect.
What I should have is that Baxter & Sagart's work is always worth reading. But there is far more to Old Chinese than just what they say. And there are a lot of unresolved issues (as I'm sure they would both admit).
Yves Rehbein said,
February 28, 2026 @ 3:26 am
@ David Marjanović, since I mentioned onomatopoeia, I could see *h₁éḱ- being that clicking sound of a horse's galloping hooves. I knew a guy who made a convincing impression of it. "Beatbox impression horse" gives good results, too.
Incidentally, to tap is compatible with *d. Quasi-PIE *daps, that I mentioned before, does have a synonym PIE *h₁yaǵ- ("to sacrifice"), which may be cited in many different forms (s.v. Greek ἅγιος e.g.). However, the similar form of *h₁eyǵʰ-, including German jagen ("to chase")[1] and a Celtic cognate ("to scream, cry")[2] – though both are uncertain – has occured to me while thinking about vulgar expressions gurken and juckeln ("to drive" vel sim.). Chatbot suggested that Jucker does in fact refer to a horse. Wiktionary: Proto-Germanic *jukkōną (“to hop; to run”) is undecided, and Wikipedia refers to a hungarian kind of span ("Ungarische Anspannung (Jucker-Anspannung)"). At that point the similarity to yoke should be considered, cf. PIE *yewg- ("to join"). This would lead to me to comment on the calendrical signs, since on the one hand a span or team is often grammatical dual, and seven on the other hand can be construed as five and two, thus Sumerian imin, though it is not actually clear how PST *s-ni-s (七 OC /*[tsʰ]i[t]/ "seven") should be connected to PST *kV-ni-s (二 OC /*ni[j]-s/ "two", 次 OC /*[s-n̥]i[j]-s/ "second; put in order")[3] and the similarity to the seventh earthly branch isn't great (午 OC /*[m].qʰˁaʔ/, B–S; *ŋ…, @CB).
[1]: e.g. Breton chas ("dog") may be from French chasse (“hunt, hunting”).
[2]: the Egyptian hieroglyph of an adorant with raised arms that is linked to the letter he, our E, is explainable from Semitic hll ("jubilation").
[3]: next consider the following, no pun intended.
Honestly, I wanted to comment on onomatopoeia neigh, German wiehern w.r.t. PIE *(H)wí- ~ dwoH (Kortlandt effect). It's such a huge topic, sure I like to compare everything with everything (@DM). That is called brute force search or backtracking. Yet I do not see how *kw or even *kʰʷ could be onomatopoeic w.r.t. 犬. So I disagree with myself a lot, when what you want is a professor who states everything as fact (straight from the horses mouth :P) Remarkably though, PIE *h₁éḱ- is missing from the one branch you need it to be in. That's just a theory and I would hedge it appropriately.
Yves Rehbein said,
February 28, 2026 @ 3:46 am
Circular arguments do tend to be self-reinforcing. @ Chris Button, Hoenigswalde, whose textbook is credited as the classical exposition of the comporative method by Ringe, wrote in a later article – I want to say admits – that the comparative method is to some degree circular and that that is not a problem as long as everyone agrees (Is the “comparative” method general or family specific? Henry Hoenigswald, in: Patterns of Change – Change of Patterns, ed. Philip Baldi 1991, p.183-192).
As far as language contact in prehistory, experts agree on hardly anything. Old Chinese on the other hand seems to suffer from a strong focus on internal derivation (cf. "bottom-up
reconstruction principles adopted in Baxter and Sagart 2014", cited above).
David Marjanović said,
February 28, 2026 @ 5:46 pm
Points taken, no offenses taken! It also goes without saying that I don't mean to deify (or demonize) anybody; moreover, Baxter & Sagart don't do that themselves, as seen e.g. in their responses to criticism (of which I've read this response to Schuessler and this more recent response to Ho Dah-an). I'm also aware of criticism of very specific details within their system by other people, e.g. here and here.
There are cases of that, sure, and I absolutely agree that since you're trying to reconstruct an actual language that actual people actually spoke*, you need to have hypotheses on how your reconstruction was actually pronounced.
* Some historical linguists have tried to convince themselves that that's not their goal, not even aspirationally; that reconstructions only are notations for correspondence sets and can't be more than that. Frankly, that doesn't work.
And where would that come from? Is there a publication on this?
But there are three sets here: one only has evidence of *-n; one only has evidence of *-j; and one has evidence of both. Most words with *-n or *-j show no evidence of alternation. Some of them aren't attested widely enough to be sure we just haven't seen their alternations so far, but some are.
A straightforward way of accounting for why some words have this alternation while others lack it is to postulate a third set. Postulating, further, that this third set had specifically *-[r] makes sense of a number of further facts, including but not limited to external comparison, so I can't see why you seem to hate it so much. Why do you call it "faulty"? Where is the fault?
…I just learned it's somewhere around Bái, meaning that it's not clear if 1) it's descended from OC as I asserted above, or 2) is more closely related to Sinitic than to any other ST branch but not actually descended from OC, or 3) nowhere near Sinitic (perhaps Qiāngic or what's left of that), so that all (instead of just most) its Sinitic vocabulary is layer upon layer of loans. As long as the cited words belong to the same layer, it ought not to make a difference, but this would be good to clarify…
More likely via affrication, AFAIK. That's easier to imagine than a palatal becoming retroflex (unconditioned at that).
Even if you're right and there was an OC *-ɲ, I would not expect that for typological reasons. What you could get, though, is velarization, as found in Russian and Gaelic. (It's going to be a few months before I hear a lot of Lithuanian, but I expect the same there.) While I'm at it, velarized vowels (as opposed to consonants) are a thing in Gyalrongic languages…
I greatly appreciate your fieldwork, then! But how sure are you that this glottal stop doesn't correspond to an OC pharyngealized velar? Those easily become uvulars (as happened in Arabic, and in one of the early layers of Bái), and uvulars easily become glottals (as happened to voiceless stops in Egyptian Arabic, Maltese, and under certain conditions MC).
An obvious problem here is that ST phylogeny and sound correspondences are not well understood…
What, [hɛkʲ]? Really?
Googling "beatbox impression horse" does bring up lots of results, most of them on TikTok which is such bloatware that I prefer to avoid it. This impressive sample on YouTube consists entirely of clicks.
…maybe.
(For the modern language at least, I'd say "hunt" is the primary meaning; "chase" is rather hetzen. But such things change, as shown by chasser meaning "hunt"…)
There's really not much similarity between *h₁eyǵʰ- and *yewg-.
What do you mean? Anatolian? It's there.
I haven't read that, but so far I can't agree. It's quite ordinary science: it makes hypotheses and tests them against data and parsimony.
What exactly do you mean?
David Marjanović said,
February 28, 2026 @ 5:47 pm
(Comment with six links in moderation.)
Chris Button said,
March 1, 2026 @ 11:39 am
@ Yves Rehbein
Burmese shows "seven" to be a compound of "six-two" (i.e. the second number after you begin counting in fives again from 6 onward). There was a discussion about this on LLog a while back.
@ David Marjanović
– ʲəɣ > -əj
– ʲək > -əc
– ʲəŋ > -əɲ
etc.
I think you made a typo with -ɲ for -ɳ.
There were retroflex nasal codas of a sort in EMC when the "r" prosody in type-A syllables was thrown from the onset to the coda (e.g. -ʳán > aʳn), which would presumably give a phonetic output similar to the Norwegian retroflex nasals codas from -rn clusters. But I agree with your point about typology nonetheless.
I was actually just reading Henderson's (1966) analysis of Vietnamese, where she notes the following about Southern Vietnamese (SV): "SV -n is post-aveolar and markedly retroflex". Confusingly she then later uses the phonetic symbol -ŋ (rather than say -ṉ or -ɳ) in accordance with standard analyses of Southern Vietnamese.
But I do like your comments about Russian and Gaelic! The -n coda obviously can't merge with -ŋ, but an Irish-style -nˠ for -n could work really well! That would actually create a really nice counterpart to -ɲ (i.e., -nʲ).
They've created a problem that doesn't exist and then tried to solve it.
The connections in rhyming/xiesheng are not between -n and -j but between a far more expected/normal -n and -l.
Firstly, I don't believe there were any pharyngealized velars (by the way, they reconstruct pharyngealization across all major onsets, which includes pharyngealized uvulars).
But putting that aside for the moment, the only other comparable case I can come up with in my datasets where Mizo (Lushai) Ɂ- might somehow correspond with an Old Burmese velar onset is the word for "excrement": Mizo Ɂe² and Old Burmese *kʰlɨj² (compare 屎). But Benedict and Shorto both note an Austroasiatic source for the Mizo form.
And it's the phonetics (not the phonemics) that are behind the sound changes. To be clear, I'm not dismissing the tremendous value of phonemic analysis all, but all to often it devolves into simplistic notations with no real attention to what could actually have been going on phonetically.
Chris Button said,
March 1, 2026 @ 11:42 am
Last sentence should say: *… but all too often it devolves…"
Chris Button said,
March 1, 2026 @ 2:49 pm
Yes, I agree with you about ɲ- not going through a ɳ- stage to get to ɻ-.
In addition to the phonetics, ɳ- remains as ɳ- between EMC and LMC (Pulleyblank actually treats ɳ- as an nr- affricate- rather than a proper retroflex in any case).
Pulleyblank (1983:89) postulates: ɲ > ɲdʑ > ʑ̃ > ʐ̃ > ʐ. He adds a note that he regards ʐ- phonemically as r- but phonetically ɹ-, which he treats as a retroflex so should really be treated as ɻ-. For context, there was a general merger of EMC palatals with LMC retroflexes: ɕ > ʂ, ʨ > ꭧ, ʑ/ʥ > ʂʱ.
Chris Button said,
March 1, 2026 @ 2:50 pm
ɲ > ɲdʑ > ʑ̃ > ʐ̃ > ʐ
(missed a semicolon in the html)
Yves Rehbein said,
March 3, 2026 @ 3:30 am
@ DM, yes, but no. There is no Anatolian adjective "fast" either in the Wiktionary or eDiAna under *h1ek̂u̯o-.
And no, not [*kʲ], see AmE. tsk-tsk, BrE. tuk-tuk, for example.
David Marjanović said,
March 3, 2026 @ 4:08 pm
I'll try to find that, because I don't think this particular way of building numerals is attested anywhere.
So, where others reconstruct *-i, *-ik, *-iŋ for OC, you reconstruct *-əj, *-əc, *-əɲ? Where B&S and some others reconstruct *-e, *-ek, *-eŋ, you reconstruct *-aj, *-ac, *-aɲ? And where others reconstruct *-e and *-aj separately, you diagnose an error, presumably involving your *ɣ? Am I getting this right?
That's interesting especially because B&S *-r already corresponds to both -r and -l of other ST languages. Have you elaborated on this somewhere?
Yes, I know. That definitely makes for a consonant inventory on the large side, but surprisingly many East Caucasian languages distinguish plain from pharyngealized uvulars (and from plain velars, though they don't have pharyngealized ones). Given that B&S themselves have said in a publication that they think this extra-large consonant system only pertains to the final stage of what they call OC, and that */ʔ/ was a separate segment before that, I'm not losing sleep over that. (…Even though their speculation about where this */ʔ/ came from at a yet earlier stage is less compelling.)
Oh, I agree. Maybe it's IEists in particular who like to claim we can only reconstruct phonemes, and phonetic detail is difficult to reconstruct or lost forever, while in reality it's often easier to reconstruct sounds than to figure out what the sound system looked like, in other words which phonemes these sounds belonged to (which sometimes changes in a sound change).
We're in violent agreement. :-)
The *-o- is newfangled; it's missing in the Anatolian version of the "horse" word. "Thematicization" of an *u-stem adjective has precedents.
…I'm not claiming certainty that *h₁ was [h], but to suggest that it was a dental click… doesn't strike me as serious.
*ḱ pretty much has to have been [kʲ].
David Marjanović said,
March 3, 2026 @ 6:15 pm
Argh, I misclicked. I mean /ʕ/ of course, not "/ʔ/".
Chris Button said,
March 3, 2026 @ 6:42 pm
It's quite literally "six two" in Inscriptional Burmese. Unfortunately almost all comparisons with Burmese are made with Written Burmese in the literature.
B&S's *-e and *-aj are Pulleyblank's *-aj and *-al.
I recall when comparing data across languages that distinguishing the liquids unsurprisingly became very messy very quickly.
Northwest Caucasian too.
But the idea that pharyngealization occurs regularly across all basic onsets to account for a bifurcation in OC syllable types seems far fetched and typologically insupportable.
For an alternative perspective, I recommend you read Pulleyblank's short "Prosody or Pharyngealization in Old Chinese" article, which is a response to Norman's pharyngealization proposal.
Bear in mind, pharyngealization plays an important role in Pulleyblank's reconstruction of Middle Chinese, but it has nothing to do with what Norman (and by extension) B&S are talking about.
By the way, have you read Pulleyblank's "Middle Chinese" (1984) book? That is the one book I would recommend over anything else, but it is an insanely challenging read!
Chris Button said,
March 3, 2026 @ 11:17 pm
All things considered, I think a nasalized lateral [l̃] allophone for -n probably best accounts for the internal and external connections.
Incidentally, I have a theory that the attestation of -r, -l, -n codas in some TB languages, such as Kuki-Chin, versus just -l and -n in OC might perhaps be tied to a rhotic prosody in the syllable that paralleled the palatal and labial prosodies. That's why -j and -w do not occur originally as codas in OC either.
But the lack of need to reconstruct an -r coda in OC means I haven't really needed to consider the viability of the idea. And I have no idea what it would mean for OC-PTB comparisons.
David Marjanović said,
March 4, 2026 @ 6:12 pm
That sounds interesting and worth pursuing!
No, and I'm flattered that you even think it's possible I might have! I'm a complete autodidact in linguistics (thank Gore for the internet) – and what Sinitic I actually speak, let alone read, is limited to a few beginner's courses in Standard Mandarin, so I suspect I couldn't follow a lot of the book. My vocabulary is tiny.
But 1984 was a long time ago. Baxter's 1992 book on MC and all OC reconstructions except Karlgren's, I think, have built on that book. Is there really a point in climbing down the giants upon giants?
Chris Button said,
March 4, 2026 @ 8:08 pm
Baxter doesn't bother to really reconstruct Middle Chinese; he just provides what he calls a "notation".
Pulleyblank thinks a good reconstruction of Middle Chinese is a prerequisite for reconstructing Old Chinese.
David Marjanović said,
March 5, 2026 @ 10:47 am
I know Baxter's is a notation for what the rhyme books & stuff say; is Pulleyblank's an actual comparative reconstruction from the descendants of MC?
Chris Button said,
March 5, 2026 @ 4:35 pm
Pulleyblank attempts to use all the evidence available. That includes everything from internal Sinitic language/dialect data to Sino-Xenic data. So he covers everything from Min to Sino-Japanese. The base is still of course the Qieyun and Yunjing.
You should read the book. It is immensely challenging, but the challenge is to do with linguistic content rather than knowledge of the Chinese script. I recall that in his review of Pulleyblank's associated 1991 lexicon, Egerod said something about it being incredibly useful for anyone who has struggled to glean the essence of what Pulleyblank is saying in his 1984 book.
David Marjanović said,
March 7, 2026 @ 3:33 pm
Good to know, thanks!
Yves Rehbein said,
March 13, 2026 @ 2:53 pm
Well, you are potentially wrong, @ DM. David Sasseville et al. (eDiAna) reconstruct *h1ek̂u̯o- on the assumption that the Luwian u-stem is phonetically conditioned. I don't need to lecture you.
The fact is that the adjective seems to require abstract nonsense in the form of *Ho‑h1k̑‑u- (vel sim.) ‘swift’ (eDiAna), attested chiefly in Grecko-Indo-Iranian, which is too late to support your theory in most phylogenies. It is correctly presented as a hypothesis because it is based on internal reconstruction at a proto-stage. I mean, the collocation attested by Sanskrit āśvàśva‑ etc. could easily be poetically licensed backformation, even if you find evidence in Anatolian and want to project that back to some flavor of PIE.