The historical phonology of "Han", the main Chinese ethnonym

« previous post | next post »

[VHM:  This is a guest post by Chris Button.  It will be primarily of interest to specialists in the phonological history of Sinitic.  Since there are quite a few such scholars on Language Log, I expect that it will occasion the usual lively debate that follows posts on such subjects.  It will also undoubtedly be of interest to historical phonologists in general, as well as to a broad spectrum of Sinologists and their colleagues focusing on other Asian cultures and languages.]

I've been thinking about the etymological associations of Hàn 漢. It's often reconstructed with an aspirated coronal nasal as *hn-, in spite of the Middle Chinese x- then being somewhat unexpected (Baxter and Sagart put it down to dialects), largely on the basis of the *n- in 難. But its etymological association with 艱 and its velar *k- make this problematic. A regular source of MC x- would be *hŋ- which then at least would be a velar onset to parallel *k-. The *n- in 難 could perhaps be put down to some sort of assimilation of *ŋ- with the *-n coda (one might compare 般 *pán < *pám where there is dissimilation of the coda unlike in its phonetic 凡 *bàm) . At the very least, 漢 most likely went back to something like *hŋáns and then *xáns with a velar onset and the -s eventually becoming qu-sheng. An alternative option is rhinoglottophilia whereby a *ʔ became *n- as attested in cases like 憂 *ʔə̀w and 獶(夒) *nə́w a I mentioned here.

The one thing that can be said for sure is that it was some kind of back onset. Speculating wildly, that does of course make me think of the Mongolian title "khan", but I doubt there is a convincing historical association to be made there. Just wondering if you had any thoughts on the matter?

I noticed that Schuessler doesn't reconstruct it with "n" but with a fricative of sorts. He could be right about the fricative, but I think he's wrong about the phonetic in the graph not playing a phonetic role. The etymological associations between 難 and 艱 and others in the word family show that it was  undoubtedly phonetic. He also doesn't recognize the schwa/a ablaut so I suppose for him only the codas match. For me the problem is purely the onset.

Selected readings


  1. Pamela said,

    April 14, 2020 @ 7:40 am

    the Qianlong emperor (not a linguist) thought the association with through Korean han for the early kingdoms (e.g. 三韓). the han in the Korean usage could also be written 漢, but i don't think it was for any signifying reason. the origins of khan/khaghan are, of course, endlessly debated. given the circulation of political terms across Eurasia from Bronze Age forward, they could all be right in some degree.

  2. Alexander Vovin said,

    April 14, 2020 @ 9:46 am

    Khan is an apparent contraction of qaɣan, which is itself of a Yenisseian/Xiong-nu pedigree: PY *qa 'great', + *ɣï(y) 'ruler' + Mongolic -n as the icing on the cake. I trust that any similarity with 漢 is pure accidental.

  3. tom davidson said,

    April 14, 2020 @ 3:31 pm

    may I suggest checking out the 異體字典 published on line by Taiwan's Ministry of Education for more info on the character 漢 han4.

  4. Chris Button said,

    April 14, 2020 @ 4:57 pm

    Khan is an apparent contraction of qaɣan, which is itself of a Yenisseian/Xiong-nu pedigree: PY *qa 'great', + *ɣï(y) 'ruler' + Mongolic -n as the icing on the cake. I trust that any similarity with 漢 is pure accidental.

    That seems the most likely conclusion to me too.

    Also, for what it's worth, Old Chinese did have uvular onsets but they caused vocalic rounding in the evolution to Middle Chinese such that 漢 could not be reconstructed with one.

  5. Chris Button said,

    April 14, 2020 @ 10:47 pm

    the Qianlong emperor (not a linguist) thought the association with through Korean han for the early kingdoms (e.g. 三韓). the han in the Korean usage could also be written 漢, but i don't think it was for any signifying reason.

    As with "khan", I highly doubt (but don't outright reject) any connection of 漢 with 韓, but the comparison does raise an interesting question nonetheless.

    Voiced stops in Old Chinese were clearly prenasalized for phonetic reasons to maintain the voicing, so *g- would really have been *ᵑɡ-

    Maintaining the voicing on velars is harder than articulations farther forward in the mouth, so the occasional shift of ᵑɡ- to ŋ- (Japanese might make a good well-known example here) to fully retain the voicing would not be entirely untoward. It could also account for why we sometimes have nasals alternating with stops in etymologically related words in the same phonetic series in Old Chinese.

    韓 *gán, possibly from earlier **ʱkán if we reconstruct ɦ- as a voicing prefix of sorts, is in line with the evidence for a velar onset in 漢 as attested in counterparts like 艱 with *k-. But, if the onset in 漢 was indeed nasal then *ʰŋáns from earlier **sŋáns would assume that any g- to ᵑɡ- to ŋ- shift occurred prior to the s- prefixation (there is then an -s suffix to account for as well).

    Alternatively, there is evidence that Old Chinese kʰ- had an allophone *x- (in the same way as qʰ- with *χ- and *ts- with *s-) such that 漢 might well have been *xáns (as an allophonic variant of *kʰáns from **skáns), which seems to be more in line with Schuessler's thoughts.

  6. Victor Mair said,

    April 15, 2020 @ 6:15 am

    From Juha Janhunen:

    Just a few thoughts that come to my mind (I am staying in our wilderness cottage with no access to advanced sources):

    In Mongolic kan 'prince' the final -n is a suffix, so: *ka-n. The same root *ka appears in Mongolic *ka-xa-n < *ka-ga-n = Turkic kagan (qaghan) 'emperor' and *ka-tu-n 'empress, queen' (= Turkic *kadun, also attested as ka-su-n in Tabghach/Bei Wei). There is an immense amount of literature on this word, but no conclusion concerning its origin. However, I do not think that it can have any connection with 漢, which is originally a river name. Even so, it is interesting in this connection that the Yellow River in Mongolian is called Khatun ghol (Written Mongol qaduv ghuul = Modern Mongolian /xatǝn gol/) 'Queen River'.

  7. Pamela Crossley said,

    April 15, 2020 @ 8:34 am

    I should be more clear about the Qianlong emperor's theory. he was not connecting 漢 and 韓, he was connecting 三韓的韓 with Manchu han (from Jurchen ha-ghan or however you want to spell it) in the context of a continuous Northeastern political history. in view of Sasha's comments, this could be one of those much-derided traditional ideas that has some sense to it.

  8. Chau said,

    April 15, 2020 @ 10:39 am

    @Juha Janhunen, “Turkic *kadun, also attested as ka-su-n in Tabghach/Bei Wei.”

    It's interesting to see the d in Turkic *kadun becomes s in ka-su-n in Tabghach/Bei Wei. I am curious to know whether there are other examples of Turkic d changing to Tabghach s.

  9. Jonathan Smith said,

    April 15, 2020 @ 4:48 pm

    It's my understanding that BI characters with "難" above "水" write the name of the Han 漢 river. This certainly makes it look like 難 was being used as a phonetic, which might suggest contra Schuessler (2007) that there was indeed a close phonological relationship here. But of course there is the further question of structure of 難 itself. It is supposed to have been designed to write the name of a bird but I don't know of more specific claims or suggestions.

  10. ~flow said,

    April 16, 2020 @ 3:33 am

    @Chris Button concerning a remark in the earlier LL post "Guys and gals: Or, why the 'Chinese are called Han'" linked above; quote:

    "What I do believe is that the reconstruction process that attained the underlying schwa vowel (with its lower a variant) reflects the inherent underlying phonological "vowellessness" of language (distinct from surface phonetics replete with vowels) demonstrating the universal primacy of the syllable as its basic building block."

    I'd be very much interested in learning more about vowel-less phonology for languages with surface vowels as well as how that connects to the primacy of the syllable. As it stands, the remark is enigmatic to me.

    I'm aware this is OT but maybe you would still submit a guest post on the subject? Maybe you have some links? Thanks in advance!

  11. Alexander Browne said,

    April 16, 2020 @ 4:31 pm

    @VHM: Unrelated, except the general geographical area: Mongolia is reportedly planning to replace Cyrillic with the Mongolian script by 2025 (

  12. Chris Button said,

    April 16, 2020 @ 5:58 pm

    @ Jonathan Smith

    Regarding 漢 as 難 with 水, the 邕 ~ 雝 connection is interesting.

    @ ~flow

    I've actually mentioned things to that effect in various LLog comments. This one was even addressed to you directly:

    Aside from various posts on LLog here and there, I'd prefer to let the evidence in my "Derivational Dictionary of Chinese and Japanese Characters" speak for itself rather than engage in a theoretical polemic. The dictionary will of course have an introduction, but you'll need to wait until it's written to read that!

  13. ~flow said,

    April 17, 2020 @ 2:30 am

    @Chris Button Thanks for digging out that link which I re-read, but it's just as enigmatic to me now as it was back then: " linguists are on the whole incredibly reluctant to accept the reality that vowels are nothing but a surface phenomenon".

    This of course reminds me of the famous quip about etymology being a science where consonants count little, and vowels don't count at all.

    My impression is that consonants can often be better traced back than vowels in historical linguistics, maybe because they are on average more 'categorically' produced and perceived (IOW there's a whole continuous range of sounds between [a … e … i] but more of a set of distinct points in the case of, say, [t]—[s]—[k]). In theory, tones should be rather less stable and less reliably reconstructable than vowels. Now, if I understand correctly, you would prefer to keep vowels out of the picture, or represent as side-effects of syllable structure plus consonants(?). Does that imply that, likewise, tones will also be treated as being dependent phenomena? They are thought, after all, to have originated from consonantal features.

    Now one thing I can say is that when one looks at Mandarin, then the constraints of phonotactics in that language allow one to underspecify syllables to a great extent. For example, knowing that a given syllable has a fricative in the onset and is followed by a non-zero medial, the nucleus, and a vocalic non-zero coda, then that syllable must pretty much be one of 'xiao', 'xiu', 'shuai', 'sui', 'shui'—there are a few but not a lot more possible ones. So in that sense I could rid the phonology of that language of some vocalic features, although it is not immediately clear which ones should be kept.

    However such eliminations are dependent on the phonotactics of a specific language. In a language that contrasts syllables like [ban], [ben], [bin], [bon], [bun], how could one claim that [b], [n] and the common syllable structure CVC are phonologically 'real', yet the contrasting [a], [e], [i], [o], [u] are 'nothing but a surface phenomenon'? For surely when there is a five-way distinction in the data, there must be some kind of mechanism in the phonology to account for that five-fold distinction?

    Maybe our notation was lacking in the first place and we should have been more precise and have written [ban], [bʲen], [bʲin], [bʷon], [bʷun], in which case one could try an analysis with three consonants and fewer (but still at least two) vowels. Yet, in the abstract this is little more than shifting around symbols (although in the concrete case, there may be compelling evidence for or explanatory power provided by such an alternative way of viewing things).

    Speaking of which, I really look forward to the evidence for and the explanatory power of your vowel-less analysis so be sure to remind readers here when your book comes out!

  14. Philip Taylor said,

    April 17, 2020 @ 2:06 pm

    ~flow — " there's a whole continuous range of sounds between [a … e … i] but more of a set of distinct points in the case of, say, [t]—[s]—[k]". At first sight I am inclined to agree with you, but if we were to replace your [t]—[s]—[k] graph with (say) [b]—[v] or [r]—[l], then I cannot but feel that a Spanish reader contemplating [b]—[v], or a Japanese reader doing the same with [r]—[l], might not agree. Would you be willing to accept this as a possible counter-argument ?

  15. Chris Button said,

    April 17, 2020 @ 7:44 pm

    @ ~flow

    Yet, in the abstract this is little more than shifting around symbols …

    And in the abstract, two fundamental questions are ignored:

    1. When does a sonorant consonant end and a vowel begin? People familiar with alphabets, be they non-linguists with the Roman alphabet or linguists with the IPA, sometimes falsely believe there is a clear phonological distinction (making a phonetic distinction is a separate matter).

    2. How deep should a phonological analysis go? A few languages are immediately amenable to a vowelless phonological (not phonetic!) analysis. Many other languages may only need such an analysis when subjected to the comparative method.

    … (although in the concrete case, there may be compelling evidence for or explanatory power provided by such an alternative way of viewing things)

    Mandarin has been analyzed as a vowelless language by several people. But what's to be gained from that? It all depends on what the analysis is trying to achieve.

    The one thing that can be said for sure is that forcing a language into a triangular vowel system to make it appear more "normal" but weaken its explanatory force is bad linguistics. Perhaps "forcing" is the wrong word. Let's just say that there exists an inherent bias in many analyses. In my opinion, the reconstruction of Old Chinese is often a case-in-point.

  16. Chris Button said,

    April 17, 2020 @ 10:51 pm

    Pulleyblank (1989) has a suggestion that the shift of *ʰŋ- to x- could sometimes be treated inversely as *x- under the influence of a voicing prefix giving ŋ-. It's certainly phonetically reasonable and could perhaps be applied here to give some credence to the 漢 as *xáns hypothesis. Whether it actually have ever happened is another matter (personally I haven't really heeded the suggestion in the past). Confusion of x- with *ɬ- (as also discussed here ) could perhaps then account for the n- in 難 (perhaps influenced by assimilation with the coda) as its apparent phonetic–the 隹 being lost in 漢 in the same way that it was later lost in 邕 from 雝 (雍).

  17. Chris Button said,

    April 18, 2020 @ 9:59 am

    @ Alexander Vovin

    Khan is an apparent contraction of qaɣan, which is itself of a Yenisseian/Xiong-nu pedigree: PY *qa 'great', + *ɣï(y) 'ruler' + Mongolic -n as the icing on the cake. I trust that any similarity with 漢 is pure accidental.

    Although I said the following…

    That seems the most likely conclusion to me too.

    I wonder perhaps if the problem is that we're looking at an origin for 漢 as a river name going as far back as the Zhou period and hence treating its use as a dynastic name as something native.

    Could we perhaps ignore that origin, and just assume that the character was borrowed to represent a new word "qaɣan / qan" and that the coincidence is rather that the river name happened to have a similar pronunciation (and hence be a suitable graphic representation). The OC x- could perhaps represent a combination of q + ɣ ?

    Or can that be discounted for some chronological reason?

  18. Chris Button said,

    April 18, 2020 @ 8:20 pm

    And to be clear, OC x- (as an allophone of *kʰ-) would be the next best thing after *χ- (as an allophone of qʰ-) because the OC uvulars may well have all already disappeared by the time of the loan. We can't reconstruct *χ- because that would have triggered labialization in the evolution of 漢 into Middle Chinese.

  19. ~flow said,

    April 19, 2020 @ 2:54 am

    @Philip Taylor: yes, [b]…[β]…[v] is much more of a continuum than [t]…[s]…[k] of course, and in some languages (Spanish as you mentioned) much more so than in others (English, German). Likewise, we find series of sounds only gradually differentiated by voice onset timing, or think of Mandarin [s]…[ɕ]…[ʂ].

    I think the point is that while we do find series of 'universally confusable' sounds—sounds that are kept categorically apart in some languages, but not differentiated by or hard to differentiate by speakers of other languages—among the so-called consonants, it is as if the entire vowel space (in the absence of secondary articulations like nasalization, pharyngealization, or the retroflective vowels of some English dialects) acts as *one* single space of gradual differentiations.

    Which is to say that if one were to draw a web of 'sounds' (based, say, on the symbols offered by the IPA, as a starting point) with 'similar', gradually differentiated nodes placed in proximity, and longer distances for categorically differentiated sound(group)s (e.g. [p]—[k]—[s]), then one could well end up with the entire space of oral vowels being close together in one spot, on an equal footing with other groups ([b]…[β]…[v]), ([s]…[ɕ]…[ʂ]) and so on, but, crucially, not so much the consonants in their entirety.

    In this simplified picture I have omitted the 'gateways' between sounds, like lenition processes where processes like [aga] > [aγa] > [a:] happen, and have not outlined ways to model the close proximity or even overlap of rhotics and other liquid consonants with vowels, or the systematic parallels between, say, [k]/[g], [t]/[d] and so on.

    While the model does not do away with vowels as such, it does seem to shift their status, from being right near the top in a tree that has 'sound' as its root and branches into 'consonants' on the one hand and 'vowels' on the other, to a network of interconnected 'clusters' the majority of which are consonant-like (because there are so many fricatives, plosives, flaps and trills), and the minority of which are vowel-like. Be it said that of course there's infinite possible variation between [i]…[e]…[a]; however, as in music, we might be justified to map out that infinitely wiggly coast line by merely noting the positions of the lighthouses and the promontories, as it were, in a fashion not unlike what we do in music and colors, where a few notes and hues get prominently named in spite of the infinite gradual variations they offer.

  20. Chris Button said,

    April 19, 2020 @ 6:13 am

    @ ~flow

    I think the distinction you are trying to get at is not one about a continuum. The whole buccal cavity is one of those.

    Rather, a "vowel" functions as sonorant syllabic peak. An obstruent (which should by default be voiceless–hence it struggles to retain voicing as discussed above) is inaudible without the vowel. Its formants just affect the following vowel. To quote the phonetician Peter Ladefoged: "Many consonants are just ways of beginning or ending vowels".

    In Proto-Indo-European, sonorants (n, l, etc.) patterned as vowels. Many languages (including Sinitic ones) have sonorants as syllables.

    A far more useful distinction than consonant-vowel is obstruent-sonorant.

  21. Philip Taylor said,

    April 19, 2020 @ 6:25 am

    ~flow : all pointed noted and agreed. But I did find Peter Ladefoged's (via Chris B) "[m]any consonants are just ways of beginning or ending vowels" both amusing and intellectually stimulating.

  22. ~flow said,

    April 19, 2020 @ 7:21 am

    Yes, I find that a quite educating quote. Although, as we've heard here recently, there are apparently languages with words that lack sonorant nuclei, and at any rate [f],[s] (which I count as obstruents) are, similar to vowels, both audible on their own and indefinitely maintainable. But yes, I agree with something like sonorance/non-sonorance being probably at the (syllabic) heart of linguistic utterances.

  23. Rodger C said,

    April 19, 2020 @ 12:15 pm

    [f],[s] (which I count as obstruents) are, similar to vowels, both audible on their own and indefinitely maintainable

    Isn't it the case that we call the letters "eff," "ess" (like "ell, em, en, ar") because the Etruscans pronounced the letter-names as continuants? Leave it to the Etruscans.

  24. Chris Button said,

    April 19, 2020 @ 3:57 pm

    On a separate tack from 邕 ~ 雝 (雍), I wonder if the bird in 難 has anything to do with the dog in 然 (燃) ? The sense running through the word family being "torrid, parched" after all?

  25. Chris Button said,

    April 20, 2020 @ 8:02 pm

    For anyone still reading (and interested!), how about the following:

    The phonetic in 漢 (i.e., minus the three 水 marks on the left), had a velar onset. This onset is attested throughout its phonetic series and in the many etymologically related words in that series belonging to the "torrid, parched" word family (a sense supported by the earliest graphic form of the phonetic and its usage).

    An originally unrelated word, represented by 然 with a nasal onset, overlapped with the "torrid, parched" family above as a result of its similar rhyme and meaning. The character 難 (from which emerged a few other nasal derivatives) was used to represent a word etymologically related to this word family instead.

    Whether the use of 漢 with its velar onset as the name for "Han" was a loan from qaɣan "Khan" remains pure speculation, but at least it is linguistically plausible if nothing else.

  26. Chris Button said,

    April 21, 2020 @ 9:18 pm

    Regarding the 隹 component, 歎 and 嘆 in a sense of "bewail, sigh" makes an interesting comparison with Peter Schriver's proposed association of "woe" (hence the sense of distress) in English with Celtic words for "seagull" as the wailing bird (and also "wolf")…

RSS feed for comments on this post