Comparative Common Shē and Common Neo-Hakka

« previous post | next post »

I have observed the author working on this 749 page volume for many years, so it is with great rejoicing that it is available in time to send to friends, colleagues, and students as a Yuletide gift:

South Coblin, Common Shē and Common Hakka-Shē: A Comparative Study
Language and Linguistics Monograph Series 68

Institute of Linguistics, Academia Sinica (Taipei:  November, 2025)

Introduction

The present work is divided into two parts. Part I is devoted to the reconstruction of the phonology of Common Shē, the ancestral form of the closely related Sinitic dialects spoken by the Shē ethnic minority of China. The approach applied is the classical comparative method, in which modern data from seventeen modern dialects are subjected to comparative reconstructive analysis. Data from additional Shē varieties are also adduced as needed. The end product of these procedures is a hypothetical phonological system, which for the sake of brevity we call Common Shē, though this term should more precisely encompass not only phonology but also syntax and lexicon.

As outlined elsewhere (Coblin 2018; 2019), we hold that Common Shē and Common Neo-Hakka, the proto-language from which the modern Neo-Hakka dialects derive, are closely related sister languages descended from a common speech form which in the present work we call Common Hakka-Shē. Part II below is accordingly devoted to the comparison of Common Shē and Common Neo-Hakka, so as to arrive at a higher order Common Hakka-Shē reconstructed system. This comparative exercise takes as its basis the Common Shē forms reconstructed in Part I and the Common Neo-Hakka ones presented in our earlier study of comparative Hakka (Coblin 2019). The final chapter of Part II summarizes and assesses our findings regarding Common Hakka-Shē and concludes with suggestions for the future study of even earlier stages in the history of early south central spoken Chinese.

At the end of the work, Appendix I gives the entire corpus of 647 Shē and Neo-Hakka comparative syllable sets used in the basic reconstruction of Common Hakka-Shē. Following this, in Appendix II, is a corpus of 658 comparative Shē lexical sets. Lexical material of this sort, which comprises both monosyllabic and polysyllabic words, is collected in a number of published Shē dialect surveys and sometimes studied in more or less detail there, but to our knowledge these data have not so far been treated from the standpoint of comparative reconstruction. We take this step here, first because the Shē dialects are relatively less well-known among students of Sinitic languages and, secondly, in order to present an experimental model for how a full comparative Shē etymological dictionary might someday be constructed. The data are arranged topically, and the entire Appendix is followed by an English index. Some data from this appendix are also adduced in the Hakka-Shē reconstructive work in Part II. Finally, a brief general index to pertinent topics in the work as a whole concludes the monograph.

References

Coblin, W. South. 2018. “Neo-Hakka, Paleo-Hakka, and Early Southern Highlands Chinese”, Yǔyán yánjiù jíkān 語言研究集刊 [Bulletin of Linguistic Studies], vol. 21 (2018). Shanghai:  Shanghai cishu chubanshe. (Special number in honor of Jerry Norman.) pp. 175-238.

Coblin, W. South.  2019.  Common Neo-Hakka: a Comparative Reconstruction. Language and Linguistics Monograph Series Number 63, Taipei: Institute of Linguistics, Academia Sinica.

[VHM:  If someone would like to have either / both of these two items, I think that I can supply them.  The copy of 2018 I can send restores all of the maps, which the publisher deleted without consulting the author.  

As to 2019, the electronic version of that is downloadable from the same Academia Sinica, Institute of Language and Linguistics website where the Hakka-She thing is found. The hard copy version has to be purchased, of course. Since about 2018 the electronic versions of all new monographs from there are available for free download. So you can get for free anything there that interests you from now on.]

The thirteen-page Table of Contents includes a two-page summary, a four-page preface, a two-page list of maps, and a two-page list of abbreviations and signs.  After that comes a detailed list of chapters and sub-chapters.  The book concludes with two appendices:

Hakka-Shē comparative data (262 pages)

Lexical sets (120 pages), which I find to be of extraordinary value and interest, so I will list them here:

1. Natural Phenomena   607
2. Earth, Fire, and Water   611
3. Man and Nature   616
4. Animals   616
5. Fowl    617
6. Domestic Animals   619
7. Insects   621
8. Fish   623
9. Man and Animals   624
10. Plants   625
11. Flowers and Grasses   627
12. Grains   627
13. Vegetables   629
14. Fruits   631
15. Man and Plants   632
16. Food and Drink   635
17. Cooking   638
18.Drugs   644
19. Clothing and Adornment   644
20. Dwelling   651
21. Furniture   655
22. Tools   657
23. Town and Country   660
24. Commerce   661
25. Measures   662
26. Communication and Travel   663
27. Culture and Education   664
28. Games and Entertainment   666
29. Religion   666
30. Social Customs   667
31. Human Body   669
32. Body Movements   675
33. Grooming   681
34. Life and Death   682
35. Sickness   683
36. Weapons   684
37. Human Relationships   685
38. Categories of People   691
39. Occupations    694
40. Activities   696
41. Mental Activities and Emotions   698
42. Sensations   701
43. Taste and Smell   702
44. Shape, Dimension, and Color   703               
45. Sound   707
46. Quality   707
47. Time    712
48. Place   714
49. Motion   717
50. Existence, Location, and Possession   718
51. Quantity   719
52. Pro-words   722
53. Grammatical Functors   724

English Index to Appendix II   727


The She Ethnic Minority:  Preface

The She (畲) people are the 20th largest officially recognized ethnic minority in China, with a population of over 746,000 people as of 2021. 
 
  • Location: They primarily live in the mountainous border regions of the coastal provinces of Fujian and Zhejiang, with smaller populations in Jiangxi, Guangdong, and Anhui.
  • Language: Most She people today speak a Chinese variety known as She Chinese (畲话, Shēhuà), which is generally considered an unclassified Sinitic language and has been heavily influenced by Hakka Chinese.

She people and languages:  Introduction

The She people (Chinese: ; She Chinese: [sa˦]; Cantonese: [sɛː˩], Fuzhou: [sia˥]) are an ethnic group in China. They form one of the 56 ethnic groups officially recognized by the People's Republic of China.

According to the 2021 China Statistical Yearbook, the total population of the She was 746,385, including 403,516 males and 342,869 females. The She are the largest ethnic minority in Fujian, Zhejiang, and Jiangxi Provinces. They are also present in the provinces of Anhui and Guangdong. Some descendants of the She also exist amongst the Hakka minority in Taiwan.

Today, over 400,000 She people of Fujian, Zhejiang, and Jiangxi provinces speak She Chinese, an unclassified Chinese variety that has been heavily influenced by Hakka Chinese.

There are approximately 1,200 She people in Guangdong province who speak a Hmong–Mien language called She, also called Ho Ne meaning "mountain people" (Chinese: 活聂; pinyin: huóniè). Some say they are descendants of the Dongyi, Nanman, or Yue peoples.[2][3]

She Chinese (畲话) should not be confused with Shēyǔ (畲语), also known as Ho Ne, which is a Hmong-Mien language spoken in east-central Guangdong. She and Sheyu speakers have separate histories and identities, although both are officially classified by the Chinese government as She people. The Dongjia of Majiang County, Guizhou are also officially classified as She people, but speak a Western Hmongic language closely related to Chong'anjiang Miao (重安江苗语).

(Wikipedia)

She language

The She language (Mandarin: 畲語, Shēyǔ), autonym Ho Le or Ho Ne, /hɔ22 ne53/ or Ho Nte, is a critically endangered Hmong–Mien language spoken by the She people. Most of the over 709,000 She people today speak She Chinese (possibly a variety of Hakka Chinese). Those who speak Sheyu—approximately 1,200 individuals in Guangdong Province—call themselves Ho Ne, "mountain people" (活聶; huóniè).

(Wikipedia)

She Chinese

She or Shehua (畲话, Shēhuà, meaning 'She speech') is an unclassified Sinitic language spoken by the She people of Southeastern China. It is also called Shanha, San-hak (山哈) or Shanhahua (山哈话). She speakers are located mainly in Fujian and Zhejiang provinces of Southeastern China, with smaller numbers of speakers in a few locations of Jiangxi (in Guixi and Yanshan County), Guangdong (in Chaozhou and Fengshun County) and Anhui (in Ningguo) provinces.

She (畲话) is not to be confused with Shēyǔ (畲语, also known as Ho Ne), which is a Hmong–Mien language spoken in East-Central Guangdong. She and Sheyu speakers have separate histories and identities, although both are officially classified by the Chinese government as She people. The Dongjia of Majiang County, Guizhou are also officially classified as She people, but speak a Western Hmongic language closely related to Chong'anjiang Miao (重安江苗语).

(Wikipedia)

Hakka language and people

Hakka (Chinese: 客家话; pinyin: Kèjiāhuà; Pha̍k-fa-sṳ: Hak-kâ-va / Hak-kâ-fa, Chinese: 客家语; pinyin: Kèjiāyǔ; Pha̍k-fa-sṳ: Hak-kâ-ngî) forms a language group of varieties of Chinese, spoken natively by the Hakka people in parts of Southern China, Taiwan, some diaspora areas of Southeast Asia and in overseas Chinese communities around the world.

(Wikipedia — language)

The Hakka (Chinese: 客家), also referred to as Hakka Chinese or Hakka-speaking Chinese, are an ethnic group and subgroup of Han Chinese whose principal settlements and ancestral homes are dispersed widely across the provinces of southern China and who speak a language that is closely related to Gan, a Chinese language spoken in Jiangxi province. They are differentiated from other southern Han Chinese by their dispersed nature and tendency to occupy marginal lands and remote hilly areas. The Chinese characters for Hakka () literally mean "guest families".

(Wikipedia — people)

 

Selected readings



23 Comments

  1. Philip Taylor said,

    November 22, 2025 @ 9:54 am

    One very quick question, If I may, Victor ? How should "Shē" be pronounced in this context ?

  2. Victor Mair said,

    November 22, 2025 @ 12:17 pm

    @Philip Taylor

    Their name in Sinitic 畲 is pronounced Shē in MSM. Their ethnonym in so-called She Chinese is [sa˦]; Cantonese: [sɛː˩], Fuzhou: [sia˥]).

  3. Philip Taylor said,

    November 22, 2025 @ 1:11 pm

    Thank you. But please forgive my ignorance — would Shē (MSM) sound different to Shī (MSM) to a Western ear, and if so, in what way ?

  4. David Marjanović said,

    November 22, 2025 @ 2:17 pm

    would Shē (MSM) sound different to Shī (MSM) to a Western ear, and if so, in what way ?

    e here stands for [ɤ], the unrounded counterpart of [o], which is the sound of at least some versions of Vietnamese ơ; the other vowel (to the extent that it is a vowel) is somewhat difficult to describe, but I'm sure it overlaps with the range of Vietnamese ư.

  5. Jonathan Smith said,

    November 22, 2025 @ 2:51 pm

    @Philip Taylor why not copy the characters e.g. 畬 and 师 (respectively shē and shī) into e.g. Google Translate, select "Chinese Simplified", and press the "speaker" icon for audio (ignoring the 'translation' of course)? This is passable and you can decide for yourself if/how the two are impressionistically different.

    @David Marjanović re: Mandarin -e rhyme, while very different between e.g. heavy north (with a diphthong) and Taiwan (not), it indeed must at least overlap with Vietnamese ơ… re: the Mandarin vowel in shi zhi chi ri, someone pretty familiar with both the IPA vowel chart / how it works and Mandarin would IMO say that this "vowel" is not chartable. It's just kinda (e.g.) sh when you quit saying the consonant and switch voicing on i.e. maintain all other articulatory variables. But indeed Mandarin speakers (technically wrongly) tend to use just this sound for e.g. Japanese -u, Vietnamese -ư, etc. (and visaversa).

  6. Philip Taylor said,

    November 22, 2025 @ 3:07 pm

    Thank you, David. But if Google Translate is correct (and the narrator certainly sounds like a native speaker), then the vowel sound of 師 (shī) differs from the vowel sound of 是 (shì) not only in pitch (as one would expect) but in the "quality" of the vowel as well. Would you agree ?

  7. Philip Taylor said,

    November 22, 2025 @ 3:09 pm

    Oh, and thank you Jonathan. I was doing what you suggested without knowing that you had suggested it, and it was only on returning here to post a further comment that I read what you had written.

  8. Jonathan Smith said,

    November 22, 2025 @ 3:36 pm

    @Philip Taylor Haha no problem. The "Chinese Traditional" one you link to is not correct (I mean simply just not Chinese); switch to "Chinese Simplified." Have noticed this before but have no idea why it happens. There are also more sophisticated text to voice tools of course but GT should be OK with "Chinese Simplified" setting (even with "Traditional" characters as in your link)

  9. Philip Taylor said,

    November 22, 2025 @ 5:03 pm

    Oh. Sigh. Thank you. I do my best to avoid the simplifed script for ideological reasons, but "needs must, when the devil drives", and on so doing I see (well, hear) exactly what you mean.

  10. Jerry Packard said,

    November 23, 2025 @ 2:49 pm

    The vowel of shē is [ɤ] as David points out, and the vowel of shī is the ‘retroflex mid-apical vowel’, which has a storied past in Chinese linguistics. As many have written, the phone can simply be considered the vocalic extension of the preceding retroflex consonants zh- ch- sh- (pinyin). I have seen it written as ‘iota’ (ɩ with a bottom-right hook, e.g., Pullum & Ladusaw), or as an iota with a top-left hook. I could reproduce none of them here.

  11. Chris Button said,

    November 24, 2025 @ 7:47 am

    @ Jerry Packard

    As many have written, the phone can simply be considered the vocalic extension of the preceding retroflex consonants zh- ch- sh- (pinyin). I have seen it written as ‘iota’ (ɩ with a bottom-right hook, e.g., Pullum & Ladusaw), or as an iota with a top-left hook. I could reproduce none of them here.

    I assume Karlgren introduced that?

    Needless to say, the idea of a "fricative vowel" totally flies in the face of the definition of a vowel.

  12. Jerry Packard said,

    November 24, 2025 @ 9:24 am

    @Chris The earliest place I’ve seen it is Chao 1934 and then Hockett 1947. Karlgren 1934 has it as an umlauted i. The vocalic element, however, is decidedly not fricated, though, would you agree?

  13. Philip Taylor said,

    November 24, 2025 @ 12:19 pm

    Returning (briefly) to the question I posed in my second comment ("would Shē (MSM) sound different to Shī (MSM) to a Western ear, and if so, in what way ?"), at the time I asked it I could not think of an MSM word ending with (Pinyin) "he", so had no internal clue as to how it might sound. It was only last night that I suddenly realised that 喝 (Pinyin) "hē" not only ends with "he", it also starts with it, and the sound of MSM 喝 is a sound with which I am very familiar, so I withdraw my question as totally unnecessary (with the benefit of hindsight, of course).

  14. Daniel said,

    November 24, 2025 @ 4:41 pm

    Chris, I disagree. Consider:
    "Psst! Want to know a secret?"?

    One definition of a vowel is an unconstricted airstream (phonetic definition), but the other is that it is the nucleus of a syllable because it is the point of least restriction. Put another way, the nucleus of a syllable is usually a (phonetic) vowel, but it may also be a fricative or a liquid.

  15. Chris Button said,

    November 24, 2025 @ 5:10 pm

    @ Jerry Packard

    I think it's fricated throughout. I'd need find a spectrogram though to confirm.

    @ Daniel

    Hence its treatment as a "syllabic fricative" rather than a vowel. Personally, I'm not a big fan of the notion of "vowels" outside of phonetics, but that's a big can of worms.

  16. KIRINPUTRA said,

    November 25, 2025 @ 5:12 am

    Wow, just downloaded. Looking forward to digging in. I wonder why he didn't work out at least the lexicon as well.

    @ Victor

    When you say maps in the 2018 paper, do you mean the five maps in the appendix at the end?

  17. KIRINPUTRA said,

    November 25, 2025 @ 5:17 am

    The speakers of these languages call themselves (canonically) the SANHAK, while the Hakka (traditionally) call themselves the HAK, and their songs are SAN songs.

    Connections like these are made less accessible — in practice — by the questionable neo-tradition of referring to everything in Pinyin Mandarin.

  18. Chris Button said,

    November 27, 2025 @ 6:36 am

    @ Jerry Packard

    The earliest place I’ve seen it is Chao 1934 and then Hockett 1947.

    I just found it on pages 295-297 of Karlgren's "Études sur la phonologie chinoise" (1915)! It's in Unicode as ɿ.

  19. Jerry Packard said,

    November 27, 2025 @ 8:37 am

    @ Chris
    Good find!

  20. Andreas Johansson said,

    November 28, 2025 @ 5:55 am

    Needless to say, the idea of a "fricative vowel" totally flies in the face of the definition of a vowel.

    The phrase has been used to describe certain syllabic nuclei in certain Swedish dialects, and I've seen a superscript z following a i or y used to indicate them.

    They're probably better analyzed as syllabic fricatives, though.

  21. Jinfu Ke said,

    December 2, 2025 @ 1:10 pm

    @ Chris @ Daniel @ Jerry
    The Mandarin “apical vowel” being the continuation of Pinyin zh, ch, sh, is better understood as a syllabic approximant [ɻ]: Lee-Kim, Sang-Im. “Revisiting Mandarin ‘Apical Vowels’: An Articulatory and Acoustic Study.” Journal of the International Phonetic Association 44.3 (2014): 261–282.

  22. Chris Button said,

    December 2, 2025 @ 6:43 pm

    And if there is indeed no frication, then one could claim that "r" is a vowel.

    That is not a wholly unreasonable position phonetically>/i>.

    But it does then raise the awkward question of what a vowel represents phonologically. And there, "syllabic approximant" still ends up behaving just like "syllabic fricative" in challenging the very notion of a phonological (rather than phonetic) vowel.

  23. Victor Mair said,

    December 3, 2025 @ 9:08 pm

    From South Coblin:

    The question of the autonyms of the She people is taken up in section 1.1 of the monograph. As noted there, 99% of the She call themselves [san1 haʔ7] or [san1 haʔ7 ŋin2]. (The 1% of exceptions are interesting, but I will eschew discussion of them here.) Your correspondent is mistaken in saying that they call themselves SANHAK, if s/he means that as an accurate phonetic rendering. The final of the second syllable is a glottal stop, not -k. So far as I am aware, no known She dialect has a final -k. It is a defining characteristic of these languages that Common Hakka-She *-k becomes Common She *-ʔ across the board in the modern dialects. As to the Hakka, I have never personally heard a Hakka speaker use the single syllable [hak1] in reference to Hakka people or matters. The speakers with whom I have had working contact always said [hak1 ka1], usually followed by another qualified word, e.g., [hak1 ka1 ua6] “Hakka language”, as in [ŋai3 m hiau3 koŋ3 hak1ka1 ua6] “I can’t speak Hakka.” A Hakka person is always called [hak1 ka1 ȵin2], never just [hak1]. As to the famous Hakka “mountain songs”, of which there are many examples available on YouTube, I have no comment, since I don’t know what your correspondent is attempting to convey by mentioning them.

    Linguists universally use the word “She" to refer to the language and ethnicity of these people. In a separate paper, published in 2018, I discussed the autonym question and the feasibility of using the common native word in my work. I decided to go instead with common usage, i.e., She, in order not to confuse readers, who would nearly all be linguists and dialectologists, not laymen. And I shall continue to do this for as long as I write anything on these matters. Communis error facit ius. Others may do as they please.

RSS feed for comments on this post