Abstand und ausbau

« previous post | next post »

Back in early April of this year, Kirinputra brought up this distinction at the end of a comment thread on Cantonese, but it came at the conclusion  of the thread, so — though it deserved discussion — there was no opportunity to hold one at that time.  Consequently, I reopen the deliberations now by quoting Kirinputra's final comment:

"Sinitic is like Romance" is not a working, truth-bearing analogy, esp. not for a layman audience. (Maybe some subset of Sinitic is like Romance, though.) "Sinitic" arguably harbors much more abstand diversity than Romance, for one thing. More importantly, Romance is by now evidence-based. Humans have a detailed understanding of the mechanics & timing of the divergence from a common ancestor. Sinitic is belief-based. The approach to the detailed reality is largely speculative & often circular.

That reminds me of my former colleague Harold Schiffman's description of Ausbau and Abstand languages, following Heinz Kloss's 1952 criteria:

  • Ausbau languages are languages because they have been developed or `built-up'; they contain all the useful vocabulary they need and are recognized for all domains and registers of a language—technical, religious, etc. But they may be very close to some other, even mutually intelligible lect: The Scandinavian "languages", Czech and Slovak, Lao and Thai, etc. But they may depend on different classical (or other) languages as a source of learned vocabulary …
  • Abstand languages are definitely languages by `distance', i.e. there is no close relative with which they can be confused, or are mutually intelligible with: Japanese, Korean, Icelandic, etc. No chain of mutually intelligible lects merging with some other `language'. Thus many African languages, Amerindian languages, Malayo- Polynesian languages, Australian languages are so by Abstand, but not by Ausbau.
  • Many languages are languages by both criteria of Ausbau and Abstand, e.g. Japanese, English, French, etc. but some are languages by only one criterion, though some are attempting to become useful for all registers by developing their own Ausbau procedures;
  • Some languages that are so by Ausbau but not by Abstand might try to increase the distance by resorting to purism or some other distancing mechanism (borrowing from some other source). Maithili is thought of as a dialect of Hindi within Nepal, but within India, Maithili speakers wish to claim language status.

According to Wikipedia,

This framework addresses situations in which multiple varieties from a dialect continuum have been standardized, so that they are commonly considered distinct languages even though they may be mutually intelligible. The continental Scandinavian languages offer a commonly cited example of this situation. One of the applications of this theoretical framework is language standardization (examples since the 1960s including Basque and Romansh).

How does this scheme work for Sinitic?  Lord knows we need some sort of mechanism for dealing with the hundreds of Sinitic topolects that range in mutual intelligibility from near zero to almost complete.

 

Selected readings

When I was still teaching (until 2012), I had students from the Maghreb (Morocco, N. African Arabic) whose spoken language was completely unintelligible to Arab students from the Levant…or so they told me. Egyptian Arabic retains some old/archaic pronunciations, e.g. the letter "jim" ( ج) is pronounced "g"/, not "j," "ž" etc. as in the Levant. Because of its population size and developed motion picture and music industries, Egyptian (Cairene) Arabic is widely understood. In addition to numerous local dialects, there were also religion-specific dialects, e.g. Judeo-Arabic written in Hebrew letters, and Christian Arabic written in a Syriac-based alphabet, Karshuni.

  • Hammarström, Harald. 2008. “Counting Languages in Dialect Continua Using the Criterion of Mutual Intelligibility.” Journal of Quantitative Linguistics 15(1). 36-45.
  • Kloss, Heinz. 1967. “‘Abstand languages’ and ‘ausbau languages’”. Anthropological linguistics 9(7). 29-41.
  • Tamburelli, Marco & Brasca, Lissander. 2017. “Revisiting the classification of Gallo-Italic: a dialectometric approach.” Digital Scholarship in the Humanities 33(2). 442-455
  • Tang, Chaoju & van Heuven, Vincent J. 2009. “Mutual Intelligibility of Chinese Dialects Experimentally Tested.” Lingua 119(5). 709-732.



62 Comments »

  1. Jerry Packard said,

    October 28, 2025 @ 7:12 pm

    The distinction between Ausbau and Abstand seems to boil down to distance between the languages that comprise the groups. So Ausbau languages are close to each other and Abstand languages are more distant from each other, if they are in fact related. ‘

    ‘Sinitic is like Romance’ I feel is in fact a useful analogy, because it captures the fact that the Sinitic languages are demonstrably related, even if they may not derive cleanly from a common ancestor as is the case with Romance and Latin. It is useful because the alternative would be all kinds of strange a priori beliefs possessed by laypersons concerning, e.g., the relation between Cantonese, Min, Shanghainese and Mandarin.

    The statement that “Romance is by now evidence-based…[and]…Sinitic is belief-based” strikes me as odd, for while more phonologically straightforward textual evidence exists and more work has been done on Romance, Sinitic is just as evidence-based – especially if you consider primary Chinese-language sources and all the fieldwork done over the past 100 years or so. I’ll stop here.

  2. Jonathan Smith said,

    October 28, 2025 @ 9:55 pm

    One thing is these concepts — language by virtue of de facto ("objective"?) separateness from neighbors vs. "standard" language as a product of social construction — could in principle be regarded independently, but are usually considered in the context of a picture in which there (seem to?) exist coherent, self-contained dialect groupings on the basis of which are eventually constructed one or a couple umbrella standards. Not coincidentally, these groupings and/or the scope of the standards wind up coterminous-ish with emergent Western European polities. Whereas absent durable political/cultural separation, as in the Chinese case, potential "abstand" groups never get to fully "abstand"… the Yue, Hokkien, etc., regions could have been like European (or just Indian for that matter) states on some other timeline, but centripetal cultural and political forces have held them — forces of which one component has ever been the so-called "classical" "language" which is in some sense neither of these things but rather in effect an accretion on a relevant national/regional standard such that its perpetuation and exaltation in the provinces amounts to an accommodation to that standard.

    And re: "Sinitic is belief-based", let us rather say that the above centripetal forces are partly emotional/psychological. The Sinitic of historical linguistics is (or should be) based on systematic phonological correspondences across putative daughter languages. Now interestingly, one probably *could* argue on reasonably scientific grounds that this is insufficient to speak of language "inheritance" and thus "relationship" sensu stricto and that such can really only exist given certain traceable morphological features. It would be interesting to read an exposition of such a view. But outside of that, the notion of timeless Chineseness/Han identity (like all "racisms") as well as the (generally well-meaning) opposing rejection of scientific "Sinitic" are equally emotional/psychological positions.

  3. Peter Cyrus said,

    October 29, 2025 @ 4:53 am

    A language is a dialect with a standard.

    We no longer need the army and the navy.

  4. Chris Button said,

    October 29, 2025 @ 12:57 pm

    I once did some phonetic analysis of a Kuki-Chin language spoken in India for an academic working on the language. However, for socio-political reasons, the speakers of the language identified ethnically as Naga rather than Kuki-Chin.

  5. Andreas Johansson said,

    October 30, 2025 @ 4:15 am

    Am I reading Kirinputra correctly as saying that the genetic unity of Sinitic is unproven? That Cantonese might be closer to Burmese than to Mandarin or something like that?

  6. Chris Button said,

    October 30, 2025 @ 6:10 am

    One myth that does need dispelling is that Min languages somehow represent a more conservative ancient version of Chinese that better reflects Old Chinese than any other branch. Rather, they are just interesting in their own right and in how they don't seem to fully align with Middle Chinese.

  7. Blake Shedd said,

    October 30, 2025 @ 8:47 am

    Would a comparison between Indo-European and Sinitic better represent the differences than a comparison between Sinitic and Romance? I took a cursory look at the Wikipedia Swadesh list of Sinitic languages and wonder if a specialist could say whether the breadth of differences is similar to that which one sees among IE languages.

    Granted, a comparison with Indo-European might not be as easily understood since familiarity with Romance languages might be higher.

  8. Nelson Goering said,

    October 31, 2025 @ 3:21 am

    "One myth that does need dispelling is that Min languages somehow represent a more conservative ancient version of Chinese that better reflects Old Chinese than any other branch. Rather, they are just interesting in their own right and in how they don't seem to fully align with Middle Chinese."

    I'm wondering if you could expand on that. I've certainly heard (without being qualified to judge) that Min branched off earlier than the rest of Sinitic, and is descended from Late Old Chinese rather than Middle Chinese. That is, of course, not at all the same thing as a claim that it's more conservative as a whole, though it does certainly imply that it fails to show some innovations of Middle Chinese. I'm aware that Baxter & Sagart take Min as a source of evidence for different kinds of pre-initials in Old Chinese, but (as far as I understand them) without claiming that Min actually preserves any pre-initials as such.

  9. Nelson Goering said,

    October 31, 2025 @ 4:26 am

    Or, to put matters more simply: subgrouping and conservatism are different things. I'm basically asking whether you're doubting that Min actually split off as a separate branch compared to Middle Chinese and its descendants (something that needn't imply that either branch is more conservative — a distinct branch of a family can, potentially, be highly innovative), or whether you're doubting that Min is particularly conservative (either actually, or in preserving, in indirect and innovative form, evidence for pre-initials that happens to have been effaced in Middle Chinese), regardless of its phylogeny.

  10. Chris Button said,

    October 31, 2025 @ 5:39 am

    Definitely the conservatism rather than subgrouping point. I wouldn't dare comment on the subgrouping question! I don't think even experts on Min can make definitive statements there. And I am certainly no Min expert.

    I was just trying to make the simple point that Min languages have naturally evolved over time like any other languages. So, while they are conservative in some ways, they are not conservative in others.

    (Meanwhile, they separately also seem to have some "unique" characteristics relative to other Sinitic languages. But I'm not qualified to opine on whether those characteristics make them more conservative or more innovative).

  11. Nelson Goering said,

    October 31, 2025 @ 6:20 am

    "I was just trying to make the simple point that Min languages have naturally evolved over time like any other languages. So, while they are conservative in some ways, they are not conservative in others."

    I see! I can imagine that trying to frame things in that (obviously very sensible) way could be struggling against a certain degree of romanticization. I definitely see something similar in my home turf of Germanic often enough, where languages like Icelandic and Elfdalian — which *do* have some legitimately conservative features, but have also both undergone some really major changes over the centuries — are presented as unchanged fossils of the Viking Age (which neither is, by a very long shot).

  12. Jerry Packard said,

    October 31, 2025 @ 10:31 am

    I’ve been taught that Min branched off earlier (Bodman, Norman, Coblin). But I’ve always thought that this doesn’t necessarily imply that it was therefore closer to Old Chinese. I thought the Yue dialects had that honor.

  13. wgj said,

    November 1, 2025 @ 10:50 am

    The only way for the Yue dialects to be closer to Old Chinese is if the (main branch of) Han ethnic group originated in SE Asia (Thai-Viet) as the original rice farmers and migrated northwards along the coast into China following the warming climate after the last glacial maximum. There are some very tangential evidence for this hypothesis from paleogenetics, but it's far from mainstream.

  14. Nelson Goering said,

    November 3, 2025 @ 2:13 am

    "The only way for the Yue dialects to be closer to Old Chinese is if the (main branch of) Han ethnic group originated in SE Asia (Thai-Viet) as the original rice farmers and migrated northwards along the coast into China following the warming climate after the last glacial maximum."

    This is confusing geography with linguistic conservatism. Estimating the latter is hard (people have a tendency to underrate conservative features in "mainstream" varieties and exaggerate those in "marginal" ones), but insofar as conservatism is measurable, it's not particularly related to geography. Despite my objections to Icelandic getting romanticized, there are a number of ways in which is really is fairly conservative, but it is in a region only settled by Norse speakers relatively late on, and far removed from the original locus of North Germanic.

    I'm not saying that the Yue varieties really are exceptionally conservative (I have no idea), but this would be a matter purely of how much *linguistic* change: do they preserve more features (in this case, I guess phonological ones are meant) from Old Chinese than other varieties do? Geography and population dynamics have absolutely nothing to do with that question.

  15. KIRINPUTRA said,

    November 4, 2025 @ 8:16 am

    @ Andreas Johansson

    Am I reading Kirinputra correctly as saying that the genetic unity of Sinitic is unproven?

    That’s right. It is literally unproven that all of Sinitic is “genetically Sinitic”.

    Some find this statement remarkable, hence the passive-aggressive pushback, etc. If anything, I find the underlying state of affairs more remarkable.

    > That Cantonese might be closer to Burmese than to Mandarin or something like that?

    Wild! I wouldn’t go anywhere near there.

  16. KIRINPUTRA said,

    November 4, 2025 @ 9:10 am

    @ Jerry Packard

    The statement that “Romance is by now evidence-based…[and]…Sinitic is belief-based” strikes me as odd

    It would’ve struck me as odd too at some point — a non-starter, even. Then the reality confronted me, almost (?) unexpectedly.

    especially if you consider primary Chinese-language sources and all the fieldwork done over the past 100 years or so.

    I do read Mandarin fluently, and I def. did consider modern Chinese-language sources. Regardless of its pedigree, the “Sinitic assumption” is Chinese-led by default today since Western China scholars are deferent to “ethnic Chinese” scholars when the latter are in consensus, which is often.

    “Primary” is an interesting word here too, BTW.

  17. KIRINPUTRA said,

    November 4, 2025 @ 9:12 am

    @ Jonathan Smith

    The Sinitic of historical linguistics is (or should be) based on systematic phonological correspondences across putative daughter languages.

    Using just “character readings”, right? No real words, etc. ;)

  18. KIRINPUTRA said,

    November 4, 2025 @ 9:18 am

    @ Blake Shedd

    No-brainer for an expert in adjacent fields (or a non-expert, of c.) to look at the Swadesh lists on the Wikis. But they’re treacherous for the so-called Chinese languages.
     
    Taking the list for Hokkien, for example…. Extreme Mandarin-fusion expressions (to the point of nonsense) are packed in together with the common, cardinal word or phrase on many lines — typically preceding the common word — as if these forms simply coexisted timelessly in Hokkien (at best). Meanwhile, basic monosyllabic variants & whole swathes of words with no demonstrable Mandarin cognate (e.g. ENG-IA for DUST) are left out.
     
    Why such a state? Raw ignorance (of the point of Swadesh lists; of the language itself) is probably involved, but the timeless Mando-fusion, the omission of “non-pan-Chinese” words, etc…. All that happens to be politically correct in the context of Making China Great Again.
     
    https://en.wiktionary.org/wiki/Appendix:Hokkien_Swadesh_list

    You def. couldn’t fool around like this with the Swadesh list for, say, Albanian, or even Manchu. To be clear, it’s not just the Wikis. This kind of problem is pervasive — although typically in even more subtle forms — in the data for the so-called Chinese languages (excl. Mandarin, ironically).

  19. Chris Button said,

    November 4, 2025 @ 9:04 pm

    @ Kirinputra

    I would be wary of Swadesh list(s) regardless of the languages involved.

  20. Andreas Johansson said,

    November 5, 2025 @ 2:38 am

    That’s right. It is literally unproven that all of Sinitic is “genetically Sinitic”.

    Thanks you for clarifying :)

    Some find this statement remarkable

    I do find it remarkable, if only because I don't believe I've ever seen the unity of Sinitic questioned before.

    Wild! I wouldn’t go anywhere near there.

    Do you have a suggestion what a plausible alternative to Sinitic unity could look like?

  21. Nelson Goering said,

    November 5, 2025 @ 3:00 am

    Is the objection just that there is vocabulary variation within Sinitic?

  22. KIRINPUTRA said,

    November 5, 2025 @ 10:31 am

    Not sure what you’re asking, but vocab. variation among “Chinese” languages is indeed deemed a nuisance or a hindrance in some contexts. You can see a hint of that thinking in the aforelinked Hokkien Swadesh list on Wiktionary. Comparative linguists often get around the “problem” by comparing “character readings” instead of real words.

  23. KIRINPUTRA said,

    November 5, 2025 @ 10:32 am

    @ Chris Button

    I hear you.

  24. KIRINPUTRA said,

    November 5, 2025 @ 10:42 am

    I don't believe I've ever seen the unity of Sinitic questioned before.

    That’s remarkable in itself, for something that hasn’t been proven or demonstrated in any remotely scientific fashion.

    Certainly, if one approaches “Sinitic” via Mandarin — as almost all modern Western scholars do, and many non-Western (incl. Chinese) scholars — the uniform aspects of the elephant are always emphasised, almost systematically. For ex., even the go-to “polar opposite of Mandarin” role — in the imagination of the Western scholar — has been claimed by Hong Kong-Canton Cantonese, which is, lexically & structurally, the most Mandarin-influenced language on the seaboard south of the Yangtze Basin. (So the very imagination is stunted, most cruelly for those who’ve power-studied Cantonese in order to see, they hoped, the far side of the elephant.)

    Also, the monogenesis of the “Han Chinese” people — and, in parallel, their language(s) — is an existential pillar of the Chinese nation. So there’s that. Def. something to keep in mind.

    Another factor is the special “competitive” type of “community-based” Chinese nationalism that surrounds, from birth, native speakers of any big-market Deep South or Deep Southeast Chinese language — i.e., most of the very people best positioned to approach “Sinitic” from some other side of the elephant. Western scholars don’t give a damn about the Party line, in theory — but the subjective beliefs of the Subject are kind of sacred nowadays. Sorry for the over-comment!

  25. KIRINPUTRA said,

    November 5, 2025 @ 10:45 am

    Do you have a suggestion what a plausible alternative to Sinitic unity could look like?

    Let’s do the science and find out.

  26. Chris Button said,

    November 5, 2025 @ 10:31 pm

    I am looking forward to seeing Jerry Norman's posthumous dictionary when it is published.

    Certainly, if one approaches “Sinitic” via Mandarin — as almost all modern Western scholars do, and many non-Western (incl. Chinese) scholars …

    Isn't the approach normally via Middle Chinese?

  27. KIRINPUTRA said,

    November 6, 2025 @ 4:40 am

    I mean “approach Sinitic” as in “come to Sinitic”. (So, I hope there is somebody out there that “learned” Middle Chinese before Mandarin. Judging from all the Mandarin stuff in our tools & resources for Middle Chinese, though, most people come to Middle Chinese via Mandarin, or another modern language plus Mandarin.)

  28. KIRINPUTRA said,

    November 6, 2025 @ 4:41 am

    Do you have a suggestion what a plausible alternative to Sinitic unity could look like?

    Let’s do the science and find out.

    I was reminded of this today. A friend, an adult heritage learner of Taioanese, used Verb + Target State + Object order (*看清現實 = *KHÒAᴺ CHHENG HIĀN-SI̍T, a là Mandarin) in a post where Verb + Object + Target State (共現實看乎清 = KĀ HIĀN-SI̍T KHÒAᴺ HŌ͘ CHHENG) was the correct order.

    Verb + Target State + Object is very much legal in Cantonese (睇清現實), and probably in nearly all the socially Chinese languages.

    Genetically related languages can, and always more or less are, structurally different. What could be more obvious? But … what is the idea that Hoklo is genetically related to all other types of “Chinese” even based on?

    Hoklo — esp. seen diachronically — is not qualitatively more “Chinese” than the “non-Sinitic” Mien. However, genetic relationship is a matter of history, not features, typology, etc. The idea is that past scholarship has demonstrated that the Hoklo languages are descended via continuous imperfect cloning (although we don’t use this term in linguistics) from some ancient language that Cantonese, Mandarin & Changsha speech — but not Mien — also descend from via cloning.

    That it hasn’t, though.

    This is not procedural trivia. Going by the evidence we do have, it seems quite possible, if not likely, that Hoklo & (say) Mandarin don’t actually share an ancestor in the conventional sense, via unbroken chains of imperfect cloning.

    Put provocatively, similarities among the socially Chinese languages in their entirety may well be the product of convergence, not divergence. (Intuitively, though, I do believe a large subset of “Sinitic” is genetically related in the conventional sense.)

    Clearer, deeper, moar high-level top-down work doesn’t get us closer to the truth. The genetic affiliation of a Teochew or a Hainamese has to be based first on in-depth documentation & analysis of Teochew, etc., and then on detailed understandings of past & present relationships between Teochew and similar or neighboring languages. High-level top-down work is only as good as the low-level bottom-up work it builds on. The state of the art in Romance linguistics is built on copious data & contested insights on every modern language & some dead ones. Individual scholars bring expertise & working knowledge in different & often pretty random combinations of Romance (and relevant non-Romance) variants. The careers of many individual scholars thus form an interlocking whole that is always becoming more & more robust.

    Every sociopolitical environment is different, but standards by definition should be consistent. There should be no shame in saying We Don’t Know Yet. *看清現實-level ignorance (errors) shouldn’t be showing up in comparative linguistics papers. The data needs to be better, and linguists need to be better at evaluating it: Stick to what you know, or know what you stick to.

  29. Chris Button said,

    November 6, 2025 @ 8:07 am

    I was reminded of this today. A friend, an adult heritage learner of Taioanese…

    But there is a reason why historical linguists tend to ignore syntax in the same way they tend to ignore Swadesh lists. Neither are particularly helpful for establishing genetic relatedness.

    Hoklo — esp. seen diachronically — is not qualitatively more “Chinese” than the “non-Sinitic” Mien.

    Hmong-Mien and Tai languages were originally treated as branches like Sinitic and Tibeto-Burman from the same "Indo-Chinese" source. Hmong-Mien and Tai were then deemed genetically unrelated and pulled off.

    To pull Min off from Sinitic represents more than just separating the Hmong-Mien branch from the Sinitic branch though. You are pulling it out from its position within a branch.

    To be clear, I am no Min expert, and I'm not about to google lots of articles and then feign some expertise here!

    However, from my cursory exposure to articles on Min languages over the years while working on Sino-Tibetan/Tibeto-Burman languages, your proposal does seem drastic in the state of current knowledge. Having said, I for one am interested to see what you can uncover.

  30. David Marjanović said,

    November 6, 2025 @ 2:18 pm

    if the (main branch of) Han ethnic group originated in SE Asia (Thai-Viet) as the original rice farmers and migrated northwards along the coast into China following the warming climate after the last glacial maximum

    Wikipedia says rice was domesticated "some 13,500 to 8,200 years ago". The Last Glacial Maximum ended some 16,000 years ago, so I'd postulate any such migration before the origin of rice farming.

    whole swathes of words with no demonstrable Mandarin cognate (e.g. ENG-IA for DUST) are left out

    "Dust" is not terribly basic vocabulary*; it is, however, the kind of word that is often borrowed from a local language during language shift.

    * German: Staub; neither English nor German even seem to have a cognate of the other word, respectively, in any meaning.

  31. Jerry Packard said,

    November 6, 2025 @ 5:20 pm

    Ok. If you take either the 汉语方言词汇 (Chinese Dialect Word List) or the 汉语方言字汇 (Chinese Dialect Character List) both first published in the early ’60’s, you find lists of words and characters pronounced by speakers of 8 Mandarin dialects and speakers of 10 languages of the 6 Sinitic dialects. An investigator could do an analysis of the Sinitic dialect data regarding their cognate status and their phonological resemblance, and compute the statistics to estimate the likelihood that they do or do not ‘match’, that is, do or do not originate from the same body of data. I am not aware of such an analysis, but I know where I would put my $$ on a bet as to whether they do or do not match.

  32. KIRINPUTRA said,

    November 6, 2025 @ 9:35 pm

    @ Chris Button

    your proposal does seem drastic in the state of current knowledge

    The idea that we hold our guesses and do the science first?

    If anything, it seems drastic to view the science as irrevocably not-done.

  33. KIRINPUTRA said,

    November 6, 2025 @ 11:42 pm

    @ David Marjanović

    (The can't-miss point was that there is systematic bias embedded in the data — the omission of ENG-IA from the Hokkien Swadesh list on Wikt. being an “organic” example — making it “improbable” for a non-Hoklo-speaking linguist, regardless of intent, to truthfully work with Hoklo data, let alone draw conclusions as to its genetic affiliation….)

    "Dust" is not terribly basic vocabulary*; it is, however, the kind of word that is often borrowed from a local language during language shift.

    How about FOOT/LEG, or (HU)MAN? The Hoklo words for these aren't cognate to the Hakka or Cantonese or Mandarin words either — as we all know.

  34. Andreas Johansson said,

    November 7, 2025 @ 3:49 am

    As it happens, the German and English word for "leg" and "human" aren't cognate either.

    (Those for "foot" and "man" are.)

  35. wgj said,

    November 7, 2025 @ 5:29 am

    The genetic unity of Sinitic is not only unproven, but it's being challenged increasingly by both paleogenetics studies, as well as paleoanthropological studies, including by mainland Chinese researchers affiliated with mainstream state institutions.

    The "Western Xia, Eastern Yi" hypothesis, which says the civilization later known as Han came to be through merger of two large civilizations (one coastal and the other inland), each with their own (unrelated and starkly different) language, proto-writing, spiritual believes (shamanism vs. ancestor worship), economic production (rice vs. millet) and material culture, was proposed decades ago, remains somewhat fringe to this day, but is slowly picking up support.

    https://youtu.be/bKFHZ6biVCw
    https://youtu.bexZv_FgqgqrA

  36. Chris Button said,

    November 7, 2025 @ 5:35 pm

    @ wgj

    The phonetic elements in oracle-bone inscriptions, which represent an advanced form of writing, often continue to this day. Reconstructions of earlier forms of Chinese demonstrate an unbroken tradition in that regard.

    My only gripe is that people who reconstruct Old Chinese are often poorly versed in oracle bone inscriptions. Random examples are sometimes thrown in haphazardly without any real understanding.

    Worse still is when the oracle bones are dismissed as not containing enough linguistic evidence to be useful! It reminds me of how inscriptional Burmese (albeit from a much later time depth) has often been dismissed as too haphazard and unreliable. The real reason was a lack of readily accessible lexicons/concordances.

  37. Jonathan Smith said,

    November 8, 2025 @ 12:58 am

    Re: Andreas Johansson said, "Am I reading Kirinputra correctly as saying that the genetic unity of Sinitic is unproven?" and KIRINPUTRA said, "It is literally unproven that all of Sinitic is 'genetically Sinitic'"

    Nothing is "proven." "Sinitic" in its current form is a (good) hypothesis attempting to account for phonological correspondences in basic vocabulary across "Taiwanese", "Mandarin", and the rest. Not to say this should not be done (much) better or that there is no other way of approaching the problem; see to follow…

    Andreas Johansson said, "Do you have a suggestion what a plausible alternative to Sinitic unity could look like?"

    If we are to accept the notion of genetic relationship and certain of that metaphor's key implications, such an alternative could take a form such as "Min is a valid genetic grouping but represents a branch not of Sinitic but of (e.g.) Mien whose Sinitic features are ultimately contact phenomena." Problem with this is that the (e.g.) Mien-looking elements of Min lexicons are not particularly extensive and seem to correspond rather unevenly to one another and to Mien, while the Chinese-looking elements of Min lexicons are extensive and seem to correspond to one another and to less-controversial Sinitic rather evenly. Thus the state-of-the-art. There are of course other possible directions… while "Min is not a valid group" does not look super promising to me, something more radical like "genetic relationship is just an imperfect or indeed invalid metaphor given lack of evidence from e.g. derivational morphology" might be an OK beginning, as I noted above. FWIW, people like Baxter & Sagart 2014 treat Vietic, HM, etc., loans exactly like Sinitic daughter material for purposes of "reconstruction"… if one is methodologically OK with this, one is probably obligated to accept the equivalent assertion that e.g. "Middle Chinese"-to-Min correspondences need not imply "genetic" relatedness…

    KIRINPUTRA said (re: J Smith said "The Sinitic of historical linguistics is (or should be) based on systematic phonological correspondences across putative daughter languages") "Using just “character readings”, right? No real words, etc. ;)"

    Taking this seriously (and forgetting "Sinitic") for a moment, the idea of a historical relationship between specifically say "Hoklo" and "Hokchiu" looks to me to be pretty darn securely founded in the very grimiest (dustiest?) of real words… I had thought of drawing something up along these lines which while bound to be to large degree redundant with stuff that is already out there could serve as a kind of one-stop-shop for the curious generalist… having accepted a unity of this kind, the question then becomes what one's actual hypothesis is re: it and its relationship to uncontroversially "Sinitic" neighbors (see above)… cuz one needs one.

    KIRINPUTRA said, Wiktionary-type data issues are "pervasive […] "in the data for the so-called Chinese languages (excl. Mandarin, ironically)."

    right but… emphatically *including* Mandarin. Rural Shandong might as well be Mars to such data sets. Note the "Chinese Dialect Word List" and "Chinese Dialect Character List" mentioned by Jerry Packard are unfortunately also of this general kind, i.e. not "real words," as are tons of other very-legit-seeming materials…

    KIRINPUTRA said, “A friend, an adult heritage learner of Taioanese, used Verb + Target State + Object order (*看清現實 = *KHÒAᴺ CHHENG HIĀN-SI̍T, a là Mandarin) in a post where Verb + Object + Target State (共現實看乎清 = KĀ HIĀN-SI̍T KHÒAᴺ HŌ͘ CHHENG) was the correct order. […] *看清現實-level ignorance (errors) shouldn’t be showing up in comparative linguistics papers."

    Sure but fact is structural merger of the kind reflected in this "error" is ubiquitous/insidious/irresistible such that "relative purity" claims are a pot-kettle situation where everyone instantly gets potted. E.g. in your correction, kā is historically a benefactive-and-such marker and in particular is not applied to inanimate "DOs" literally ever… like never in say pre-WWII POJ literature and I would be surprised if you could find one single instance even in e.g. the Maryknoll textbook series from the 80s (I know I know 言無難). It remains rare in TV/radio into the 90s. Uses of this kind — largely modeled on Mandarin ba3 — have become widespread only over the course of the past couple decades.

    KIRINPUTRA said, "Hoklo — esp. seen diachronically — is not qualitatively more 'Chinese' than the “non-Sinitic” Mien. […]

    It sure is, both qualitatively and quantitatively; see above — but note I also allow that "Min is Mien" or something is not a categorically insane hypothesis. Neither is "Yue is Tai" for that matter, though you would regard such as "wild." Both hinge on one's interpretation of large-scale phonological correspondences.

    Put provocatively, similarities among the socially Chinese languages in their entirety may well be the product of convergence, not divergence."

    Cf. Haspelmath's "Standard Average European" (convergence product), so not particularly provocative except again there are actual regular phonological correspondences readily interpretable as the product of regular sound change i.e. "divergence" — so a better-formulated devil's advocate position would be e.g., "'Sinitic' could in theory represent a superstrate across a broad swathe of its range; (factual!) phonological correspondences alone aren't diagnostic of something called "genetic relationship"; shared syntax is just contact (not at all provocative FWIW); any say Mien or Tai language of the Chinese south could in theory be swept into common-sense 'Sinitic' given more intense contact / another hundred years."

    below desultory —

    Chris Button said, "One myth that does need dispelling is that Min languages somehow represent a more conservative ancient version of Chinese"

    I've been pointing out for a while that "Min as objectively 'conservative'" is a lazy formulation; see e.g. LL thread "Tang (618-907) poetry in Min pronunciation". Re: specifically applications of Min data to reconstructions of early complex onsets of various exotic kinds, anyone interested is cordially invited to check out my (kind of) recent paper "Some problems involving Proto-Mǐn onsets and new Old Chinese" in Diachronica.

    Blake Shedd said, “Would a comparison between Indo-European and Sinitic better represent the differences than a comparison between Sinitic and Romance?”

    Arguably… early European learners of Amoy etc. drew analogies between e.g. Germanic and JUST what is now called "Southern Min", so…

    wgj said, "The only way for the Yue dialects to be closer to Old Chinese is if the (main branch of) Han ethnic group originated in SE Asia"

    In this and other comments you are "confusing geography [and genetics — JS] with linguistics" to paraphrase Nelson Goering.

  38. wgj said,

    November 8, 2025 @ 7:05 am

    @Chris: The oracle bones from (late) Shang period aren't the oldest form of Sinitic writing – they're merely the oldest Sinitic writing undisputedly recognized as such. Both Erlitou culture outside Luoyang and Longshan culture outside Jinan – both considered candidates for the mysterious and unconfirmed earlier "Xia dynasty" – have proto-writings on pottery. The second video I posted above talks about writings older than the oracle bones.

    The merger of two (or more) populations into Sinitic would have happened much earlier than Shang, based on both paleogenetic and paleoanthropological evidences.

  39. Chris Button said,

    November 8, 2025 @ 9:07 am

    @ wgj

    I'm aware of those, and ignore them. I recall Qiu Xigui gives a good overview/appraisal.

    In any case, the use of the phonetic elements in the oracle-bone inscriptions confirms an unbroken Sinitic line through to the present. The details remain debatable.

    Of course, many scholars have speculated earlier connections with Sinitic (e.g., Pulleyblank with Indo-European, Sagart with Austronesian, Starostin with North Caucasian), but none seem convincing. The only one of any real interest seems to be Pulleyblank's proposed Indo-European connection because he is hitting on more primordial features of language in general.

  40. Jerry Packard said,

    November 8, 2025 @ 3:27 pm

    “Note the "Chinese Dialect Word List" and "Chinese Dialect Character List" mentioned by Jerry Packard are unfortunately also of this general kind, i.e. not "real words," as are tons of other very-legit-seeming materials…”

    Not real words? How would you define words such that the entries in the "Chinese Dialect Word List" would not be considered words? My copy (1964) has 474 pages, with each page containing two 18-line columns = 17,064 word entries. Perusing the data, it is very difficult to find non-words (i.e., syntactically non-free forms). The majority, as you would expect, are multi-syllabic (2-3 syllables) with less than 50% single-syllable. These data would be among the best that one could expect from a language field work perspective.

  41. Chris Button said,

    November 8, 2025 @ 4:18 pm

    Is the issue with the word lists perhaps because they contain "citation forms"?

    I recall looking at one (or both) of those word lists years ago, but I don't remember paying attention to the methodology used.

    When I was collecting word lists, I deliberately used frame sentences in addition to recording the words in isolation.

  42. Jonathan Smith said,

    November 8, 2025 @ 4:23 pm

    @Jerry Packard
    To be more specific, things like 汉语方言字汇 if I remember correctly feature columns labeled with characters ("树" "日" etc.), rows labeled with locales ("北京" "厦门" etc.), and cells populated with IPA-ish values with four-corner-type tone marking. SO there is no telling which of these cells contain real words in the (hopefully obvious) sense of being things actually said by speakers of whatever variety to mean the suggested meaning. And the problems really are deeper than this — see. e.g. discussion of Linguistic Atlas of Chinese Dialects in LL threads from last year like Dialectometry. Good data is e.g. an actual dictionary or lexicon specific to some variety… all the better if one is able to gain some real practical knowledge of the language at issue, easier said than done of course…

    Returning for a second to Abstand und Ausbau since I managed to make a long post (#2 above) without making the point clear enough: Mandarin in most places including e.g. Taiwan is certainly not "ausbau" with respect to local languages in the sense that standard German or French are. Instead it's like, oh, Hindi if it were imposed as the language of government and education in southern India… which would go over not well. There isn't and never was an ausbau based on Taiwanese varieties or Hoklo varieties more broadly. Interestingly but not surprisingly when set alongside e.g. France, (potential) movement in such a direction is if anything often resisted by community members…

  43. Jerry Packard said,

    November 8, 2025 @ 5:20 pm

    @Jonathan
    Your description of the work is accurate, but I’m not sure why you think that, for some reason, rules them out as “real words in the (hopefully obvious) sense of being things actually said by speakers of whatever variety to mean the suggested meaning.”

  44. Jonathan Smith said,

    November 8, 2025 @ 11:47 pm

    @Jerry Packard
    It doesn't rule them out (as words), it fails to rule them in — and so rules them out (for consideration in comparative work.) The locales listed could just as well include "东京", "河内", etc. Surely the type of question to be asked is e.g. "how do you say 'sun' in the Sinitic languages?", a question which must have thousands of distinct phonological answers and dozens of distinct etymological ones… whereas "how do you read '日'?" or whatever is just not a linguistic issue in the strict sense.

  45. KIRINPUTRA said,

    November 9, 2025 @ 5:57 am

    @ Chris Button

    historical linguists tend to ignore syntax in the same way they tend to ignore Swadesh lists. Neither are particularly helpful for establishing genetic relatedness.

    What is particularly helpful, BTW?

    Amidst their own ad hoc arguments, “Sinitic”-oriented scholars, in particular, seem to lose sight of what genetic relatedness even means.

    Hmong-Mien and Tai languages were originally…. Hmong-Mien and Tai were then deemed genetically unrelated and pulled off.

    To pull Min off from Sinitic represents more than just …

    This goes to the structure of our hallucinations (however interesting in their own right) — not to the truth of the matter.

    The belief that “Min” (as commonly conceived) is a branch of “Sinitic” reflects our imaginings (which may yet prove accurate, or not) & “aspirations” more than it does the systematically understudied, poorly known reality.

  46. Jerry Packard said,

    November 9, 2025 @ 7:39 am

    @Jonathan
    As a scientist, then, your task is to test your research hypothesis (= the languages do not derive from a common source) by trying to reject the null hypothesis (= they do derive from a common source) via data analysis. You have probabilities of type I error (accepting the null hypothesis as incorrect when it is incorrect) and type II error (rejecting the null hypothesis when it is correct). Believing the Sinitic dialects are not closely related or do not derive from a common source is in my opinion a type II error.

    But your complaint is not really that, but rather that the word lists are improperly elicited. Perusing the data, there are enough non-cognates in Min as well as obvious cognates to convince me that the geographic place name approach is valid – how else would you do it?

  47. Jerry Packard said,

    November 9, 2025 @ 7:45 am

    Sorry, should say: probability of type I error (accepting the null hypothesis as incorrect when it is correct).

  48. KIRINPUTRA said,

    November 9, 2025 @ 11:54 am

    > Nothing is "proven."

    Right. As David M. said here but elsewhere, best we can do is a parsimony argument.

    The point is to argue parsimoniously, not to Make Reality Parsimonious — such as by “minimising” the # of language families, a scholarly pastime that in rare (?) cases coincides with a stately nationalistic drive to lower the # of languages.

    > If we are to accept the notion of genetic relationship and certain of that metaphor's key implications,

    Genetic relationship is a metaphor, but the basis is past reality.

    As I understand it, Languages X & Y are genetically related if both descend from Language A; Language X descends from Language A if it is a(n albeit imperfect) clone of a(n albeit imperfect) clone and so on … of a(n albeit imperfect) clone of Language A. The stuff of family trees is chains of unbroken (albeit imperfect) cloning.

    Creoles like (basilectal) Singlish ain’t clones of anything. They mixes. They ain’t “descended”, in this sense. Despite “phonological correspondences in basic vocab.” vs Germanic, Singlish is not a Germanic language, nor will its descendants (clones) be.

    Reconstructed linguistic history is a proxy for the real thing — not the thing itself. Let’s keep this in mind. A proxy is the best thing sometimes, but’s it still a proxy.

    And let’s keep things in perspective. There’s so much more to human languages than genetic affiliation. Let’s not lose sight of that. The obsession with genetic affiliation is like the obsession with paternity in many societies, incl. Anglophone society till not too long ago. This is a great example of our discipline still (partly) being self-impaled on the 1800s. Oh! God forbid a language not have a brand-name daddy….

  49. KIRINPUTRA said,

    November 9, 2025 @ 12:11 pm

    > “Sinitic" in its current form is a (good) hypothesis attempting to account for phonological correspondences in basic vocabulary across …

    Faking certainty is not good. Admitting uncertainty is fine, at least for a time. And it’s the default in questions of genetic affiliation. There’s no claim in linguistics that, say, Tai & Hmong aren’t ultimately genetically related. The idea is that if they are, it’s in some fashion, or at some time-depth, beyond the capacity of our craft to ascertain at the moment.

    Phonological correspondences across core cognates don’t mechanically establish genetic relationship. If they did, Singlish would be Germanic™. (Prof. Smith also seems to underestimate the # of unshared core lexical items between Hoklo & Mandarin, as well as the # of Hoklo words thought to “correspond irregularly”. Why isn’t Vietnamese “Sinitic” again?) Genetic relationship is a matter of the past linguistic reality that we try to reconstruct. Among other things, was there an unbroken chain of (imperfect) cloning or not, as far as we know? And how far do we know?

    > the question then becomes what one's actual hypothesis is re: it and its relationship to uncontroversially "Sinitic" neighbors (see above)… cuz one needs one

    This prioritises appearances & politics. In some sense, concerns like these are part of the burden that stunts the study of all modern “Sinitic” languages but one (or part of one).

    A hypothesis is only as good as the data & insights it’s built on. Not to deny any aspect of reality — the pursuit of truth sometimes has marketing needs, etc. — but false certainty is worse than none.

    I too suspect that Hokchiu and several of its close neighbors are blood kin of Hoklo — co-clones of some Language A. But don’t take my word for it. I don’t know much about Hokchiu or Hokchia or 蛮講 or 蛮話 or Henghoa, partly b/c the language-learning threshold for being able to use (evaluate, synthesise, etc.) the data truthfully is pretty high. Robust scholarship calls for many specialties; all take time. Making the data more accessible, more conscious, & more abundant is least glamourous but most needed. Comparative work is only as good as the data, mostly. Sky’s the limit, but shitty shortcuts are how we got something like “Sinitic”, a solid (?) core wrapped in layers of circular guesswork.

    The phrase “uncontroversially Sinitic” is revealing. Very interesting.

  50. KIRINPUTRA said,

    November 9, 2025 @ 12:19 pm

    > It sure is, both qualitatively and quantitatively; see above — but note I also allow that "Min is Mien" or something is not a categorically insane hypothesis. Neither is "Yue is Tai" for that matter, though you would regard such as "wild."

    1800s obsession with who fathered who….

    > Both hinge on one's interpretation of large-scale phonological correspondences.

    Doubt this.

    A top-down Sinologist (& their students) will overlook whole categories of evidence in a mad rush to tree the fathers & sons based on intuition.

    > Cf. Haspelmath's "Standard Average European" (convergence product), so not particularly provocative except again there are actual regular phonological correspondences readily interpretable as the product of regular sound change i.e. "divergence"

    (I meant what I wrote. Hoklo has changed a good deal & then some in the last hundred-some years; the main thrust is convergence to Mandarin, b/c of the traditional book koine being mostly replaced by written Mandarin. The idea of Hoklo & Mandarin being one language was an abstraction to the bilingual of 200 years ago; today, even most bilingual linguists may not comprehend how much of the perceived lexical & structural likeness between the languages is recent.)

    > a better-formulated devil's advocate position would be e.g., "'Sinitic' could in theory represent a superstrate across a broad swathe of its range; (factual!) phonological correspondences alone aren't diagnostic of something called "genetic relationship"; shared syntax is just contact (not at all provocative FWIW); any say Mien or Tai language of the Chinese south could in theory be swept into common-sense 'Sinitic' given more intense contact / another hundred years."

    (I agree with some of this, in spirit if not to the letter.)

  51. KIRINPUTRA said,

    November 9, 2025 @ 12:33 pm

    > kā is historically a benefactive-and-such marker and in particular is not applied to inanimate "DOs" literally ever… like never in say pre-WWII POJ literature and I would be surprised if you could find one single instance even in e.g. the Maryknoll textbook series from the 80s (I know I know 言無難). It remains rare in TV/radio into the 90s. Uses of this kind — largely modeled on Mandarin ba3 — have become widespread only over the course of the past couple decades.

    Partly right, and my sentence may be in that part. Well done.

    There are “native” KĀ uses with inanimate DOs that couldn’t be derived from Mandarin. (But I think mine may be the type that is.)

    It’s not about pots & kettles. It’s about getting better & getting well.

  52. Nelson Goering said,

    November 11, 2025 @ 10:43 am

    "Phonological correspondences across core cognates don’t mechanically establish genetic relationship."

    Not mechanically, no. But, in the absence of much morphology, it is helpful to look at a large swathe of vocabulary, and when there are systematic similarities, evaluate how to account for them. English and French are a good example. English has borrowed an enormous number of French words, and there are plenty of systematic correspondences with modern French. Some of these are even fairly basic vocabulary items. But taken as a whole, the hypothesis that a French layer (or several) has been borrowed over a Germanic language is clearly the best hypothesis — and that would be so even without historical records of English. The basic reason is this: we get the most French vocabulary in areas of the lexicon most prone to influence in contact situations, and we get a preponderance (not an exclusivity!) of Germanic vocabulary (a small proportion of which is actually Norse, not West Germanic) in senses that should be most resistant to borrowing. You can get borrowing almost anywhere, but if most of your grammatical particles, words for daily objects, basic verbs, and the like are from a particular language family, the most economical hypothesis is probably going to that the language belongs to that family. Reasonable human judgement is needed to holistically assess patterns of vocabulary, so it's not mechanical — but it's also not a mysterious or conceptually complicated procedure.

    My understanding is that Jerry Norman did quite a lot of direct work with Min, including field work and a good knowledge of at least some Min varieties. Is this wrong? If not, his judgement would seem to count for quite a lot in this regard — pushing matters decidedly away from "uncertain for now" to "presumably X, which now needs to be argued against". This is how I, not a Sinologist, would approach the matter in any family where I'm not personally able to easily and rigorously assess the primary data. If Norman *is* wrong, that would of course be interesting, and a demonstration of it would be nice to read. But the methodology is pretty straightforward anyway — it would be a matter of a bit of linguistic nosegrinding to get any further at this point.

  53. Victor Mair said,

    November 12, 2025 @ 10:23 am

    For additional remarks on the linguistics, genetics, archeology, history, etc. of "Sinitic", see "Abstand und ausbau, part 2" (11/11/25) and the comments thereto.

  54. Jonathan Smith said,

    November 12, 2025 @ 6:26 pm

    Re: Nelson Goering on "the most economical hypothesis", Norman, etc., yes — Min seen as part of Sinitic is no more or less than that

    Re: Jerry Packard on "rejecting the null hypothesis" — to be clear, I was contemplating the space of reasonable-seeming objections to the null hypothesis… so Sinitic incorporating Min is "economical" but nothing more; tons of (most? almost all?) interesting work on these languages has no need even to make reference to this framework

    Re: KIRINPUTRA "There’s so much more to human languages than genetic affiliation" and related comments: YES, classic (IMO) statement being Chirkova, "On principles and practices of language classification"

    Re: KIRINPUTRA "getting better & getting well," my impression is that for some, Taiwanese is a shibboleth landmine where the shibboleths are just as often arbitrary products of recent in-group consensus as they are historically meaningful. Maybe this isn't "well." My 2 cents (worth less than ever — inflation) is just "be the change you want to see (if you want to see change that is)"

  55. Jonathan Smith said,

    November 12, 2025 @ 6:28 pm

    *landmine > minefield

  56. Nelson Goering said,

    November 13, 2025 @ 2:11 am

    "There’s so much more to human languages than genetic affiliation"

    This is of course entirely true, though not at all a new observation. In a diachronic perspective, there are basically two types of affiliation: inheritance (vertical) and contact (horizontal). Both are hugely important. If people go back and read classic works from even the Neogrammarians, you'll find that this is already fully recognized and understood. The idea of the "linguistic area" (after the Neogrammarians, but nonetheless now rather venerable itself) is one way of formalizing some contact-based phenomena in a way no less reified than the language family. Terms like the Sanskrit Cosmopolis and Sinitic Cosmopolis are historical ones, in the first instance about language in history rather than history in language, but could be reapplied easily enough to the family-cross-cutting effects of Sanskrit and Sinitic in various parts of the world.

  57. KIRINPUTRA said,

    November 14, 2025 @ 5:35 am

    my impression is that for some, Taiwanese is a shibboleth landmine where the shibboleths are just as often arbitrary products of recent in-group consensus as they are historically meaningful

    (Literally not sure what you mean, but….) You’ve got me curious: What makes a shibboleth “historically meaningful”? What makes a shibboleth historically meaningless?

    on "the most economical hypothesis", Norman, etc., yes — Min seen as part of Sinitic is no more or less than that

    The evidence doesn’t indicate whether related Languages X & Y descend from Language A — so we “economise” by making believe that they do?

    (Am I missing something? Is there some key assumption that’s not being articulated?)

    tons of (most? almost all?) interesting work on these languages has no need even to make reference to this framework

    Yet [these languages] are systematically understudied compared to non-“Chinese” languages generally. Compare, for ex., Teochew vs Hiligaynon. This has more than a little to do with the Sinitic Assumption as-it-is, doesn’t it?

    (Might it be “economic” to describe the Sinitic “framework” as a pseudo-scientific step-down version of the fantasy of a Chinese monolanguage?)

  58. KIRINPUTRA said,

    November 14, 2025 @ 5:45 am

    In a diachronic perspective, there are basically two types of affiliation: inheritance (vertical) and contact (horizontal). Both are hugely important.

    Thanks for the comment, which I applaud.

    Some or arguably (?) all of what’s known as “Sinitic” also falls within the now-called Mainland Southeast Asia linguistic area (Sprachbund).

  59. KIRINPUTRA said,

    November 14, 2025 @ 6:39 am

    in the absence of much morphology

    We hear this a lot. What is it based on?

    Against some odds, the actual — and circular — logic seems to be that since there is very little Morphology that supports the Sinitic Assumption, Morphology is therefore irrelevant to the theme as a whole. (B/c it doesn’t apply to “all of Sinitic”.)

    On a related note, for ex., what sense does it make to tree the Hoklo languages — vs each other, vs their closest kin, & of c. vs supposed distant kin — w/o examining the (tonal) run-stand patterns in each language?

    This is an entire, partly morphological dimension that is ignored, apparently b/c it doesn’t apply to Cantonese or Mandarin.

    “What have Cantonese or Mandarin got to do with it?” This, now, is a question to be settled scientifically. As you can see, the Assumption has cut off the science and in a real sense preempted it.

    I say this not in ignorance (as one may assume, in haste), but from years of pondering the state of the science: A seductive, or desired, conclusion has bonsaied the scholarship — the doing of the science — from the dawn of modernity (which came late to E/SE Asia). That bonsaied science — in itself far from all bad — is now cited in circular support of the desired conclusion.

    it is helpful to look at a large swathe of vocabulary, and when there are systematic similarities, evaluate how to account for them

    Of course. This helps even when there is “abundant morphology”.

    Non-similarities have to be dealt with at some point, right? The Sinitic Assumption brushes non-similarities aside, in practice. It simply concludes based on what it includes.

    You can get borrowing almost anywhere,

    Sure.

    but if most of your grammatical particles, words for daily objects, basic verbs, and the like are from a particular language family, the most economical hypothesis is probably going to that the language belongs to that family.

    The theory is fine.

    Let me guess: You thought “most of [the] grammatical particles, …, basic verbs, and the like” in (say) Hoklo are derived from some ancestor of, say, Mandarin.

    Not so. W/i the range of our tools, at least, much of the lexical core of Teochew, Hokkien, etc. cannot be shown to derive from any ancestor of Mandarin via either inheritance or borrowing. Such words lack cognates across “Chinese” at large …

    … and “comparative Chinese” scholars tend to simply ignore (if not naively overlook) them in their studies, which is mind-boggling, esp. if their conclusions cite to “the entirety of the evidence”.

    Systematic ignoring upstream is also why non-expert, non-speaker linguists, who know what they know about Hoklo (for ex.) from the writings of others, grossly misunderstand the lexical core of Hoklo to be “largely Sinitic”.

    It’s not….

    There’s also the Chinese-nationalist pop scholarship, etc., in Hoklo-speaking society that (put simply) assumes all “native” lexicon to be “Chinese”; if the Chineseness of a given word is not apparent, it’s b/c the “passage of time” has “obscured” the connection. This greatly affects the native scholarship & scholarly institutions and thus, indirectly, Western scholarship as well.

    My understanding is that Jerry Norman did quite a lot of direct work with Min, including field work and a good knowledge of at least some Min varieties.

    I think he did interesting work, and he worked on what he was interested in — which is real nice, when the times allow it.

    On the flip-side, it seems that, besides his work with specific varieties, he preferred to work with “the Min languages” in highly abstracted form, focusing on words that complemented his interesting theories. Aside from the fieldwork, etc., he didn’t do gruntwork. He focused on duckweed & children; he didn’t trace or chase grammatical particles through “the Min languages”. He didn’t deal with “the Min languages” in their tedious entirety. Honestly, given the tools & data of his (or even our) time, who could?

    In context, a field of study needs a range of scholars. A Norman-type scholar is a fine thing for any field. But his fields of study have been undermanned, creating something like overreliance on the one man’s intuition- and interest-driven (not heavy-duty reality-driven) scholarship. And today, I imagine, there are people (in a less favorable environment) who might like to be the next Norman. There are still too few people laying the foundations that are & were (in Norman’s time) lacking — and next to no social or material incentives for anyone to do so, for reasons intimately related to the Sinitic Assumption.

    his judgement would seem to count for quite a lot in this regard — pushing matters decidedly away from "uncertain for now" to "presumably X, which now needs to be argued against"

    (Point taken to some extent, but….) I don’t think he judged that “the Min languages” were genetically related to each other and to all of “Chinese”. He just assumed it. He also seemed to tacitly assume — someone please correct me if I’m wrong — that if “Min” wasn’t “Chinese”, it would have to be part of some other known language family, as if there was no other logical possibility. He also doesn’t seem to have taken language mixing (creoles, etc.) into account.

    Given the times, his assumptions & blind spots were probably reasonable. Not so for scholars of our time to “inherit” — to some degree willfully — the same assumptions & blind spots. Do we not stand on the shoulders of giants?

  60. Nelson Goering said,

    November 14, 2025 @ 10:13 am

    "Morphology is therefore irrelevant to the theme as a whole."

    That's rather a leap from what I said. The point is that morphology is a great tool where we have it, particularly inflectional morphology, and particularly irregular inflectional morphology. Especially with derivational morphology, though, be careful: this is often easily borrowed.

    I have no idea about the etymology of Min grammatical particles. I certainly wouldn't expect perfect correspondences with any other branch of Sinitic, and I certainly wouldn't take the starting point as back-projected Mandarin or Cantonese (which would miss any cases where Min was conservative against the others). I was commenting on the methodology, not the data — which, to be discussed usefully, needs to actually be discussed concretely.

    On "creoles", I would note that at least some prominent scholars of creoles (in their proper historical reference) doubt that "creolization" is a special type of language change. We of course have examples of languages that have borrowed a great deal of material from Chinese, and there is of course no particular reason Min couldn't have done the same thing. But the methodology for demonstrating this is just the same as I said, and as has already been applied to a number of other languages. Follow the method, see what the result is.

  61. KIRINPUTRA said,

    November 15, 2025 @ 5:39 am

    Follow the method, see what the result is.

    100%.

    That’s what I’m saying too.

    The point is that morphology is a great tool where we have it,

    Right. I’m questioning the idea that we don’t have it (or much of it, anyway) in the case of “Sinitic”.

    What can we know, for instance, of genetic relationship between Hakka & Hoklo w/o having compared the run-stand tonal patterns, which are borrowing-resistant to some high degree? (This is a matter of morphology, at least synchronically where Hakka is concerned.)

    I have no idea about the etymology of Min grammatical particles.

    You suggested that “if most of [the] grammatical particles [among other elements … in Hoklo, for ex.] are from [Chinese], the most economical hypothesis is probably going to be that the language belongs to [Chinese]”.

    They’re not, though. That’s the thing.

    Did you think they were? What led you to believe that, and how? (This too is not trivial.)

  62. Jonathan Smith said,

    November 17, 2025 @ 7:41 pm

    Ah maybe late to contribute meaningfully to this thread (and it's probably all been said before) but —

    re: "grammar", every language in the area has it's own idiosyncratic collection of "content words" and "grammar words" (the latter sometimes transparent grammaticalizations of "content words" sometimes not but anyway not in general seeming to have deep histories traceable to "proto-languages") with combinatory rules involving sequencing/nesting i.e. no (recognized) inflection and (scarcely) any recognized productive derivation.

    So e.g. Mandarin's "grammar words" are by no means less idiosyncratic than those of e.g. Shanghainese or Taiwanese or whatever (though do note that they all constantly grab stuff from each other and from prestige/book idioms.)

    Thus the words of primary interest to I don't know Norman or Schuessler or Baxter & Sagart are almost exclusively "content words." And the lexicon of e.g. Taiwanese shows large-scale and reasonably systematic phonological correspondences to the lexicon of e.g. Mandarin. So (e.g.) "Taiwanese is Sinitic" is far and away the "most economical" that is to say the best framework to work within until it breaks. Which it might. Since there remain lots of fuzzy points. That's the (not inherently political or ideological) situation. That's OK.

    As I pointed out, one could acknowledge the above and still attempt to escape e.g. "Taiwanese is Sinitic" with arguments about the very nature of genetic relationship. If one wanted.

    KIRINPUTRA possibly you know Taiwanese and Hoklo more broadly too well thus it/they seem very special (which they are!). A good idea (for you, me, anyone) would be to learn Hokchiu or Guing-iong or Iu Chiu or whatever (very) well — at which point they will of course seem every bit as special and you (me, anyone) might make impassioned arguments about how the "Sinitic Assumption" obscures practically everything interesting about them. Which it does. That is the nature of such comparative work, which ultimately deals with a relatively small number of putatively cognate content words. That's OK.

RSS feed for comments on this post · TrackBack URI

Leave a Comment