Language Log

Sumerian and Sinitic

December 17, 2023 @ 7:54 pm · Filed by Victor Mair under Artificial intelligence, Classification, Grammar, Translation

This amounts to an afterword to this post: "Hype over AI and Classical Chinese / Literary Sinitic" (11/9/23)

Four decades ago, when I was trying to determine what type of language Sinitic was (synthetic, analytic, inflected, isolating, agglutinative, fusional, polysynthetic, etc.), from a survey of all the world's languages that I could get a grasp of, I came across Sumerian, which seemed to have many features that were similar to Sinitic, so I decided to look into that a bit more deeply.

Fortunately, I discovered this excellent book, which had just come out around that time:

Marie-Louise Thomsen, The Sumerian Language: An Introduction to Its History and Grammatical Structure (Mesopotamia Copenhagen Studies in Assyriology, Volume 10) (Akademisk Forlag, 1984).

In it, she said, "…the study of the Sumerian language is not easy: the meaning of many words and grammatical elements is far from evident, the writing is defective…". She also declared, "The orthography of the Old Sumerian texts is rather defective."

I was shocked. These are rather harsh judgements, so I asked my colleagues who were specialists in Sumerian. Those were the days of Samuel Noah Kramer (1897-1990) and Åke W. Sjöberg (1924-2014), who were world-renowned Sumerologists. Åke assured me that Thomsen was a reliable scholar.

Hmmm, I thought, might I conceptually apply a similar characterization to Sinitic? Of course, I chose to focus on the earliest stages of Sinitic, since later stages (after the Han Dynasty [202 BC – 9 AD; 25–220 AD]) have been so influenced by areal features that one can no longer think of them as pristinely Sinitic. (This is why some of the most pathbreaking, creative linguists among my colleagues [e.g., John McWhorter, Charles N. Li, and David Prager Branner] think of post classical Sinitic as a creole-like or mixed language.)

After that, around the mid to late nineties, I let the matter drop — under force of teaching, publishing, lectures elsewhere, expeditions to Central Asia — and thought of the Sumerian parallels only sporadically, until the last few days, when I was preparing that post about using AI to translate Classical Chinese, which forced me once again to think about the basic nature of Old Sinitic.

So I went back and asked the current generation of Penn Sumerologists, Steve Tinney and Phil Jones.

Phil tells me: "I've seen Piotr Michalowski (University of Michigan) refer to earlier cuneiform as 'nuclear' where it probably ignores most of the verbal and nominal affixes." Phil continues:

There's actually a couple of different phenomena happening.
One is how to represent CVC sequences: early on there is a small number of explicit CVC-signs, but the only certain way of representing that final C comes with the creation of a full inventory of VC signs.
The other is what prompts Michalowski's use of "nuclear" –
- a later form such as mu-un-da-an-peš-e can be analyzed as mu.n.da.n.peš.e
- does an earlier form mu-da-peš-e imply mu.da.peš.e with the later n.'s (basically 3rd animate singular markers) representing a linguistic innovation or were they already in the language and the writing system didn't feel the need to represent them in its earlier stage?
I think I've also seen the term "mnemonic" in place of "nuclear".

When I was thinking about this Sumerian-Sinitic symplegma fairly intensively in the mid-80s, I also found another descriptor for the grammar of Sumerian, and it seemed to fit quite well, without in the least sounding pejorative. It was something related to ellipsis, i.e., prone to the elision of inessential elements. I also considered "minimalist" and several other similar descriptors.

Please rest assured that I am not, like certain Hungarian and Turkish scholars have done with their languages, trying to draw a genetic relationship between Sinitic and Sumerian, not in the slightest. Rather, I am looking at Sumerian for comparative typological purposes, as explained in the second paragraph of this post.

Selected readings

"'The world's oldest in-use writing system'?" (5/12/12)
"ChatGPT does cuneiform studies" (5/21/23)
"Pleiades: From Sumer to Subaru" (4/25/22)
"The Sound of Ancient Languages, parts 1 and 2" (10/24/23)
"An example of ChatGPT 'hallucinating'?" (4/16/23)
"Desultory philological, literary, and historical notes on Xanadu" (4/4/23)
"Hallucinations: In Xanadu did LLMs vainly fancify" (4/3/23)
"Detecting LLM-created essays?" (12/20/22)
"Alexa down, ChatGPT up?" (12/8/22)
"Bing gets weird — and (maybe) why" (2/16/23)
"ChatGPT-4: threat or boon to the Great Firewall?" (3/21/23)
"ChatGPT writes VHM" (2/28/23)
"ChatGPT: Theme and Variations" (2/21/23)
"GLM-130B: An Open Bilingual Pre-Trained Model" (1/25/2023)
"ChatGPT writes Haiku" (12/21/22)
"Artificial Intelligence in Language Education: with a note on GPT-3" (1/4/23)
"DeepL Translator" (2/16/23)

December 17, 2023 @ 7:54 pm · Filed by Victor Mair under Artificial intelligence, Classification, Grammar, Translation

Permalink

9 Comments

Martin Schwartz said,

December 17, 2023 @ 8:11 pm

In my early teens I saw something –an article, not a book–referring to
or summarizing the views of a Ball–I was dubious at the time– and now I see listings for the book Chinese and Sumerian, Charles James Ball, O.U.P
1913. ?
Martin Schwartz
Martin Schwartz said,

December 17, 2023 @ 8:19 pm

Samuel Noah Kramer–I remember him as a charismatic and inspiring scholar. The Wiki article on him makes good reading.
Martin Schwartz
Jonathan Smith said,

December 17, 2023 @ 10:34 pm

Re: "Defective", this = imperfect (ambiguous, partial…) mappings from written sign to phonological form; nothing pejorative and nothing (directly) related to the Sumerian language per se. Yeah Chinese writing is also highly "defective" in this technical sense, in some respects more so now than is considered once to have been the case.

Re: Chinese+Sumerian, an old idea is that given lots of monosyllabic words, logographs of these early writing systems lent themselves relatively straightforwardly to "syllabogram"-ish phonetic uses… I guess this makes sense.

Re: Chinese+creolization, a number of ideas have been floated, some involving the very earliest stages of Sinitic. While Chinese(s) clearly became typologically "MSEAsian" over time, IMO it's not clear that the standard creolization narrative is that useful here — why not plain old contact + time; after all, AA, AN… languages having fallen into this orbit in more recent times follow the same trajectory. Or maybe the bigger question is "why/whence this typology" in the first place; if one is inclined to accept "Austro-Tai", HM is the only relevant family without members lying plainly outside of the MSEA sphere…
Stephen Goranson said,

December 18, 2023 @ 9:11 am

Chinese and Sumerian, by C. J. Ball, is available online:
https://catalog.hathitrust.org/Record/012262001
Rodger Cunningham said,

December 18, 2023 @ 11:09 am

Jonathan Smith: I doubt that I'm the only LLog reader who'd like to see your final paragraph with the abbreviations expanded.
David Marjanović said,

December 18, 2023 @ 11:37 am

Do you mean the typology of the languages or the one of the writing systems? For the writing systems I can see that: as the Sumerian notations gradually became writing during their attested history, they started as suitable only for bean-counting, with symbols only for concrete noun roots, numerals and measures, and gradually expanded by the use of the same signs for synonyms, homonyms and eventually just syllables that it ended up as a full-blown writing system, capable of recording any text in specifically the Sumerian language. On the Sinitic side, it seems that various "presyllables" and prefix consonants came to be written (by modification of the base characters) late or in some cases never, implying a similar development that just isn't as exceptionally well documented as Proto-Cuneiform is. Incidentally, that would support the idea that Chinese characters were developed completely independently of any other writing system; only the abstract idea of notation itself could still have arrived from elsewhere in this scenario.

The languages, not so much. Sumerian was polysynthetic with lots of inflectional morphology expressed as affixes on either side of the noun and verb roots, and agreement between words in a sentence. Even in Old Sinitic as reconstructed by Baxter & Sagart (2014), there are hardly ever more than two affixes on the same root.

final paragraph with the abbreviations expanded

Modern (?) Southeast Asian
in my opinion
Austroasiatic
Austronesian
Jonathan Smith said,

December 18, 2023 @ 12:15 pm

Sorry — Sinitic is (/ has become) part of the so-called "Mainland Southeast Asia linguistic area", with phonemic tone / largely monosyllabic morphemes / simplified syllable onsets and esp. codas but plentiful vowel contrasts, etc. Very different from many of its "Sino-Tibetan" relatives.

This has invited an explanation beyond just "contact", thus e.g. DeLancey's proposal ("The origins of Sinitic") that Chinese may have begun as a "creolized" lingua franca among (relatively) indigenous Hmong-Mien speaking, etc., populations — thus, some "Sino-Tibetan" lexical stock + relic morphology but largely "Mainland Southeast Asian" typological profile. (Lots more details in the papers including notion of a connection to the Shang-Zhou political transition…)

However, plenty of other "Sino-Tibetan" languages have undergone / are undergoing similar typological transitions… also the Tai part of Austro-Tai, if the latter is thing, and more recently Cham within Austronesian, and of course Vietnamese, etc., within Austroasiatic, and so on. So I don't know that creolization per se needs to be part of the explanation re: Chinese — this just seems to be what happens to languages in this particular zone over time. Or maybe the "MSEA" profile just happens to feature certain kinds of "simplifications" associated elsewhere with creolization…

Hmong-Mien is special in that this family lies entirely within the zone in question and apparently originated here-ish. IDK if this means it had a special role to play at early periods as DeLancey and others suspect…
Stephen Anderson said,

December 18, 2023 @ 12:55 pm

The Thomsen book is also available online:
https://theswissbay.ch/pdf/Books/Linguistics/Mega%20linguistics%20pack/Cuneiform/Sumerian%20Language%20%28Thomsen%29.pdf
Chris Button said,

December 18, 2023 @ 9:36 pm

On the Sinitic side, it seems that various "presyllables" and prefix consonants came to be written (by modification of the base characters) late or in some cases never

… Even in Old Sinitic as reconstructed by Baxter & Sagart (2014), there are hardly ever more than two affixes on the same root.

The Baxter & Sagart system is already really extreme in terms of prefixes and presyllables. Personally, I remain unconvinced by many of them. It is quite on the fringes of Old Chinese reconstruction in that regard.

Sinitic is (/ has become) part of the so-called "Mainland Southeast Asia linguistic area"

… Very different from many of its "Sino-Tibetan" relatives.

I'm not really in agreement with that. Take just the "Burman" side of Tibeto-Burman for instance. Granted, it doesn't seem the same can be said for the "Tibeto" side, but I'm not well-versed there.

As for all the other languages in the family, one big standout with which I have some personal familiarity is the Kuki-Chin family. Their use of complex verbal inflections appears highly exceptional on the surface, but ultimately they go back to a far simpler and typologically well-attested proto-paradigm.

RSS feed for comments on this post

Sumerian and Sinitic

9 Comments

Martin Schwartz said,

Martin Schwartz said,

Jonathan Smith said,

Stephen Goranson said,

Rodger Cunningham said,

David Marjanović said,

Jonathan Smith said,

Stephen Anderson said,

Chris Button said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta