Language Log

Latinxua / Latinization — it worked in the 30s and 40s

December 21, 2021 @ 10:41 pm · Filed by Victor Mair under Alphabets, Romanization, Spelling, Topolects

Tweet from Alan DAI:

“the people’s paper” 老百姓报, a wall newspaper in yan’an, june 1937 — note that it is written with latinxua sin wenz, a romanization system developed by chinese and soviet sinologists during the early 1930s that did not mark tones and used a lot of fun x’s and z’s pic.twitter.com/vTMA7cUhKQ
— alan DAI (@dai_alan_dai) December 21, 2021

[Click on the photograph to see the complete Twitter thread, which has additional illustrations of printed Latinxua texts.]

We may think of Latinxua ("Latinization") as a forerunner of Hanyu Pinyin, which is the official Romanization of the People's Republic of China:

Latinxua Sin Wenz (Chinese: 拉丁化新文字; pinyin: Lādīnghuà Xīn Wénzì; lit. 'Latinized New Script'; also known as Sin Wenz "New Script", Zhungguo Latinxua Sin Wenz "China Latinized New Script", Latinxua "Latinization") is a historical set of romanizations for Chinese languages, although references to Sin Wenz usually refer to Beifangxua Latinxua Sin Wenz, which was designed for Mandarin Chinese. Distinctively, Sin Wenz does not indicate tones, under the premise that the proper tones could be understood from context.

Latinxua is historically notable as being the first romanization system used in place of Chinese characters by native Chinese speakers. It was originally developed by groups of Chinese and Russian scholars in the Soviet Union and used by Chinese immigrants there until the majority of them left the country. Later, it was revived for some time in Northern China where it was used in over 300 publications before its usage was ended by the People's Republic of China.

…

The work towards constructing the Beifangxua Latinxua Sin Wenz (北方話拉丁化新文字) system began in Moscow as early as 1928 when the Soviet Scientific Research Institute on China sought to create a means through which the large Chinese population living in the far eastern region of the USSR could be made literate, facilitating their further education.

This was significantly different from all other romanization schemes in that, from the very outset, it was intended that the Latinxua Sin Wenz system, once established, would supersede the Chinese characters. They decided to use the Latin alphabet because they thought that it would serve their purpose better than Cyrillic Unlike Gwoyeu Romatzyh, with its complex method of indicating tones, Latinxua Sin Wenz system does not indicate tones at all.

For those who are interested in the history of Chinese writing during the 20th century, I strongly recommend that they read the rest of the Wikipedia article, which ends thus:

Because Sin Wenz is written without indicating tones, ambiguity could arise with certain words with the same sound but different tones. In order to circumvent this problem, Sin Wenz defined a list of exceptions: "characters with fixed spellings" (Chinese: 定型字). For example, 买 (pinyin: mǎi; lit. 'buy') and 卖 (pinyin: mài; lit. 'sell') are of the same sound but different tones. The former is written as maai and the latter is written as mai in Sin Wenz. The word 有 (pinyin: yǒu; lit. 'to have') is also special; it is written as iou, as opposed to iu, which may be 又 (pinyin: yòu; lit. 'once more').

Telegrams sent by workers for the railways in the northeast of China switched from Zhuyin to Sin Wenz in 1950, then from Sin Wenz to Hanyu Pinyin in 1958; the 5 irregular spellings of 卖 maai, 试 shii, 板 baan, 不 bu, and 李 lii, in use during the Hanyu Pinyin period, were inherited from Sin Wenz.

In addition, Sin Wenz also calls for the use of the postal romanization when writing place names in China, as well as preservation of foreign spellings (hence Latinxua rather than *Ladingxua).

Think what might have been. They had already published half a million issues with more than 300 publications and were making great strides in combating illiteracy, taking different topolects into account. But then the Hanziphiles spoke out, and even the communists quaked.

Selected readings

John DeFrancis, Nationalism and Language Reform in China (Princeton: Princeton University Press. 1950).
John DeFrancis, The Chinese Language: Fact and Fantasy (Honolulu; University of Hawaii Press, 1984).
Zhou Youguang, The Historical Evolution of Chinese Languages and Scripts (Columbus: National East Asian Languages Resource Center, The Ohio State University, 2003).
"Writing Sinitic languages with phonetic scripts" (5/20/16)

[h.t. Geoff Wade]

December 21, 2021 @ 10:41 pm · Filed by Victor Mair under Alphabets, Romanization, Spelling, Topolects

Permalink

62 Comments

James Griffiths said,

December 22, 2021 @ 2:54 am

What's doubly tragic when it comes to Latinxua is that the early advocates were also strong proponents of preserving other Chinese languages and topolects. They were critical of other Latinisation projects for only focusing on (what would become) Putonghua.

Shanghai Society to Study the Latinization of Chinese Writing released a statement in 1935 (via DeFrancis, 1950):

"The National Language Romanization venerates the speech of Peiping as the National Language; nominally it advocates the unification of the National Language, but actually it sets up a dictatorship of the Peiping speech. In the opinion of people with leisure and money, no special effort is involved in learning Peiping speech and then using the roman letters to read and write. But if a poor person of Shanghai or Foochow or Canton has to study the Peiping speech and at the same time to learn romanization, that is almost as difficult as learning a foreign language."
John Rohsenow said,

December 22, 2021 @ 3:07 am

"In order to circumvent this problem, Sin Wenz defined a list of exceptions: "characters with fixed spellings" (Chinese: 定型字). For example, 买 (pinyin: mǎi; lit. 'buy') and 卖 (pinyin: mài; lit. 'sell') are of the same sound but different tones. The former is written as maai and the latter is written as mai in Sin Wenz…"
This particular solution immediately calls to mind the famous Gwoyeu Romatzyh 國語羅馬字 Transcription System,which was created in the late 1920s and served as a transcription system during the Republican period (1912-1949) and is still in use, at least to a certain extent, in Taiwan. It is most familiar to older American students as G.R., used in Prof.. Y R Chao's famous beginning Chinese language textbooks. Its most interesting feature is the inclusion of the tone pitches of the syllables into the transcription by altering the transcription, instead of adding a special symbol to the basic transcription, so that in the two examples cited above, 买 (pinyin: mǎi; 'buy') and 卖 (pinyin: mài; 'sell'),
in G.R. the former is written as maai and the latter is written as may.
Victor Mair said,

December 22, 2021 @ 8:46 am

Gwoyeu Romatzyh (GR; National Language Romanization; Guóyǔ luómǎ zì 國語羅馬字 / 国语罗马字), which we have often mentioned and discussed on Language Log, was a creation of elegance and intelligence. Unfortunately, it was also difficult to master for those who were not geniuses. This is one of the main differences between GR and Latinxua Sin Wenzi, which was simple and easy to use for common people. Moreover, as rightly pointed out by James Griffiths, "the early advocates [of Latinxua Sin Wenzi] were also strong proponents of preserving other Chinese languages and topolects."

https://en.wikipedia.org/wiki/Gwoyeu_Romatzyh

https://en.wikipedia.org/wiki/Latinxua_Sin_Wenz
Mark S. said,

December 22, 2021 @ 10:13 am

For a book from 1936 discussing Sin Wenz, the need for it, and its orthography, as well as how it might handle fangyin (regional pronunciations), see Sin Wenz Rhumen (A Primer on Sin Wenz).

Please note that the final section of the book offers examples of texts in Sin Wenz.
Chris Button said,

December 22, 2021 @ 3:02 pm

Interesting how “wenz” is written as if one syllable. Perhaps reflecting the neutral tone of the “z” syllable?
J.W. Brewer said,

December 22, 2021 @ 6:38 pm

The wiki article in Sin Wenz notes rather self-referentially that "For example, 新 (pinyin: xīn; lit. 'new') can be written as xin or sin in Sin Wenz," although it doesn't explain why it ended up in this particular instance as Sin Wenz rather than Xin Wenz. (In hanyu pinyin the same characters would be "Xin Wenzi" and in Wade-Giles "Hsin Wen-tzu," I think.) Of course the same morpheme comes out "sin" in Sinkiang, the old postal-map romanization of what the current regime calls Xinjiang.

But this reminds me of something about the relationship between orthography and Mandarin phonology that I've puzzled over before. In bopomofo ㄒ, ㄕ, and ㄙ are three different glyphs representing three different syllable-initial sibilants, which are generally mapped to three different latin-alphabet equivalents (e.g. x, sh, and s, respectively, in hanyu pinyin). But as best as I can figure out Mandarin's phonotactics from online lists of the permissible syllables, these three are allophones in complementary distribution. E.g., the "new" morpheme is said to be pronounced ㄒㄧㄣ, but there is no possible syllable ㄕㄧㄣ or ㄙㄧㄣ for that to be contrasted with in a phonemically meaningful way. So my question is what motivated the designers of bopomofo and the various romanization schemes to treat these as three different consonants that needed to be distinguished from each other orthographically rather than a single phoneme whose realized pronunciation would vary (predictably) by context depending on the following vowel etc? The variability between g, k, x and z, c, s that wikipedia reports for Sin Wenz (with xin/sin as an illustration) suggests a somewhat inconsistent approach to maintaining part of this three-way distinction, but that just makes the whole thing even more puzzling.
David Marjanović said,

December 22, 2021 @ 6:41 pm

Perhaps reflecting the neutral tone of the “z” syllable?

No, Pinyin si, zi, ci were always written s, z, c without any vowel letter.
J.W. Brewer said,

December 22, 2021 @ 6:44 pm

Actually I need to amend the prior comment to note that there do seem to be some minimal pairs where a contrast between ㄕ and ㄙ is phonemic, e.g. ㄕㄢ v. ㄙㄢ, there don't seem to be any minimal pairs where either of those two initials contrast with ㄒ. So maybe you need two glyphs to avoid ambiguity but you don't need three.
David Marjanović said,

December 22, 2021 @ 6:48 pm

But as best as I can figure out Mandarin's phonotactics from online lists of the permissible syllables, these three are allophones in complementary distribution.

Well, you can regard the consonants of these syllables as allophones in complementary distribution conditioned by the vowels, or you can regard the vowels as allophones in complementary distribution conditioned by the consonants. If you're willing to accept syllables that don't contain phonemic vowels at all, as Sin Wenz makes explicit, you can analyze Standard/Northeastern Mandarin as having just two vowel phonemes.
J.W. Brewer said,

December 22, 2021 @ 6:52 pm

It may be significant to David M.'s point that hanyu pinyin si, zi, ci come out as single glyphs in bopomofo (ㄙ, ㄗ, and ㄘ), meaning the consonant without further specification somehow implies the vowel. The same is true (i.e. only one bopomofo glyph which somehow implies a default vowel if no second glyph is given) for the syllables written in hanyu pinyin as (leaving out vowel-initial ones) chi, ri, shi, wu, yu, & zhi
IA said,

December 22, 2021 @ 7:00 pm

Reply to C Button of December 22, 2021 @ 3:02 pm

'Interesting how “wenz” is written as if one syllable. Perhaps reflecting the neutral tone of the “z” syllable?'

Nothing to do with that whatsover. It is rather because Beila wrote z c s zh ch sh (*without* a vowel) for so-called 「空韻」。Why? Because the spellings zi ci si meant something else. What exactly? So far as graphemes go, Beila maintained the 尖團 differentiation, such that (regardless of how a person might actually realise them) zi ci si ≠ gi ki xi (which of course are merged in Pekinese as what HP writes as j q x.) Thus Beila 'guozi' （國際） ≠ (supposing this lexicon existed for them at all) 'guozi' （國字）. Etc. etc.

(Why not use for 空韻, as GR did? Because in Beila, means /y/.)
IA said,

December 22, 2021 @ 7:02 pm

Correction of last sentence. Should read:

(Why not use the letter 'y' for 空韻, as GR did? Because in Beila, 'y' means IPA /y/.)
IA said,

December 22, 2021 @ 7:11 pm

And by 空韻 I mean the sounds [ɿ] and [ʅ] (to use the symbols deemed illegal by the International Phonetic Association).
IA said,

December 22, 2021 @ 7:24 pm

One should not say 'Latinxua' or 'Sin Wenz' when referring *only* to 'Beifangxua Latinxua Sin Wenz' (which latter should be called 'Beila'). This is because there were *many* areal (= 'dialectal') manifestations of Sin Wenz. Among which, for their respective details: https://zh.wikipedia.org/wiki/Category:拉丁化新文字.
IA said,

December 22, 2021 @ 7:27 pm

And here: https://zh.wikipedia.org/wiki/拉丁化新文字.
IA said,

December 22, 2021 @ 7:37 pm

Some persons might benefit from reading this: 'Against the Myth of the Russian-Cyrillic Origin of Certain Pinyin Letters', which covers more topics than the title suggests. https://www.angelfire.com/pop2/pkv/myth.html
Chris Button said,

December 22, 2021 @ 9:23 pm

Ah ok, so it’s bopomofo-like in that regard then. Guess I should have looked it up on Wikipedia …
IA said,

December 22, 2021 @ 10:45 pm

And, on top of the graphemic distinction of 尖團, Beila also differentiated between -e and -o where Pekinese phonology (written as HP) only has -e. (Cf original Wade-Giles, which also had the distinction of some sort or another, for example「客」with -ê, vs「課」 with -o.)

This means that corresponding to Pekinese (here in HP) 'jue', Beila had four different (written) syllables: gyo, gye, zyo, zye.

(Remember, the letter 'y' was used for 'u' umlaut.)
Terpomo said,

December 22, 2021 @ 11:10 pm

@Victor Mair
Why need one be a genius to learn GR? The fundamental concepts of it are not difficult at all; it may be tricky at first to memorize the rules for how different types of syllables are respelled, but it quickly becomes quite intuitive, especially if you're learning the language at the same time or already have some grasp of it. I didn't even find it particularly difficult to pick up Chao's General Chinese, and I don't think I'm a genius.
@J. W. Brewer
I believe there's at least one Braille scheme for Mandarin that treats x j q as allophones of h g k, and then Palladius Cyrillization analyzes them as allophones of s z c. Personally I think the latter analysis makes more sense for a few reasons.
IA said,

December 22, 2021 @ 11:33 pm

Chart in middle of https://www.angelfire.com/pop2/pkv/ipa.html shows what various systems (including the Russian, French, and Italian) have done about the vexing problem of [ɿ] and [ʅ ]. (The German Lessing-Othmer is missing, but it used 'ï' for these.)

Supposing a system does not differentiate 尖團, and supposing it recognises as mandated the pronunciations [tɕ] [tɕʰ] [ɕ] for the now merged series, how are they to be written? With their own, independent symbols (as do HP and Zhunyin Fuhao)? Or, are they to be treated as conditioned allophones (occuring only before [i] and [y]) of one the the following sets of initials, with which they are in complementary distrubution?

[k][kʰ][x~h]
[tʂ][tʂʰ][ʂ]
[ts][tsʰ][s]

Don't think that the last one is only hypothetical. In Riedlinger's magnificent book 'Likbez — Alphabetisierung bei den sowjetischen Dunganen seit 1927 und ihr Zusammenhang mit den Latinisierungsestrebungen in China' there is an example of Beila by some residents of Beiping. Since the 尖團 differentiation was not something they had in their own speech, and apparently not of a mind to bother with normative Beila, they simply used z- c- s- everywhere to represent their own monolithic [tɕ] [tɕʰ] [ɕ]. Thus, where normal Beila would have 'zgi' （「自己」）, they had 'zzi'.
IA said,

December 22, 2021 @ 11:47 pm

I'm looking at the 'Latinxua Xanz Pinjin Biao' in 《陈望道语文论集》.

In it:

zyo 爵
zye 絕
gyo 覺
gye 決

Better than my idle speculation, Chris Button will be able to tell us the actual historical-linguistic/'dialectal' backround to this -e / -o difference.
Victor Mair said,

December 22, 2021 @ 11:57 pm

I knew for sure that some smart aleck would say you don't have to be a genius to learn GR, but the only people I know who learned it well — and they are few and far between — are extraordinarily smart, unlike for Hanyu Pinyin, which just about anyone can learn with a minimum of effort.
Terpomo said,

December 23, 2021 @ 1:11 am

Mightn't the fact that GR is used much less often have anything to do with it, since it means people have less need to learn it? After all, millions of people of average intelligence do manage to learn far worse orthographic systems, such as those already in use for English or Thai.
IA said,

December 23, 2021 @ 2:02 am

For what it's worth, in 顧黔、RV Simmons《漢語方言共同音系研究》 —

(I give here Beila spelling/representative morpheme mentioned in previous post, followed by the coda for that morpheme given in their book: )

zye 絕 -yet
gye 決 -yet
gyo 覺 -eok
zyo 爵 -iok (嚼)

(Their 'y' is [y], and the 'e' in -eok just means that a velar initial is palatalised in the North … something to that effect.)
IA said,

December 23, 2021 @ 3:31 am

Here below is the article 'KY CIUBAI TUNGZH' (i.e. 瞿秋白同志 ) from the book 'Beifangxua Sin Wenz Koben' （倪海曙編輯, 1950）.

By this time what Hanyu Pinyin always (?) has as 'de' was sometimes 'de', sometimes '-d', sometimes -di. This was for syntactical reasons I have forgotten after 20 years.

'Solian' is because of 'Soviet Lianbang'. (Cf 'Moskwa' or 'Gwoyeu Romatzyh' with the same principle of conservation of the original spelling (including Latin alphabet transcription) of a loanword. This, regardless of how it might be adapted to native phonology when spoken. Consider the English vs French pronunciations of 'Paris'.)

************

Ky Ciubai t-zh, ta sh Giangsu Changzhourhen, 1800 n. sheng.

Ta siao de shxou, giali xen kyng, tad mucin sh inwei bei shengxo bide meijou banfa, chle xungtou-xuochai zsha de.

1920 n., ta zai Beiging cangia "5-4" Xyosheng Yndung, xoulai bei Beiging Chenbao cingky dang gizhe, bingcie pai ta dao Moskwa ky kaocha Oguo de geming, ta zai nabian ba Oguo de geming dui guonei zole xen xaod baogao.

1923 n., ta xui dao Zhungguo lai, cangia Zhungguo Gungchandang Di-3 C Cyan Guo Daibiao Dahui, dangsyan zhungyang weiyan.

1927 n., zai cingsuan Chen Dusiu gixuizhuji lusian de 8-7 huiji ixou, ta danrhen Dang-Zhungyang de shugi. 1928 n., ta daibiao Zhungguo Gungchandang dao Solian ky cangia Gungchan Guozi Di-6 C Daxui. Zai zhego shki litou, ta iu zinxingle fandui go guo shexui minzhudang xo go guo gungchandang neibu gixuizhuji lusian de douzheng, bingcie kaish iangiu Zhungguo wenz Latinxua de wenti, kicaole zui zaod Zhungguo Latinxua Sin Wenz fang'an.

1930 n., ta xui dao Shangxai, zai wenxua ssiang zhansian shang zinxingle xuixuangd gungzo. Ta sh Zhungguo geming wenxyo de zhujao chuangzaozhe litou de igo, tungsh ie sh Zhungguo Lantinxua Sin Wenz de chuangzaozhe. Tad ji pian guanjy Zhungguo wenz gaige de zhumingd lunwen, du sh zai zhego shki zhung sie de.

1933 n., ta dao Giangsi ky cangia Soviet de gungzo, danrhen Zhungguo Soviet Zhengfu de giaoybu buzhang. 1935 n., ta zai Fugian bei Guomindang de gyndui fulo, zai dirhen de fating xo gianjy zhung chungfen biaoxianle igo gungchandangjyan de gang-tie izh xo ingxyng kigai. Zuixou, ziu zai zhe i nian de 6 ye 18 rh, ta ingjyngdi wei geming xianchule zgid shengming.
IA said,

December 23, 2021 @ 3:33 am

Correction: 1899 of course, not 1800.
Victor Mair said,

December 23, 2021 @ 8:58 am

GR was the official National Language Romanization of the Republic of China. It was promoted with the wonderful Gwoyeu Tsyrdean, a marvelous dictionary of the National Language that I still regularly use, and other publications. In later times, I myself promoted GR with my own academic writings and the journal called Shin Tarng (name changed to Xin Tang starting from the fourth issue, because our readers, most of whom were in China, told us that they didn't like / couldn't master GR), which was partially supported by enlightened benefactors such as Elling Eide. Nonetheless, GR did not catch on because it was too hard for ordinary mortals. Do not bring up the quirks of English and Thai, which evolved organically within natural languages. They are irrelevant in this discussion which is about Romanizations invented from scratch.

The same sort of non-usage pertains to sìjiǎo hàomǎ 四角號碼 (four corner method), the brilliant character lookup system, which was also a product of the Republic of China era. It too was promoted by the publication of valuable research tools such as the indices to the shí tōng 十通 (ten traditional encyclopedic works) from Shāngwù yìnshū guǎn 商務印書館 (Commercial Press). From the time I first went to China in 1981, I met many scholars who realized the value of these research tools, but would always ask me to look things up in them because they were not able to master the Four Corner system. In my whole career, I only encountered four people who were capable of readily looking things up with the four corner method, and they were all laowai ("old furriners").
J.W. Brewer said,

December 23, 2021 @ 11:22 am

One seeming oddity struck me (after a bit of delay) about the vintage "wall newspaper" pictured in the original post. It's not that the text is in latin-alphabet letters, it's that it's written in cursive. This is not a style I'm used to seeing used for transliterated Sinitic, and I likewise don't remember seeing cursive romaji during my childhood years in Tokyo. Maybe it is (or used to be a thing) that I just haven't been exposed to or don't recall, or maybe this was an odd outlier. I don't have enough of a sense of when cursive is and isn't used in Cyrillic to know if the Soviet influence might have been a factor here.
Chris Button said,

December 23, 2021 @ 12:27 pm

If you're willing to accept syllables that don't contain phonemic vowels at all, as Sin Wenz makes explicit, you can analyze Standard/Northeastern Mandarin as having just two vowel phonemes.

Or even no vowels at all, depending on your theoretical persuasion! The key point is that you need to treat “vowels” as syllabic forms of sonorants—something familiar from Proto-Indo-European.

In my opinion, the concept of “vowel” is really only useful for phonetic analyses. Once you get into phonology, it stops being a useful category. Unfortunately, the concept of consonants vs vowels is too entrenched in the discipline of linguistics for it to be overhauled any time soon.
Antonio L. Banderas said,

December 23, 2021 @ 4:28 pm

@C Button

Standard Chinese can be analyzed as having between two to six vowel phonemes
https://en.wikipedia.org/wiki/Standard_Chinese_phonology#Vowels
Chris Button said,

December 23, 2021 @ 5:16 pm

I was thinking of some of the discussions by Pulleyblank.

Incidentally, for what it's worth, I would analyze Burmese as having a vertical "vowel" system too. Pulleyblank wrote a paper about Burmese once too in that regard, although I would analyze things slightly differently in places (he wasn't a specialist in Burmese).
David Marjanović said,

December 23, 2021 @ 6:45 pm

It's not that the text is in latin-alphabet letters, it's that it's written in cursive. This is not a style I'm used to seeing used for transliterated Sinitic

This is what handwritten Latin letters look like.

Only Americans draw printed letters by hand. That seems to be because they're taught downright calligraphy in the 3rd grade instead of handwriting in the 1st.

What is interesting is the complete absence of capital letters, but that's shared with the initial versions of some of the Early Soviet orthographies in the Latin alphabet, so I do suspect specifically Soviet influence there. That said, the Cree orthography insists on not using uppercase, too.

The key point is that you need to treat “vowels” as syllabic forms of sonorants—something familiar from Proto-Indo-European.

Doesn't work for *e and *o in PIE, let alone for *ē and *ō (which weren't completely predictable in PIE anymore, although it doesn't take a lot of internal reconstruction to make them disappear).

For Mandarin, I can't see how you could get to one or zero vowel phonemes without adding epicycles that don't correspond to any phonetic reality. One such attempt was once made for a West Caucasian language: the difference between /ə/ and /a/ was reinterpreted as a difference between consonants lacking or having an "openness" feature (thus doubling the already huge consonant inventory). If you can see a parsimony-based reason to do that, please tell me, because I can't.
Chris Button said,

December 23, 2021 @ 7:08 pm

If I were looking into it, I would ponder two things:

1. The actual nature of “e” and “o” in PIE—vertical vowel systems are an established reality, horizontal ones are not

2. Schwa as a default feature of syllabification—that leaves just “a”
julie lee said,

December 23, 2021 @ 9:39 pm

Amusing tidbit:

Just now I went to the Chinese Wikipedia (Weiji Baike) to look up
the Four Corner Method (四角號碼) of looking up characters, and found the Wiki. article written in Cantonese ! Every now and then I'll find a Chinese Wiki. article written in Cantonese instead of the usual Mandarin. This is like finding an article in the English-language Wiki. written in Gaelic instead of English. I interpret
theee Cantonese entries as acts of defiance.
Jonathan Smith said,

December 23, 2021 @ 11:15 pm

Wow; thanks for this. It would be cool if more "Beila" documents were readily available online… lots of interesting features from a linguistic point of view of which a couple random ones are "du" not "dou" consistently for the word 'all' (e.g. line 2 in image) and "xuan" at times (?) for the word(s) 'still', MSM (pinyin) hai but in some northern varieties still ~han or the like (e.g. line 11 in image).
Jonathan Smith said,

December 23, 2021 @ 11:22 pm

And the contrasts g/z k/c x/s where pinyin has j q x are indeed fascinating — the implication would seem to be that velars have palatalized before i/y but remain readily interpretable as allophones of the velars; that is, as IA suggests, a "tuan/jian" contrast survived among investors and early users of Beila. Incidentally such a situation would readily account for spellings like "Peking" and the rest without requiring phonetically velar articulations for k, etc.
Jonathan Smith said,

December 23, 2021 @ 11:23 pm

*inventors
Chris Button said,

December 24, 2021 @ 8:30 am

zyo 爵
zye 絕
gyo 覺
gye 決

Better than my idle speculation, Chris Button will be able to tell us the actual historical-linguistic/'dialectal' backround to this -e / -o difference

Sorry, I missed this earlier. That's not something I know much about. However, Pulleyblank's reconstruction of Early Mandarin in his Lexicon (based on the Mengu Ziyun from 1308 and the Zhongyaun Yinyun from 1324) offers the following bolded forms below:

zyo 爵 tsjɛw
zye 絕 tsɥɛ
gyo 覺 kjaw
gye 決 kɥɛ

So we're still several centuries out, but the correspondence patterns are clear.
Chris Button said,

December 24, 2021 @ 8:31 am

*Zhongyuan
Chris Button said,

December 24, 2021 @ 9:01 am

*Menggu (Lol, I really miss that old editing feature)
Philip Taylor said,

December 24, 2021 @ 10:33 am

David M. — Could you possibly clarify what you meant by "Only Americans draw printed letters by hand. That seems to be because they're taught downright calligraphy in the 3rd grade instead of handwriting in the 1st" ? In the first sentence, for example, where is the stress — is it on "printed" or on "hand" ? And what do you mean by "downright calligraphy" ? I ask because as a Briton, I print rather than "use joined-up writing" when I handwrite because I switched school so many times in my formative years that I never developed the skill of doing "joined-up writing". By contrast, my maternal grandfather was taught copperplate at school, and his handwriting was a sheer joy to see (and he was politely horrified by my failure to grasp what was to him a basic and necessary skill).
B.Ma said,

December 24, 2021 @ 3:11 pm

@julie lee

You do know there is a Cantonese wikipedia separate from the "Chinese" (i.e. Standard Written Chinese) wikipedia?

There are also Wikipedias in Shanghainese, Minnan, Mindong, Gan, etc. but the Cantonese one seems to show up in search results a lot more (since I don't know any other topolects I don't visit the others, except to admire the Minnan one which is written in Romanization).
Jerry Packard said,

December 24, 2021 @ 6:36 pm

“…the relationship between orthography and Mandarin phonology that I've puzzled over before. In bopomofo ㄒ, ㄕ, and ㄙ are three different glyphs representing three different syllable-initial sibilants, which are generally mapped to three different latin-alphabet equivalents (e.g. x, sh, and s, respectively, in hanyu pinyin). ….
… So my question is what motivated the designers of bopomofo and the various romanization schemes to treat these as three different consonants that needed to be distinguished from each other orthographically rather than a single phoneme whose realized pronunciation would vary (predictably) by context depending on the following vowel etc? “

In the field of Chinese phonology this is usually termed “the problem of the palatals”. The multiple complementation problem is that the palatals only ever occur after [i] and [y], while the apicals and retroflexes never do. The problem is solved by getting rid of the palatals, and saying that they are actually gi, ki and hi everywhere. The retroflexes and apicals are then uniquely distinguished by zhi, chi, shi in pinyin, thereby getting rid of the ‘illegal’ vowels with extremely restricted distribution.

I think the reason it was never implemented in pinyin is because although it was a fine phonemic analysis, it was deemed not user-friendly. In addition to the giang, kiang etc. problem, ‘zhi’ etc. would have to drop the i to go from zhiang to zhang.
David Marjanović said,

December 24, 2021 @ 6:58 pm

I'm pretty sure the Chinese Wikipedia (zh.wikipedia.org) is blocked in the PRC, so its very existence is already defiance.

If I were looking into it, I would ponder two things:

1. The actual nature of “e” and “o” in PIE—vertical vowel systems are an established reality, horizontal ones are not

Indeed, it is likely that that was actually vertical at some point. The Anatolian reflexes of *o are all unrounded – a and e in different languages –, and the Tocharian ones – likewise a and e – are unrounded as well, so while *o was probably a rounded back vowel in Proto-Indo-Actually-European, there's no reason to think it was also rounded in Proto-Indo-Anatolian.

The Hittite outcome of stressed *ō does seem to have been u, i.e. /o/. But [a aː] becoming [æ ɒ] is a very common sound change, so it's very weak evidence that the stressed *o was rounded, too.

2. Schwa as a default feature of syllabification—that leaves just “a”

This is easily ruled out for PIE (i.e. all of Proto-Indo-Anatolian, Proto-Indo-Tocharian and Proto-Indo-Actually-European) because there was pretty clearly no schwa in syllables that contained a resonant but neither *e nor *o, and when *e or *o was present, its position was not usually predictable: *CReC, *CeRC, *CRoC, *CoRC and *CRC were five phonemically different syllables.

Could you possibly clarify what you meant by "Only Americans draw printed letters by hand. That seems to be because they're taught downright calligraphy in the 3rd grade instead of handwriting in the 1st" ?

From what I've gathered, they have a separate subject called "cursive" that only starts in the 3rd grade – so the pupils have been writing for years by printing fluently – and puts great emphasis on getting the prescribed shapes of the letters exactly right. Those shapes have a lot of curls. The teaching of this subject is frequently accompanied by rather blatant lies like "you'll need this later". It is widely hated, and again and again there are calls to abolish it.

My experience, in Austria in the late 80s*, was close to the opposite: I was never taught to print, and hardly expected to even have that ability. When we were taught a letter, we were taught how to read the printed versions (Druckschrift, "printing-script") and how to write the joined-up version (Schreibschrift, "writing-script"). Very soon, or so I think in retrospect, the teachers stopped caring about the exact shapes of our letters and only minded if our writing became illegible or ugly (with a great tolerance for pretty but illegible writing). Reading about the American experience in the early 2010s was a massive culture shock for me.

I've blathered about this before in a LL thread I can't find through Google. :-( Anyway, the wording "draw printed letters by hand" is meant to express how utterly alien the concept of printed handwriting is to me.

* And again for Russian in the late 90s.

except to admire the Minnan one which is written in Romanization

Specifically in Taiwanese "Church" Romanization.
Chris Button said,

December 24, 2021 @ 9:22 pm

@ David M

Indeed, it is likely that that was actually vertical at some point

Now we're talking.

… because there was pretty clearly no schwa in syllables that contained a resonant but neither *e nor *o …

Or rather the schwa was specified as "e" or unspecified as "zero". I think you might enjoy Pulleyblank's 1965 paper "The Indo-European Vowel System and the Qualitative Ablaut" and his continuation of the discussion as part of his 1993 paper "The Typology of Indo-European".
julie lee said,

December 24, 2021 @ 10:17 pm

@B.Ma:

Thank you! I knew there are Wikipedias in all sorts of languages but didn't know there were Wikipedias in so many Chinese languages or topolects.
Rodger C said,

December 25, 2021 @ 10:24 am

I was taught writing the way David says, including that foofy "Palmer" script with its ridiculous exercises ("PUSH! PUSH! PUSH!"). My execution was incurably ugly from the outset. At age 20 I taught myself Italic chancery hand, which appealed to me by its crispness and its obvious relation to "printed" letters. I still use it on the increasingly rare occasions on which I use cursive at all. I find that a number of authors I admire also picked up Chancery at some time.
David Marjanović said,

December 25, 2021 @ 1:47 pm

Or rather the schwa was specified as "e" or unspecified as "zero".

What do you mean? Isn't the specification is phonemic, then?

I should have spelled out why I'm so sure "there was pretty clearly no schwa in syllables that contained a resonant but neither *e nor *o": things like the continuation of syllabic /r/ as such (in most positions) in Sanskrit, and more importantly the change of syllabic nasals to /a/ in Indo-Iranian and Greek, which makes perfect sense if an intermediate [ə̃] was involved; there are nice parallels in Bavarian-Austrian dialects.

I think you might enjoy Pulleyblank's 1965 paper "The Indo-European Vowel System and the Qualitative Ablaut" and his continuation of the discussion as part of his 1993 paper "The Typology of Indo-European".

I haven't been able to get the 1993 paper, but I just found, downloaded and read the 1965 one. It makes some of the usual arguments that the *e-*o contrast was vertical at some point, and makes an interesting proposal for what the original morphological function of the *e-*o ablaut was. I've never seen that proposal mentioned elsewhere, so it does seem to be underresearched. However, it does not get rid of the fact that *e and *o contrasted not only with each other, but also, unlike in Kabardian, with zero. Shifting the problem to a late pre-PIE stage, so we can ignore words like *septḿ and *wĺkʷos, does not seem to help. Current work on how PIE stress worked and how it relates to zero-grade hypothesizes, without running into further complications, that *e-*o ablaut was fully morphologized and therefore phonemic when the zero grade was still a living phonological process: pretonic *e, pretonic *o, posttonic *o followed by resonants (IIRC) and posttonic *e (except in the invariant nominative count-plural ending *-es) were deleted. On this I recommend the academia.edu pages of Andrew Byrd, Anthony Yates and Ronald I. Kim among others.

This conference handout does propose a phonetic origin for *e-*o ablaut, but in a way that does not imply that there ever were fewer than two vowel phonemes: there was a length distinction in the vowel system, and then all long vowels merged as (what later became) *o while all short ones merged as *e, very similar to what Indo-Iranian did later.

(Note that there's Uralic-internal as well as typological evidence that the Uralic unstressed non-low vowel, represented as *i in the handout, was actually [ə].)
David Marjanović said,

December 25, 2021 @ 2:28 pm

Isn't the specification is phonemic, then?

…Isn't the specification phonemic, then?

No, conditioned by stress, as Pulleyblank's 1965 paper implies. I wrote that before I actually looked for and read the paper, and then forgot to edit it out. Anyway, conditioning *e-zero by stress doesn't work because *o and zero also alternate.
IA said,

December 26, 2021 @ 2:32 am

@ Chris Button on December 24, 2021 @ 8:30 am

Thanks for your helpful reply.

The possible relationship of Beila to Shandong has been noted or speculated about before, I believe. I notice just today through use of the interactive tools at http://sino.kaom.net that there is a strong correspondence of the Beila written forms zyo (for 爵), zye (for 絕), gyo (for 覺), gye (for 決) to actual pronunciation in parts of coastal eastern Shandong; in at least one case even a complete match. (Aside from the similar finals, the 尖團 distinction of course is there as well.)

Here: http://sino.kaom.net/si_word.php .

Unfortunately the information is spotty. No Dongbei ( part of which anyway is part of that same eastern Shandong language-area).
IA said,

December 26, 2021 @ 2:45 am

two book covers:

'Sin Vensh Kuben' i.e.《新文字課本》as written in 江南新文字. (Note that the 'h' in 'sh' has nothing to do with retroflexing. It indicates that the 's' in front of it is voiced — or perhaps more correctly described as 清音濁流 .)
http://image.yjcf360.com/20180405/29196d1c7c414127a99172251251479d.png

'Zhonguo Latinhuadi Zemu' :
http://shorturl.at/lANSW

This latter is the first* pre-version of Beila by Qu Qiubai (with the assistance of the sinologist and linguist, VS Kolokolov, aka 郭質生), written at the end of 1929 and published 1930.

Among many other differences from finalised Beila in this booklet, those especially striking to the eye may be:

1) The digraph 'ng' was not used at all. Rather, 'ń'; and in some syllables, it could be dropped. This why 'Zhonguo' is with one 'g', yet no acute over the 'n'. (Note that '-on' ≠ '-un'; the latter's 'n' is [n], not [ŋ] )

2) After zh ch sh jh z c s, 空韻 ([ɿ] / [ʅ]) is indicated by 'e'. Thus 'she' = HP 'shi'; 'shé = HP 'she'. (After other letters, diacritic over 'é' is dropped since 空韻 cannot occur and there is no need for differentiation. )

3) The letter 'h' is used, there is no 'x'.

4) -ae, not -ai is used.

5) -iou, not -iu is used.

*I say ' "first" pre-version' because QCB' later has a second 'pre-version' of Beila — 新中国文草案 .
Chris Button said,

December 26, 2021 @ 7:27 am

@ David M

That’s why I recommended the 1993 paper with its more considered discussion almost 30 years later. Personally, I have more issue with the idea of “a” insertion that any partial specification of schwa.
J.W. Brewer said,

December 26, 2021 @ 2:03 pm

To follow up on my own question upthread about bopomofo's three-way distinction among ㄒ, ㄕ, and ㄙ, while as Jerry Packard notes it is followed by hanyu pinyin (and also by Wade-Giles), I am interested to learn based on poking around pinyin.info that some other romanization systems did opt to collapse three orthographic representations into two, because of the allophone thing, but did not all do so the same way

Consider the bopomofo almost-minimal-trio ㄒㄧㄣ (the "new" morpheme mentioned above), ㄕㄣ, and ㄙㄣ. In hanyu pinyin they're xin, shen, and sen (hsin, shen, and sen in Wade-Giles). But in MPS2 and GR (first tone only …) they're shin, shen, and sen, collapsing ㄒ and ㄕinto "sh," while in Tongyong they're sin, shen, and sen, collapsing ㄒ (at least before ㄧ) and ㄙ into "s." Yale has syin, shen, and sen, but is congruent with Tongyong if you analyze "syin" as s/yi/n rather than sy/i/n, which probably makes more sense to a Sinophone than to an Anglophone.
Jerry Packard said,

December 26, 2021 @ 6:56 pm

@ J.W. Brewer

Thanks for that great post.
David Marjanović said,

December 29, 2021 @ 5:59 am

Personally, I have more issue with the idea of “a” insertion that any partial specification of schwa.

What do you mean by "a" insertion? It could refer to several different interesting things… :-)
Philip Taylor said,

December 29, 2021 @ 4:27 pm

David M. — Thank you for your explanation of "Only Americans draw printed letters by hand. That seems to be because they're taught downright calligraphy in the 3rd grade instead of handwriting in the 1st". Co-incidentally I received a handwritten poem at Christmas, with beautiful calligraphy, and it was only on the second or third reading that two things struck me :
1. It is printed, not cursive (every letter is separate)
2. The poet /calligrapher frequently (but not invariably) separates tall punctuation from the preceding letter by a thin space.
Scanned copy here.
David Marjanović said,

December 29, 2021 @ 6:16 pm

That is beautiful!

The hair spaces preceding punctuation used to be much more common. I've recently seen them in print. In French, larger spaces are even considered obligatory – MS Word automatically adds non-breaking spaces every time you type tall punctuation if you set the language to French.
IA said,

December 30, 2021 @ 5:08 am

@ James Griffiths on December 22, 2021

Yes indeed. It is often forgotten that until the time Maoism-in-power hijacked the word 'Putonghua' and made it identical in meaning and intent to the KMT's Dictatorship of 'Guoyu', 'Putonghua' had meant, since 1906 when 朱文熊 used it, the mutually intelligible lingua franca, *not* some fussily, narrowly prescribed standard phonology based on the speech of Peking.

( In 《國語留聲片》, the pedagogic phonograph recordings he made in 1922 for 老國音（which was pronunciation-prescriptionist, but not Pekinese）YR Chao referred by name to 'Putonghua'. How so? Denigratingly, as a「南腔北調」without definition. QQB, on the contrary also refers to it as 「南腔北調」(and as 「藍青官話」）, but *positively*. (Historical irony: democrat as authoritarian; Stalinist as libertarian.) )
Chris Button said,

December 30, 2021 @ 8:32 am

@ David Marjanović

What do you mean by "a" insertion? It could refer to several different interesting things

I mean the way in which Pulleyblank discusses the role of "a" in his 1965 and 1993 papers.
David Marjanović said,

December 30, 2021 @ 11:48 am

Oh, this?

In Indo-European studies it is usual to regard "e" as representing the normal grade, but it might well be simpler and more elegant to follow the example of the Indian grammarians and regard the zero grade as fundamental and "e" as a guṇa grade, appearing only in specified circumstances, to bear the accent or between consonants of certain types or at certain fixed intervals in a string of consonants.

(1965, p. 94)

That suggestion has not been taken up, firstly because it's not (for the most part) predictable whether the e-grade of *CRC is *CeRC or *CReC (and likewise the o-grade *CoRC or *CRoC), secondly because the hypothesis of stress-dependent reduction of both *e and *o (with various complications) has recently turned out to work quite well, even though work on this only started in earnest in 2010.

As I've said before, the following, from the same page:

An alternation in Kabardian morphemes between zero (realized as ə when stressed, etc.) and a is therefore phonologically precisely parallel to the so-called "e-o" alternation in Indo-European.

doesn't work: ə and 0 may be the same in Kabardian, but they clearly weren't in PIE.

The idea presented on the following pages, that *o */a/ could have been a morpheme with about the same set of functions as /a/ in Kabardian, is very interesting, but really only as a hypothesis about how *o could have become morphologized, not so much as a hypothesis about how it could have originated. And indeed, Zhivlov's idea (linked above) that *o is the merger product of all long vowels of a pre-PIE stage, where vowel length itself had a phonological origin from a yet earlier stage, seems to work stunningly well, even though it hasn't made it to peer-reviewed publication yet.
Chris Button said,

December 30, 2021 @ 3:14 pm

@ David M

I think you are talking about the specification of schwa. again. As I said before, you should read the 1993 paper and indeed Eric Hamp's response.

My issue is not with comparisons with Kabardian, but with "a" as an infix that marks an "introvert" distinction. I find the evidence wanting.
Chris Button said,

December 30, 2021 @ 3:28 pm

–wanting in Old Chinese and, I suppose PIE (to the second part of your post), although I hesitate to comment on areas where I lack any formal training.

RSS feed for comments on this post

Latinxua / Latinization — it worked in the 30s and 40s

62 Comments

James Griffiths said,

John Rohsenow said,

Victor Mair said,

Mark S. said,

Chris Button said,

J.W. Brewer said,

David Marjanović said,

J.W. Brewer said,

David Marjanović said,

J.W. Brewer said,

IA said,

IA said,

IA said,

IA said,

IA said,

IA said,

Chris Button said,

IA said,

Terpomo said,

IA said,

IA said,

Victor Mair said,

Terpomo said,

IA said,

IA said,

IA said,

Victor Mair said,

J.W. Brewer said,

Chris Button said,

Antonio L. Banderas said,

Chris Button said,

David Marjanović said,

Chris Button said,

julie lee said,

Jonathan Smith said,

Jonathan Smith said,

Jonathan Smith said,

Chris Button said,

Chris Button said,

Chris Button said,

Philip Taylor said,

B.Ma said,

Jerry Packard said,

David Marjanović said,

Chris Button said,

julie lee said,

Rodger C said,

David Marjanović said,

David Marjanović said,

IA said,

IA said,

Chris Button said,

J.W. Brewer said,

Jerry Packard said,

David Marjanović said,

Philip Taylor said,

David Marjanović said,

IA said,

Chris Button said,

David Marjanović said,

Chris Button said,

Chris Button said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta