Pure Pinyin

« previous post | next post »

A father speaks

[This is a guest post by Alex Wang, following up his remarks in "Learning to read and write Chinese" (7/11/16).]

The more I learn Chinese to teach my younger son Chinese reading and writing the more I realize for lack of better word how “ridiculous” it is for a “significant / modern” country to use such a reading and writing system. Perhaps I may be wrong because I’m not informed.

To provide some background, I grew up speaking only Chinese in the house.  I went to Saturday school for a few years to learn a little bit of reading and writing but mostly forgot all of it by the time I came to Shenzhen 9 years ago. I did not learn pinyin; I was taught Bopomofo which I have forgotten entirely.   I say this so that you understand my relative fluency in the spoken language.  On reading characters, I can now recognize perhaps several hundred.

I have looked and tried reading pure pinyin text like Pīnyīn Rìjì Duǎnwén by Zhāng Lìqīng.

http://www.pinyin.info/readings/pinyin_riji_duanwen.html

I have to say because I know the oral I can read and understand more pure pinyin than if it were just characters.

What’s even more amazing is that I can understand most without understanding the tonal diacritics.

I have also asked some adult friends and their children to read the links and provide feedback.

Many who are character literate said it takes more time, but is still relatively easy.

It is no surprise that the children can read the pure pinyin faster.

I would have to believe if children are taught since young pure pinyin and nothing else they should be able to read it as fast as we read English. I’d have to believe reading the diacritics would become second nature if taught at young age.

As I do not have enough background nor experience in linguistic studies I wanted to know your opinion. I do not have as extensive a vocabulary in Chinese.  Do you and others in the field believe that solely pinyin can be used? Is digraphia just a suggested compromise for the sake of culture? From Language Log I would think the answer is yes, but perhaps I misunderstood.  I wanted to understand what the difficult parts would be for reading and writing if only pinyin were used.  That is assuming a child only learns pinyin and is provided with many pure pinyin books.  If there is a link for such reading material, that would be appreciated.  Also, if pure pinyin is taught, what would you do with proper nouns?  Would you use the pinyin for the loan words or do something else?  Would they be written / spelled as closely to resemble the language of origin?  Can you provide any pinyin only websites?

I bolded the previous paragraph so that my questions don't get lost in my other ramblings.

I have about 50 staff here in Shenzhen and every day I ask 1 or 2 if they can read the pinyin story at pinyin.info Interesting enough I asked my kids haircutter at the salon and I think with many its psychological.  Because its strange its very hard at first.

The phenomenon is like this taxi cab one that I have encountered.  I have some western looking friends, as in white and black, here in Shenzhen.  Some of them have very good spoken Chinese without much accent.  However when we jump into the taxi and they say the location, the taxi driver can't understand them.  Perhaps the brain at first doesn’t register due to their non Asian face.  When they say it again then it begins to register.  Then when I ask them during the ride in Chinese was it hard to understand.  They reply no and many honestly say they didn’t expect it.  Its like the brain says he will be speaking English since he is non Asian looking.

I think for those who I provide more info before asking them to read, like what you are about to see, than just hey I was wondering can you read this, do much better in the initial look.

Many of the older 30-40 year old women who taught their kids do comment that pure pinyin perhaps is much better for learning as they are the ones who suffered having to scream the bihua [VHM:  "strokes"], etc.

As for my two boys, I have decided that I have no “demand” for excellent grades in written vocabulary exams as I do in math, science, English etc.  I do not believe the incremental time needed to score an A or even medium B is worth it.  I do want my kids to know I have high expectations, but seriously the negative aspects of the lost time and happiness are not worth it to me.  My older son, now 8, uses a word processor to write Chinese.  My only demand is that that he really learns the pinyin for writing and will still be able to read the characters.  (My compromise to the language, learning to read the characters, is inefficient but bearable.)  He has many Chinese books, so he reads and learns new characters, and he writes my father a daily email in Chinese so that uses the pinyin.

As for my younger son, now just 5, his English has overtaken his Chinese, even though he has grown up in a Chinese environment.  He spends more time a day learning Chinese and yet his English reading ability has surpassed his Chinese.  This happened even though he had a year head start reading Chinese.  I loathe and he loathes the learning how to write Chinese.  At times I wish I never investigated this and taken part in his learning.  They say ignorance is bliss, if I didn’t investigate it wouldn’t have bothered me so much how much time my kids are “wasting” but more important suffering.  They love to read the National Geographic for Kids in English but now loath to read Chinese books on the same topics that don’t have the pinyin. Proper nouns of people, places, things, etc. are such a hassle.  The kids have decided.

[NOTES by VHM:  1. Alex has several times asked me for information about the best input systems for Pinyin only typing.  Some of them are already on the Pinyin Literature Contest website, but I will be happy to send more suggestions to anyone who requests them.  2. I have answers for the questions Alex posed in the bolded paragraph, but will refrain from replying to them until Language Log readers have a chance to express their opinions.]



63 Comments

  1. liuyao said,

    October 15, 2016 @ 10:37 pm

    I wondered about Alex Wang's background, and it seems he meant that he grew up speaking MANDARIN at home, probably in the United States, and was taught bopomofo and some characters at Saturday schools before Pinyin became more popular there.

    I do not know of any pinyin-only website (other than the Riji duanwen), and that underscores the problem. He may get away with his children not memorizing and writing Chinese characters by hand (by not going to standard Chinese schools), but to completely do away with even reading (i.e. recognizing when seeing) Chinese characters, and betting on that China or Taiwan will become pinyin-only in 10 or 20 years, is very risky for their future.

  2. Jim Breen said,

    October 15, 2016 @ 11:16 pm

    Re: "Then when I ask them during the ride in Chinese was it hard to understand. They reply no and many honestly say they didn’t expect it. Its like the brain says he will be speaking English since he is non Asian looking."

    This is a well-known phenomenon in Japan too. There are countless stories of fluent Japanese speakers getting blank looks and "No English" responses. I suspect it's fading as more and more foreigners in Japan speak Japanese. I also noticed when I was living in Tokyo that that the "No English" response was largely absent in my rather working-class suburb, but more common in the touristy areas like Shibuya and Meguro.

  3. Alex said,

    October 16, 2016 @ 12:54 am

    Although I wish for a China of just pinyin for the future rest assured I am pragmatic and realise for my kids it will happen too late. As I said a "low B" is fine on the writing and that is my pragmatic compromise. I wont go crazy as many do with the bihua bihua. My son goes to the public school here for the reason you and I both stated. Both can read even the younger one, 5, has about 1000 character vocabulary. (Kids really do learn and retain alot faster and more than 46 year olds) Growing up I "wasted" alot of time learning and practicing cursive so some loss is inevitable and a necessary evil. On the lack of reading material I hope to be a part of a solution on that. I do believe thats very important.

  4. Alex said,

    October 16, 2016 @ 12:57 am

    @jim breen

    Is there a word for this in any language?

  5. Alex said,

    October 16, 2016 @ 1:20 am

    @liuyao

    Thinking about it, the risk is that they would be like me and many other fluent English speaking people. That said I do believe in reading alot of Chinese books when young to broaden the mind and to master the language composition wise. One needs to read alot to be able to compose/write well in any language, even if the "writing" of characters ends up being inputted via pinyin on a word processor. I view knowing both languages well is an advantage especially for many careers. I want them to have that advavtage. My issue is far more with having to have to be able to hand write something than reading especially during that age.

  6. Bathrobe said,

    October 16, 2016 @ 4:38 am

    This is a well-known phenomenon in Japan too

    If I remember rightly, Jack Seward, who wrote some popular books on Japanese way back in the 60s, called this phenomenon the baka valve. That's the 'idiot valve'. To reset the 'baka valve' he recommended drastic measures. One of these was to be 'caught out' writing the Imperial Rescript on Education, a difficult Meiji-era document that all pre-war children had to learn. He did this one time to a maid who apparently refused to understand his Japanese. Unable to contain her curiosity, she had to have look at what the foreigner was writing. When she saw what it was there was an audible intake of breath and an immediate apology.

  7. Bathrobe said,

    October 16, 2016 @ 4:44 am

    Let me add that writing the Imperial Rescript on Education might not work nowadays. Any Japanese who saw a gaijin writing it might be highly impressed with the musty old knowledge he/she had mastered but arguably less willing to grant that he/she could actually speak the language.

  8. Guy_H said,

    October 16, 2016 @ 4:55 am

    Is *reading* pinyin text actually quicker/easier than reading chinese characters? I'm assuming the choice must be highly personal. If given a choice between reading romanized text and character text, I would always choose the latter. It feels so much faster and no chance of ambiguous readings. I actually find large slabs of pinyin text quite daunting, though perhaps that is because it is so unusual. (Writing is a different matter for me, where character amnesia is rampant & pinyin wins hands down).

  9. Jim Breen said,

    October 16, 2016 @ 5:35 am

    @Alex. I'm not aware of a term for it.

  10. E. T. said,

    October 16, 2016 @ 6:46 am

    Firstly, I must say I agree wholeheartedly with the idea of learning to read characters while not learning to write them by hand. Writing characters in the day of pinyin input methods is really just reading – recognizing the correct characters for the word you have typed out. Learning to write by hand is not worth the time investment in my opinion, and in fact I basically boycotted all 听写 quizzes while I was learning Chinese. This has not hampered my ability to read any kind of text in Chinese.

    As to your actual questions, proper names in Pinyin are written capitalized as in English, and loanwords should reflect the pronunciation. It is very common for languages to change the spelling of loanwords to reflect local orthography, and for a nearly perfectly phonetic script as Pinyin this is essential.

  11. Victor Mair said,

    October 16, 2016 @ 7:24 am

    Mental block; psychological barrier.

    When it comes to fluent speakers of Mandarin professing that they cannot read Pinyin, it's a common phenomenon. I have written about it here:

    "Pinyin memoirs" (8/13/16)

  12. Victor Mair said,

    October 16, 2016 @ 7:36 am

    @Guy_H:

    I read Pinyin texts as easily as I read English texts and as rapidly as I do Chinese character texts, but that's probably because I was used to working with "Pure Pinyin" for decades when I edited Xin Tang.

    @E. T.

    "…I basically boycotted all 听写 quizzes while I was learning Chinese."

    Bravo for you! My wife told me that her very best / smartest students at Harvard and elsewhere did exactly the same thing. They were more interested in learning the language than they were in slaving over the refractory writing system.

    For tīngxiě 听写 ("dictation"), see (among other posts):

    "The future of Chinese language learning is now" (4/5/14)

    "Spelling bees and character amnesia" (8/7/13)

    And especially:

    "How to generate fake Chinese characters automatically" (12/30/15)

  13. Victor Mair said,

    October 16, 2016 @ 9:17 am

    Concerning the lack of reading materials in Pinyin, that is definitely a problem in popularizing it as an alternative / complement to Hanzi texts. But that situation will change within five years, and people like Alex Wang will be part of the solution.

    I should also note that the organizers of the Pinyin Literature Contest (see also here) are already — a year before the deadline — receiving excellent submissions. When published, these will constitute a substantial body of original, attractive Pinyin reading materials in a variety of genres.

    One thing we're noticing with the entries that have come in so far is that the authors are writing orthographically proper Pinyin, but without tones. While many of my colleagues, especially those who are Mandarin teachers, would prefer tones, I have a feeling that the way this is going to work out in practice is that people will naturally tend to gravitate to writing pinyin texts without tones except for disambiguation when necessary. There are several reasons for this:

    1. They feel that it is extra work to add them.

    2. They believe that it is unnecessary to have them, i.e., they think that, since they can read the texts without all of the tones marked, why should they go to the trouble of adding them.

    3. They recognize that:

    a. speakers are often idiosyncratic / idiolectal or topolectal in their use of tones

    b. intonation often overrides tone

    For standardization and ease of writing with tones, I think it would help greatly if the following type of Pinyin inputting were developed:

    1. The writer simply types in orthographically correct Pinyin (i.e., with spaces between words) but without tones.

    2. The system adds the tones automatically.

    3. The writer checks and revises.

    This is essentially what Google Translate already does, but there may be other systems out there that also do something similar or even better. I would appreciate notice of all superior Pinyin inputting systems. Eventually, we will compile a list of the best software and make it available on various websites.

  14. cliff arroyo said,

    October 16, 2016 @ 11:08 am

    "This is a well-known phenomenon in Japan too. "

    I had the same thing happen to me in Mexico.

  15. cliff arroyo said,

    October 16, 2016 @ 11:11 am

    ". If given a choice between reading romanized text and character text, I would always choose the latter. "

    How much of that is simply being used to characters?

    A few years ago I tried to learn some Vietnamese in characters (there was a site with some graded lessons that has, I think, since disappeared) and found it much more difficult and confusing that simple, clear Quoc Ngu… (just to balance the usual 'characters are easier than pinyin/kana' argument)

  16. Jon Forrest said,

    October 16, 2016 @ 12:16 pm

    For a very funny example of non-native looking fluent speakers having trouble (in Japan) see https://www.youtube.com/watch?v=oLt5qSm9U80 .

  17. liuyao said,

    October 16, 2016 @ 1:15 pm

    Just a quick thought, what if we start writing wikipedia in pinyin, just like the simple English wikipedia https://simple.wikipedia.org/wiki/Main_Page aimed at children and adult learners?

    The fact that no one has started doing it is very telling, given that there are pages written in Wu and Gan topolects, not to mention Cantonese (all in characters).

    One could even imagine toggling between pinyin with tones and pinyin without tones, just like the toggles between simplified and traditional characters currently in place.

  18. Wang Yujiang said,

    October 16, 2016 @ 1:35 pm

    @Alex Wang
    In the sixth paragraph of OP
    “What’s even more amazing is that I can understand most without understanding the tonal diacritics.”
    The reason is that when we speak, frequently we are not according to “four-tone” which is for reading a single character only.

  19. AntC said,

    October 16, 2016 @ 6:11 pm

    @VHM [and several others make the same point] the authors are writing orthographically proper Pinyin, but without tones.

    So the tones can in general be guessed from context. This is similar to omitting nikudim in writing Hebrew. The extra marks are included for children/learners.

    But Mandarin already suffers from many homophones (going by Victor's many posts). If you omit the tones from pinyin, doesn't that increase the ambiguity?

    In contrast, English orthography has a childish obsession with pernickety rules (there/their/they're), which don't reliably distinguish homophones (write/rite/right/right/right/right) and don't avoid homographs (red/read/read).

  20. Jenny Chu said,

    October 16, 2016 @ 7:12 pm

    @AntC – this struck me, too. I sincerely hope that if Pinyin is more widely adopted for mainstream writing, that the tones will be included. Compare Vietnamese: without diacritics, it can be read, and even quickly, but it's really annoying. Typing with ordinary word processing, especially in "telex" style, is extremely fast and simple. There's no excuse for leaving out tones and other diacritics!

  21. Alex Wang said,

    October 16, 2016 @ 7:30 pm

    I think a good test of a written language is how clear it can be when writing/reading legal contracts. Would pure pinyin cause any issues with the many homophones even with using yin biao? Mainly are there many related words that have the same exact pinyin but not character that could cause confusion even in a contextual setting? I wouldn't know since my oral and reading vocabulary is limited and haven't looked at legal Chinese lingo enough so hoping someone with the background can answer the question. I understand things can be rewritten in pinyin to not cause confusion but would it be "weird", not to say legalese isn't weird in English.

    On loanwords being written in closer to native language with no corresponding character, there lies the slippery slope that I'm sure the government is worried about even though the goal would be like Vietnam, Korea to one day have characters for art sake.

    I guess I'm trying to think things through and understand more. I have read many of the previous comments and articles on homophones but still cant get a good sense since I don't know the language well enough. Can someone let me know if there are other languages that have close to the same amount of homophones.

  22. Elessorn said,

    October 16, 2016 @ 8:22 pm

    @Bathrobe

    I remember that Seward story! Has that ever happened to you?

  23. Victor Mair said,

    October 16, 2016 @ 11:24 pm

    There will be a Pinyin Wikipedia in due course.

    If Mandarin or other Sinitic language had an excessive amount of homophony, people would not be able to hold intelligible, meaningful conversations in it.

  24. Alex said,

    October 17, 2016 @ 12:01 am

    "If Mandarin or other Sinitic language had an excessive amount of homophony, people would not be able to hold intelligible, meaningful conversations in it."

    Never thought about it that way :-) makes sense.

    Thanks,

  25. cliff arroyo said,

    October 17, 2016 @ 1:07 am

    "The reason is that when we speak, frequently we are not according to “four-tone” which is for reading a single character only."

    I the problem here is in seeing pinyin primarily as heuristic tool (and the subsequent idea of including tonal sandhi).

    I think the two most optimal solutions are

    Mostly not writing tones at all (which increases homography)

    Always writing the tone a particular morpheme has in isolation (as far as that is feasible). This decreases homography while also reducing the number of rules a writer needs to learn.

    This latter is the system used by Vietnamese whose speakers have led the most successful transition from characters to a phonemic(ish) system. The Vietnamese system uses tone markings as an integral part of the orthography (and thus reflects changes in speech less than optimally but helps create a maximum number of visually distinct syllables which probably has processing benefits.

  26. Àilì Hēijiāo said,

    October 17, 2016 @ 3:04 am

    per Professor Mair, list of Pinyin test Wikipedia articles.

  27. Wang Yujiang said,

    October 17, 2016 @ 7:47 am

    @Alex said
    There is not any homophony or homonym in spoken language including Chinese and English. The homophony is in written language only.
    Most homonyms are monosyllable. Because Chinese characters are monosyllable, they have excessive amount of homonyms. English has many monosyllable homonyms too.
    After adopting pinyin, Chinese will solve this problem like English.

  28. Alex said,

    October 17, 2016 @ 8:22 am

    @ Wang Yujiang

    "Because Chinese characters are monosyllable, they have excessive amount of homonyms"

    Thanks, feedback like this really help me understand things. So it would seem by glancing over the rules of the writing of pinyin where one puts the pinyin together for example elephant would be composed of two characters written daxiang, (that is to say if I am understanding correctly this rule "Write words as complete words, not as bro ken syl la bles.") it should clarify most ambiguity.

    @Àilì Hēijiāo
    thanks for that link! that's super.

  29. Victor Mair said,

    October 17, 2016 @ 4:40 pm

    @Àilì Hēijiāo

    A zillion thanks for those Wikipedia test articles.

  30. Eidolon said,

    October 17, 2016 @ 5:44 pm

    "Is *reading* pinyin text actually quicker/easier than reading chinese characters? I'm assuming the choice must be highly personal. If given a choice between reading romanized text and character text, I would always choose the latter. It feels so much faster and no chance of ambiguous readings. I actually find large slabs of pinyin text quite daunting, though perhaps that is because it is so unusual. (Writing is a different matter for me, where character amnesia is rampant & pinyin wins hands down)."

    It would be an interesting study for a Ph. D student in Chinese linguistics, and that's an invitation.

    Besides the mental block and lack of material, however, I would actually offer that there is a bit of a structural problem with writing Mandarin in pinyin, when compared to English. In English, words typically have a length and form to them that is, if not unique, at minimum well-distinguished. Take what I just wrote as an example. Each block of letters – that is, each word – is sufficiently unique such that even if you didn't know English, you could scan the blocks rapidly and pick out the differences.

    Compare that to this:

    Xiàndài biāozhǔn hànyǔ, yě chēng pǔtōnghuà, guóyǔ huò huáyǔ, shì yīzhǒng guǎngfàn tōngxíng yú dà zhōnghuá, huárén yǔ hànzú shè qún de biāozhǔn yǔ, zì 20 shìjì yǐlái zài huárén dìqū tōngxíng.1923 Nián zhōnghuá mínguó jiàoyù bù guóyǔ tǒngyī chóubèi huì dì wǔ cì huìyì juédìng jīyú xiàndài běifāng guānhuà de báihuàwén yǔfǎ hé běijīng huà yǔyīn zhìdìng,1932 nián jīng zhōnghuá mínguó jiàoyù bù bānbù “guó yīn chángyòng zìhuì” hòu, bèi cǎinà wéi zhōngguó de guānfāng yǔyán, dāngqián zài zhōnghuárénmín gònghéguó, zhōnghuá mínguó, xīnjiāpō děng guó jūn wèi guānfāng yǔyán zhī yī, yěshì dōngnányà jí qítā hǎiwài huá rén qúntǐ guǎngfàn cǎiyòng de gòngtōng kǒuyǔ huò shūmiànyǔ; qiě zuòwéi liánhéguó liù zhǒng guānfāng gōngzuò yǔyán zhī yī, chéngwéi guójì rénshì xuéxí hànyǔ de zhǔyào cānzhào.

    A quick observation would be that this latter passage has many more similar length letter blocks, due to 70-80% of Mandarin words being monosyllabic or bisyllabic. But it's not just the length of the word that is at issue – there are many more similar looking syllable blocks, too. In English, we often take liberties with the writing of such blocks, especially with respect to homophones and near homophones. For example, we distinguish between "dam" and "damn" orthographically, even though phonetically they are virtually the same. Many more examples can be found here for British English: http://www.singularis.ltd.uk/bifroest/misc/homophones-list.html. This is the benefit – or curse, depending on where you stand – of centuries of alphabetic tradition. Pinyin, absent such tradition, is also absent of such distinctions, by and large. A "zhi" is a "zhi" is a "zhi," so to speak, made worse by the lack of tone marks, without which I found the above passage to be incomprehensible unless read slowly and deliberately.

    But if you do use tone marks, that imposes an additional burden on reading comprehension, and I find myself having to enunciate each word from the pinyin passage before understanding it, whereas with English and characters I can typically scan. This is probably capable of being fixed with time, since reading comprehension is ultimately facilitated by the the long-term learning of visual patterns. This is why, after all, we're able to read a lot faster than we can talk or listen. Yet it still raises the problem of whether word symbols in pinyin Mandarin are distinct enough for efficient pattern learning. More research is definitely needed.

  31. Victor Mair said,

    October 17, 2016 @ 6:54 pm

    @Eidolon

    Your Pinyin text has no capitalization except on the very first letter. The presence of a full complement of punctuation will help to make reading faster, surer, and easier.

    Another factor that will alleviate the alleged problem of a predominance of one and two syllable words is that new types of word formation and borrowing will arise, leading to words of variable length, a length not determined by the old "one syllable one character" metric.

    You talk about "centuries of alphabetic tradition" as being a key factor in what makes English writing work efficiently. Well, as you also recognize, Pinyin — thus far — essentially has no tradition behind it. Give it some time, my man.

  32. Wong Chao said,

    October 17, 2016 @ 7:47 pm

    Hi, Alex, I haven't read all the comment, so having no idea whether my concern is talked of yet. Pure pinyin will not function even it seems more efficient in reading for kids or adults, due to the fact that in far more regions of China, people pronounce the same character in horribly different ways, situations more complicated and subtle than the matter of accent or dialect as far as I know. This unique linguistic phenomenon is considered as one significant factor of the integrity of Han culture, that allows geographically scattered people united in written communication thereby securing the rule of the central power.

    I hope this comment will help in thinking of the possibility of Pure Pinyin reform (which has been well discussed already in Wusi Period 五四时期 by those famous intellectuals. ).

  33. Victor Mair said,

    October 17, 2016 @ 9:12 pm

    @Wong Chao

    Since the Republican period, it has always been the nationalistic goal to unify the pronunciation of Modern Standard Mandarin (MSM), and great strides have been made in that direction. My students come from all over China, and they all speak standard MSM.

  34. JS said,

    October 17, 2016 @ 9:40 pm

    @Àilì Hēijiāo, many thanks, I had no idea this existed. Do you or does anyone know to what extent production of these transcriptions (assuming that's how they're done) is automated? They are superb; the one small thing I noticed is wei2 'be, etc.' is consistently being represented as wei4 (like Eidolon's "…jūn wèi guānfāng yǔyán zhī yī" from the putonghua article. Re: tonal diacritics, as I've said many times, they're excluded (like occasionally in this comment) for reasons of convenience, but arguments that the phonemes they represent are in some sense less substantial or more changeable than the segments with which speakers of Western languages are more familiar is wrong. Wang Yujiang: phonemic-level representation FTW. Especially as we should conceive of written pinyin also as a teaching tool, they must be included whenever resources permit.

  35. JS said,

    October 17, 2016 @ 9:43 pm

    To a put a finer point on the last sentence: toneless pinyin would only be for fluent speakers of the Mandarin. Even advanced students would have tremendous difficulties reading such alphabetic texts aloud. What a shame that would be.

  36. Àilì Hēijiāo said,

    October 17, 2016 @ 9:43 pm

    if you do use tone marks, that imposes an additional burden on reading comprehension

    This seems like a damned-if-you-do situation (not to be confused with dammed-if-you-do). Indicating tones is a “burden” even though it’s “incomprehensible” without them?

    I’m not disagreeing, but how is that any harder than no capitalization or word boundaries and only the slimmest connection between phoneme and grapheme?

    The comparison, really, is characters, not other languages written with alphabets.

    And, of course, simplified Chinese does compound the number of homographs.

    There may be languages written with the Roman alphabet even easier to read than English. Who cares? Should we start writing English with hanzi, then?

  37. liuyao said,

    October 18, 2016 @ 12:42 am

    Eidolon's example and analysis are great! I was also trying to scan for key words and had the same reaction.

    For longer words Chinese has a trove of four-character set phrases, and I believe the ABC dictionaries tried to hyphenate in the middle.

  38. liuyao said,

    October 18, 2016 @ 1:04 am

    Let me capitalize the paragraph (and correct some typos)

    Xiàndài Biāozhǔn Hànyǔ, yě chēng Pǔtōnghuà, Guóyǔ huò Huáyǔ, shì yīzhǒng guǎngfàn tōngxíng yú Dà Zhōnghuá, Huárén yǔ Hànzú shèqún de biāozhǔn yǔ, zì 20 shìjì yǐlái zài huárén dìqū tōngxíng. 1923 nián Zhōnghuá Mínguó Jiàoyù Bù Guóyǔ Tǒngyī Chóubèi Huì dì wǔ cì huìyì juédìng jīyú xiàndài Běifāng Guānhuà de báihuàwén yǔfǎ hé Běijīng huà yǔyīn zhìdìng. 1932 nián jīng Zhōnghuá Mínguó Jiàoyù Bù bānbù “Guóyīn chángyòng zìhuì” hòu, bèi cǎinà wéi Zhōngguó de guānfāng yǔyán, dāngqián zài Zhōnghuá Rénmín Gònghéguó, Zhōnghuá Mínguó, Xīnjiāpō děng guó jūn wéi guānfāng yǔyán zhī yī, yěshì Dōngnányà jí qítā hǎiwài Huárén qúntǐ guǎngfàn cǎiyòng de gòngtōng kǒuyǔ huò shūmiànyǔ; qiě zuòwéi Liánhéguó liù zhǒng guānfāng gōngzuò yǔyán zhī yī, chéngwéi guójì rénshì xuéxí hànyǔ de zhǔyào cānzhào.

    I'm not sure about the gòngtōng in the last sentence. Is it 共通 or something else?

  39. cliff arroyo said,

    October 18, 2016 @ 6:34 am

    I did some fooling with it. In marking tones is it really necessary to distinguish first and zero tones? I removed all the first tone markers (44 in all) and the result looks like this. Maybe a bit less cluttered.

    Xiàndài Biaozhǔn Hànyǔ, yě cheng Pǔtonghuà, Guóyǔ huò Huáyǔ, shì yizhǒng guǎngfàn tongxíng yú Dà Zhonghuá, Huárén yǔ Hànzú shèqún de biaozhǔn yǔ, zì 20 shìjì yǐlái zài huárén dìqu tongxíng. 1923 nián Zhonghuá Mínguó Jiàoyù Bù Guóyǔ Tǒngyi Chóubèi Huì dì wǔ cì huìyì juédìng jiyú xiàndài Běifang Guanhuà de báihuàwén yǔfǎ hé Běijing huà yǔyin zhìdìng. 1932 nián jing Zhonghuá Mínguó Jiàoyù Bù banbù “Guóyin chángyòng zìhuì” hòu, bèi cǎinà wéi Zhongguó de guanfang yǔyán, dangqián zài Zhonghuá Rénmín Gònghéguó, Zhonghuá Mínguó, Xinjiapo děng guó jun wéi guanfang yǔyán zhi yi, yěshì Dongnányà jí qíta hǎiwài Huárén qúntǐ guǎngfàn cǎiyòng de gòngtong kǒuyǔ huò shumiànyǔ; qiě zuòwéi Liánhéguó liù zhǒng guanfang gongzuò yǔyán zhi yi, chéngwéi guójì rénshì xuéxí hànyǔ de zhǔyào canzhào.

  40. Victor Mair said,

    October 18, 2016 @ 6:47 am

    Thank you very much for that bit of fine tuning, cliff arroyo. The addition of capitals certainly does speed up and sharpen our reading.

    As for removing first tone markers, I'd like to hear the opinions of others on that. Would it not lead to confusion between first and neutral tone?

  41. 艾力·黑膠(Àilì Hēijiāo) said,

    October 18, 2016 @ 7:59 am

    Eidolon:

    The article you pasted has very poor Pinyin. Bear in mind that most Chinese speakers have very little experience with languages that use spacing and capitalization, and a good deal of it is, as JS alluded, written by editors who are still ”thinking in Written Chinese” as it were, with all the visually-bound terseness and influence from Classical Chinese that entails.

    JS:

    Who cares about non-fluent speakers when devising an efficient orthography? Should Hebrew and Arabic start marking all their vowels now for the benefit of “advanced students?” (Well, maybe, but it’s a marginal use case for such a sweeping reform.)

    I had no idea it existed either until Professor Mair’s heads-up; I must’ve referenced the en: Pinyin article half a dozen times in the past year and never noticed the “Pinyin test of Wikipedia” bug under the infobox (I usually expect it under External links but w/e).

    I don’t think they’re being run through converters (or else it’d have more articles in two years, sheesh!); the ideal is wholly Mandarin compositions but I do think some editors are just trusting whatever their IMEs spit out (in terms of capitalization, word boundaries; see above). Although automatic conversions from zh: could be a very useful starting point for editors to go back and clean up by hand, as Dr. Mair has pointed out in the rules for the Pinyin Literature Contest, if you convert from SWC, it will likely be noticeable. Bot-generated content has jumpstarted the growth of several minority-language wikis and anecdotally, I’ve noticed that momentum seems to favour existing pages; readers will get pissed off at a shitty article and click the Edit button out of spite, whereas fewer will go to the trouble of creating an article from scratch to fill a perceived lacuna.

    After the Mandarin Wikipedia gets a userbase of active editors I'd love to see a Mandarin Wikinews site get off the ground, which could be an excellent source of relevant, realtime content in Pinyin. (I see Cantonese and Shanghainese/Wu already have Wikinews projects on Wikimedia Incubator.)

  42. Àilì Hēijiāo said,

    October 18, 2016 @ 8:07 am

    liuyao:
    Thank you! Please update the article with your fixes! :)

  43. Chris Kern said,

    October 18, 2016 @ 9:02 am

    Wouldn't a good portion of the "homonyms in writing" problem be solved just by making writing closer to speech, and avoiding the literary/classical terms that are only possible to use because of the characters?

    As for tone marks, foreigners of course would welcome them, but I can understand that native speakers might find them less necessary, and cumbersome to use all the time. But I remember John DeFrancis writing that Vietnamese writing always uses the tone marks?

  44. Wang Yujiang said,

    October 18, 2016 @ 9:56 am

    Every language is learned. Practice makes perfect.
    Which is easier, pinyin or characters? It depends on how you are familiar with them.

  45. Wang Yujiang said,

    October 18, 2016 @ 9:59 am

    If you use pinyin as IPA, four-tone markers would help you sometimes. If you use pinyin as a written language, they would be a burden.

  46. Alex Wang said,

    October 18, 2016 @ 10:00 am

    @wong chao
    " Pure pinyin will not function even it seems more efficient in reading for kids or adults, due to the fact that in far more regions of China, people pronounce the same character in horribly different ways, situations more complicated and subtle than the matter of accent or dialect as far as I know."

    Actually I think quite the opposite. I think by having the pinyin tightens the pronunciation. I don't see how the character adds any value. Once again based on my personal experience.

    The first time I tried durian liu lian was here. The person who introduced it to me said niu almost like in cow lian. Was an elder extended family member and then it seems like everyone in the family said it incorrectly. Now if i would have just seen it spelled liu lian problem solved. Its like mishearing words of a song in which we have probably all had that embarrassing moment where we sing it wrong and our friends laugh.Clarity happens when we look up the lyrics. Now how would I have ever known I was saying it wrong just based on the character if i have never seen that character before. So if a teacher in some remote area says it incorrectly everyone in the class will. Whereas if it were phonetic based sometimes logic would prevail. (please dont give English examples of words i know there are many but i think you get my point)

    The most wide spread word I have found mispronounced is for blood. Since growing up and even widely spread here in Shenzhen many people pronounce it like snow. My son's tutor corrected my son and his teacher verified and then he corrected me. I was amazed. Half the office pronounced it one way the other properly. That said perhaps its just like having a southern accent but the liu lian not niu lian certainly isn't just accent.

    Back then voice recordings weren't inexpensively widely available so I'd imagine standardizing pronunciation would be even harder

  47. JS said,

    October 18, 2016 @ 11:02 am

    @艾力·黑膠(Àilì Hēijiāo)

    Your point is interesting as lack of vowel marking in the Semitic cases goes back to Egyptian hieroglyphs, just as non-inclusion of tonal diacritics in writing Mandarin alphabetically is not a design decision per se but is rather a product of the exigencies of the QWERTY keyboard.

    However, an argument could be made that abjads are a reasonable fit for Semitic typology, whereas no such argument exists for toneless Mandarin. Sound design — respect for language structure — would call for tone marking just as is done for Vietnamese, Hmong, Zhuang, Amoy, etc. While I take your point that learners need not be a focus, it would seem that good design and accessibility to learners are in many cases going to be the same thing.

    But of course preexisting conditions are real, and what happens with pinyin, happens. People (like Wang Yujiang) are used to reading and producing toneless text and find tone marks irritating. Maybe this perspective will carry the day.

  48. Alex Wang said,

    October 18, 2016 @ 11:45 pm

    Can someone tell me if there are some Chinese language children's stories online that have expired copyrights or no copyright/public domain. I am building a pinyin website organized by age of reader starting with children. Broken down into preschool through primary.
    Thanks

  49. Àilì Hēijiāo said,

    October 19, 2016 @ 3:07 am

    Alex Wang:

    I see dialect levelling as a bug, not a feature.
    What, exactly, makes non-standard native pronuncations “incorrect?”
    Who decides that, after school tutors?

    JS:

    I am about to wade deeply into Talking Out My Ass territory, as I don’t really know anything about Mandarin, let alone Vietnamese, but hubris compels me to take a crack at this anyway.

    non-inclusion of tonal diacritics in writing Mandarin alphabetically is not a design decision per se but is rather a product of the exigencies of the QWERTY keyboard.

    I don’t know that that’s true. First of all, Vietnamese speakers on early computer networks resorted to workarounds to represent tone. If Mandarin speakers haven’t done the same, I would submit this is due to lack of necessity, not lack of ingenuity. Mandarin’s tones vary by dialect and can shift in colloquial speech; Vietnamese has a more complex tone system than MSM, or at least I think there are more of them. Either way, people find toneless pinyin comprehensible.

    Furthermore, Vietnamese orthography does seem to be broken down at the syl la ble level, resulting in more ambiguities for individual units. I don’t know enough about Vietnamese morphology but it’s likely that if words were written as words, instead of as syllables, much tone marking could become optional. “Ma” may mean a lot of things, but no one is going to be confused about the meaning of “Zhongguo” (or “pīnyīn!”), diacriticals or no.

    And I can’t speak for Hmong, Zhuang, or Amoy, but I do know that Chinese-illiterate Cantonese speakers have resorted to English-based ad hoc romanizations with no tone indications and high degrees of comprehension. Again, context does tons of heavy lifting.

    Not all tonal languages are created equal, and many get by fine without explicitly marking such in writing.

    I should point out that I’m getting a lot of this based on Prof. Mair’s research into Dungan, which if u haven‘t read Dungan + Chinese script reform you probably want to check out.

  50. Wang Yujiang said,

    October 19, 2016 @ 8:44 am

    @Alex Wang said
    Since you are in Shenzhen now, you could buy Chinese textbooks (Yuwen语文) for elementary school, and put the word lists together.
    It is imperative to compile a dictionary of Chinese pinyin, or a pinyin word list at the beginning. Several years ago, I was planning to do this project.
    Waiting for your pinyin website.

  51. Wang Yujiang said,

    October 19, 2016 @ 8:56 am

    @Alex Wang said
    “The most wide spread word I have found mispronounced is for blood.”
    血 is polyphonic character 多音字. Both “xie” or “xue” are correctly pronounced. Many Chinese characters are polyphonic. Therefore, we need a dictionary of Chinese pinyin urgently.

  52. Alex Wang said,

    October 19, 2016 @ 9:30 am

    @Wang Yujiang

    Thanks for the response. I have all the first through sixth grade books that the kids use in school here. For the website I would like stories or passages. I know I can easily add a dictionary as many are public domain (no copyright) however the reading passages in the class book might be copyright protected.

    For example first grade shang has a reading passage to teach reading I am only typing the pinyin but in the book it has the pinyin and characters underneath.

    ye ye he xiao shu

    wo jia men kou you yi ke xiao shu
    dong tian dao le
    ye ye gei xiao shu chuan shang nuan hou de yi shang

    etc etc

    I am hoping to find public domain material as creating stuff takes time and my chinese isnt good enough nor am I an educator. I am sure these stories were well thought out for teaching purposes.

    I hope to also have on the site the advised way of typing the pinyin. Clearly in these primary school books every character's pinyin is separated. like dong tian whereas I think according to suggestions it should be dongtian?

    The website we are creating will only have pinyin but upon mouse over it will have a bubble that contains the Chinese characters. This way people who know the characters can try to read without the characters but if stuck can mouse over and see. Figuring out the best IT way to do this.

    Thanks I hope there is more feedback!

  53. Alex Wang said,

    October 19, 2016 @ 9:51 am

    @Wang Yujiang

    As an aside these "Yuwen语文" books have not been updated since 2001 for the first grade book and 2003 for the third grade. According to the staff at my sons school only typos have been corrected. For my older son's third grade Yuwen语文 I think its crazy they make them memorize and mo xie passages of tang dynasty poems. Needless to say his tutor taught him the thoughts behind the poems and I didn't require him to mo xie.

  54. Alex Wang said,

    October 19, 2016 @ 10:15 am

    @Àilì Hēijiāo

    Thanks for the link to dialect leveling it was very interesting! Will take some/alot of time for me to digest.

  55. Wang Yujiang said,

    October 19, 2016 @ 11:15 am

    @Alex Wang said
    1. “like dong tian whereas I think according to suggestions it should be dongtian?”
    You are right. 冬天should be “dongtian”, which is in the dictionary 现代汉语词典. “dong tian” is misspelling.
    2. “The website we are creating will only have pinyin but upon mouse over it will have a bubble that contains the Chinese characters. ”
    For polysyllable word, no problem, I think. For monosyllable word, or a single Chinese character, it would be difficult because of homophone at present. After having a pinyin dictionary, this problem will be solved eventually.

  56. Wang Yujiang said,

    October 19, 2016 @ 11:24 am

    @Alex Wang said
    “ye ye gei xiao shu chuan shang nuan hou de yi shang.”
    yeye, nuanhou, and yishang should be in one single word.
    Yeye gei xiao shu chuan shang nuanhou de yishang.

  57. Eidolon said,

    October 19, 2016 @ 4:59 pm

    "The comparison, really, is characters, not other languages written with alphabets."

    I'd argue it's both, because the argument for using pure pinyin becomes a lot weaker if it turns out that it's much worse than other alphabetic scripts, even if it is better than characters. Such a discovery would push for the development of a new script specific to Mandarin via the hindsight argument that the English alphabet is not an effective fit for Mandarin, which most Chinese would, again in hindsight, assume to be common sense.

    Pinyin is a young script. Its creation & adoption was almost strictly a consequence of recent history & politics rather than because the underlying alphabet was specifically engineered for Mandarin. Alternatives to pinyin include bopomofo, which but for the success of a certain political party in the 1940s, would have taken the place of pinyin in the Sinitic world. After all, during the original commission of the national language in 1912, only 2 out of 17 delegates supported the use of the Latin alphabet for Chinese, while 15 supported either the use of Chinese characters for their sounds, or the creation of a new alphabet for Chinese. The triumph of pinyin under the PRC was in no way inevitable.

    So while I do recognize the fact that pinyin is the most practical option, today, for writing Mandarin with an alphabet, it was not always so, and maybe not always be so in the future. Political fortunes wax and wane, and since the PRC is in no hurry to replace characters, there's plenty of room to examine & study the efficiency of pinyin as a stand alone language, regardless of characters.

  58. Thorin said,

    October 19, 2016 @ 6:15 pm

    @Jim Breen

    Regarding your comment about it being a frequent occurrence in Japan, I sometimes get it when I speak Arabic. I remember one time a man from Jordan asked what I do for a living.
    Me: "Ana motargem." (I'm a translator)
    Him: A what?
    His friend: "Motargem."
    Him: "Oooh! You said it fine, I just didn't expect it!"

  59. Victor Mair said,

    October 19, 2016 @ 6:16 pm

    "since the PRC is in no hurry to replace characters, there's plenty of room to examine & study the efficiency of pinyin as a stand alone language, regardless of characters."

    Sure, go ahead and study the efficiency of Pinyin as an independent writing system, if that's what you want to do. As for me, I'm mainly interested in watching Pinyin increasingly manifest itself in various ways here and now and in the immediate future.

  60. Alex Wang said,

    October 19, 2016 @ 7:38 pm

    @Wang Yujiang

    "For polysyllable word, no problem, I think. For monosyllable word, or a single Chinese character, it would be difficult because of homophone at present"

    We are still in the design doc phase it will take at least of month as we think things through and receive feedback. All feedback is appreciated as it helps us think things through! During that time we are researching what opensource/IT and pubic domain content is out there already.

    The key is the input interface. The user will type pinyin as they do now and there will be many options setting for the interface for example:
    1 If user wants to only see the pinyin and choose from the tone options
    2 Characters appear below the tone options for the choices

    The goal is to make input and edit as easy as possible.

    The website is in English and Pinyin with pinyin mouseovers for the English ( a way for Chinese users to learn English words and Chinese character hover over for the pinyin.

    More than likely to begin with its manual edit input for mouseovers. Not all English words will have mouseovers

    @Eidolon

    Language is an evolution so one must start somewhere. English isn't the most efficient as many pointed out but it is used because its wide use like Microsoft Windows. So we are starting with Pinyin and will it evolve, yes perhaps.

  61. John said,

    October 20, 2016 @ 8:01 am

    @Alex Wang

    As someone who works in technology, I always advise my clients against designing for mouse overs. Not only are they annoying when you move your mouse around, they also don't really work on mobile, which is how many consume content (you can't hover your finger over the screen).

  62. JS said,

    October 20, 2016 @ 1:38 pm

    @Àilì Hēijiāo "context does tons of heavy lifting"

    You are certainly right that it does a lot, and right about Cyrillic (and Xiaojing) for Dungan — but in part wrong about the comparison to Vietnamese, where the writing system has long been alphabetic. In China, basically nobody does or even can read pinyin texts. Only if that starts happening at scale, as you describe for Vietnam, will we know the fate of tone marking.

    Maybe you're right and it won't happen… but I admit I will have trouble getting my head around it. Even properly spaced, I would have read the title of the story above — Yeye he xiaoshu — as Yéye hé xiǎoshū 'grandpa & brother-in-law', but one line in come to find out it's Yéye hé xiǎoshù 'grandpa & the little tree'… but even at this point Yèye hé xiǎoshù 'the leafies and the little tree' remains perfectly plausible, if less likely, to say nothing of more imaginative possibilities. MDBG gives six very common words of the form xiaoshu, for instance.

    So — you end up leaving an awful lot to the reader. Isn't it weird not to be able to read the title of this little story correctly? Wouldn't it be weird to sign your name to a letter and have no one know how to say it?

  63. Alex said,

    October 20, 2016 @ 7:58 pm

    @john

    Thanks for feedback we do enterprise middleware, thus the hoverover. I guess the goal though is how to allow the user to know what the word means. Kindle use press and bubble. I know kindle is app vs browser. Still investigating frameworks and add on's

    Siincere thanks for the feedback. Any other suggestions are appreciated. As said our firm creates enterprise middleware so users are IT admin so very different from consumer market so there is a learning curve.

    Current for the design doc we are in the requirements list. So any ideas are appreciated. I know this isnt the forum for tech discussion but to create a good software one needs experts in the field. Like accounting software, accountants are needed. We will put up a page in Nov for the IT.

RSS feed for comments on this post