Archive for Language and computers

Really weird sinographs

Scott Wilson has written an entertaining, and I dare say edifying, article on "W.T.F. Japan: Top 5 strangest kanji ever 【Weird Top Five】", SoraNews24 (10/6/16) — sorry I missed it when it first came out.  Wilson refers to the "Top 5 strangest kanji", but he actually treats nearly three times that many.  The reason he emphasizes "5" is so that he can stick with his theme of W.T.F., cf.:

Scott Wilson, "W.T.F. Japan: Top 5 most difficult kanji ever【Weird Top Five】", SoraNews24 (8/4/16)

Scott Wilson, "W.T.F. Japan: Top 5 kanji with the longest readings【Weird Top Five】", SoraNews24 (4/20/17)

Read the rest of this entry »

Comments (18)

Kanji as commodity

On Friday, April 27, I participated in "Seeking a Future for East Asia’s Past:  A Workshop on Sinographic Sphere Studies" at Boston University.  Among the participants was Terry Kawashima who talked about the commodification and fetishization of kanji.  The following paragraphs are a revised version of a portion of her remarks:

Read the rest of this entry »

Comments (4)

Colossal translation fail at the Boao Forum for Asia

China is currently hosting the Boao Forum for Asia in Hainan, the smallest and southernmost province of the PRC.  The BFA bills itself as the "Asian Davos", after the World Economic Forum held annually in Davos, Switzerland.  The BFA draws representatives from many countries, so naturally they have to provide translation services.  Unfortunately, the machine translation system they used this year failed miserably.  Here are screenshots of a couple of examples:

Read the rest of this entry »

Comments (14)

The elegance of Google Translate

When I was in graduate school, some of my best friends were mathematicians.  I was always intrigued by their approach to problem solving.  They told me that merely solving problems was not satisfying to them.  Rather, their goal was to solve problems elegantly.

This morning, I was reminded of the modus operandi of mathematicians when I asked Google Translate (GT) to render a short passage of German into English.

Read the rest of this entry »

Comments (39)

The letter * has bee* ba**ed in Chi*a

Since the announcement by the Chinese Communist Party (CCP) yesterday that the President of China would no longer be limited to two five-year terms in office, as had been the case since the days when Chairman Mao ruled, there has been much turmoil and trepidation among China watchers and Chinese citizens.  Essentially, it means that Xi Jinping has become dictator for life, which is not what people had been hoping for since Richard Nixon went to China 46 years and 5 days ago.  What everyone had expected was that China would "reform and open up" (gǎigé kāifàng 改革開放), which became an official policy as of December, 1978.  Instead, all indications from the first five years of Xi's regime and the newly announced policy changes regarding Xi Jinping thought and governance are that China has jumped right back to the 1950s in terms of policies and procedures.

Read the rest of this entry »

Comments (34)

Shadowsocks

The immediate reason for writing this post is the curiosity of an important Chinese product, Shadowsocks, whose name is known only in English and whose author, clowwindy, has only an English name.

Shadowsocks is an open-source encrypted proxy project, widely used in mainland China to circumvent Internet censorship. It was created in 2012 by a Chinese programmer named "clowwindy", and multiple implementations of the protocol have been made available since. Typically, the client software will open a socks5 proxy on the machine it is run, which internet traffic can then be directed towards, similarly to an SSH tunnel. Unlike an SSH tunnel, shadowsocks can also proxy UDP traffic.

Source

Read the rest of this entry »

Comments (9)

Don't blame Google Translate

Douglas Hofstadter has a critical article in the latest issue of The Atlantic (1/30/18):

"The Shallowness of Google Translate:  The program uses state-of-the-art AI techniques, but simple tests show that it's a long way from real understanding." (1/30/18).

Hofstadter criticizes GT for not being as good as himself at translating from French, German, and Chinese into English.  I will let others respond to his critique of the French and German translations, but I will comment on his critique of the Chinese to English translation.

Read the rest of this entry »

Comments (21)

News program presenter meets robot avatar

Yesterday BBC's Radio 4 program "Today", the cultural counterpart of NPR's "Morning Edition", invited into the studio a robot from the University of Sheffield, Mishal Husain and the Mishalbot the Mishalbot, which had been trained to conduct interviews by exposure to the on-air speech of co-presenter Mishal Husain. They let it talk for three minutes with the real Mishal. (video clip here, at least for UK readers; may not be available in the US). Once again I was appalled at the credulity of journalists when confronted with AI. Despite all the evidence that the robot was just parroting Mishalesque phrases, Ms Husain continued with the absurd charade, pretending politely that her robotic alter ego was really conversing. Afterward there was half-serious on-air discussion of the possibility that some day the jobs of the Today program presenters and interviewers might be taken over by robots.

The main thing differentiating the Sheffield robot from Joseph Weizenbaum's ELIZA program of 1966 (apart from a babyish plastic face and movable fingers and eyes, which didn't work well on radio) was that the Mishalbot is voice-driven (with ELIZA you had to type on a terminal). So the main technological development has been in speech recognition engineering. On interaction, the Mishalbot seemed to me to be at sub-ELIZA level. "What do you mean? Can you give an example?" it said repeatedly, at various inappropriate points.

Read the rest of this entry »

Comments off

CCP approved image macros

Two powerful agencies of the PRC central government, Zhōnggòng zhōngyāng jìlǜ jiǎnchá wěiyuánhuì 中共中央纪律检查委员会 ("Central Commission for Discipline Inspection") and Zhōnghuá rénmín gònghéguó jiānchá bù 中华人民共和国监察部 ("People's Republic of China Ministry of Supervision"), have issued "bā xiàng guīdìng biǎoqíng bāo 八项规定表情包" ("emoticons for the eight provisions / stipulations / rules"); see also here.  The biǎoqíng bāo 表情包 (lit., expression packages") were announced on December 4, 2017, five years to the day after the rules themselves were promulgated.

English translations of the so-called "Eight-point austerity rules" or "Eight-point regulations" may be found here and here.  The rules were designed to instill greater discipline among Chinese Communist Party (CCP) members, to bring the Party "closer to the masses", and to reduce bureaucracy, extravagance, and undesirable work habits among Party members.

Read the rest of this entry »

Comments off

Ask Language Log: Looking up hanzi for ignoramuses

From Mark Meckes:

I'm a regular Language Log reader, completely ignorant of Chinese languages.  I was just wondering whether there exist worthwhile online tools to help someone like me figure out the meaning of something written only in hanzi.  (The question is occasioned by my looking at a package of tea given to me by a Chinese student; the writing on the package is mostly hanzi, with a little English and no pinyin.)  I'm perfectly competent to use Google Translate and similar tools (and know how much skepticism to approach the results with) for the last stage of the process.  But starting from written hanzi on a physical object, I first need some way to translate that image into either pinyin, Unicode, English, or something equivalent to one of the above — and something that relies on no knowledge of the meaning or pronunciation of the characters, or knowledge of the structure of Chinese characters in general.  Do you have any suggestions?

Read the rest of this entry »

Comments (9)

Duolingo Mandarin: a critique

A friend sent this lifehacker article to me:

"Mandarin Chinese Is Now Available on the Language Learning App Duolingo", by Patrick Allan (11/16/17)

Duolingo claims that it "is the world's most popular way to learn a language. It's 100% free, fun and science-based. Practice online on duolingo.com or on the apps!"

After reading Allan's article, I sent the following note to my students and colleagues:

Judging from the description in this article, I'm dubious about the efficacy of their method.  Never mind about misleading statements emanating from the author of the article (e.g., there are 1.2 billion native speakers of "Chinese"), they seem to overemphasize individual characters, downplay words, don't talk about sentence structure, grammar, and syntax, and don't give any indication of how or whether pinyin is used.

Has anyone checked this app out?

Read the rest of this entry »

Comments (32)

Just press Pay

This is a screen shot I snapped during a recent attempt to purchase something (can't remember what) on the web:

Notice that in order to continue, it tells me (twice) that I have to press "Pay". Can you see any button labeled "Pay" on the screen?

If you are itching to tell me what I should have done, you are missing my point.

Read the rest of this entry »

Comments off

Is there a practical limit to how much can fit in Unicode?

A lengthy, important article by Michael Erard recently appeared in the New York Times Magazine:

"How the Appetite for Emojis Complicates the Effort to Standardize the World’s Alphabets:  Do the volunteers behind Unicode, whose mission is to bring all human languages into the digital sphere, have enough bandwidth to deal with emojis too?" (10/18/17)

The article brought back many vivid memories.  It reminded me of my old friend, Joe Becker, who was the seminal designer of the phenomenal Xerox Star's multilingual capabilities in the mid-80s and instrumental in the organization and foundation of the Unicode Consortium in the late 80s and early 90s.  Indeed, it was Becker who coined the word "Unicode" to designate the project.

Read the rest of this entry »

Comments (34)