Archive for Language and computers

Fub

The University of Pennsylvania is instituting a Two-Step Verification for PennKey WebLogins. Up till now, our PennKey for login consisted of a Username and Password. After much effort and practice, I finally mastered that. Now, however, for the sake of greater security, after using our PennKey to log in, we will in addition be asked to go through a second step that requires us to enter a randomly generated number that will be sent to us via cell phone.

That really freaked me out, since I don't have a cell phone.

Read the rest of this entry »

Comments (48)

Corpora and the Second Amendment: Responding to Weisberg on the meaning of "bear arms" [Updated, and updated again]

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

[Update: Broken link fixed.]

The Originalism Blog has a guest post, by David Weisberg, taking issue with the conclusion in Dennis Baron's Washington Post op-ed that newly available evidence of historical usage shows that in District of Columbia v. Heller, Justice Scalia misinterpreted the phrase keep and bear arms. That's an issue that I wrote about yesterday ("The coming corpus-based reexamination of the Second Amendment") and that I'm going to be dealing with in a series of posts over the next several weeks.

One of Weisberg's arguments concerns a linguistic issue that I'm planning to address, and I think that Weisberg is mistaken. At the risk of getting out ahead of myself, I want to respond to Weisberg briefly now, with a more detailed explanation to come.

Read the rest of this entry »

Comments (36)

Really weird sinographs

Scott Wilson has written an entertaining, and I dare say edifying, article on "W.T.F. Japan: Top 5 strangest kanji ever 【Weird Top Five】", SoraNews24 (10/6/16) — sorry I missed it when it first came out.  Wilson refers to the "Top 5 strangest kanji", but he actually treats nearly three times that many.  The reason he emphasizes "5" is so that he can stick with his theme of W.T.F., cf.:

Scott Wilson, "W.T.F. Japan: Top 5 most difficult kanji ever【Weird Top Five】", SoraNews24 (8/4/16)

Scott Wilson, "W.T.F. Japan: Top 5 kanji with the longest readings【Weird Top Five】", SoraNews24 (4/20/17)

Read the rest of this entry »

Comments (18)

Kanji as commodity

On Friday, April 27, I participated in "Seeking a Future for East Asia’s Past:  A Workshop on Sinographic Sphere Studies" at Boston University.  Among the participants was Terry Kawashima who talked about the commodification and fetishization of kanji.  The following paragraphs are a revised version of a portion of her remarks:

Read the rest of this entry »

Comments (4)

Colossal translation fail at the Boao Forum for Asia

China is currently hosting the Boao Forum for Asia in Hainan, the smallest and southernmost province of the PRC.  The BFA bills itself as the "Asian Davos", after the World Economic Forum held annually in Davos, Switzerland.  The BFA draws representatives from many countries, so naturally they have to provide translation services.  Unfortunately, the machine translation system they used this year failed miserably.  Here are screenshots of a couple of examples:

Read the rest of this entry »

Comments (14)

The elegance of Google Translate

When I was in graduate school, some of my best friends were mathematicians.  I was always intrigued by their approach to problem solving.  They told me that merely solving problems was not satisfying to them.  Rather, their goal was to solve problems elegantly.

This morning, I was reminded of the modus operandi of mathematicians when I asked Google Translate (GT) to render a short passage of German into English.

Read the rest of this entry »

Comments (39)

The letter * has bee* ba**ed in Chi*a

Since the announcement by the Chinese Communist Party (CCP) yesterday that the President of China would no longer be limited to two five-year terms in office, as had been the case since the days when Chairman Mao ruled, there has been much turmoil and trepidation among China watchers and Chinese citizens.  Essentially, it means that Xi Jinping has become dictator for life, which is not what people had been hoping for since Richard Nixon went to China 46 years and 5 days ago.  What everyone had expected was that China would "reform and open up" (gǎigé kāifàng 改革開放), which became an official policy as of December, 1978.  Instead, all indications from the first five years of Xi's regime and the newly announced policy changes regarding Xi Jinping thought and governance are that China has jumped right back to the 1950s in terms of policies and procedures.

Read the rest of this entry »

Comments (34)

Shadowsocks

The immediate reason for writing this post is the curiosity of an important Chinese product, Shadowsocks, whose name is known only in English and whose author, clowwindy, has only an English name.

Shadowsocks is an open-source encrypted proxy project, widely used in mainland China to circumvent Internet censorship. It was created in 2012 by a Chinese programmer named "clowwindy", and multiple implementations of the protocol have been made available since. Typically, the client software will open a socks5 proxy on the machine it is run, which internet traffic can then be directed towards, similarly to an SSH tunnel. Unlike an SSH tunnel, shadowsocks can also proxy UDP traffic.

Source

Read the rest of this entry »

Comments (9)

Don't blame Google Translate

Douglas Hofstadter has a critical article in the latest issue of The Atlantic (1/30/18):

"The Shallowness of Google Translate:  The program uses state-of-the-art AI techniques, but simple tests show that it's a long way from real understanding." (1/30/18).

Hofstadter criticizes GT for not being as good as himself at translating from French, German, and Chinese into English.  I will let others respond to his critique of the French and German translations, but I will comment on his critique of the Chinese to English translation.

Read the rest of this entry »

Comments (21)

News program presenter meets robot avatar

Yesterday BBC's Radio 4 program "Today", the cultural counterpart of NPR's "Morning Edition", invited into the studio a robot from the University of Sheffield, Mishal Husain and the Mishalbot the Mishalbot, which had been trained to conduct interviews by exposure to the on-air speech of co-presenter Mishal Husain. They let it talk for three minutes with the real Mishal. (video clip here, at least for UK readers; may not be available in the US). Once again I was appalled at the credulity of journalists when confronted with AI. Despite all the evidence that the robot was just parroting Mishalesque phrases, Ms Husain continued with the absurd charade, pretending politely that her robotic alter ego was really conversing. Afterward there was half-serious on-air discussion of the possibility that some day the jobs of the Today program presenters and interviewers might be taken over by robots.

The main thing differentiating the Sheffield robot from Joseph Weizenbaum's ELIZA program of 1966 (apart from a babyish plastic face and movable fingers and eyes, which didn't work well on radio) was that the Mishalbot is voice-driven (with ELIZA you had to type on a terminal). So the main technological development has been in speech recognition engineering. On interaction, the Mishalbot seemed to me to be at sub-ELIZA level. "What do you mean? Can you give an example?" it said repeatedly, at various inappropriate points.

Read the rest of this entry »

Comments off

CCP approved image macros

Two powerful agencies of the PRC central government, Zhōnggòng zhōngyāng jìlǜ jiǎnchá wěiyuánhuì 中共中央纪律检查委员会 ("Central Commission for Discipline Inspection") and Zhōnghuá rénmín gònghéguó jiānchá bù 中华人民共和国监察部 ("People's Republic of China Ministry of Supervision"), have issued "bā xiàng guīdìng biǎoqíng bāo 八项规定表情包" ("emoticons for the eight provisions / stipulations / rules"); see also here.  The biǎoqíng bāo 表情包 (lit., expression packages") were announced on December 4, 2017, five years to the day after the rules themselves were promulgated.

English translations of the so-called "Eight-point austerity rules" or "Eight-point regulations" may be found here and here.  The rules were designed to instill greater discipline among Chinese Communist Party (CCP) members, to bring the Party "closer to the masses", and to reduce bureaucracy, extravagance, and undesirable work habits among Party members.

Read the rest of this entry »

Comments off

Ask Language Log: Looking up hanzi for ignoramuses

From Mark Meckes:

I'm a regular Language Log reader, completely ignorant of Chinese languages.  I was just wondering whether there exist worthwhile online tools to help someone like me figure out the meaning of something written only in hanzi.  (The question is occasioned by my looking at a package of tea given to me by a Chinese student; the writing on the package is mostly hanzi, with a little English and no pinyin.)  I'm perfectly competent to use Google Translate and similar tools (and know how much skepticism to approach the results with) for the last stage of the process.  But starting from written hanzi on a physical object, I first need some way to translate that image into either pinyin, Unicode, English, or something equivalent to one of the above — and something that relies on no knowledge of the meaning or pronunciation of the characters, or knowledge of the structure of Chinese characters in general.  Do you have any suggestions?

Read the rest of this entry »

Comments (9)

Duolingo Mandarin: a critique

A friend sent this lifehacker article to me:

"Mandarin Chinese Is Now Available on the Language Learning App Duolingo", by Patrick Allan (11/16/17)

Duolingo claims that it "is the world's most popular way to learn a language. It's 100% free, fun and science-based. Practice online on duolingo.com or on the apps!"

After reading Allan's article, I sent the following note to my students and colleagues:

Judging from the description in this article, I'm dubious about the efficacy of their method.  Never mind about misleading statements emanating from the author of the article (e.g., there are 1.2 billion native speakers of "Chinese"), they seem to overemphasize individual characters, downplay words, don't talk about sentence structure, grammar, and syntax, and don't give any indication of how or whether pinyin is used.

Has anyone checked this app out?

Read the rest of this entry »

Comments (32)