Archive for Language and computers

Automated transcription-cum-translation

Marc Sarrel received the following message on his voicemail:

Read the rest of this entry »

Comments (7)

Chinese translation app with built-in censorship

What good is a translation app that automatically censors politically sensitive terms?  Well, a leading Chinese translation app is now doing exactly that.

"A Chinese translation app is censoring politically sensitive terms, report says", Zoey Chong, CNET (11/27/18)

iFlytek, a voice recognition technology provider in China, has begun censoring politically sensitive terms from its translation app, South China Morning Post reported citing a tweet by Jane Manchun Wong. Wong is a software engineer who tweets frequently about hidden features she uncovers by performing app reverse-engineering.

In the tweet, Wong shows that when she tried to translate certain phrases such as "Taiwan independence," "Tiananmen square" and "Tiananmen square massacre" from English to Chinese, the system failed to churn out results for sensitive terms or names. The same happened when she tried to translate "Taiwan independence" from Chinese to English — results showed up as an asterisk.

Read the rest of this entry »

Comments (6)

Idiosyncratic stroke order

Comments (15)

I pressed the "correct" button three times and the ATM ate my card

That's what happened to Paul Midler when confronted with this display on an ATM in China:

Read the rest of this entry »

Comments (10)

Words in Vietnamese

In "Diacriticless Vietnamese on a sign in San Francisco" (9/30/18), we discussed the advisability of joining syllables into words or separating all syllables.  The ensuing string of comments revealed that there is a correlation between linking syllables and word spacing on the one hand and the necessity for diacritical marks on the other hand.

This prompted me to ask the following questions of several colleagues who are specialists on Vietnamese:

Roughly what percentage of Vietnamese lexemes (words) are monosyllabic? Disyllabic? Any trisyllabic or higher?

The average length of a word in Mandarin is almost exactly two syllables.

Can you think of examples in Vietnamese parsing where it would be clearer or more helpful to have the syllables of words joined together?

Read the rest of this entry »

Comments (34)

The growing impact of "biaoqing" ("expressions") on the internet in China

Gabriele de Seta has a serious, scholarly article on "Biaoqing: The circulation of emoticons, emoji, stickers, and custom images on Chinese digital media platforms" in First Monday, Volume 23, Number 9 – 3 September 2018.  Here's the abstract:

The Mandarin Chinese term biaoqing, or ‘expression’, categorizes genres of visual content ranging from emoticons and emoji to stickers and custom images. This article is grounded on ethnographic research and approaches biaoqing in terms of their circulation across Chinese digital media platforms. By formulating a comprehensive typology of biaoqing genres, I foreground the situated socio-technical specificities of their circulation: the creative play with typographical compositions, the affective repurposing of graphical emoticons, the platformed monetization of proprietary stickers, and the user-driven proliferation of custom images. Drawing on this typology, I argue for the need to recognize the circulation of biaoqing as an emergent and malleable category of semiotic resources profoundly shaped by two decades of development of the Internet in China.

Read the rest of this entry »

Comments (1)

Spectral Sinographs

Comments (20)

Opening and closing necrophilia

Comments (13)

Fub

The University of Pennsylvania is instituting a Two-Step Verification for PennKey WebLogins. Up till now, our PennKey for login consisted of a Username and Password. After much effort and practice, I finally mastered that. Now, however, for the sake of greater security, after using our PennKey to log in, we will in addition be asked to go through a second step that requires us to enter a randomly generated number that will be sent to us via cell phone.

That really freaked me out, since I don't have a cell phone.

Read the rest of this entry »

Comments (48)

Corpora and the Second Amendment: Responding to Weisberg on the meaning of "bear arms" [Updated, and updated again]

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

The Originalism Blog has a guest post, by David Weisberg, taking issue with the conclusion in Dennis Baron's Washington Post op-ed that newly available evidence of historical usage shows that in District of Columbia v. Heller, Justice Scalia misinterpreted the phrase keep and bear arms. That's an issue that I wrote about yesterday ("The coming corpus-based reexamination of the Second Amendment") and that I'm going to be dealing with in a series of posts over the next several weeks.

One of Weisberg's arguments concerns a linguistic issue that I'm planning to address, and I think that Weisberg is mistaken. At the risk of getting out ahead of myself, I want to respond to Weisberg briefly now, with a more detailed explanation to come.

Read the rest of this entry »

Comments (36)

Really weird sinographs

Scott Wilson has written an entertaining, and I dare say edifying, article on "W.T.F. Japan: Top 5 strangest kanji ever 【Weird Top Five】", SoraNews24 (10/6/16) — sorry I missed it when it first came out.  Wilson refers to the "Top 5 strangest kanji", but he actually treats nearly three times that many.  The reason he emphasizes "5" is so that he can stick with his theme of W.T.F., cf.:

Scott Wilson, "W.T.F. Japan: Top 5 most difficult kanji ever【Weird Top Five】", SoraNews24 (8/4/16)

Scott Wilson, "W.T.F. Japan: Top 5 kanji with the longest readings【Weird Top Five】", SoraNews24 (4/20/17)

Read the rest of this entry »

Comments (18)

Kanji as commodity

On Friday, April 27, I participated in "Seeking a Future for East Asia’s Past:  A Workshop on Sinographic Sphere Studies" at Boston University.  Among the participants was Terry Kawashima who talked about the commodification and fetishization of kanji.  The following paragraphs are a revised version of a portion of her remarks:

Read the rest of this entry »

Comments (4)

Colossal translation fail at the Boao Forum for Asia

China is currently hosting the Boao Forum for Asia in Hainan, the smallest and southernmost province of the PRC.  The BFA bills itself as the "Asian Davos", after the World Economic Forum held annually in Davos, Switzerland.  The BFA draws representatives from many countries, so naturally they have to provide translation services.  Unfortunately, the machine translation system they used this year failed miserably.  Here are screenshots of a couple of examples:

Read the rest of this entry »

Comments (14)