Archive for Language and computers

Spamferences thrive; junk journals prosper

I was recently moved (screaming and struggling, as four strong men held me down by my arms and legs) to a new web-based university email system designed and run by Microsoft: Office 365. Naturally, it's ill-designed slow-loading crap, burdened by misfeatures and pointless pop-ups that I do not want popping up, and it fails to allow various elementary operations that I often need (every upgrade is a downgrade). But that is not my topic today. I want to note one special sad consequence of moving to an entirely new system: all my previous email system's Bayesian machine learning about spam classification has been lost. The Office 365 system has had hardly any data to learn from as yet, so I am seeing some of the stuff that would have been coming to me all along if it had not been caught by machine learning and dumped in the spam bin. And what has truly amazed me is the daily flow of advertising for spamferences and junk journals.

Read the rest of this entry »

Comments off

Firestorm over Chinese characters

It began with a one page think piece by Ted Chiang in the New Yorker (5/16/16) that we describe and discuss here:

"Ted Chiang uninvents Chinese characters" (5/13/16)

Read the rest of this entry »

Comments (21)

Backward Thinking about Orientalism and Chinese Characters

 This is a guest post by David Moser of Beijing Capital Normal University

For those of us who teach and research the Chinese language, it is often difficult to describe how the Chinese characters function in conveying meaning and sound, and it’s always a particular challenge to explain how the writing system differs from the alphabetic systems we are more familiar with. The issues are complex and multi-layered, and have important implications for basic literacy and the teaching of Chinese to both native speakers and foreign learners. Tom Mullaney, a professor of history at Stanford University, has lately been muddying these pedagogical waters in a series of articles and interviews that seriously misrepresent the merits and relative advantages of the alphabet over the Chinese script.

Read the rest of this entry »

Comments (81)

Character amnesia redux

This is a topic that we have frequently broached on Language Log:

In several recent messages to me, Guy Almog has raised the issue once again.  This is not unexpected for someone whose ongoing research focuses on the changing writing and reading habits of native Chinese and Japanese speakers, and mainly with issues of memory and forgetfulness of hanzi / kanji.

Read the rest of this entry »

Comments (9)

AI for youth: success and failure

Success: Xiaoice is a Microsoft chatbot program that has become popular in China.  Her name is written in various ways:

"Xiaoice" 42,400 ghits (that's pronounced "xiǎo ice")
"小冰" 362,000 ghits (that's pronounced "xiǎo bīng")
"小ice" 11,200 ghits (that's pronounced "xiǎo ice")
"Little Bing" 16,000 ghits (she's obviously named after Microsoft's search engine*)
"Little Ice" for the chatbot doesn't work, because that's the name of Ice-T's son.

Not all of these ghits are to the Chinese chatbot program; some are for Facebook and Twitter monikers, etc., but most do refer to the Microsoft chatbot.

Read the rest of this entry »

Comments (10)

Ask Language Log: Why are some Chinese PDFs garbled on iPad?

Mark Metcalf writes:

Since Language Log addresses lots of interesting language-related issues, I was wondering if you'd ever encountered a problem with Chinese PDFs being incorrectly displayed on an iPad. I searched the LL website and didn't find it previously addressed. I also unsuccessfully searched the Web for solutions.

Here's the issue: Last week I downloaded several articles from CNKI and they all display correctly on my Windows machine. However, when I transferred them to an iPad the Chinese text was garbled. Since I haven't had iPad problems with Chinese PDFs from other sources, one thought is is that CNKI uses a modified PDF file format that can't be properly handled by the iPad OS.

Has anyone previously addressed this problem? If so, could you point me to a solution? If not, would you be interested in addressing this on 'Language Log'? Below I've attached before/after versions of the displays.

I asked several colleagues and students whom I've often observed reading Chinese PDFs on their iPads what their experience with CNKI has been.  Here are a few of the replies that I received.

Read the rest of this entry »

Comments (12)

Kongish, ch. 2

In "Kongish" (8/6/15), we looked at the phenomenon of extensive mixing of English and Cantonese by young people in Hong Kong.  We also became acquainted with the Kongish Daily, a Facebook page written in and about Kongish.  Many Language Log readers thought it was a satire or parody and that it was an ephemeral fad that would swiftly fade away. But here we are, half a year later, and the movement is still going strong, and even, it would seem, gaining momentum.

Read the rest of this entry »

Comments (11)

How to generate fake Chinese characters automatically

On the otoro blog, there is another amazing article about sinograms:

"Recurrent Net Dreams Up Fake Chinese Characters in Vector Format with TensorFlow" (12/28/15)

I say "another amazing article" because, just a week ago, in "Character building is costly and time consuming" (12/22/15), we looked at a fascinating report on the vast amount of labor necessary to build fonts made up of real Chinese characters.  Basically, the latter report examined the history of Chinese characters and then explained how typographers create new fonts comprising all the characters necessary for printing books, newspapers, magazines, advertising copy, and so forth.

Read the rest of this entry »

Comments (15)

God use VPN

One of Kohei Jose Shimamoto's photos on Facebook:

Read the rest of this entry »

Comments (6)

Japan's continuing love affair with the fax machine

Periodically, someone will write an article about how the Japanese still are inordinately fond of fax machines, such as this one b from the BBC News "Technology of Fiction" section:

Not a word about kanji.

Read the rest of this entry »

Comments (14)


Here's another eye-opening article from Quartz:

"Stop texting right now and learn from the Chinese: there’s a better way to message" (7/02/15) by Josh Horwitz.

I missed the article when it came out back in July, and even now wouldn't have known about this new fad that is sweeping China if Kyle Wilcox hadn't called it to my attention.

What the article describes is the craze for sending short audio clips instead of text messages.

Read the rest of this entry »

Comments (27)

Pinyin spam text message

From David Moser:

Just got this spam text, all in pinyin, to avoid spam detectors. The usual spam offering fake certificates and chops, plus their Weixin contact. What's novel is the tone markings, don't see that very often.

Read the rest of this entry »

Comments (30)

Chinese character inputting

During my "Language, Script, and Society in China" class on this past Thursday (10/15/15), I asked the students the following questions:

1. What is your primary method for inputting Chinese characters?

2. What percentage of the time do you use your primary method for inputting Chinese characters?

3. What is your secondary method for inputting Chinese characters?

4. What percentage of the time do you use your secondary method for inputting Chinese characters?

Read the rest of this entry »

Comments (9)