Archive for Language and computers

Attribution of the WannaCry ransomware to Chinese speakers

The notorious WannaCry malware infestation began on Friday, May 12, 2017 and spread rapidly throughout the world, infecting hundreds of thousands of computers and causing major damage.  Speculation concerning the identity of the perpetrators focused on North Korea, but the supposed connection was never convincingly demonstrated, and there were no other serious suspects.

Yesterday, Jon Condra, John Costello, and Sherman Chu published a stunning report which suggests that the authors of WannaCry — or someone they hired — spoke fluent Chinese:

"Linguistic Analysis of WannaCry Ransomware Messages Suggests Chinese-Speaking Authors" (Flashpoint [5/25/17])

Read the rest of this entry »

Comments (17)

Similes for quality of computer code

I must admit to having enjoyed the series of savage similes about quality of computer program code presented in three xkcd comic strips. They show a female character, known to aficionados as Ponytail, reluctantly agreeing to take a critical look at some code that the male character Cueball has written. Almost at first sight, she begins to describe it using utterly brutal similes. In the first strip (at http://xkcd.com/1513) she announces that reading it is "like being in a house built by a child using nothing but a hatchet and a picture of a house." But Ponytail is not done: there is more bile and contempt where that came from.

Read the rest of this entry »

Comments off

Veggies for cats and dogs

This video was passed on by Tim Leonard, who remarks, "real-time video translation at its best":

Read the rest of this entry »

Comments (8)

The miracle of reading and writing Chinese characters

We have the testimony of a colleague whose ability to write Chinese characters has been adversely affected by her not being able to visualize them in her mind's eye.  See:

"Aphantasia — absence of the mind's eye" (3/24/17)

This prompts me to ponder:  just how do people who are literate in Chinese characters recall them?

Read the rest of this entry »

Comments (26)

Password nerdview

Steve Politzer-Ahles was trying to change his password on the Hong Kong Polytechnic University system, and found himself confronted with this warning:

You may not use the following attribute values for your password:

puAccNetID
puStaffNo
puUserGivenName
puUserSurname

Attribute values? This is classic nerdview.

Read the rest of this entry »

Comments off

Why electronic machine translation services sometimes seem to fail

The inability of Google Translate, Microsoft Translator, Baidu Fanyi, and other translation services to correctly render jī nián dàjí 鸡年大吉 ("may the / your year of the chicken be greatly auspicious!") in various languages points up a vital distinction that I have long wanted to make, and now is as good a time as ever.  Namely, just as you could not expect these translation services to handle Cantonese, Shanghainese, Taiwanese, etc. (unless specifically and separately programmed to do so), we should not expect them to deal with Literary Sinitic / Classical Chinese (LS / CC).

Read the rest of this entry »

Comments (10)

Finding non-Roman letters and characters in an MS Word document

Somebody asked Mark Swofford to help her devise a speedy, easy way to locate all the Chinese characters in a book-length manuscript that she was working on.  Mark set to work on the problem, and this is what he came up with:

"How to find Chinese characters in an MS Word document" (12/10/16)

Read the rest of this entry »

Comments (9)

Offal is not awful

My son sent me this wonderful, learned post called "The best bits" from the "Old European culture" blog (12/7/2015).  It begins:

Offal, also called variety meats or organ meats, refers to the internal organs and entrails of a butchered animal. The word does not refer to a particular list of edible organs, which varies by culture and region, but includes most internal organs excluding muscle and bone.

The word shares its etymology with several Germanic words: Frisian ôffal, German Abfall (offall in some Western German dialects), afval in Dutch and Afrikaans, avfall in Norwegian and Swedish, and affald in Danish. These Germanic words all mean "garbage", or —literally— "off-fall", referring to that which has fallen off during butchering. However, these words are not often used to refer to food with the exception of Afrikaans in the agglutination afvalvleis (lit. "off-fall-meat") which does indeed mean offal. For instance, the German word for offal is Innereien meaning innards. According to the Oxford English Dictionary, the word entered Middle English from Middle Dutch in the form afval, derived from af (off) and vallen (fall).

Read the rest of this entry »

Comments (9)

Mystery modal window error message

Almost every day, when looking through the headlines on Google News, I see one or two stories where what's meant to be a snippet from the first paragraph of the story contains not a single word from the story but instead says this:

This is a modal window. This modal can be closed by pressing the Escape key or activating the close button. Close Modal Dialog. This is a modal window.

modalwin

Read the rest of this entry »

Comments (31)

Chinese typewriter redux

We have looked at the Chinese typewriter again and again:

"Chinese Typewriter" (6/30/09)

"Chinese typewriter, part 2" (4/17/11)

"Chinese character inputting" (10/17/15)

By now we are thoroughly familiar with this unwieldy contraption.  Given that it has long since been consigned to the museum, where it properly belongs, it is strange that some folks continue to tout it as the wave of the future in information processing.

Read the rest of this entry »

Comments (3)

How many more Chinese characters are needed?

I was stunned when I read this op-ed piece in the NYT yesterday (10/24/16):  "China's Digital Soft Power Play".  In it, the author, Jing Tsu (a professor of Chinese literature and culture at Yale), writes:

This month, the Chinese government plans to introduce codes for some 3,000 Chinese characters as part of a grand project, known as the China Font Bank, to digitize 500,000 characters previously unavailable in electronic form. Until now, only 80,388 characters have been encoded in the international computing standard, Unicode.

The project highlights 100,000 characters from the country’s 56 ethnic minorities, and another 100,000 rare and ancient characters from China’s written corpus. Deploying almost 30 companies, institutions and universities, it’s the largest state-funded digitization project ever undertaken.

Read the rest of this entry »

Comments (52)

Pure Pinyin

A father speaks

[This is a guest post by Alex Wang, following up his remarks in "Learning to read and write Chinese" (7/11/16).]

The more I learn Chinese to teach my younger son Chinese reading and writing the more I realize for lack of better word how “ridiculous” it is for a “significant / modern” country to use such a reading and writing system. Perhaps I may be wrong because I’m not informed.

To provide some background, I grew up speaking only Chinese in the house.  I went to Saturday school for a few years to learn a little bit of reading and writing but mostly forgot all of it by the time I came to Shenzhen 9 years ago. I did not learn pinyin; I was taught Bopomofo which I have forgotten entirely.   I say this so that you understand my relative fluency in the spoken language.  On reading characters, I can now recognize perhaps several hundred.

Read the rest of this entry »

Comments (63)

A child's substitution of Pinyin (Romanization) for characters, part 2

This is a photograph of a page from an essay written by a third grade student at an elementary school in Suining, Sichuan Province, China:

Read the rest of this entry »

Comments (21)