Archive for Language and computers

Chinese Telegraph Code (CTC)

Michael Rank has an interesting article on Scribd entitled "Chinese telegram, 1978" (5/22/2015).

It's about a 1978 telegram that he bought on eBay.  Here's a photograph:

Read the rest of this entry »

Comments (28)

7,530,000 mainlanders petition Taiwan actress to change her name

From David Moser:

Read the rest of this entry »

Comments (44)

Paperless reading

Just a little over a year ago, I made the following post:

"The future of Chinese language learning is now"  (4/5/14)

The second half of that post consisted of an account of a lecture that David Moser (of Beijing Capital Normal University and Academic Director of Chinese Studies at CET Beijing) had delivered a few days earlier (on 4/1/14) at Penn:  "Is Character Writing Still a Basic Skill?  The New Digital Chinese Tools and their Implications for Chinese Learning".

Read the rest of this entry »

Comments (1)

Autocomplete strikes again

I think I know how an unsuitable but immensely rich desert peninsula got chosen by FIFA (the international governing body for major soccer tournaments) to host the soccer World Cup in 2022.

First, a personal anecdote that triggered my hypothesis about the decision. I recently sent a text message from my smartphone and then carelessly slipped it into my pocket without making sure it had gone to sleep.

Read the rest of this entry »

Comments off

Cantonese input methods

Despite the efforts of the central government to clamp down on and diminish the role of Cantonese in education and in public life generally, the language has been experiencing a heady resurgence, especially in connection with the prolonged Umbrella Movement last fall.

"Cantonese resurgent" (12/11/12)

"Here’s why the name of Hong Kong’s 'Umbrella Movement' is so subversive" (10/23/14)

"Translating the Umbrella Revolution" (10/3/14)

"Cantonese protest slogans" (10/26/14), etc.

Read the rest of this entry »

Comments (9)

Zhou Youguang, 109 and going strong

A year ago, I wrote "Zhou Youguang, Father of Pinyin" (1/14/14) to celebrate Zhou xiansheng's 108th birthday and his many accomplishments in language reform and applied linguistics.  Included in that post were a portrait of ZYG in his study and numerous links concerning the man and his works.

Read the rest of this entry »

Comments (3)

Stylometric analysis of the Sony Hacking

The question of who was behind the hacking of Sony peaked a couple of weeks ago, but it is still a live issue.  The United States government insists that it was the North Koreans who did it:

"Chief Says FBI Has No Doubt That North Korea Attacked Sony" (New York Times — January 8, 2015)

James B. Comey, director of the Federal Bureau of Investigation, said on Wednesday that no one should doubt that the North Korean government was behind the destructive attack on Sony’s computer network last fall.

Read the rest of this entry »

Comments (13)

Kazakh

Google Translate just keeps getting bigger and bigger and better and better.  As of today, it now includes Kazakh.  And here's the first word that I typed in Google Translate + Kazakh:

Қазақ

Read the rest of this entry »

Comments (25)

Tim Cook, Bent Man

Last week, China was gaga over Facebook chairman Mark Zuckerberg for gamely, if somewhat lamely, speaking Mandarin before an audience of Tsinghua University students:

"Zuckerberg's Mandarin" (10/23/14)

In the days following his sensational performance at Tsinghua, while not universally showered with adulation (and Facebook is still blocked in China), Zuckerberg was generally acclaimed for his gutsy, good-natured effort to speak to Chinese people in their own language.

In stark contrast, poor Tim Cook (Apple CEO) was mocked by the Chinese netizenry for his declaration in Bloomberg Businessweek:  "So let me be clear: I’m proud to be gay…."

"Tim Cook Speaks Up" (10/30/14)

The resultant hullabaloo on the Chinese internet was instantaneous:

"Tim Cook Coming Out Has Turned China Into a Nation of 5th-Graders:  Despite the Apple CEO's good intentions, Chinese netizens can't seem to stop mocking iPhones for being gay. " (10/30/2014)

Read the rest of this entry »

Comments (18)

The paucity of two-letter words

The number of possible two-letter lower-case strings over the English alphabet (not including the apostrophe) is 262 = 676. This morning I ran a script to test which two-letter sequences show up as words included in the standard 25,143-word list of words supplied with many Unix-derived systems (usually at /usr/share/dict/words). I found the proportion of two-letter sequences that are 2-letter words is roughly 9 percent (59/676 ≈ 0.09). That is, more than 90 percent of the logically possible two-letter combinations from aa to zz do not occur as spellings of common English words. You might think a lot of the explanation lies in phonetics: vowelless combinations like pq or bn are unpronounceable. But I then did the same thing for two-letter standard Unix commands: bc (basic calculator), cp (copy files), ls (list files), mv (move or rename files), etc. These arbitrarily adopted program names do not have to be pronounceable, and usually aren't. And I found that the ratio of two-letter Unix commands (more precisely, two-letter commands that have manual entries on Apple OS X version 10.6.8.) to two-letter sequences that are not Unix commands is almost exactly the same (62/676 ≈ 0.09). Why? Could it be that some kind of natural law discourages packing too many meanings into character strings (or phoneme sequences) of a given length, because it is likely to give rise to confusion or mnemonic problems? Does every language waste (as it were) at least 90 percent of the space available in the length-N sequences of letters or sounds that it uses, possibly for every N > 1?

Read the rest of this entry »

Comments (44)

Stray Chinese characters in English language documents

Lawrence Evalyn wrote to me saying that he received the official communication below about a new student card that is being issued by his university.  He was perplexed by all the Chinese characters that got inserted in the text.  They seem to appear consistently in certain places and for certain letters.  [N.B.:  The communication has been anonymized for posting on Language Log.]

Read the rest of this entry »

Comments (10)

Is the Urdu script on the verge of dying?

Hindi-Urdu, also referred to as Hindustani, is the classic case of a digraphia, so much so that there has been a long-standing controversy over whether they are one language or two.  Their colloquial spoken forms are nearly identical, but when written down, the one in the Devanāgarī script, the other in the Nastaʿlīq script, they have a very different look and "feel".

Read the rest of this entry »

Comments (56)

Language notes from Macao and Hong Kong

From June 13 until the 18th, I was at a conference on Buddhist culture and society held at the University of Macao.  There were about thirty participants, all except me from East Asia, and the East Asians were about evenly divided among scholars from Taiwan, China, Macao, and Hong Kong, plus one each from Japan and Korea.

Read the rest of this entry »

Comments (25)