Archive for Language and technology

The future of Chinese language learning is now

When I began learning Mandarin nearly half a century ago, I knew exactly how I wanted to acquire proficiency in the language.  Nobody had to tell me how to do this; I knew it instinctively.  The main features of my desired regimen would be to:

1. pay little or no attention to memorizing characters (I would have been content with actively mastering 25 or so very high frequency characters and passively recognizing at most a hundred or so high frequency characters during the first year)

2. focus on pronunciation, vocabulary, grammar, particles, morphology, syntax, idioms, patterns, constructions, sentence structure, rhythm, prosody, and so forth — real language, not the script

3. read massive amounts of texts in Romanization and, if possible later on (after about half a year when I had the basics of the language nailed down), in character texts that would be phonetically annotated

Read the rest of this entry »

Comments (40)

Emojify the Web: "the next phase of linguistic evolution"

Today's announcement from the Google Chrome team (yes, note the date):


Read the rest of this entry »

Comments (8)

Swype and Voice Recognition for mobile device inputting

In late 2012, while visiting my son Tom in Dallas, I noticed that he was doing something very odd with his cell phone.  Most people enter text into their cell phone by pressing their thumbs (or their fingertip) on the letters of a small keyboard, whether virtual or actual.  But Tom was doing something altogether different:  he was sliding his finger over the glass surface of his phone and somehow, by so doing, he was able to enter text.  I was dumbfounded!  What amazed me most of all was how casual he was about it.  He'd be talking to me about something, then glance down at his cell phone, move his fingertip around on the glass, and — presto digito! — he'd have typed a message to someone and sent it off.

Read the rest of this entry »

Comments (41)

A fair-use victory for Google in these United States

US Circuit Judge Denny Chin has ruled in favor of Google in its long-running copyright litigation with the Authors Guild over the scanning and digitization of books. Chin ruled that the Google Books project constitutes fair use because it is "highly transformative" and "provides significant public benefits." In explaining those public benefits, Chin cited the use of Google Books data for Ngram queries, and pointed to a research example that we've discussed several times on Language Log.

Read the rest of this entry »

Comments (29)

Stupid FBI threat scam email

I recently heard of another friend-of-a-friend case in which people were taken in by one of the false email help-I'm-stranded scams, and actually sent money overseas in what they thought was a rescue for a relative who had been mugged in Spain. People really do respond to these scam emails, and they lose money, bigtime. Today I received the first Nigerian spam I have seen in which I am (purportedly) threatened by the FBI and Patriot Act government if I don't get in touch and hand over personal details that will permit the FBI to release my $3,500,000.

I wish there was more that people with basic common sense could do to spread the word about scamming detection to those who are somewhat lacking in it. The best I have been able to do is to write occasional Language Log posts pointing out the almost unbelievable degree of grammatical and orthographic incompetence in most scam emails. Sure, everyone makes the odd spelling mistake (childrens' for children's and the like), but it is simply astonishing that literate people do not notice the implausibility of customs officials or bank officers or police employees being as inarticulate as the typical scam email.

The one I just received is almost beyond belief (though see my afterthought at the end of this post). The worst thing I can think of to do to the senders is to publish the message here on Language Log, to warn the unwary, and perhaps permit those who are interested to track the culprit down. I reproduce the full content of the message source below, with nothing expurgated except for the x-ing out of my email address and local server names. I mark in red font the major errors in grammar and punctuation, plus a few nonlinguistic suspicious features.

Read the rest of this entry »

Comments off

Garakei: Galapagos cell phone

Recently I've been hearing about a Japanese electronic device called a "garakei ガラケイ". Mystified by this katakana word, which I assumed to be at least partially the transcription of some foreign term, I set about trying to find out more about it.

It wasn't hard to discover (here and here) that the word basically means "Galapagos cell phone". What a strange name for a kind of cell phone!

Read the rest of this entry »

Comments (14)

More on Juola's stylometry

Worth reading if you were interested in the computational stylometric analysis by Patrick Juola that helped to unmask J. K. Rowling as the author of The Cuckoo's Calling: an article in The Chronicle of Higher Education about Juola's work.

Read the rest of this entry »

Comments off

Rowling and "Galbraith": an authorial analysis

The Sunday (UK) Times recently revealed that J.K. Rowling wrote the detective novel The Cuckoo's Calling under the pen name Robert Galbraith. The newspaper explained that, as part of their investigation, they sought the assistance of two scholars who have developed software to help with authorship attribution: Peter Millican of Oxford University and Patrick Juola of Duquesne University. Given the public interest in the Rowling revelation, I asked Patrick to write a guest post describing the authorial analysis that he conducted. (For more on the story, see my post on the Wall Street Journal's Speakeasy blog.)

Read the rest of this entry »

Comments (17)

Cupertinos in the spotlight

About seven years ago, in March 2006, I wrote a Language Log post about "the Cupertino effect," a term to describe spellchecker-aided "miscorrections" that might turn, say, Pakistan's Muttahida Quami Movement into the Muttonhead Quail Movement. It owes its name to European Union translators who had noticed the word cooperation getting replaced with Cupertino by a spellchecker that lacked the unhyphenated form of the word in its dictionary. Since then, I've had occasion to hold forth on the Cupertino effect in various venues (OUPblog, Der Spiegel, Radiolab, the New York Times, etc.). Now, Cupertinos are getting yet another flurry of publicity, thanks to a new book by the British tech writer Tom Chatfield called Netymology.

Read the rest of this entry »

Comments (8)

Providence talks

Emily Badger, "Providence Wins Mayors Challenge Prize for Early Childhood Project", The Atlantic Cities, 3/13/2013:

New York City Mayor Michael Bloomberg likes to say that cities are the new laboratories of democracy in the United States (sorry states!), particularly in an era of political paralysis in Washington. This was the premise behind the $9 million Mayor's Challenge launched last summer by Bloomberg Philanthropies, inviting any city with a population larger than 30,000 to submit a groundbreaking idea for funding. This morning, Bloomberg announced the five winners – including a $5 million grand prize to Providence, Rhode Island – for potentially replicable innovations "bubbling up" from cities in early childhood education, recycling, data analytics, civic entrepreneurship and resident wellbeing. [...]

Grand Prize ($5 million): Providence, Rhode Island: Research suggests that in just the first few years of life, low-income children hear millions fewer words than their middle- and upper-income counterparts, impacting the development of their vocabularies and setting back their long-term prospects for academic and career success. This program aims to close that "word gap."

Read the rest of this entry »

Comments (24)

Transit is departing

The electric train that runs between the different parts of Terminal 5 at London's Heathrow Airport insists on referring to itself as a "transit".

What's more, the remarkably annoying female voice that tells you needlessly that the doors are closing and that the train is about to start moving says "Transit is departing."

Read the rest of this entry »

Comments (28)

Google Translate Chinese inputting

Google Translate is so incredibly good — especially for typing Chinese and producing Pinyin (Romanization) with tones — that I rely on it a lot and am always afraid that, like so many software developers (e.g., Microsoft), they are going to add some unwanted bells and whistles or take away some basic features.  So today, when I turned on my Google Translate and saw a new wrinkle in the bottom left corner of the box into which you input Chinese, I was worried that it would lose the features that make it so easy for me to enter text.

Read the rest of this entry »

Comments (28)

iPhone Math

The rumors are flying that Apple will introduce a new device called the "iPhone Math" in June of this year.  Since that is a highly improbable name for an iPhone (is this going to be some kind of fancy calculator?), skeptical minds have been trying to find the source of the rumors.  The earliest known occurrences of the expression "iPhone Math" are to be found in Taiwanese media, so one suspects that there was some sort of distortion of a hypothetical "iPhone Plus / iPhone +" (semantic garbling) or a hypothetical iPhone Max (phonetic garbling).  After jumbled translation or transcription from English to Chinese, then back again into English, either of those names might conceivably have come out as "iPhone Math", which would indeed be a weird name for an iPhone.

Read the rest of this entry »

Comments (14)

Counting words

Far be it from me to pervert the noble institution of Language Log by exploiting it as a place to rant about the shortcomings of an unusably vile word processor. I know you wouldn't want that. This is Language Log, not Vile Word Processing Software Log. However, since the topic seems to have come up… Could I make a brief remark?

Read the rest of this entry »

Comments off

Remembering Aaron Swartz (and Infogami)

There have been many online remembrances of Aaron Swartz, the brilliant young programmer and Internet activist who killed himself on Friday at the age of 26. (See, for instance, Caleb Crain's piece for The New Yorker's Culture Desk blog and the many tributes linked therein.) It's typically noted that in 2005 Swartz founded the startup Infogami, which then merged with Reddit shortly thereafter. (In obituaries, Swartz has often been identified as a co-founder of Reddit — some dispute that characterization, but it's true that the Infogami wiki platform was a key to Reddit's early success.) I don't have any first-hand reminiscences to share, but with Infogami back in the news I thought it would be a good time to look back on something I wrote in 2006 about the company's name.

Read the rest of this entry »

Comments (9)