Archive for Language and technology

Remembering Aaron Swartz (and Infogami)

There have been many online remembrances of Aaron Swartz, the brilliant young programmer and Internet activist who killed himself on Friday at the age of 26. (See, for instance, Caleb Crain's piece for The New Yorker's Culture Desk blog and the many tributes linked therein.) It's typically noted that in 2005 Swartz founded the startup Infogami, which then merged with Reddit shortly thereafter. (In obituaries, Swartz has often been identified as a co-founder of Reddit — some dispute that characterization, but it's true that the Infogami wiki platform was a key to Reddit's early success.) I don't have any first-hand reminiscences to share, but with Infogami back in the news I thought it would be a good time to look back on something I wrote in 2006 about the company's name.

Read the rest of this entry »

Comments (9)

The he's and she's of Twitter

My latest column for the Boston Globe is about some fascinating new research presented by Tyler Schnoebelen at the recent NWAV 41 conference at Indiana University Bloomington. Schnoebelen's paper, co-authored with Jacob Eisenstein and David Bamman, is entitled "Gender, styles, and social networks in Twitter" (abstract, full paper, presentation).

Read the rest of this entry »

Comments (6)

A new chapter for Google Ngrams

When Google's Ngram Viewer was launched in December 2010 it encouraged everyone to be an amateur computational linguist, an amateur historical lexicographer, or a little of both. Today, the public interface that allows users to plumb the Google Books megacorpus has been relaunched, and the new version makes it even more enticing to researchers, both scholarly and nonscholarly. You can read all about it in my online piece for The Atlantic, as well as Jon Orwant's official introduction on the Google Research blog.

Read the rest of this entry »

Comments (13)

Panel on Digital Dictionaries (MLA/LSA/ADS)

Eric Baković has noted the happy confluence of the annual meetings of the Linguistic Society of America and the Modern Language Association, both scheduled for January 3-6, 2013 at sites within reasonable walking distance of each other in Boston. (The LSA will be at the Boston Marriott Copley Place, and the MLA at the Hynes Convention Center and the Sheraton Boston.) Eric has plugged the joint organized session on open access for which he will be a panelist, so allow me to do the same for another panel with MLA/LSA crossover appeal. The MLA's Discussion Group on Lexicography has held a special panel for several years now, but many lexicographers and fellow travelers in linguistics have been unable to attend because of the conflict with the LSA and the concurrent meeting of the American Dialect Society. This time around, with the selected topic of "Digital Dictionaries," the whole MLA/LSA/ADS crowd can join in.

Read the rest of this entry »

Comments (6)

Global email skepticism

To my considerable astonishment I read this in a piece of boilerplate automatically tacked onto the end of an email reply that I received when I emailed my personal contact person and account manager at my bank:

This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate.

I can't see anything in it that is actually incorrect (and I like the use of singular they); it just seems extraordinary to receive a sort of endorsement of global skepticism from one's bank. My philosophical friends tend to have no time at all for global skepticism of this sort. They would ask the sender, "Should we therefore not assume that this caveat is accurate? Should we doubt that it originated from the Internet, since the sentence saying so did?" And eventually the sender would vanish in a puff of logic.

Read the rest of this entry »

Comments off

Remove this

In the bathroom at a friend's house tonight I saw, on the underside of the toilet lid, firmly affixed with adhesive, a printed paper sign that I truly do not understand. That is, although I comprehend it (it is in six languages, all of which I read well enough to be able to follow the legend in question), I don't follow what its purpose could possibly be. I am truly baffled. Let me show you what it said. Keep in mind that the following is all of what it says. Nothing is missing from the label, and there is no other wording at all (and incidentally, the various accent mistakes are not mine, they are copied from the original). See if you are as baffled as I am:

Read the rest of this entry »

Comments (69)

Taikonaut

From a correspondent in Taiwan who wishes to remain anonymous:

Sometimes the word 'taikonaut' will be seen in news articles about PRC astronauts. This cuto-chinoiserie is really stupid. The premise seems to be that since Russian astronauts are called cosmonauts, PRC astronauts ought to have a special name too.

Read the rest of this entry »

Comments (47)

Stupid answering machine program code

Perhaps it is because I'm at the Computability in Europe 2012 conference, a big meeting honoring the centenary of Alan Turing's birth, that I was reflecting on algorithms today. My phone answering machine at home is programmed to count the number of messages waiting to be listened to, storing the total in a variable I will call N, and then set another variable that I will call M to the initial value of 1; and the playback button causes the running of a routine of which the pseudo-code would be this:

if N = 1
   speak "You have one new message."
else
   speak "You have N new messages."
end if
for each M from 1 to N
   speak "Message M:"
   play message M
end for
speak "End of messages."
speak "To delete all messages, press Delete."

Can you see what's so incredibly annoying here, to a linguist, or anyone with some basic common sense about pragmatics?

Read the rest of this entry »

Comments off

Personal electronic deDeputys

On the heels of the notorious Nooking of War and Peace, Shane Horan sends along "a priceless search-and-replace error on the rules page of an Irish secondary school." St. Joseph's College in Borrisoleigh, County Tipperary has an entire section on "personal electronic deDeputys": though "mobile phones and other electronic deDeputys can be very useful and helpful," the school's rules say "these deDeputys may not be powered on while the student is on the school grounds, including before classes begins or at break or lunch time."

Read the rest of this entry »

Comments (12)

"It was as if a light had been Nookd…"

Here on Language Log we've often talked about unfortunate search-and-replace miscorrections, which now seem to be infecting poorly edited e-reader texts. The latest example, via Kendra Albert on Jonathan Zittrain's Future of the Internet blog, is a doozy. The Nook edition of Tolstoy's War and Peace (in its English translation) has been de-Kindled, quite literally. Every instance of the text string kindle has been replaced by Nook.


(Click to embiggen.)

Read the rest of this entry »

Comments (23)

Hyperbolic lots

For the past couple of years, Google has provided automatic captioning for all YouTube videos, using a speech-recognition system similar to the one that creates transcriptions for Google Voice messages. It's certainly a boon to the deaf and hearing-impaired. But as with Google's other ventures in natural language processing (notably Google Translate), this is imperfect technology that is gradually becoming less imperfect over time. In the meantime, however, the imperfections can be quite entertaining.

Read the rest of this entry »

Comments (9)

Tasty cupertinos

A correction from The New York Times on Damon Darlin's article, "Economic Theory Plots a Course for Good Food" (4/10/12 online, p. D3 in the 4/11/12 print edition):

This article has been revised to reflect the following correction:

Correction: April 10, 2012

An earlier version of this article incorrectly referred to the Ethiopian dish doro wot as door wot. Additionally, the article referred incorrectly to awaze tibs as aware ties.

As noted on the Slate Twitter feed, these goofs are almost certainly the result of overzealous autocorrect — or, as we say in these parts, they're due to the Cupertino effect. We've documented many such cupertinos over the years (old site, new site). Foreign food terms have cropped up before — way back in 2005, before we even knew the Cupertino effect had a name, I noted that menus and recipes had fallen prey to the unfortunate spellcheck miscorrection of prostitute for prosciutto. At least prosciutto is likely to be in spellcheck dictionaries these days — the same can't be said for Ethiopian doro wot or awaze tibs, no matter how delectable those dishes may be.

(Craig Silverman of Poynter's Regret the Error is also on the case.)

Comments (7)

Annals of airport Chinglish, part 3

Carley De Rosa spotted this sign in the Kunming airport on her way to Laos. Dumbfounded by the Chinglish, not least because what it called an "elevator" was actually an "escalator", on her way back from Laos she made sure to get a photograph of the sign and send it to me for analysis:

Read the rest of this entry »

Comments (59)