Language Log

The quality of quantity

April 24, 2012 @ 1:45 pm · Filed by Mark Liberman under Computational linguistics

The longer it is, the higher the rating:

Read the rest of this entry »

Scientific study of affirmative-response indicators

April 23, 2012 @ 3:13 am · Filed by Geoffrey K. Pullum under Language and the media, Pragmatics, Style and register

My Breakfast Experiments™ aren't quite as rigorous as Mark Liberman's. He has direct access via a high-speed line to the entire Linguistic Data Consortium collection of corpora at his breakfast table, and writes R scripts for statistical analysis as if R was his native language (it may well be, come to think of it). My breakfast table has just a digital radio, a cereal bowl, and a mug bearing the legend "Keep calm and drink tea." But I'll give you some hard quantitative data for two different ways of expressing an affirmative response to a yes/no question or agreeing with a presented statement in contemporary British English. The frequency of people (especially experts) speaking to Radio 4 news programs saying "That's correct" falls in the monstrogacious to huge range (as measured by my casual early-morning impressions), while the frequency of that mode of affirmative responding in ordinary real-life conversation is roughly zero (source: vague memories of hearing people chat to each other). I hope that's rigorous enough for present purposes.

Read the rest of this entry »

Permalink Comments (90)

Possession and agency in editions of Barth

April 22, 2012 @ 8:43 am · Filed by Mark Liberman under Language and culture

John Barth is visiting Penn, and so I took the opportunity to catch up on his most recent meta-fictions, specifically The Development and Every Third Thought. I read the second one first, and will make no comment on it here, except to note that (while suitably Barthian) it lacked the feature that struck me so forcefully in reading the first one.

Read the rest of this entry »

Permalink Comments (40)

Watson v. Watson

April 21, 2012 @ 8:32 am · Filed by Mark Liberman under Computational linguistics

As Wikipedia explains,

Watson is an artificial intelligence computer system capable of answering questions posed in natural language,[2] developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first president, Thomas J. Watson.

But as a page at AT&T Labs Research tells us,

AT&T WATSON^SM is AT&T's speech and language engine that integrates a variety of speech technologies, including network-based, speaker-independent automatic speech recognition (ASR), AT&T Labs Natural Voices® text-to-speech conversion, natural language understanding (which includes machine learning), and dialog management tasks.

WATSON has been used within AT&T for IVR customers, including AT&T's VoiceTone® service, for over 20 years during which time the algorithms, tools, and plug-in architecture have been refined to increase accuracy, convenience, and integration. Besides customer care IVR, AT&T WATSON^SM has been used for speech analytics, mobile voice search of multimedia data, video search, voice remote, voice mail to text, web search, and SMS.

Read the rest of this entry »

Permalink Comments (10)

Poetical etymologies

April 21, 2012 @ 6:17 am · Filed by Mark Liberman under Linguistics in the comics

Wondermark #829, 4/20, "In which pepper is explained":

Read the rest of this entry »

Permalink Comments (27)

Coin change 'skin problem fear' hed noun pile puzzle

April 21, 2012 @ 5:43 am · Filed by Mark Liberman under Crash blossoms, Headlinese, Psychology of language

SC, a native reader of British headlinese, was baffled by the noun pile-up "Coin change 'skin problem fear'" on the BBC News web site, because he hadn't previously encountered the story.

Read the rest of this entry »

Permalink Comments (27)

Rapper 50 Cent converted into Malaysian currency

April 20, 2012 @ 10:49 am · Filed by Ben Zimmer under Errors, Language and the media

Making the rounds today, from Andrew Bloch's Twitter feed:

Bloch's comment: "Reuters applies foreign exchange rate to 50 Cent. He is now known as RM1.50 in Malaysia."

Read the rest of this entry »

Permalink Comments (20)

Ongoing lexical fascism

April 20, 2012 @ 7:12 am · Filed by Geoffrey K. Pullum under Language and the media, Language change, Words words words

Over at Lingua Franca, where I do weekly blog posts for The Chronicle of Higher Education, I tried to refer to some ongoing research other day, and called it that, and I was slapped down by my editor (she knows the New York Times style manual prohibitions far too well), quoting a remark by the managing editor: "If I see someone using ongoing in The Chronicle, I will be downcoming and he or she will be outgoing."

Lexical fascism! They would fire me for using ongoing as an adjective? Thank goodness for Language Log, I thought, where lexical liberty survives. So I'm back over here today, choosing my own words, ruminating resentfully on this stylistic bullying.

Read the rest of this entry »

Permalink Comments off

Ask a baboon

April 19, 2012 @ 7:41 am · Filed by Mark Liberman under Psychology of language

Sindya N. Bhanoo, "Real Words or Gibberish? Just Ask a Baboon", NYT 4/16/2012:

While baboons can’t read, they can tell the difference between real English words and nonsensical ones, a new study reports.

“They are using information about letters and the relation between letters to perform the task without any kind of linguistic training,” said Jonathan Grainger, a psychologist at the French Center for National Research and at Aix-Marseille University in France who was the study’s first author.

Read the rest of this entry »

Permalink Comments (22)

"Lie Fallow Small And Pave"

April 19, 2012 @ 2:45 am · Filed by Victor Mair under Lost in translation

Murray Clayton, a statistician from the University of Wisconsin, sent in this photograph of a sign on the Tamsui Fisherman's Wharf in Taiwan:

(Click to embiggen.)

Read the rest of this entry »

Permalink Comments (22)

Jailed for tweeting

April 18, 2012 @ 1:13 pm · Filed by Geoffrey K. Pullum under Language and the law, Taboo vocabulary

The marginally linguistic topic of freedom of linguistic expression occasionally occupies me here on Language Log, as you probably know. And you may be aware that my instincts tend toward the libertarian end of the spectrum, and the defense of the First Amendment. Possibly you are also aware that there really isn't anything I despise and abhor more than racism. So the recent case of Liam Stacey here in the UK puts my principles in tension. He has been jailed for exercising what you might describe (incorrectly, I think) as his free speech rights on Twitter, having apparently forgotten that the UK does not have any analog of America's First Amendment. I'll review the facts of the case, including the language that he used. But do not read on unless you are prepared to see some seriously offensive linguistic material.

Read the rest of this entry »

Permalink Comments off

Pulling out (the words whose distribution is most similar to that of) a plum

April 17, 2012 @ 5:04 am · Filed by Mark Liberman under Computational linguistics

A few days ago ("Evaluative words for wines", 4/7/2012), I illustrated how a trivial method can help us uncover the contribution of individual words to the expression of opinion in text. For this morning's Breakfast Experiment™, I'll illustrate an equally trivial approach to learning how words fit together structurally, using the same small collection of 20,888 wine reviews.

Read the rest of this entry »

Permalink Comments (6)

The first "asshole" in the Times?

April 16, 2012 @ 10:47 am · Filed by Ben Zimmer under Language and politics, Language and the media, Taboo vocabulary

In "Larkin v. the Gray Lady," Mark Liberman credits a Language Log reader with pointing out that "the NYT printed asshole for the first time a couple of weeks ago" ("Race, Tragedy and Outrage Collide After a Shot in Florida", 4/1/2012):

Mr. Zimmerman told the dispatcher that this “suspicious guy” was in his late teens, with something in his hands. He asked how long it would be before an officer arrived, because “these assholes, they always get away.”

But this wasn't, in fact, the first time that asshole graced the pages of the Times. That verbal transgression was pioneered, like so many others, by Richard Nixon in the Watergate tapes.

Read the rest of this entry »

Permalink Comments (14)

Language Log

The quality of quantity

Scientific study of affirmative-response indicators

Possession and agency in editions of Barth

Watson v. Watson

Poetical etymologies

Coin change 'skin problem fear' hed noun pile puzzle

Rapper 50 Cent converted into Malaysian currency

Ongoing lexical fascism

Ask a baboon

"Lie Fallow Small And Pave"

Jailed for tweeting

Pulling out (the words whose distribution is most similar to that of) a plum

The first "asshole" in the Times?

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta