Language Log

Archive for Computational linguistics

Linguistic Deception Detection: Part 1

December 6, 2011 @ 1:01 pm· Filed by Mark Liberman under Computational linguistics

In "Reputable linguistic "lie detection"?", 12/5/2011, I promised to scrutinize some of the research on linguistic deception detection, focusing especially on the work cited in Anne Eisenberg's 12/3/2011 NYT article "Software that listens for lies". This post is a first installment, looking at the work of David Larcker and Anastasia Zakolyukina ("Detecting Deceptive Discussions in Corporate Conference Calls", Rock Center for Corporate Governance, Working Paper No. 83, July 2010).

[Update: as of 6/5/2019, the working papers version no longer exists, but a version under the same title was published in the Journal of Accounting Research in 2012.]

Read the rest of this entry »

Permalink Comments (4)

Reputable linguistic "lie detection"?

December 5, 2011 @ 12:18 pm· Filed by Mark Liberman under Computational linguistics

Several readers have noted the article by Anne Eisenberg in Saturday's New York Times, "Software that listens for lies":

SHE looks as innocuous as Miss Marple, Agatha Christie’s famous detective.

But also like Miss Marple, Julia Hirschberg, a professor of computer science at Columbia University, may spell trouble for a lot of liars.

That’s because Dr. Hirschberg is teaching computers how to spot deception — programming them to parse people’s speech for patterns that gauge whether they are being honest.

For this sort of lie detection, there’s no need to strap anyone into a machine. The person’s speech provides all the cues — loudness, changes in pitch, pauses between words, ums and ahs, nervous laughs and dozens of other tiny signs that can suggest a lie.

Dr. Hirschberg is not the only researcher using algorithms to trawl our utterances for evidence of our inner lives. A small band of linguists, engineers and computer scientists, among others, are busy training computers to recognize hallmarks of what they call emotional speech — talk that reflects deception, anger, friendliness and even flirtation.

Read the rest of this entry »

Permalink Comments off

The immortal Pierre Vinken

December 1, 2011 @ 9:16 pm· Filed by Philip Resnik under Computational linguistics, Obituaries

On November 7, publishers Reed Elsevier announced the passing of Pierre Vinken, former Reed Elsevier CEO and Chairman, at age 83. But to those of us in natural language processing, Mr. Vinken is 61 years old, now and forever.

Though I expect it was unknown to him, Mr. Vinken has been the most familiar of names in natural language processing circles for years, because he is the subject (in both senses, not to mention the inaugural bigram) of the very first sentence of the Wall Street Journal (WSJ) corpus:

Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29.

But there's a fascinating little twist that most NLPers are probably not aware of. I certainly wasn't.

Read the rest of this entry »

Permalink Comments (1)

Towel-snapping semiotics: How the frontal lobe comes out through the mouth

November 30, 2011 @ 5:45 am· Filed by Mark Liberman under Computational linguistics

Yesterday's Tank McNamara:

Read the rest of this entry »

Permalink Comments (18)

Listeners needed for TTS standards intelligibility test

November 24, 2011 @ 6:36 am· Filed by Mark Liberman under Computational linguistics

Email from Ann Syrdal on behalf of the S3-WG91 Standards Working Group:

The "Text-to-Speech Synthesis Technology" ASA Standards working group (S3-WG91) is conducting a web-based test that applies the method it will be proposing as an ANSI standard for evaluating TTS intelligibility. It is an open-response test ("type what you hear"). The test uses syntactically correct but semantically meaningless sentences, Semantically Unpredictable Sentences (SUS).

To take the test, click here.

Read the rest of this entry »

Permalink Comments (31)

Spinoculars re-spun?

November 23, 2011 @ 8:43 am· Filed by Mark Liberman under Computational linguistics

Back in September of 2008, a Seattle-based start-up named SpinSpotter offered a tool that promised to detect "spin" or "bias" in news stories. The press release about the "Spinoculars" browser toolbar was persuasive enough to generate credulous and positive stories at the New York Times and at Business Week. But ironically, these very stories immediately set off BS detectors at Headsup: The Blog ("The King's Camelopard, or …", 9/8/2008) and at Language Log ("Dumb mag buys grammar goof spin spot fraud", 9/10/2008), and subsequent investigation verified that there was essentially nothing behind the curtain ("SpinSpotter unspun", 9/10/2008). SpinSpotter was either a joke, a fraud, or a runaway piece of "demoware" meant to create enough buzz to attract some venture funding. Within six months, SpinSpotter was an ex-venture.

An article in yesterday's Nieman Journalism Lab (Andrew Phelps, "Bull beware: Truth goggles sniff out suspicious sentences in news", 11/22/2011) illustrates the same kind of breathless journalistic credulity ("A graduate student at the MIT Media Lab is writing software that can highlight false claims in articles, just like spell check.") But the factual background in this case involves weaker claims (a thesis proposal, rather than a product release) that are more likely to be workable (matching news-story fragments against fact-checking database entries, rather than recognizing phrases that involve things like "disregarded context" and "selective disclosure").

Read the rest of this entry »

Permalink Comments (8)

Justin Bieber Brings Natural Language Processing to the Masses

November 18, 2011 @ 10:12 pm· Filed by Philip Resnik under Computational linguistics, Language and culture

Forget Watson. Forget Siri. Forget even Twitterology in the New York Times (though make sure to read Ben Zimmer's article first). You know natural language processing has really hit the big time when it's featured in a story in Entertainment Weekly.

Read the rest of this entry »

Permalink Comments (5)

Speech-based "lie detection"? I don't think so

November 10, 2011 @ 3:09 pm· Filed by Mark Liberman under Computational linguistics

Mike Paluska, "Investigator: Herman Cain innocent of sexual advances", CBS Atlanta, 11/10/2011:

Private investigator TJ Ward said presidential hopeful Herman Cain was not lying at a news conference on Tuesday in Phoenix.

Cain denied making any sexual actions towards Sharon Bialek and vowed to take a polygraph test if necessary to prove his innocence.

Cain has not taken a polygraph but Ward said he does have software that does something better.

Ward said the $15,000 software can detect lies in people's voices.

This amazingly breathless and credulous report doesn't even bother to tell us what the brand name of the software is, and certainly doesn't give us anything but Mr. Ward's unsupported (and in my opinion almost certainly false) assertion about how well it works:

Ward said the technology is a scientific measure that law enforcement use as a tool to tell when someone is lying and that it has a 95 percent success rate.

Read the rest of this entry »

Permalink Comments (19)

Real trends in word and sentence length

October 31, 2011 @ 8:34 am· Filed by Mark Liberman under Computational linguistics, Linguistic history

A couple of days ago, The Telegraph quoted an actor and a television producer emitting typically brainless "Kids Today" plaints about how modern modes of communication, especially Twitter, are degrading the English language, so that "the sentence with more than one clause is a problem for us", and "words are getting shortened". I spent a few minutes fact-checking this foolishness, or at least the word-length bit of it — but some readers may have misinterpreted my post as arguing against the view that there are any on-going changes in English prose style.

Read the rest of this entry »

Permalink Comments (72)

Sirte, Texas

October 31, 2011 @ 1:23 am· Filed by David Beaver under Computational linguistics, Language and politics, Linguistics in the news

According to Ben Zimmer, I'm writing from the front lines. But it's pretty quiet here, sitting at home in Texas, looking at tweets that have come out of Libya in the last couple of weeks. And somehow I don't think I'll be the first twitterologist to suffer from combat fatigue. Maybe that's because my students Joey Frazee and Chris Brown, together with our collaborator Xiong Liu, have been the ones doing computational battle in our little research team. That and the fact that nobody is firing mortars around here.

Yet quiet as it is where I'm sitting, it's a startling fact that today it's easy to hear far off clamor, to listen to the online noise made by thousands of ordinary people. Ordinary people in war zones. What are those people thinking?

Read the rest of this entry »

Permalink Comments (10)

On the front lines of Twitter linguistics

October 30, 2011 @ 6:27 pm· Filed by Ben Zimmer under Computational linguistics, Language and technology, Language on the internets

I have a piece in today's New York Times Sunday Review section, "Twitterology: A New Science?" In the limited space I had, I tried to give a taste of what research is currently out there using Twitter to build various types of linguistic corpora. Obviously, there's a lot more that could be said about these projects and other fascinating ones currently underway. Herewith a few notes.

Read the rest of this entry »

Permalink Comments (14)

Where he at now?

October 30, 2011 @ 9:23 am· Filed by Mark Liberman under Computational linguistics, Linguistics in the comics

That's the question on a t-shirt designed by John Allison, the author of the Bad Machinëry comic series:

Remember that dude? Always poppin' up in the corner? Wonder what he doin' now? Where he at now?

For those who are too young (or too old, or too fortunate in some other way) to have encountered the Microsoft's Office Assistant "Clippit", nicknamed "Clippy", the Wikipedia page may be helpful.

Read the rest of this entry »

Permalink Comments (14)

Amy was found dead in his apartment

October 26, 2011 @ 5:15 pm· Filed by Mark Liberman under Computational linguistics, Pragmatics, Semantics

I'm spending three days in Tampa at the kick-off meeting for DARPA's new BOLT program. Today was Language Sciences Day, and among many other events, there was a "Semantics Panel", in which a half a dozen luminaries discussed ways that the analysis of meaning might play a role again in machine translation. The "again" part comes up because, as Kevin Knight observed in starting the panel off, natural language processing and artificial intelligence went through a bitter divorce 20 years ago. ("And", Gene Charniak added, "I haven't spoken to myself since.")

The various panelists had somewhat different ideas about what to do, and the question period uncovered a substantially larger range of opinions represented in the audience. But it occurred to me that there's a simple and fairly superficial kind of semantic analysis that is not used in any of the MT systems that I'm familiar with, to their considerable detriment — despite the fact that algorithms with decent performance on this task have been around for many years.

Read the rest of this entry »

Permalink Comments (15)

« Previous Page — « Previous Entries

Next Entries » — Next Page »

Archive for Computational linguistics

Linguistic Deception Detection: Part 1

Reputable linguistic "lie detection"?

The immortal Pierre Vinken

Towel-snapping semiotics: How the frontal lobe comes out through the mouth

Listeners needed for TTS standards intelligibility test

Spinoculars re-spun?

Justin Bieber Brings Natural Language Processing to the Masses

Speech-based "lie detection"? I don't think so

Real trends in word and sentence length

Sirte, Texas

On the front lines of Twitter linguistics

Where he at now?

Amy was found dead in his apartment

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta