Spinoculars re-spun?

« previous post | next post »

Back in September of 2008, a Seattle-based start-up named SpinSpotter offered a tool that promised to detect "spin" or "bias" in news stories. The press release about the "Spinoculars" browser toolbar was persuasive enough to generate credulous and positive stories at the New York Times and at Business Week. But ironically, these very stories immediately set off BS detectors at Headsup: The Blog ("The King's Camelopard, or …", 9/8/2008) and at Language Log  ("Dumb mag buys grammar goof spin spot fraud", 9/10/2008), and subsequent investigation verified that there was essentially nothing behind the curtain ("SpinSpotter unspun", 9/10/2008). SpinSpotter was either a joke, a fraud, or a runaway piece of "demoware" meant to create enough buzz to attract some venture funding. Within six months, SpinSpotter was an ex-venture.

An article in yesterday's Nieman Journalism Lab (Andrew Phelps, "Bull beware: Truth goggles sniff out suspicious sentences in news", 11/22/2011) illustrates the same kind of breathless journalistic credulity ("A graduate student at the MIT Media Lab is writing software that can highlight false claims in articles, just like spell check.")  But the factual background in this case involves weaker claims (a thesis proposal, rather than a product release) that are more likely to be workable (matching news-story fragments against fact-checking database entries, rather than recognizing phrases that involve things like "disregarded context" and "selective disclosure").

Phelps explains that

Dan Schultz, a graduate student at the MIT Media Lab […], is devoting his thesis to automatic bullshit detection. Schultz is building what he calls truth goggles — not actual magical eyewear, alas, but software that flags suspicious claims in news articles and helps readers determine their truthiness. It’s possible because of a novel arrangement: Schultz struck a deal with fact-checker PolitiFact for access to its private APIs. […]

Schultz is careful to clarify: His software is not designed to determine lies from truth on its own. That remains primarily the province of real humans. The software is being designed to detect words and phrases that show up in PolitiFact’s database, relying on PolitiFact’s researchers for the truth-telling. “It’s not just deciding what’s bullshit. It’s deciding what has been judged,” he said. “In other words, it’s picking out things that somebody identified as being potentially dubious.”

The problem, as Schultz explained to Phelps, is that

Things get trickier when a claim is not a word-for-word match [to something in PolitiFact's database]. [..]

Schultz’s work explores natural language processing, in which computers learn to talk the way we do. If you’ve ever met Siri, you’ve experienced NLP. Schultz’s colleagues at the Media Lab invented Luminoso, a tool for what the Lab calls “common sense computing.” The Luminoso database is loaded with simple descriptions of things: “Millions and millions of things…Like, ‘Food is eaten’ or ‘Bananas are fruit.’ Stuff like that, where a human knows it, but a computer doesn’t. You’re taking language and turning it into mathematical space. And through that you can find associations that wouldn’t just come out of looking at words as individual items but understanding words as interconnected objects.

“Knowing that something has four legs and fur, and knowing that a dog is an animal, a dog has four legs, and a dog has fur, might help you realize that, from a word you’ve never seen before, that it is an animal. So you can build these associations from common sense. Which is how humans, arguably, come to their own conclusions about things.”

In other words, Schultz's dissertation project (which is apparently in fairly early stages of planning) will try to scan text for "claims" that match items in a database of statements that have been fact-checked. And the tricky part indeed will be the matching.

Some connections are going to be pretty easy. When there are proper names in association with exact quotes or specific numbers, so that PolitiFact's database tells us something like that "X claimed that Y said Z", or "X claimed that Y dollars went to Z", just the set of people, numbers, word-strings {X,Y,Z, …} is likely to be an effective diagnostic, independent of how many legs the set's members have have or whether any of them are fruit. Using this approach depends on finding names, numbers, quotes, etc., both in the fact-checking database and in the stories being analyzed; but such "entity tagging" software is fairly mature and works fairly well.

The matching will be harder if a suspect quotation is misquoted or paraphrased, or if names have to be matched to descriptions, or if numbers have to be matched to  evaluative descriptions or comparisons — or in general if the software really needs to "understand" at least some of the propositional and referential content of the fact-check database and the stories being evaluated, and to do the matching in a semantic space. It's easy to construct artificial examples that seem to require full "understanding" for a successful match, but I suspect that in the real world, these will be rare, and that relatively simple techniques can work reasonably well. Still, the system will need to be more sophisticated than simple string-matching, involving techniques like within-document co-reference determination, cross-document reference resolution, normalization of various kind of quantification,  some treatment of hypernyms and hyponyms, etc.

How should we quantitatively evaluate how well such a program is working? One now-common approach would use a test sample to estimate the system's recall and precision. Suppose that we have a set of stories containing (references to) N claims that actually do correspond to items in PolitiFact's database (according to consensus human judgment). Then the system's recall will be the proportion of those N claims that it flags and associates with the correct PolitiFact entry. And if the system flags M claims in the stories, its precision will be the proportion of those whose links to PolitiFact entries are correct. The system's F-measure is the harmonic mean of its precision and its recall.

Estimates of a system's precision and recall can be very different for different selections of test material; and as Yogi Berra told us, it's tough to make predictions, especially about the future. But based on what I know about the technologies involved, I'd expect a competent implementation of Dan Schultz's "truth goggles" to have precision and recall in the range of 0.6 to 0.8 — that is, the system will find roughly 70% of the PolitiFact-checked claims, and roughly 70% of the claims that it flags will be correctly linked to a PolitiFact check. (And as usual, you could tune the system to increase precision at the expense of recall, or vice versa.) If Schultz is able to get an F-measure as high as 0.9, on a representative sample of realistic material, I'll be impressed — and I'll be cheering from the sidelines for him to make that kind of score.

I'm a bit less sympathetic to Andrew Phelps, whose Nieman Journalism Lab article reads more like a hype-generating press release than a serious journalistic presentation of an inchoate PhD project.

If you'd like to calibrate your own built-in psychological "truth goggles", you could point them at Claire Cain Miller's "Start-Up Attacks Media Bias, One Phrase at a Time", NYT 9/8/2008:

SpinSpotter, a new start-up, could send shivers across many a newsroom. The Web tool, which went live Monday at the DEMO technology conference in San Diego, scans news stories for signs of spin.

Users download Spinoculars, a toolbar that sits atop the browser and lets readers know if the story they are reading has any phrases or words that indicate bias. […]

The Spinoculars find spin in three ways […]. First, it uses an algorithm to seek out phrases that violate six transgressions that the company’s journalism advisory board came up with based on the Society of Professional Journalists’ Code of Ethics. They are personal voice, passive voice, a biased source, disregarded context, selective disclosure and lack of balance. […]

SpinSpotter’s algorithm also uses a database of common phrases that are used when spinning a story. Finally, readers can flag instances of spin.

Or Jon Fine's "Media Bias? Not if This Web Site Can Help It", Business Week 9/8/2008:

In a development that could titillate political partisans of all stripes, a new Web application promising to spot bias in news stories will launch on Monday, Sept. 8, just as this ferociously contested election season shifts into high gear.

[…] When turned on in a user's Web browser's toolbar, Spinoculars scans Web pages and spots certain potential indicators of bias. The toolbar also will allow its users to flag phrases in news stories and opine on those called out by other Spinspotter users. The application's algorithms work off six key tenets of spin and bias, which the company derived from both the guidelines of the Society of Professional Journalists' Code Of Ethics and input from an advisory board composed of journalism luminaries. […]

An early guided tour of a news story on the Web seen through the prism of Spinoculars showed how the service highlights, in red, phrases that may tip off instances of opinion creeping into reportage. Users can then mouse over a red "S" icon near the offending phrases on the Web page to see what caught Spinocular's attention, and see a more neutrally worded recasting of that portion of the article. […]

Before critics on the right or left rear up with accusations that this new arbiter of bias is lousy with biases of its own, it's worth noting that while Herman calls himself a conservative, CEO John Atcheson describes himself as "very progressive." SpinSpotter's advisory board contains notable names from the center, right, and left. Among them are longtime commentator and co-founder of blogging network Pajamas Media Roger L. Simon, National Review's Jonah Goldberg, and Mother Jones and The Nation contributor Brooke Allen.

You'll need to rely on your own BS-detection circuitry in this case — or on a comparison to "SpinSpotter unspun", 9/10/2008 — because a search of PolitiFact's database for "SpinSpotter" or "Spinoculars" comes up empty. And snopes.com is also unaware of this curious little historical exercise in social-media/NLP vaporware…



8 Comments

  1. J. W. Brewer said,

    November 23, 2011 @ 12:55 pm

    So if this is going to be "just like spell check," I'm wondering what hilarious side effect will be "just like Cupertinoisms."

  2. Dakota said,

    November 24, 2011 @ 5:00 am

    "…an algorithm to seek out … passive voice …" ??!? GKP will so be salivating.

    On the other hand, it's pretty easy for a human to scan a list of comments on the end of a news article and immediately determine the political persuasion of the commenter by which test-marketed words and phrases are being used. It should be easy enough to deliver up a list of those phrases, but less easy to keep up with them, as during election season they change frequently, depending on the latest poll that tells what resonates with a particular county that has the swing vote at any given moment.

  3. Stitch said,

    November 24, 2011 @ 10:43 am

    "Schultz is building what he calls truth goggles — not actual magical eyewear, alas, but software that flags suspicious claims in news articles and helps readers determine their truthiness."

    That word "truthiness?" I don't think it means what he thinks it means.

  4. Dan Hemmens said,

    November 24, 2011 @ 12:55 pm

    That word "truthiness?" I don't think it means what he thinks it means.

    Indeed, I'd always understood that "truthiness" was exactly what spin was designed to produce.

  5. maidhc said,

    November 25, 2011 @ 4:34 am

    I suspect that if it partially worked (because I'm dubious that it would ever fully work), what people would do is run their campaign commercials through it over and over, adjusting the wording each time until it passed the test, then advertise themselves as "Passed by SpinSpotter!"

    On the other hand, there are certain words or phrases that signify to me that the person has no idea what they are talking about and it would a waste of time to read or listen any further. Unfortunately not every area of discourse has such words for me.

  6. KeithB said,

    November 28, 2011 @ 11:28 am

    I think that the article is using truthiness correctly, in other words, if the BS quotient is high, the truthiness quotient is high.

    I remember seeing some Made For TV SF movie where political speeches were broadcast on TV. The computer would analyze the speech and put little subtitles like "Emotional Appeal, no factual comment" and "This fact is not true."

  7. Catherine Havasi said,

    November 29, 2011 @ 12:29 am

    The knowledge base that's use in Truth Goggles is indeed ConceptNet. You can see our current version (http://conceptnet5.media.mit.edu/).

    The tool that Truth Goggles uses for doing document understanding with ConceptNet is called Luminoso, and it has recently been spun out into a company by that name. Visit http://lumino.so for more information about this process and the company.

  8. Related by rj3sp - Pearltrees said,

    December 14, 2011 @ 6:32 am

    […] Language Log » Spinoculars re-spun? Things get trickier when a claim is not a word-for-word match [to something in PolitiFact's database]. [..] Schultz’s work explores natural language processing, in which computers learn to talk the way we do. If you’ve ever met Siri, you’ve experienced NLP. Schultz’s colleagues at the Media Lab invented Luminoso, a tool for what the Lab calls “common sense computing.” […]

RSS feed for comments on this post