## SpinSpotter unspun

What is spin? According to the OED's 1993 additions,

2. g. fig. A bias or slant on information, intended to create a favourable impression when it is presented to the public …

What is SpinSpotter? According to Claire Cain Miller in the NYT ("Start-Up Attacks Media Bias, One Phrase at a Time", 9/8/2008), it's a Web tool that "scans news stories for signs of spin".

The Spinoculars find spin in three ways, said Mr. Herman. First, it uses an algorithm to seek out phrases that violate six transgressions that the company’s journalism advisory board came up with based on the Society of Professional Journalists’ Code of Ethics. They are personal voice, passive voice, a biased source, disregarded context, selective disclosure and lack of balance. […]

SpinSpotter’s algorithm also uses a database of common phrases that are used when spinning a story. Finally, readers can flag instances of spin. Other SpinSpotter users can see these flags, and the reported phrases will enter the spin database.

The guy being quoted is  "SpinSpotter founder and chief creative officer, Todd Herman". Other stories about SpinSpotter — and there are quite a few of them — give a similar picture.

But here's another definition, offered by me in comments on a weblog post yesterday:

This might be an unusual type of demoware …, one that is released for general use in the hope that enough people will submit their proposed spin-spots to give the company enough free training data to actually develop some of the technology that they pretended to have in the first place.

I said this because when I downloaded and installed SpinSpotter, and tested it on several dozen news and opinion pieces, it did nothing at all. I expected to see silly complaints, like the ones that you often get from "grammar checkers". But it flagged nothing, right or wrong, as spin. The only exception was on Jon Fine's Business Week story about SpinSpotter, in which it highlighted two innocuous phrases as "spin", although the same phrases were not flagged in other stories, or even in the printable version of the very same Business Week page. [Update: Kip's analysis of SpinSpotter's code supports the view that it does no local analysis at all, but just "calls home" with the current page's URL.  Available evidence is that all the (rare) spin-spots that come back have been submitted by users — if any automated spin-spotting is going on, no one has yet reported an example.]

In other words, Mr. Herman's description — and the company's explanation on its web site of "what we do" — are prime examples of spin. In fact, I'd go a little further, and suggest that Mr. Herman's description is beyond "spin", edging into the territory of good old-fashioned "lies".

Being an optimistic sort of person who hates to think ill of his fellow humans, I hope that this is all just some kind of mistake, caused by a bug in the software release. But here are the facts as I uncovered them during my breakfast hour yesterday morning, in time taken away from much-needed course preparation, described in an over-long series of comments on a post in which Geoff Pullum complained about the company's common error of viewing "passive" as meaning "vague about agency".

My first comment:

The documentation seems to be rather carefully spun. It claims that as a first step, "SpinSpotter Looks for areas of news which appear to present editorial opinion as fact or other instances of "spin" from a published list of rules of spin".

The natural way to read this sentence is that SpinSpotter starts by analyzing the text algorithmically for instances of spin that meet the definitions on its published list. However, you could also read it to mean that SpinSpotter just looks the page up in the company's database.

The cited page says that the third step is (or perhaps will be?) that "SpinSpotter learns from the input of trusted users, which enables the system to find more spin with technology." The natural way to read this is that feedback from users will be used to re-train the system and make it better. However, you could also read it to mean that at present, they have no real spin analysis technology at all, but want to enlist users to provide them with free training material that they hope to use in the future to develop some spin-finding technology.

I downloaded and installed SpinSpotter and tried it on web pages that contained many instances of what should count as spin, by their "rules". Although I set their "slider" on the most sensitive setting, the only things that were flagged by the system on any of these pages were two word sequences that might have been detected by string search. Curiously, both of these spin-spots were in Jon Fine's article, where two two-word sequences are flagged. One is "some people" in the quote from Martha Steffens:

And SpinSpotter "might prove some stories don't have the level of bias that some people perceive," says Martha Steffens, a SpinSpotter adviser who's a professor at the Missouri School of Journalism.

The other is "may be" in the quote from Tom Curley, chief executive officer of the AP:

But he said "having some observations [in a news story] that are shaded one way or another may be good or may not, regardless of whether an algorithm spotted a word" or phrase.

When I checked a half-a-dozen other news articles containing the phrases "some people" and "may be" in similar contexts, nothing was flagged.

I also checked pages on the DNC and RNC web sites, and editorials by William Kristol and Bob Herbert, and pages on Free Republic and Daily Kos, and nothing at all was flagged on any of them. Perhaps the good people at SpinSpotter — or some of their early adopters — added a couple of spin-instances from Jon Fine's article to their database? I guess it could also be that they have some sensitive spin-detection algorithms that are set off by their own journalism-professor advisor and by the chief exective of the AP, but not by the Freepers or the Kossacks. Then again, maybe the whole thing is vaporware.

Or rather, since the the browser toolbar and other aspects of the user interface seem to work quite nicely, I should say that it's apparently "demoware", in the sense of "software that has no real functionality, but instead only functions for demos, […] to give the person viewing the demo a sense of how the program will look…"

This might be an unusual type of demoware, though, one that is released for general use in the hope that enough people will submit their proposed spin-spots to give the company enough free training data to actually develop some of the technology that they pretended to have in the first place.

With respect to the passive-voice foolishness, there are several instances of genuine passive voice in Fine's article, and also several instances of active-voice sentences where the agent is unspecified or vague. SpinSpotter flagged none of them.

My tentative conclusion: SpinSpotter is basically a scam, and the documentation is a prime example of spin, and Jon Fine at Business week got spun.

My second comment:

Further evidence that SpinSpotter is demoware: when I copy Jon Fine's article to my own computer and browse it at the new location, or even when I browse the printable version on the Business Week site, the two spin-spots go away. This is evidence, obviously, that the spotted spin is not being generated by algorithmically scanning the text in the browser, but by checking the URI in the company's database.

It looks like misunderstanding the passive voice is the least of this enterprise's credibility problems. I speculate that the SpinSpotter management team are aiming their efforts at credulous venture capitalists, and that Jon Fine has swallowed the bait.

It's a bit surprising that the download wouldn't even contain a list of keyword-sequences to check by string matching, or some other simple-minded algorithm along the lines of the usage advice that some word processes give you. To start promoting a piece of software that does absoutely nothing whatsoever would be unusually brazen — maybe there's a bug in the initial download that prevents part of it from running, or at least prevented it from running on my machine?

It's hard to believe that (what appear to be) reputable people could be associated with promoting (what appears to be) such a crock, but there you are.

My third comment:

More on the SpinSpotter scam-spotting: I've tried the download in FireFox 3.X on three different machines, under OSX (10.4.11), Windows (Vista), and Linux (Ubuntu 7.04), with identical results.

I'll add this morning that I get essentially the same results as yesterday, except that in Jon Fine's Business Week article on SpinSpotter, an additional item is highlighted as "spin" — the company's name:

And SpinSpotter "might prove some stories don't have the level of bias that some people perceive," says Martha Steffens, a SpinSpotter adviser who's a professor at the Missouri School of Journalism.

I'll also add that in addition to the Business Week and NYT items cited above, I've read a half a dozen other pieces on SpinSpotter, including:

Wired News, "SpinSpotter Combats Unethical, Biased Journalism"
Washington Post, "Funding In Hand, SpinSpotter Ready To Call Out Media Bias"
Wall Street Journal, "At Tech Show, Even News Is Target for Improvement"

And so far, none of the journalists writing about this start-up appear to have noticed that its spin-spotting software doesn't actually do anything at all. [That's not exactly true — Claire Miller at the NYT notes the paucity of spin-spots, though she doesn't draw the conclusion that the text-scanning part of the algorithm apparently doesn't exist.]

Or rather, SpinSpotter does just one of the three things that its founder claims it does (and its press release and its web site likewise). The company has apparently implemented as a system for sharing distributed human commentary on news and other web pages, maintaining a database of which text stretches in which web pages have been flagged by users as allegedly being instances of "spin". That's an interesting thing to do, though anyone who's spent much time reading about politics or language on usenet groups or web forums will be forgiven for wondering whether the resulting annotations will be worth much.

The thing is, though, SpinSpotter's founder claims that their software uses "an algorithm to seek out phrases that violate six transgressions that the company’s journalism advisory board came up with based on the Society of Professional Journalists’ Code of Ethics […]", and that "SpinSpotter’s algorithm also uses a database of common phrases that are used when spinning a story".

And then, "Finally, readers can flag instances of spin". Well, that last part's apparently true (though to be honest, I haven't tried it — perhaps the reader flagging part doesn't work either, and the spin instances in Jon Fine's article were added at SpinSpotter headquarters). The first two claims? Pure spin. Or rather, to be blunt about it, falsehoods. Otherwise known as lies.

The best spin that I can put on this episode, from the company's point of view, is that they hope to recruit users to do free data annotation for them, and they genuinely plan to use this annotation to create algorithms for automated spin detection. As Dan Tobias suggested, this would make it the first "software version of stone soup". On this view, SpinSpotter is a sort of benign and well-meaning fraud, eventually to be redeemed by the wisdom of crowds.

I'm skeptical that this could lead to anything useful, basically for the reasons that Fev gives at Headsup. "Spin" isn't going to reduce very well to the kinds of local textual features that work for entity tagging or sentiment analysis; and even if it did, it's not clear that a free-for-all public political finger-pointing exercise is going to generate the training data that could show your system how to do it.

But the main point here is not about whether spin is a local feature of text, or whether the wisdom-of-crowds idea works in political discourse, or whether SpinSpotter is a benign fraud or a malignant one. The most striking thing about the whole business is how lazy and credulous the journalists who report on this stuff often are.

It takes about 15 minutes to download SpinSpotter and discover that it doesn't actually spot any spin. Why didn't Jon Fine at Business Week do this, as I did, and for that matter as Claire Miller at the NYT did? At a guess, because he wrote his story from a press release, maybe a demo at a conference, and perhaps an interview with the company's founder and other flacks.

Alas, these are the same fingerprints that can be found at many other journalistic crime scenes (e.g. "Flacks and hacks and brainscans", 11/23/2007).

1. ### Mark P said,

September 10, 2008 @ 9:02 am

PR releases are a godsend for some journalists. There's all that space that has to be filled, and it has to be filled almost every day.

A lot of media outlets consider it part of their job to promote economic activity, especially the kind of activity that results in advertising sales. And I'm not even being snarky now; it's literally true.

2. ### Rod Whiteley said,

September 10, 2008 @ 9:10 am

I added that spin marker on the company name yesterday. The database clearly works.

In political discourse, it's important to find ways to make falsehoods believable. When there are enough SpinSpotter users, the database will provide a sentence-by-sentence index of perceived truthfulness in political articles on the web. The SpinBot algorithm, the categories of spin, and the actual comments that users add are all irrelevant. Only the number of markers in a sentence is important.

If you were a writer of political spin, wouldn't you value this detailed feedback?

3. ### Mark Liberman said,

September 10, 2008 @ 9:12 am

Mark P: PR releases are a godsend for some journalists. There's all that space that has to be filled, and it has to be filled almost every day.

Yes, it's a symbiotic relationship, as I've observed.

4. ### Chris said,

September 10, 2008 @ 9:43 am

based on the Society of Professional Journalists’ Code of Ethics

In my opinion, based on observations of recent American journalism, the most surprising thing about this story is that such an object exists at all.

Next most surprising would be if anyone paid attention to it.

As Mark P points out, I think most journalists consider their primary, if not their only, responsibility to be raising the ad sales/circulation/ratings of their employer.

5. ### Stephen said,

September 10, 2008 @ 9:54 am

"Why didn't Jon Fine at Business Week and Claire Miller at the NYT do this [install SpinSpotter]?"

I think you're being slightly unfair to Miller, who does claim in her article to have installed the add-on and tested it a little. Though, granted, she doesn't appear to be especially skeptical when, after all her attempts, it only manages to detect a single instance of "spin" (and not even a particularly convincing one). Presumably by that point she'd already written most of the article and didn't want to scrap it.

[(myl) Point taken. I've adjusted the main text to correct this error.]

6. ### rpsms said,

September 10, 2008 @ 10:55 am

An algorithm is a finite set of discreet steps. While computers are generally assumed, especially in this context, I don't think they are neccessary for implementation.

But the company definately seems to be in the con business.

7. ### Trent said,

September 10, 2008 @ 11:27 am

I certainly won't pretend to speak for all journalists, but when I was a newspaper journalist in the 1980s, I and my fellows did _not_ consider it our duty to increase ad sales or circulation. In general, we were (at worst) hostile to capitalism or (at best) skeptical of capitalism.

We felt we were in a constant battle with folks on the revenue side of the business. When we were in school, we rolled our eyes at the fact that newspaper advertising and public relations were part of the department of journalism.

Perhaps broadcast journalists view things differently. I can tell you that we newspaper types did not consider broadcast journalists as journalists. We were insufferably arrogant, of course, but perhaps you have to be if you choose a profession that pays so terribly. We feed on the Ideals of the profession, I suppose, because we barely earned enough to eat in other ways.

8. ### Rick S said,

September 10, 2008 @ 11:49 am

Imagine that this bit of fogware catches on and lots of users start marking what they perceive to be spin. Imagine that, as Rod Whiteley suggests, those users start to see value arising out of the company's database of markers per sentence. It could become a viable product that way, and people might in time come to trust it without subjecting the texts to much scrutiny.

Imagine, then, the potential commercial and political value of applying weighting factors to the database contents, so that certain sites and authors have a higher threshold before spin is indicated. That is, imagine the value of spinning the spin spotting. Such is the danger of letting somebody else think for you.

(I have no compunction about being snarky.)

9. ### kip said,

September 10, 2008 @ 1:22 pm

It's actually possible to view the source code of Firefox toolbars. From the SpinSpotter download page, save the toolbar.xpi file to your local machine (rather than opening/installing it). Then rename it to toolbar.xpi.zip and open it (i.e. with WinZip). Extract all the files somewhere. All the code is in the chrome/content/scripts directory.

I've looked through it very briefly and it seems that your assumptions are correct. In toolbar.js, when the page is loaded, a call is made to the getMarkers() function in markers.js, which makes a phones home to SpinSpotter for the current URL, and gets back a list of markers (phrases which have been identified as spin on that URL).

I didn't exhaustively review all the code here, but I certainly didn't see any advanced algorithm to analyze the text. I did notice this:

// Skip pages that link back to us! (not spotting of spin at spinspotter.com ) if ( dom.getElementById("ssptr-lb-wrapper") ) { if ( this.DEBUG ) { ss_debug( this.PREFIX + "Content Changed.. skipping because this is us!" ); } return; }

This means that if you don't spin spotted on your site, you can just put this somewhere in your html:
<div id="ssptr-lb-wrapper"></div>

10. ### Mark Liberman said,

September 10, 2008 @ 1:25 pm

Chris: I think most journalists consider their primary, if not their only, responsibility to be raising the ad sales/circulation/ratings of their employer.

Trent: when I was a newspaper journalist in the 1980s, I and my fellows did […] were (at worst) hostile to capitalism or (at best) skeptical of capitalism.

There may be a greater tendency towards cheerleading in reporting on high-tech start-ups, or scientific "discoveries", or new ideas in health and beauty; but my impression is that the reason that (for example) reporters and editors at BBC News chose to promote breast-enlargement chewing gum was not that they saw the success of the product as good for their employer, but rather that they were (and are) in the habit of generating copy from light and credulous rewrites of press releases.

At least, that's true in certain areas. There are other contexts in which journalists are excoriated for being too cynical and too reflexively critical; the poor hacks, they just can't win.

September 10, 2008 @ 2:10 pm

Mark – I installed it – with some scepticism – to give it a 'test drive.' I have yet to get it to work on anything…

12. ### Nathan Myers said,

September 10, 2008 @ 2:15 pm

Trent: Those days are long gone.

The lesson for scientists hoping to communicate their work is to hand the journalists print-ready copy. If you can't write in that register, find somebody who can, but don't count on whoever gets your press release or abstract to do it right. Repeat important facts two or more ways in different paragraphs to make it more work to alter them. One hard fact is enough for a story; two might be too many. Assume some paragraphs will be printed, with idiotic alterations, others discarded, at random. Make sure that what matters comes through anyhow.

(A final note for astronomers: to end each press release with a prayer for insight into dark matter demeans everyone.)

13. ### Mark P said,

September 10, 2008 @ 3:25 pm

There is probably a difference in how business-related journalism is done at the larger newspapers or at networks and at small- to medium-sized papers or stations. In a smaller city, the media outlet is definitely part of the business community and dependent on its good will. Outlets in larger markets can afford some independence, but the tone is usually positive even there. My view of this might be influenced by the fact that I was a reporter at a medium-sized daily for several years in the mid-70s.

14. ### Mr Punch said,

September 10, 2008 @ 3:31 pm

Spin IIRC originally referred to efforts to control the public understanding of news events — done by "spin doctors" whose commentary influences perceptions of actual developments. Mostly, it's still used that way; and that fits with the rather vague OED definition.

SpinSpotter's releases are not spin in that sense; rather, they are "media events," created for publicity purposes and injected into the news stream. Larry Tye's book about Edward L. Bernasys, who popularized media events, is indeed called The Father of Spin, but — let's just say that it would be useful to have a term referring specifically to manipulation through commentary.

15. ### Trent said,

September 10, 2008 @ 6:11 pm

When we are knowledgeable in a particular field, we notice errors made by outsiders. I've heard much silliness about journalism over the years. Some criticism is valid, some is just glib. Journalists cover the entire range from good to bad, from progressive to reactionary, from conscientious to careless. Because the typical journalist at a newspaper is a generalist, and because he or she may have to write 10 column inches within 20 minutes about something unfamiliar, there are bound to be errors — some substantive, some not so. A credible newspaper will correct errors. There's not much more that can be reasonably asked. Even in the '80s, budgets were tight; nowadays newspapers are barely able to survive. Demanding that a newspaper hire experts in all fields is just … unreasonable. Demanding that a journalist spend hours researching the material — well, you can get it perfectly accurate, or you can get it fast. Newspapers are in the business of being fast. Journals are in the business of being rigorous.

Each type of media has a different function. No serious researcher will go to a newspaper to learn about the intricacies of any particular field. A newspaper will tell you that a new study on Gene X has been published, but if you want an in-depth understanding of Gene X, you must go elsewhere. Newspapers can raise awareness — they are less able to raise understanding. That's not a criticism; that's recognition of the different functions of various media.

Glib statements that journalists are unthinking boosters of capitalism — that's too easy. Look, I might as well say that the chief joy of descriptivists is to mock prescriptive rules and standards, to grab that MWDEU off the shelf and go to town.

I was called a hack, once, in a letter to the editor complaining about a typo in one of my stories. Whatever. The letter, from an academic btw, contained two hardcore grammatical errors (the kind descriptivists would acknowledge, I mean). We corrected them and ran it. C'est la vie.

As far as TV goes — I'll be as glib as the rest of ya.

17. ### Jahi Chappell said,

September 11, 2008 @ 1:52 pm

I think SpinSpotter's claims actually may not (merely) be lies, but also Bullshit in the Frankfurtian sense. That is, since their software and press releases are so easily deconstructed, it may qualify for the moniker of insidious bullshit, at least at the point where it becomes expressed by journalists. (What is it with journalists named Miller at the Times, anyway?) Indeed, whether or not the press releases are bullshit, the whole endeavor of assuming that "spin" can be so precisely targeted, analyzed, and exposed is certainly Frankfurtian bullshit wrt the SpinSpotter Co., because it's not as if they're lying that it can be done — they quite assuredly don't KNOW if it can, and don't seem to have consulted seriously with anyone (i.e. Language Log-types) that would be able to give them an idea of its possibility. Thus, "It is just this lack of connection to a concern with truth—this indifference to how things really are—that I regard as the essence of bullshit." (A decent plain-words explanation of Harry G. Frankfurt's "On bullshit" is here, though likely most LL'ers won't need the layman's deconstruction.)

18. ### John Atcheson said,

September 15, 2008 @ 3:02 pm

Thanks to all for taking the time to throw a critical eye on our service. Though postings like this are hard to read sometimes, ultimately they're always helpful – if for no other reason than drawing our attention to areas where we messed up.

My full response is available here.

To see a couple of articles we believe give an accurate description of what we do, try Elinor Mills' article in CNET or Jake Swearingen's article in Venture Beat.

Sincerely,

John Atcheson
CEO
SpinSpotter

19. ### Spyware? said,

September 18, 2008 @ 10:18 am

So this sends all sites you visit back to SpinSpotter?

It's spying, then?