apostrophree

Our recent adventures with the vaporware/demoware SpinSpotter (here and here), which purports to detect passages of untrustworthy spin, reminded me of last month's software delight, apostrophree, which, it was said, automatically and silently

corrects common errors of spelling, punctuation, grammar, and usage in blogs and especially comments and discussion forms.

(this from a Typical Programmer interview with apostrophree's founder John Scogan).

Read the rest of this entry »

Comments off


SpinSpotter unspun

What is spin? According to the OED's 1993 additions,

2. g. fig. A bias or slant on information, intended to create a favourable impression when it is presented to the public …

What is SpinSpotter? According to Claire Cain Miller in the NYT ("Start-Up Attacks Media Bias, One Phrase at a Time", 9/8/2008), it's a Web tool that "scans news stories for signs of spin".

The Spinoculars find spin in three ways, said Mr. Herman. First, it uses an algorithm to seek out phrases that violate six transgressions that the company’s journalism advisory board came up with based on the Society of Professional Journalists’ Code of Ethics. They are personal voice, passive voice, a biased source, disregarded context, selective disclosure and lack of balance. […]

SpinSpotter’s algorithm also uses a database of common phrases that are used when spinning a story. Finally, readers can flag instances of spin. Other SpinSpotter users can see these flags, and the reported phrases will enter the spin database.

The guy being quoted is  "SpinSpotter founder and chief creative officer, Todd Herman". Other stories about SpinSpotter — and there are quite a few of them — give a similar picture.

But here's another definition, offered by me in comments on a weblog post yesterday:

This might be an unusual type of demoware …, one that is released for general use in the hope that enough people will submit their proposed spin-spots to give the company enough free training data to actually develop some of the technology that they pretended to have in the first place.

Read the rest of this entry »

Comments (19)


Dumb mag buys grammar goof spin spot fraud

A SpinSpotter tool — a plugin for the Firefox browser — has been announced in a credulous article by Jon Fine in Business Week. It will (its inventors claim) scan the text of web pages that you view, and identify passages of untrustworthy spinspeak. Our experts at Language Log's research laboratory have run it through our secret multi-million-dollar bullshit detector, and we got a strong positive. Having written several times before on Language Log about people who publish claims about language, and mention the passive voice, when they are completely unable to tell an active clause from a passive clause, I was delighted to see one more instance. Look at this description, from Jon Fine's description of SpinSpotter, detailing the "tenets" (i.e., diagnostics) that enable SpinSpotter to spot spin:

The tenets are: reporter's voice (adjectives used by a journalist that go beyond the supporting evidence in the article); passive voice (example: a story says "bombs land" without stating which party is responsible for them); a biased source (a quoted source's partisanship is not clearly identified); disregarded context (a political rally's attendance is reported to be "massive," but would it have been so huge had the surviving members of the Beatles not played?); and lack of balance (a news story on a controversial topic gives much more credence to one side's claims).

Bombs land is of course an active clause. Passive clauses always have a participial form of the verb, in almost all cases (setting aside "concealed passives" like "This needs looking at") a past participle. The past particple of land has the form landed. So quite independently of the absurdity of an algorithm running on raw text being able to spot things as subtle as strength of supporting evidence or balance on controversial topics, the inventors of this crucially linguistic tool (or the people who wrote their press release) don't know even the most elementary things about English grammar. Caveat downloader.

Read the rest of this entry »

Comments (23)


Authorship identification in the news

One of the curious things about the uses of linguistics in the legal context is that the smallest units of language get the most public attention. Linguists analyze language in all its shapes and forms, from minute sounds to broad discourse structures, but the media's interest is on the smaller language units like letters, punctuation, and words, not the larger language units like syntax, discourse structure, and conversational strategies.

A case in point is the area of authorship identification, which typically focuses on small language units such as morphology, lexicon, or stylistic choices found in evidence documents. It's tempting to think that such language features can actually identify authors with as much validity and precision as the way DNA analysis helps law enforcement identify suspects. Personally, I have some reservations about what I see linguists doing as they try to help the police and the courts determine issues of innocence or guilt.

Read the rest of this entry »

Comments off


Epenthesis, IPA, and r-fulness

John Wells has been posting a lot of nice stuff recently on his daily phonetics blog. The current page (no permalinks yet, alas) discusses epenthesis in toponyms and similar forms — why are graduates of Harrow "Harrovians" while people from Congo are "Congolese"? And what about "Kittitian" from St. Kitts, and "Torontonian", and "tobacconist", and so on? (Some relevant socio-historical information can be found in "Who let the 'n' in?", 1/22/2006; and "Chinian, not Chinese?", 1/26/2006. You may also be interested in the theological implications of such sound-pattern irregularities.)

Read the rest of this entry »

Comments (44)


If only the voters knew Greek

Many commentators have observed that John McCain is campaigning as if it were the Democrats, not the Republicans, who had been in office for the last eight years, hoping that voters will forget about George Bush and view the Republicans as the party of reform. If only more people had a classical education, McCain's choice of Sarah Palin as his running mate would have provided yet another point for the Democrats: the Ancient Greek word whose transliteration is the same as her family name, πάλιν, means "again" or "back".

Read the rest of this entry »

Comments (23)


Sentimental mush from the Washington Post

Jonathan Yardley in the Washington Post published a piece of pompous, sentimental mush yesterday. It's all about a little book he learned about in college and still carries around to this day and will love till he dies (yadda yadda yadda; violins, please); and yes, you guessed it, the book is E. B. White's disgusting and hypocritical revision of William Strunk's little hodgepodge of bad grammar advice and stylistic banalities, The Elements of Style. I have discussed it many times before here on Language Log. The appearance of this slop would have made me pretty sick, except that the wonderful Jan Freeman was on it like greased lightning. Jan's piece is called Return of the living dead, and it's a delight. (It contains original scholarship, too.) I have nothing more to say, except read Jan Freeman. She is a wonderful language writer. She should be writing for Language Log. But our organization, vast and powerful though it is, doesn't have the resources to steal her from the Boston Globe.

Comments off


Wretched analysis, appalling reporting

This is a little lesson in how not to investigate a linguistic question and in how not to use expert opinion about that question. For a change, our target is not BBC News, but instead Wired.com. The piece, by Brian X. Chen, begins promisingly:

The validity of recent e-mails supposedly sent by Steve Jobs to Apple customers is questionable, according to an analysis by Wired.com.

We carefully examined the writing style and grammar of three recent e-mails claimed to have been sent by Jobs with three samples of his confirmed writing.

With help from Wired.com's copy editors and Patrick Farrell, head of the UC Davis linguistics department, we observed that the customer-reported e-mails contained elementary grammatical errors, which are absent from Jobs' real e-mails; the CEO has a much stronger command of the English language than recent e-mails suggest.

It appears, however, that the Wired staff started with a hypothesis, that the e-mail messages were not genuine, based on considerations that had nothing to do with the language of the messages, and then searched for linguistic features that would support this hypothesis, labeling as "grammatical errors" entirely acceptable variants in standard English.

Then, when the article finally gets around to Patrick Farrell, it turns out that what he actually said didn't agree with Wired's opinion:

"The grammar in all the e-mails is competent, native, and standard English," he said.

However, he said the evidence of just three short e-mails was too scant to come to a conclusion.

"I don't see anything obvious that would lead me to believe that the three questionable emails are fake," he said. "I think one would need more evidence. Longer emails or something."

Wretched analysis, appalling reporting.

Read the rest of this entry »

Comments off


Uptalk anxiety

A few days ago, I received this poignant note from an anxious parent in Pittsburgh:

I have developed a serious interest in the origination of uptalking and methods to treat it. As absurd as it may sound, my daughter is a Ph.D. and lives in another city. When she visits me, she populates most of her explanations with uptalking. She is a psychologist.

When I am conversing with her I become extremely anxious since I have fixated on the uptalking and it puts me at a severe level of discomfort. I discussed it with her several times. She claims that it is a speech pattern she developed which is normal and that it is my problem. I noticed that many of her friends, all professionals including psychologists, attorneys and physicians also engage in uptalking. Though she vehemently denies that she can stop uptalking to me, when she is angry she speaks perfectly. It appears that it is a psychological insecurity requesting some sort of approval or affirmation from the listener that what the talker says is correct, approved by the listener or adequately explained to the listener.

My daughter recommended that I seek therapy and that it is my problem. Has any research been done to show that not only has the phenomenon of uptalking been documented and described, but that it can have very negative affects on the listener?

Read the rest of this entry »

Comments (52)


Sarah Pawlenty?

Adding to the growing corpus of speech errors connected to the 2008 U.S. presidential campaign,  we have Jo Ann Davidson, Co-Chairman of the Republican National Committee, at the Republican convention in St. Paul, 9/2/2008:

We are holding a convention to ((el- )) nominate a Republican woman governor, Sarah Pawlenty, our next vice president!

Read the rest of this entry »

Comments (6)


My, Karl, that's so 1984 of you

Comedy Central is currently showcasing this "astoundingly popular" video clip from The Daily Show:

Throughout the clip, Jon Stewart juxtaposes comments about [Alaska Governor and Republican V.P. nominee] Sarah Palin by [former Republican strategist] Karl Rove, [FoxNews blowhard] Bill O'Reilly, ["lying sack of shit"] Dick Morris, and [McCain's senior policy advisor] Nancy Pfotenhauer with other comments previously made by those same people about [Virginia Governor] Tim Kaine (Rove), Jamie Lynn Spears (O'Reilly), and Hillary Rodham Clinton (Morris, Pfotenhauer). The juxtaposition exposes a high level of hypocrisy among these conservative commentators: they all defend Palin with the same swords they use to attack Kaine, Spears, and Clinton. If you haven't already, please watch the video (better yet, the full episode): it's one of those laughs that'll make you cry.

Read the rest of this entry »

Comments (27)


Somewhere, at the end of the rainbow

The LPGA has announced that it is backing down from its "plans to suspend players who could not efficiently speak English at tournaments" (which I posted about here).

[Democratic California State Sen. Leland] Yee said he understood the tour's goal of boosting financial support, but disagreed with the method. "In 2008, I didn’t think an international group like the LPGA would come up with a policy like that," Yee said. "But at the end of the rainbow, the LPGA did understand the harm that they did."

This understanding is indirectly reflected in a statement from the LPGA:

"We have decided to rescind those penalty provisions," [LPGA Tour commissioner Carolyn] Bivens said in a statement. "After hearing the concerns, we believe there are other ways to achieve our shared objective of supporting and enhancing the business opportunities for every tour player."

[ Hat tip to Ben Zimmer. ]

Comments (3)


Paying tax(es)

Having just posted (again) on less/fewer with plural C (count) nouns, I was primed to catch the following in Gail Collins's op-ed piece ("Sarah Palin Speaks!") in the NYT yesterday:

How many times have you heard McCain promise to slash taxes and pay for it by eliminating unnecessary programs? And who better to help carry out that agenda than the governor of a state whose residents pay less taxes than anyplace else in the union, because of their genius in making the federal government pay the tab for virtually everything?

Collins could have written pay less tax, with a M (mass) use of the lexical item TAX, or she could have written pay fewer taxes, with the modifier fewer that some usage critics insist on with plural C nouns. All three variants are attested, but not (apparently) with equal frequencies. I found Collins's pay less taxes entirely natural, indeed to be preferable to pay fewer taxes, and I would have found pay less tax also natural.

There are two points of interest here: yet another context where less is fine with plural C nouns, plus the double classification of TAX as M and C, with the result that M tax and plural C taxes overlap significantly in their meaning (a situation also seen for E-MAIL, SPAM, and some other lexical items).

Read the rest of this entry »

Comments off