Language Log

Language guru runs with the journalistic pack

June 17, 2010 @ 3:03 pm · Filed by Mark Liberman under Language and politics, Language and the media

[Update 6/20/2010 — The linked CNN story has been extensively modified, for the better. The headline is now "Language mavens exchange words over Obama's Oval Office speech," and the article now highlights Ron Yaros along with Payack, and incorporates some information from this post. Fev at headsuptheblog has some before-and-after analysis.]

It's amazing what a grip Received Perceptions have on what passes for journalism these days. Today, CNN enlisted Paul Payack to lead us through an unusually contentless version of one of the standard categories of Obama criticism ("Language guru: Obama speech too 'professorial' for his target audience", 6/17/2010):

President Obama's speech on the gulf oil disaster may have gone over the heads of many in his audience, according to an analysis of the 18-minute talk released Wednesday.

How can we tell? Well, for a start,

Tuesday night's speech from the Oval Office of the White House was written to a 9.8 grade level, said Paul J.J. Payack, president of Global Language Monitor. The Austin, Texas-based company analyzes and catalogues trends in word usage and word choice and their impact on culture.

Wait, what? Text at a ninth-grade reading level is too professorial for the American people to understand? When it's read out loud to them? Color me skeptical. But wait, there's more…

Though the president used slightly less than four sentences per paragraph, his 19.8 words per sentence "added some difficulty for his target audience," Payack said.

Oh, OK, that's all right then. And Mr. Payack's numbers are right on the money here, I checked. By my count, Obama's speech had 2,668 words in 135 sentences, for an average of 19.76 words per sentence.

I think we can all agree that those are shockingly long professor-style sentences for a president to be using, especially in addressing the nation after a disaster. Why, they were almost as long as the ones that President George W. Bush, that notorious pointy-headed intellectual, used in his 9/15/2005 speech to the nation about Hurricane Katrina, where I count 3283 words in 140 sentences, for an average of 23.45 words per sentence! And we all remember how upset the press corps got about the professorial character of that speech!

[Someone at CNN crafted this little bubble of nonsense based on Payack's original analysis here, which pushes the same empirically-empty pack-journalism meme ("Obama Oil Spill Speech Echoes Elite, Aloof Ethos"), but is slightly less contentless, in the sense that it presents a larger number of mostly-meaningless numbers.

A sample of previous LL Payackology: "There will be passives", 11/7/2008; "The million word hoax rolls along", 1/3/2009; "Forbes on neologisms, and the return of the million-word bait-and-switch", 4/23/2009; "The millionth word in English could be 'sucker'", 5/12/2009; "End times at hand", 6/6/2009.

And some may be amused to compare the current ("aloof, elite") meme about presidential language with the equally overblown "bushisms" meme that Jacob Weisberg milked for eight long years…]

[Update — R.L.G. at The Economist's Johnson blog got to this before I did, contributing a well-phrased debunking of the "grade level" calculation as

a mindless bit of math that takes average word-length and sentence-length and assigns it a misleading equivalent in high-school or university reading levels. It takes no account of the skill with which words are chosen and sentences constructed, nor (in an orally delivered speech) how well or ploddingly it is delivered.

as well as continuing the reductio ad absurdum of this mis-analysis:

Microsoft Word can calculate the "Flesch-Kincaid" reading level for any bit of text. It tells me that the Gettysburg Address is on a 10.9 reading level, and the first section of Winston Churchill's storied "We shall fight them on the beaches" speech rates a downright incomprehensible 12.6. Yet of course neither speech is called "professorial". It seems that for the gullible reporters at CNN passing along Mr Payack's "analysis", confirmation bias is alive and well.

]

June 17, 2010 @ 3:03 pm · Filed by Mark Liberman under Language and politics, Language and the media

Permalink

30 Comments

Coffee Tea Linguistics said,

June 17, 2010 @ 3:28 pm

Since when is sentence-length considered some absolute measurement of being professorial or a cause for processing difficult? We can create many long sentences which are structurally and semantically quite simple; the guru's method is too naive to tell anything substantive.
Dierk said,

June 17, 2010 @ 3:55 pm

Let me get that straight … journalists from CNN and some communication expert first define a 'target audience' for a prime time televised presidential speech, then say the president doesn't reach that audience because his text is too complex and cerebral.

Several questions:

– Isn't the target audience of such a speech the whole citizenry [and more] of the US of A?
– What's the target group for CNN and communication experts?
– Is CNN saying US Americans are by and large not more intelligent and educated than 9th graders? Or, in other words, Europeans are right when they say US Americans are stupid?

It could, OTOH, just be what Jon Stewart found out about Faux and Fiends – they just hate Barack Obama, regardless what he does or says, he's wrong.

[(myl) When it comes to pack journalism, I'm afraid that your questions are misguided, in that they presuppose some minimal standard of rational discourse. In my experience, work of this kind is generated by a process of association in which logic has little or no role, and therefore questions like "is CNN saying …" can't be answered in a way that assumes a chain of connections between facts and conclusions.

To avoid accusations of partisanship, let me observe that the journalistic obsession with "Bushisms" — more or less the opposite accusation from the current "too professorial" meme — was cut from the same cloth.

Things of this kind can be motivated by ideology — no doubt Fox "news" people are anti-Obama, just as those who pushed the "bushisms" business were anti-Bush — but in most cases, I think the motivation is a combination of snark and intellectual laziness. Consider for example Maureen Dowd, who finds some dismissive stereotype to associate with every public figure that she ever writes about.]
John Cowan said,

June 17, 2010 @ 3:57 pm

Pointed-headed is new to me. Is it a mere thinko for pointy-headed? Google gives pointed-headed cabbage, pointed headed bolts, etc.

[(myl) Slip of the finger-mind combination. Fixed now.]
goofy said,

June 17, 2010 @ 4:05 pm

"Characters per words – Obama’s words had an average of 4.5 letters in them, a bit longer than typical for him."

This is even more meaningless than the other meaningless numbers.

[(myl) Word length is an important factor in the calculation of so-called readability indices, like the Automated Readability Index, which is just a linear combination of letters per word and words per sentence, minus a constant offset. Most of these indices are amazingly simple-minded — they don't even consider word frequency statistics, much less anything about syntactic or semantic complexity, and as a result, it's easy to construct texts that would test many "grades" away from their nominal index, if you actually checked how well kids of different ages were able to understand them. And (though I haven't seen much work on this) the application of these indices to real-world materials has a high enough variance, relative to actually measured readability, that I don't believe that they tell us much about speeches like those under discussion here. (If in fact they tell us anything much useful about anything.)]
Croogs said,

June 17, 2010 @ 5:00 pm

"Though the president used slightly less than four sentences per paragraph, his 19.8 words per sentence "added some difficulty for his target audience," Payack said."

Using a 25-27 word sentence (is ninteen-point-eight one word or three?) to criticize sentences averaging 19.8 words is probably not the most self-aware thing one can do.
Ralph Hickok said,

June 17, 2010 @ 5:12 pm

Does the word "paragraph" have any genuine meaning when applied to a speech, as opposed to a written document?

[(myl) The division of such a speech into "paragraphs" is more or less well-defined, if you start from the version released by the source (here the White House). But in fact, an editor could obviously redo the paragraph divisions — and for that matter the sentence divisions — so as to change the counts significantly, without changing the word sequence or any aspects of the structure or meaning of the speech.]
Mark P said,

June 17, 2010 @ 5:29 pm

The empty-headed gullibility of the media and the the self-promotion of a self-styled expert seem a marriage made in heaven.
Lazar said,

June 17, 2010 @ 5:36 pm

"Ethos" seems like a remarkably elite and aloof word for them to be using.

[(myl) Next they'll be telling us about Logos and Pathos…]
Nathan Myers said,

June 17, 2010 @ 6:34 pm

I had never heard Obama's speech until immediately before the election. (Who has time for audio, at one second per second? Transcripts, please!) What has struck me since is not anything "professorial", it's how he pronounces particles like "to". Perhaps it is just Chicago speech that I don't hear much of, but it always sounds to me as if he can't be bothered to finish his words. He doesn't sound like he's in a hurry to get to the next word, so the impression is that the word ending just doesn't merit being pronounced. Curiously, British doesn't give this impression; there, it almost feels as if the speaker dwells on word endings by omitting them.

Nothing here should be taken to suggest judging Obama himself. It's the speech mannerism that grates, not what he says or why. (If he could just bring himself to prosecute torturing murderers, I would be able to listen to anything he says without being distracted by a grip in the pit of my stomach.)
Layra said,

June 17, 2010 @ 8:14 pm

I would have boggled at the supposed difficulty of a 9.8 grade comprehension level, but then I remembered that we live in a country where "Are You Smarter Than A Fifth-Grader?" is a viable show. Which is not to imply that the show is any better a measure of the intelligence of a fifth grader than the various readability indexes are measures of readability.
Rubrick said,

June 17, 2010 @ 8:16 pm

Since when is sentence-length considered some absolute measurement of being professorial or a cause for processing difficult?

Since textbook evaluators got lazy and stingy, I suspect. Why hire humans with expertise to evaluate the readibility of texts — or, heaven forbid, actually conduct studies in which students of the target age are given the material and then tested for comprehension — when you can just get a computer to generate meaningless numbers for cheap? And if it's good enough for school boards, it must be good enough for journalists, right?

[(myl) The readability formulae go back a ways. I think that the earliest one was described in Rudolf Flesch, "Marks of readable style: a study in adult education", Bur. of Publ., Teachers Coll. Columbia University, 1944; and a revised version in "A New Readability Yardstick", Journal of Applied Psychology, 32(3): 221-233, Jun 1948. Flesch was a serious researcher, who made significant contributions to reading research. But I don't think that he would have had much patience for some of the contemporary applications of his ideas.]
Jack Lynch said,

June 17, 2010 @ 8:29 pm

For a lark I once ran the first sentence of Milton's Paradise Lost through some software that estimates "reading level." Different tests give you very different numbers. I was delighted, though, to see that one of them said PL was appropriate for (something like) 187th graders. Just another 166 years of school and I should be ready for it!

[(myl) Well, for example, the grade-level estimate of the Automated Readability Index is

4.71*characters-per-word + 0.5*words-per-sentence – 21.43

Since the first sentence of Paradise Lost has 523 alphabetic characters in 125 words, we get

4.71*523/125 + 0.5*125 – 21.43 = 60.78

That's not quite the 187th grade, but spring semester of the 60th grade is still more education than most of us have had. In comparison, George Washington's first Inaugural address ranks merely at a 30th-grade reading level, according to ARI.

This is all somewhat amusing, but it underlines the difficulties with estimating reading level by means a linear-regression formula based on easy-to-measure proxies like word and sentence length. It's better than a poke in the eye with a sharp stick, as the sayings goes — in fact Paradise Lost and GW's first Inaugural are indeed somewhat hard going — but we shouldn't attribute much authority to it, and numbers like 60th and 30th grade are simply silly.]
Steambadger said,

June 17, 2010 @ 8:41 pm

"Obama’s spoke in long, though well-crafted, sentences about 20 words in length."

You'd think a "language guru" could write a better sentence than that. Sounds like something out of a Barbara Cartland novel.
Mark said,

June 17, 2010 @ 10:32 pm

The third sentence of The Berenstain Bears and Too Much TV, which I just read to my 2 year old daughter:

"And except for one small cloud of dust billowing behind the school bus as it came over the hill, the air was sparkling clean." (24 words)

[(myl) FYI, the "grade level" for that sentence, by various automated indicators, is:

Gunning Fog index: 9.6
Coleman Liau index: 7.49
Flesch Kincaid Grade Level: 8.52 (the one that Payack used)
Automated Readbility Index: 10.2
SMOG: 8.48

]
Nijma said,

June 17, 2010 @ 11:54 pm

Nathan Myers,
What has struck me since is not anything "professorial", it's how he pronounces particles like "to". Perhaps it is just Chicago speech
Not Chicago, I'm from Chicago and points west, and say something more like tə. (It drives me crazy too.) He pronounces it more like taa or tuh, and draws it out. It sounds to me more like Iowa; could it be a souvenir of his mother's family's Kansas background?
J. Goard said,

June 18, 2010 @ 12:43 am

Did they average all the sentences? Could you nearly halve your average by saying "that's right" or "that's my plan" or "i guarantee it" after every long sentence? How about saying "thank you" twice after every burst of applause?

[(myl) The index calculations depend only on the mean values, not on any other properties of the distributions, for sentence length as well as other variables. So a text where all the sentences are twenty words long with score the same as one where half the sentences are 2 words long and half of them are 38 words long; or any other distribution whose mean is the same.]
Nathan Myers said,

June 18, 2010 @ 4:19 am

I know people from Kansas who have no hint of that. Javanese I've known don't either.
Ginger Yellow said,

June 18, 2010 @ 6:21 am

I'm struggling to get my head around the idea that 20 words is long for a sentence. At my paper, we have a (very loosely enforced) cap of 30 words, designed to help keep our style punchy. That's 50% longer than Obama's average.
Mark P said,

June 18, 2010 @ 8:15 am

The story is not that Obama is too smart, but that the American people are too stupid to understand material at the 10th grade level. And the idea that Obama is the darling of the (liberal) media is just another (right-wing) journalistic meme. I think part of the reason for this story is the journalistic urge to find a twist on a story. The story used to be that Obama was a terrific public speaker. Now we find that experts think he isn't. And a new meme is born.

[(myl) People like to have a thumbnail characterization of other people, and journalists are especially big fans of this kind of stereotyping, since it gives them ready-made no-thought-required story themes that will resonate with their readers.

The empirical emptiness of most of these thumbnails is underlined by the fact they're often mutually inconsistent. For a while, the thumbnail on Obama was "all style, no substance". Now it's mostly "too professorial, not inspiring enough". He's now alternately tagged as "Obambi" (MoDo's coinage) and as that "Chicago shake-down artist".

You have to keep in mind that there's generally an inverse correlation between pushing such stereotypes and having any real evidence to offer. We last saw that displayed in the great parade of punditry focusing on "Obama the pronoun-abusing narcissist", in which tens of thousands of words of thumb-sucking were based on a frothy mixture of nothing much at all.

This case is no exception. Mr. Payack trots out a few nearly-meaningless numbers, and the media (including many bloggers) obliges by arguing about whether the problem is Obama's elite aloofness or the public's bad education. In fact, the problem is none of the above: it's the media's shallowness and credulity.]
Ross Bender said,

June 18, 2010 @ 9:42 am

Ages ago (1994) I presented a paper at the CALICO Symposium titled "Sentence Complexity and Electronic Text Type."

http://rossbender.org/CALICO94.html

Intrigued by the possibilities of the new "style-checkers" (I was using a product called READABILITY), I analyzed then available digital samples of everything from King Lear to the Book of Genesis to Time Magazine, the Wall Street Journal and the Book of Mormon. In addition I had some samples of early synchronous chat, including some from the Media Moo at MIT.

I believed that I had shown a distinction between synchronous chat, which was more like conversation on scales like those Douglas Biber and John Swales had posited. But that was long ago, and READABILITY was actually a fairly sophisticated instrument compared to what Microsoft Word does for you.
Mark P said,

June 18, 2010 @ 10:03 am

I didn't mean to say that the American people are really too stupid to understand 10th grade material. What I intended, of course, is that the point of Payack's "analysis" and CNN's story is that the American people need their information predigested and strained so that even an idiot can understand.

I'm sure many in the media are reasonably intelligent, so I can't figure out why they act like they aren't.
Mark P said,

June 18, 2010 @ 10:04 am

Well, if not the point, at least the underlying assumption.
Jim said,

June 18, 2010 @ 1:22 pm

"I'm sure many in the media are reasonably intelligent, so I can't figure out why they act like they aren't."

Because they are at least as cynical as they are intelligent. Business is business, after all, and aw shucks stupidity and other down-homey poses sell.
James Kabala said,

June 18, 2010 @ 2:50 pm

Just curious: Grade-level rating is a key part not only of Microsoft Office and similar programs but, more significantly, of many standardized tests. Is it meaningless there as well? Has the educational establishment been suckered?
Don Sample said,

June 18, 2010 @ 6:12 pm

Stephen Colbert had a nice bit on this, on Thursday's show, complete with his version of the speech Obama should have given to avoid speaking above the comprehension level that the writers of article seemed to want. "See spot. [Image of Gulf, with small oil spill] See spot grow. [Image of bigger oil spill] Grow spot grow! [Gulf totally covered in oil]" and so on.

(I'm also a little surprised that there weren't any comments about him pulling out his copy of Strunk & White last week, when arguing about the merits of the Oxford Comma with his guests.)
Ross Bender said,

June 19, 2010 @ 10:49 am

Not to continue to beat a horse which may in fact be deceased, but in defense of Microsoft Word, its "readability" count does have some utility for writers of academic prose. Unfortunately one has to run through the gauntlet of its spelling and grammar checker to obtain the Flesch-Kincaid scores. But if you're writing, say, a chapter for a textbook aimed at college undergraduates it can be useful. Much academic writing seems to be done by professors whose style is that of an old horse with the blind staggers.

To return to my CALICO 94 paper (ahem – that link is

http://rossbender.org/CALICO94.html ), no less an authority than Douglas Biber wrote a book back in 1988 titled Variation Across Speech and Writing. He used the terms "speaking" "oral" "literate" and "writing" to map the differences between speaking and writing, using as examples the four genres of:

1)face-to-face communication
2)academic lectures
3)personal letters
4)academic expository prose

Number 1 is a spoken genre with highly oral situational characteristics.
Number 2 is a spoken genre with highly literate characteristics
Number 3 a written genre with oral characteristics
Number 4 a written genre with highly literate characteristics

M.A.K. Halliday in his 1989 Spoken and Written Language posited that written language is characterized by "lexical density" and spoken language by "grammatical intricacy", using his own definition of lexical, or content words, and grammatical items. He came up with his own index of "lexical density."

All this is to say that while Mr. Payack's analysis of Obama's speech in the context of modern journalism is something of a joke, the basic issues have been seriously addressed for some time. I don't know what the wizards over at the Linguistic Data Consortium are up to these days, but however it is they're massaging their megagig corpora certainly has its roots in the sort of analysis that Biber, Halliday, Swales and others did some years ago, and in the history of creation of corpora like the Brown and LOB.
Szwagier said,

June 19, 2010 @ 11:20 am

@Ross.

Thanks for reminding me about Biber's book and Halliday's work! Somehow I'd completely forgotten about them, despite the fact I read them both many years ago (although I can't pretend I understand the intricacies of Biber's statistical methods – my bad).

I wonder what the readability index for Variation Across Speech and Writing is. While reading it, I felt it was somewhere around Jack Lynch's 187th grade.
McLemore said,

June 20, 2010 @ 1:22 pm

@Don Sample
(I'm also a little surprised that there weren't any comments about him pulling out his copy of Strunk & White last week, when arguing about the merits of the Oxford Comma with his guests.)
I loved that! and the thought that Colbert's character has a mirror-image affinity with a certain raving linguist. :) They were talking about Vampire Weekend's lyric "fuck the Oxford Comma," which Geoff might especially enjoy, being a former rock star himself.
Mark Dowson said,

July 12, 2010 @ 9:54 am

Back in the 1960's, when the UK Daily Mirror was a moderately left-wing popular paper (GKP will probably remember), their editorial policy was that the ideal sentence length was 14 words – not a cap, but a guideline. Within this guideline the Mirror subs (sub-editors/copy editors) did a terrific job. It was almost impossible to further sub a Mirror article without losing content or clarity.
Sophmoric application of readability tests said,

May 21, 2012 @ 10:12 pm

[…] also recommend Mark Liberman's 2010 critique on Language Log of a conversely simple-minded conclusion, a suggestion that a speech by President Obama was […]

RSS feed for comments on this post

Language guru runs with the journalistic pack

30 Comments

Coffee Tea Linguistics said,

Dierk said,

John Cowan said,

goofy said,

Croogs said,

Ralph Hickok said,

Mark P said,

Lazar said,

Nathan Myers said,

Layra said,

Rubrick said,

Jack Lynch said,

Steambadger said,

Mark said,

Nijma said,

J. Goard said,

Nathan Myers said,

Ginger Yellow said,

Mark P said,

Ross Bender said,

Mark P said,

Mark P said,

Jim said,

James Kabala said,

Don Sample said,

Ross Bender said,

Szwagier said,

McLemore said,

Mark Dowson said,

Sophmoric application of readability tests said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta