Last night, I got back from England in time to be faced with a dilemma: the third presidential debate between Barack Obama and John McCain, starting at 9:00 p.m., conflicted with the fifth game of the National League championship series between the Philadelphia Phillies and the Los Angeles Dodgers, starting around 8:30.
Based on past performances, I expected the NLCS to be more exciting than the debate. And there's this nifty method for summarizing debates: for each participant P, rank the words that P uses more than 10 times according to the ratio of P's count to the opponent's count. And CNN publishes an instant debate transcript…
Still, I felt that I should pay at least some attention to what was going on at Hofstra University. So my solution involved a couple of radios, a TV with picture-in-picture, and several sites that were live-blogging one or the other event. In the end, the Phillies won the game 5-1, and will be going to the World Series. What about the debate?
Well, here's John McCain's word list:
And here's Barack Obama's:
In the abstract, "Obama business wants wealth" vs. "here economic McCain policies" seems like a plausible account of a debate between these two men. Alas, this misses (what most people took to be) the big stories of the evening.
Luckily, it's all on YouTube, including pre-digested thematic excerpts or collections, from the size of Joe the Plumber's health-care fine to Senator McCain's "I'm not Bush" line, the "health of the mother" exchange, the whole exasperation factor, and so on.
Still, I think my trivial word lists are not entirely without interest. In particular, I'm curious about something less down-to-earth than plumber: why did Barack Obama use the little words if and some more than three times as often as John McCain did?
Here's some more data — a comparison of counts across all three debates for if (at least according to the CNN transcripts):
And for some:
Given a set of observations like this, we could come to several different sorts of conclusions.
Maybe it's a meaningless, random statistical fluctuation. After all, there are lots of words, and people vary randomly in how often they choose different words on different occasions, and the way I've gone about this analysis is likely to turn up some differences that arise purely by chance.
Then again, maybe the difference (between individuals or across occasions) is real, but reflects a stylistic difference in the way messages are framed (e.g. "If we want to do X, we need Y" versus "In order to do X, we need Y"), rather than a difference in the underlying distribution of messages. If the difference is a stylistic one, it might be a stable feature of the different individuals involved, or it might reflect a more temporary priming effect, whether lexical or semantic or rhetorical.
Or perhaps the observation reflects a genuine difference in the kinds of ideas that the two candidates are presenting, or at least the spin they want to put on these ideas.
What do you think? Please try, as we professors say, to be be specific and to give reasons for your answer.