Political vocabulary display

« previous post | next post »

In a comment on "The most Trumpish (and Bushish) words", D.O. wrote "It seems that we are missing (at least I was missing) a key piece about Mr. Trump's and Mr. Bush's speaking style. Mr. Bush is using significantly more words than Mr. Trump".

What he means is not that Bush talks more than Trump — in fact, the opposite is true. Thus Donald Trump's announcement of his presidential run totaled about 6,400 words, whereas Jeb Bush's announcement racked up less than 2,300. In the first Republican presidential debate, Trump contributed more than 1,800 words, while Bush contributed less than 1,600.

What D.O. means, I think, is that Bush displays his vocabulary at a greater rate; that is, he uses a larger number of distinct word types for a give number of word tokens. A traditional way to look at things of this kind uses a plot of word type count against word token count. And a type-token plot suggests that Mr. Bush's rate of vocabulary display is indeed greater than Mr. Trumps's:

The specific recipe for this plot strung together the following texts:


His announcement
His side of a press conference held on 8/11/2015
His side of a press conference held on 9/3/2015
His remarks from the first Republican candidate debate


His announcement
His side of Face The Nation 5/31/2015
His side of an interview with Sean Hannity 6/16/2015
A speech he gave on 8/11/2015
His remarks from the first Republican candidate debate

One factor to consider is that Trump's "texts" are all extemporized, while two of Bush's texts (his announcement and the 8/11/2015 speech) were delivered from a written version. Still, I suspect that even if we limited consideration to transcripts of extemporized speech, there would be a (smaller) difference of the cited kind.

Some relevant past posts:

"Britain's scientists risk becoming hypocritical laughing-stocks, research suggests", 12/16/2006
"Cultural specificity and universal values", 12/22/2006
"Vicky Pollard's Revenge", 1/2/2007
"Ask Language Log: Comparing the vocabularies of different languages", 3/31/2008
"Betting on the poor boy: Whorf strikes back", 4/5/2009
"Nick Clegg and the Word Gap", 10/16/2010
"Lexical bling: Vocabulary display and social status", 11/20/2014



  1. HKlang said,

    September 10, 2015 @ 6:01 am

    I need a Bush-Bush comparison to calibrate the device.

  2. D.O. said,

    September 10, 2015 @ 10:32 am

    Yes, this is more or less what I meant. Of course, what Prof. Liberman calls "types" would better be represented by lexemes, but distinct tokens is good enough.

    The plot that Prof. Liberman published in OP is, I believe, showing cumulative number of distinct tokens vs. the number of tokens uttered so far in some natural progression of speech. So, roughly and loosely speaking, it is the vocabulary display progression with time. In addition to numerical differences, we see that Bush's curve has 3 very distinct intervals. Maybe this is because of the mixture of the types of the speeches used.

    I have made another plot, the cumulative number of tokens versus distinct tokens, arranged not in the
    natural progression of speech, but from more to less frequent. I think, 2 features are interesting: the difference is clear from as early as 20 distinct words (ok, tokens) and that for Bush's speech it is much more of a straight line
    (showing that Mr. Bush adheres much stronger to Zipf's law?), but Mr. Trump really likes his first 2 to 3 hundred (distinct) words and then doesn't mind to use how much vocabulary is needed to say what he wants.

  3. Helma said,

    September 11, 2015 @ 10:56 am

    Now I wonder, how would Trump compare to Rudy Noun-Verb-9/11 Giuliani?

  4. Jeff said,

    September 13, 2015 @ 4:35 pm

    I hope there will be at least one linguistic benefit to emerge from Donald Trump's campaign. The revival of the now-rarely heard fine old English word, "trumpery". The man embodies the concept so precisely, it seems he wants to see his picture in the dictionary, as well as in all the tabloids.

RSS feed for comments on this post