Inaugural Americans again

« previous post | next post »

In response to my post "Inaugural Americans", Steven Bird wrote:

It's easy to do something like this with NLTK:

import nltk
inaugural = nltk.Text(nltk.corpus.inaugural.words()
)
inaugural.dispersion_plot(['America'])

This produces plots like:



3 Comments

  1. NLTK: a natural language processing toolkit in Python said,

    October 11, 2008 @ 9:22 am

    […] (Spotted on Language Log.) […]

  2. Mark F. said,

    October 20, 2008 @ 8:37 pm

    Looking through old posts trying to find a book recommendation I remember, I come across this and am reminded to wonder: How do I interpret this plot? One possibility seems to be that the "word offset" is the sequential position of each occurrence when all the words in all the speeches are ordered chronologically, but I can't convince myself that there isn't some other possible interpretation.

  3. Steven Bird said,

    March 31, 2009 @ 6:01 am

    Yes, that's the correct interpretation. For more discussion of this example and others like it, see chapter 1 of the NLTK book [http://www.nltk.org/book].

RSS feed for comments on this post