Geo-political agency
In a couple of earlier posts, I noted a gradual change in the tendency of American newspapers and U.S. Supreme Court opinions to use the phrase "the United States" as a syntactic subject ("The United States as a subject", 10/6/2009; "'The United States' as a subject at the Supreme Court", 10/20/2009). Thus in a small sample of instances of "the United States" in SCOTUS opinions from each of 6 years from 1800 to 2000, the percentage of instances in subject position increased from 1.8% to 19%:
YEAR | Rate per 100 |
1800 |
1.8
|
1810 |
3.5
|
1850 |
7
|
1900 |
7
|
1950 |
12
|
2000 |
19
|
It's now possible to parse unrestricted text automatically but fairly accurately, and I expect to see large collections of automatically-parsed text become generally available soon (see e.g. Courtney Napoles, Matthew Gormley, and Benjamin Van Durme, "Annotated Gigaword", Proc. of the Joint Workshop on Automatic Knowledge Base Construction & Web-scale Knowledge Extraction, ACL-HLT 2012). And I was recently trying to persuade some colleagues that parsing a large historical books collection would be a Good Thing, even for people who aren't interested in syntactic structure per se. So for this morning's Breakfast Experiment™, I decided to take a look at the proportion of subject positioning for three country names in three geographically diverse news sources.
Read the rest of this entry »