For the last couple of days, I've been in Chicago at an NSF-sponsored workshop on "animacy and information status annotation", organized by Annie Zaenen, Cathy O'Connor and Gregory Ward.
A traditional and characteristic example of the role of animacy in English syntax is the way it affects the choice between the two ways of expressing genitive relations, X's Y vs. the Y of X. In general, the apostrophe-s structure is said to be preferred for animate Xs, while inanimates tend to go with the of-phrase. I'm a believer in Yogi Berra'a dictum that you can observe a lot just by watching, especially if you count things. So during the wrap-up session this afternoon, I thought I'd try using some simple web searches to probe this animacy-genitive relationship.
Since we can only do string searches — the web isn't parsed — I wanted to find a reliable string-wise proxy for finding the head of the of-phrase, and decided to try entities with single-word names. Specifically, I decided to try people, companies, countries, and chemical elements. And to pin it down further, I decided to try some contemporary American politicians and some IT companies. (I started with four members in each category, but then I added one more country and one more element.)
Thus the search string "google's" got 17 million hits, while the search string "of google" got 13.4 million hits, for a ratio of about 1.27. I repeated the analogous searches for the other 17 names. The results are kind of cute:
The first thing to observe is that the four categories are non-overlapping. (I'm sure that if you added more names, you'd discover some overlaps; but still.)
Within categories, the ordering makes a certain amount of sense. Lithium is a drug as well as an element, which makes it partly a member of another class that might be expected to rank higher on this scale (whatever this scale is, exactly). Arsenic is a poison. And it's not a surprise that America comes out as apparently the most animate of the countries.
The ranking of the politicians and the IT firms puzzled me a bit at first. But then I conjectured that perhaps the scale is measuring a sort of unpredictable agency — what you might call the "Maverick factor". Maybe by changing all their connectors every 18 months, and building laptops that freeze up every other time you plug them into a projector, Apple gains in (this measure) of animacy, just as a cantankerous old car comes to seem more alive every time you have to beg it to start. On this theory Google and Obama, by being more reliable, seem a bit less agentive.
[Of course, there are many other factors that could (and doubtless did) skew the counts enough to change the genitive-ratio rankings — for example, "Google" is often used as a modifier, in phrases like "Google Analytics" or "Google News", and as a result, strings like "new version of Google Analytics" increase the count of the "google of" search. Still, it's nice that the ratio held up in a semi-sensible way over 3-4 orders of magnitude.]