A correlate of animacy

« previous post | next post »

For the last couple of days, I've been in Chicago at an NSF-sponsored workshop on "animacy and information status annotation", organized by Annie Zaenen, Cathy O'Connor and Gregory Ward.

A traditional and characteristic example of the role of animacy in English syntax is the way it affects the choice between the two ways of expressing genitive relations, X's Y vs. the Y of X.  In general, the apostrophe-s structure is said to be preferred for animate Xs, while inanimates tend to go with the of-phrase. I'm a believer in Yogi Berra'a dictum that you can observe a lot just by watching, especially if you count things. So during the wrap-up session this afternoon, I thought I'd try using some simple web searches to probe this animacy-genitive relationship.

Since we can only do string searches — the web isn't parsed — I wanted to find a reliable string-wise proxy for finding the head of the of-phrase, and decided to try entities with single-word names. Specifically, I decided to try people, companies, countries, and chemical elements. And to pin it down further, I decided to try some contemporary American politicians and some IT companies. (I started with four members in each category, but then I added one more country and one more element.)

Thus the search string "google's" got 17 million hits, while the search string "of google" got 13.4 million hits, for a ratio of about 1.27. I repeated the analogous searches for the other 17 names. The results are kind of cute:

__'s of __ ratio
Giuliani 1.14M 140K 8.14
McCain 23.6M 4.42M 5.34
Clinton 11.6M 2.81M 4.13
Obama 26M 7.6M 3.42
Apple 22.6M 9.39M 2.41
IBM 6.97M 4.03M 1.73
Microsoft 35.5M 21.3M 1.67
Google 17M 13.4M 1.27
America 113M 131M 0.863
Canada 26.8M 60.5M 0.443
Thailand 3.96M 11.8M 0.336
England 10.9M 48M 0.227
Belgium 799K 6.31M 0.127
lithium 60.7K 1.73M 0.035
arsenic 21.7K 1.19M 0.018
silicon 50.9K 5.93M 0.0086
hydrogen 45.3K 9.44M 0.0048
cadmium 4.01K 2.2M 0.0018

The first thing to observe is that the four categories are non-overlapping. (I'm sure that if you added more names, you'd discover some overlaps; but still.)

Within categories, the ordering makes a certain amount of sense. Lithium is a drug as well as an element, which makes it partly a member of another class that might be expected to rank higher on this scale (whatever this scale is, exactly). Arsenic is a poison. And it's not a surprise that America comes out as apparently the most animate of the countries.

The ranking of the politicians and the IT firms puzzled me a bit at first. But then I conjectured that perhaps the scale is measuring a sort of unpredictable agency — what you might call the "Maverick factor".  Maybe by changing all their connectors every 18 months, and building laptops that freeze up every other time you plug them into a projector, Apple gains in (this measure) of animacy, just as a cantankerous old car comes to seem more alive every time you have to beg it to start. On this theory Google and Obama, by being more reliable, seem a bit less agentive.

[Of course, there are many other factors that could (and doubtless did) skew the counts enough to change the genitive-ratio rankings -- for example, "Google" is often used as a modifier, in phrases like "Google Analytics" or "Google News", and as a result, strings like "new version of Google Analytics" increase the count of the "google of" search. Still, it's nice that the ratio held up in a semi-sensible way over 3-4 orders of magnitude.]



  1. mike said,

    September 27, 2008 @ 10:26 pm

    It probably skews the statistics that people (at least, English-speaking) people are more likely to say "the United States of America" than "the Kingdom of Thailand" or whatever, which is presumably pushing up the "of America" count in a not-really-possessive use case.

    If you could subtract these cases, you might no longer conclude that "America [is]…the the most animate of the countries," but I doubt it'd affect your overall results too muh.

  2. Drew Smith said,

    September 27, 2008 @ 10:50 pm

    Wouldn't "America" skew as more animate because "America" would necessarily include such cases as "Miss America" and "Captain America"? Not to mention the use of "America" as a first name, such as for the actress America Ferrera.

  3. Ethan Merritt said,

    September 27, 2008 @ 11:12 pm

    I wonder if the company name category is biased by financial news, as in "shares of IBM and Apple were up today". The name "IBM" is standing in for a more complex notion.

  4. mud and flame said,

    September 27, 2008 @ 11:15 pm

    I'm dubious that the different ratios for politicians really reflect differences in perceived animacy. When I search on the phrase "of Obama," for instance, nearly all the results are sentences in which you couldn't substitute "Obama's":

    "…a meeting of Obama and Bartlet"
    "Biden's description of Obama"
    "The Folly of Obama's Tax Plan"
    "UK's Brown accused of Obama gaffe"
    "Advice for Mobile Marketers, Courtesy of Obama"

    And so on. This matches my intuition that the apostrophe-s structure is virtually always chosen over an of-phrase for proper nouns, when there's a choice. I suspect there are other factors at work here, such as the percentage of the webpages about each politician that are newspaper articles; quite a few of the results for "of Obama" are headlinese ("FAA Tapes Reveal Drama of Obama Jet Incident").

    Clinton is a special case because there are two well-known politicians with that surname, and writers often have to disambiguate. Your searches will pick up "Hillary Clinton's" but not "of Hillary Clinton."

    Also, the results for Apple could be skewed by uses of the word "apple" that don't refer to the company — "Town of Apple Valley," "a slice of apple pie," etc.

  5. mud and flame said,

    September 27, 2008 @ 11:17 pm

    D'oh. When I said "proper nouns" above, I meant "people's names."

  6. Ran Ari-Gur said,

    September 27, 2008 @ 11:39 pm

    @mike: But by Dr. Liberman's technique, the string "United States of America" would skew "America" in the inanimate direction.

    There are a lot of concerns I'd have had with this technique, since -'s can also mean "is" or "has", and "of ___" of course has many non-genitive uses ("I think of Obama as ___") even when it does govern what follows ("the candidate we spoke of, Clinton"), and genitives can sometimes be expressed in other ways ("America's largest city" = "the largest city in America", not "of"; and heck, confusingly, "friend of McCain's"); so it's cool that in practice, all the concerns seem to cancel out somehow, at least on the large scale.

  7. rootlesscosmo said,

    September 28, 2008 @ 6:30 pm

    Wouldn't the chemical element numbers be skewed by the existence of phrases like " a salt of cadmium," "a compound of hydrogen" etc. where "*a hydrogen's compound" simply isn't a standard usage option?

  8. Bill Walderman said,

    September 28, 2008 @ 10:13 pm

    Doesn't the newspaper headline effect play a role here? In some contexts people are probably using the "apostrophe s" construction because it's shorter.

    As a lawyer, I was at first shocked and appalled when I saw apostrophe s tacked on to the names of statutes in phrases such as "ERISA's prohibitions against fiduciary self-dealing" (ERISA is the Employee Retirement Income Security Act of 1974). But I've become inured to this over time and occasionally indulge in it myself when it's convenient.

  9. Stephen Jones said,

    September 30, 2008 @ 3:11 am

    I was told the two genders in English were higher animal (which often includes cars, ships and countries and frequently excludes young children) and lower animals and inanimate objects.

    Institutions are interesting because they often take the 'animate' genitive but the inanimate pronoun 'it'. Babies and other domestic animals show the opposite characteristic. They can optionally take the inanimate pronoun 'it' but always use the animate genitive.

  10. Sili said,

    October 2, 2008 @ 10:28 am

    There must be some personal – or perhaps rather professional – variation in this usage.

    As a (failed) chemist I have no problem talking about "Cadmium's melting point" or some such. But I'll admit that I prolly have something approaching free variaton. Similarly "The coördination compound's central ion" and "the central ion of the CC" are equally good for me.

    In maths "the function's derivative" is as good as "the derivative of the function".

    So in some sense familiarity seems to breed animacy. I wonder if this is related to the habit of sailors of referring to ships as women.

RSS feed for comments on this post · TrackBack URI

Leave a Comment