Why definiteness is decreasing, part 3

« previous post | next post »

Ten days ago, I documented a striking 20th-century decrease in the frequency of the definite article the ("Decreasing definiteness", 1/8/2015) — from about 6.6% to about 5.4% in the Corpus of Historical American English; from about 6.4% to 5.2% in the Google Books ngram indices; and from about 9.3% to about 4.7% in U.S. presidents' State of the Union messages.

In two follow-up posts, I offered some additional ideas about this change:

In "Why definiteness is decreasing, part 1", I suggested that it might be connected to an overall decrease in the formality of published English, starting with the observation that in contemporary English, the frequency of the varies by a large factor between very formal material (6.42% in the "Academic" genre of the Corpus of American English) and conversational speech (2.47% in the Fisher corpus).

In  "Why definiteness is decreasing, part 2", I noted that both in a collection of Facebook posts and in Fisher conversational speech transcripts, older people use the more often than younger people, and men use the more often than women; and I wondered whether this is a stable life-cycle and gender-identity difference, or the result of a change in progress. (Or both…)

Today, I want to discuss a third idea about the decreasing frequency of the, suggested to me by Jamie Pennebaker.

Jamie points to the fact that the frequency of 's-genitives has been increasing relative to of-genitives. This is relevant because 's-genitives fill the determiner position in noun phrases, thus displacing some instances of the:  "Russia's government" vs. "the government of Russia".

We know that over the course of the 20th century, 's-genitives definitely increased relative to of-genitives — for documentation, see "The genitive of lifeless things", 10/11/2009, and "Mechanisms for gradual language change", 2/9/2014.

This can't be the whole story.  Thus in COHA, 's increased in frequency from about 0.51% in 1900 to about 0.98% in 2000, for an extra 47 instances per 10,000. But the decreased in frequency from 6.53% in 1900 to 5.37% in 2000, for a loss of 116 instances per 10,000.

And the numerical disproportion is greater than than that. Only about 60% of 's instances in the 2000 text sample are genitives — the other 40% are contractions of is or has. This reduces the potential contribution from 47 to about 28 per 10,000, and so I conclude that at most about a quarter of the's decline — 28 out of 116 instances per 10,000 words — might be due to 's's rise.

And not all of the new 's-genitives are plausible replacements for of-genitives or other phrases with determiner the.

But still…

Jamie also reminds me that there's good evidence for stable gender and life-cycle effects in usage (e.g. James Pennebaker and Lori Stone, "Words of Wisdom: Language Use Over the Life Span", Journal of personality and social psychology 2003).


  1. Eric P Smith said,

    January 18, 2015 @ 9:01 am

    As a child I was taught to use 's with a human possessor only: the child's leg, but the leg of the table – an unnecessary restriction. I was further taught that the reason was that "the child's leg" is short for "the child his leg" – wrong.

    We live and learn.

  2. John Shutt said,

    January 18, 2015 @ 10:18 am

    I've this old memory of a line from Doctor Who. "You may be a doctor. But I'm *the* Doctor. The definite article, you might say." A decline in the general use of the article should increase its potency as a means of emphasizing definiteness, making it more useful for that purpose; one wonders if the usefulness of that emphatic sense could be providing positive feedback to the trend.

  3. Ray Girvan said,

    January 18, 2015 @ 1:11 pm

    @ Eric P Smith: "the child his leg"

    I remember that explanation too, with the specific example "Henry Baker, his book" – and a look in Google Books finds this was still around in 1987 in the textbook New Junior English (page 23). NJE fails to explain how this applies to the accompanying examples "the girl's hair" and "the week's work".

  4. Adrian said,

    January 18, 2015 @ 1:46 pm

    This is tangential but interesting. In Lawrence Wright's new book on the Camp David summit between Begin and Sadat, in a discussion of Jimmy Carter's attempt to bring the two sides closer with respect to the Sinai Peninsula by introducing room to wriggle, Wright writes that

    Carter might have employed another famous example of constructive ambiguity with which the delegates were more familiar: UN Resolution 242. The language that had been proposed by the Arab states and the Soviet Union demanded that Israel withdraw from "all the territories occupied during the hostilities of June 1967." That was modified to read as just "the territories." To further fudge the matter, the definite article was finally removed from the English-language version of the text but was retained in the French version. Since both were official UN documents, the Arabs could say that the resolution bound Israel to withdraw from the lands it had conquered and Israel could say that it agreed to withdraw from some, while not committing to which ones.

  5. DaveK said,

    January 18, 2015 @ 1:56 pm

    I've seen the his for miscorrection in 16th century books. It may set some longevity record for a zombie rule.

  6. Frank Y. Gladney said,

    January 18, 2015 @ 1:57 pm

    The count of -'s genitives misses cases where only the apostrophe follows an s-final noun and yet an additional syllable is pronounced.

  7. DaveK said,

    January 18, 2015 @ 1:57 pm

    "his for apostrophe s". Don't know why html knocked it out.

  8. JS said,

    January 18, 2015 @ 2:47 pm

    "X of the Y" can also alternate with Y X (bare determiner + N). Ngrams indicates that "United States government" passed "government of the United States" around 1920; over the last 65ish years, "survivors of the holocaust" was first favored but lost out to "holocaust survivors" c. 1985, etc. It is also easy to imagine cases where Y X could have taken over from other prepositional constructions with "the" — "visitors to the museum" vs. "museum visitors" looks similar in ngrams, for example. Maybe bare noun determiners are just more acceptable in more contexts than they once were?

    [(myl) There are other alternations not involving of-genitives as well, like "the world's largest economy" vs. "the largest economy in the world". In that case the 's does displace a the, and that's really the question — what fraction of the (historically innovative) 's-constructions decrease the count of definite articles? The proportion is fairly high, I think, but not 100%.]

  9. JS said,

    January 18, 2015 @ 5:35 pm

    As regards bare noun determiners, working from the first SOTU, we find phrases like "Constitution of the United States," according to ngrams earlier overwhelmingly preferred but now outnumbered by "U.S. Constitution," "interests of the United States," now eclipsed by "U.S. interests," etc. But such a tendency, not involving 's, would be hard to get a statistical big picture on.

    [(myl) One crude bound on such phenomena would be the overall frequency of "of the", which fell by about 0.25% over the course of the 20th century:

    Of course it's not clear what fraction of the missing "of the" sequences were replaced by complex nominals via the "X of the Y" ~ "Y X" alternation (as opposed to omission, or replacement by other constructions like "X of the Y" ~ "the Y's X").]

  10. Emily M. Bender said,

    January 18, 2015 @ 9:18 pm

    Probably a minor contributing factor at best, but I found (in a never-published QP; thinking about just putting it on my webpage as is) that one side effect of people starting to avoid so-called gender-neutral he in the 1970s and 1980s was a decrease in the use of singular definite generic NPs (mostly in favor of plurals, without 'the').

  11. Sjiveru said,

    January 19, 2015 @ 12:13 am

    It seems to me that it's also the case that there's a much higher proportion of common-use nouns in English these days that are names for companies or titles for things or so on (which don't require a determiner) rather than names for objects (which do). Compare 'check Facebook' and 'look up on Google' with 'listen to the radio' or 'change the channel on the television'. A lot of older people still expect to use determiners, resulting in classic 'old-person' errors like *'look up on the Google'.

  12. John Walden said,

    January 19, 2015 @ 3:08 am

    I blame Pink Floyd and Led Zeppelin.

  13. Alan Palmer said,

    January 20, 2015 @ 4:40 am

    @John Walden – At least mainly Eighties band The The bucked the trend. The popularity of the movie of The Lion, The Witch and The Wardrobe also probably helped.

  14. Steve Rapaport said,

    January 20, 2015 @ 7:49 pm

    Sjiveru's answer is what I came to point out.

    But I think it goes further. I think the bare proper noun is an English innovation that's growing because it works.

    There are more bare proper nouns these days because it's a novel (on some timescale) feature of English that a lot of proper nouns and names for unique things get to drop the "the", as opposed to, say, French, which hardly ever does.

    The Fregean definition λf: ∃x(f(x)=1 & ∀y(f(y)=1 → y=x)).[the unique y such that f(y)=1]
    of "the" entails that existence of an object with the property f implies its uniqueness. Since by naming these things, I am already asserting their existence and their uniqueness, there's no need for 'the'.

    Highway 401 (not 'the Highway 401')
    French Language (not 'le francais')
    University of Edinburgh (not 'The University of Edinburgh')

    English is dropping definiteness when it can, because in those cases it's redundant, and it makes people feel good to drop redundant excrescences. Plus of course, it makes the English feel good whenever they can one-up the French.

  15. John Walden said,

    January 21, 2015 @ 4:01 am

    A very general and unoriginal observation to make is that perhaps we live in a world of fewer certainties. If "the" is a kind of cosmic finger, where speaker/writer knows that listener/reader will in turn know which one the former is talking about, then what is to be made of a company with ad hoc working committees and non-hierarchical structures?
    Who's the boss?

    I live in a semi-autonomous region of Spain, which in turn is in the European Community. If you say "The President" I can think of three people who it might be.

    Perhaps for older people English still inhabits a small town where in some glorious mish-mash of BrE and AmE we go from the bank to the store to the cinema to the pub to the fish and chip shop to the drug-store. In parts of the country there was even The Mine (or The Pit) and The Big House.

    Other posters have disagreed, but even on holiday in Paris I might say "I need to go to the bank" when it really doesn't matter which one it is.

    But then I'm nearly 60. Younger speakers' usage may reflect the fact that in many areas of life there isn't just one of everything nowadays. In Paris would I say "I need to go to the cash-point (the ATM)" or "I need to go to a cash-point (an ATM)"? Probably the second. And somehow it does sound younger.

  16. Martha said,

    January 21, 2015 @ 10:15 am

    For me, "I need to go the bank" and "I need to go to a bank" describe a different situation. The "the bank" statement would describe my regular banking, even though I personally have three branches at two banks that I regularly visit. The "a bank" statement sounds to my ear like the speaker is not doing their regular banking and probably like they are going to have to look for a bank.

    Similarly, "I took the bus" sounds to my ear like the person is talking about using the bus system, but "I took a bus" sounds kind of like they just hopped on the next one that came along.

  17. Martha said,

    January 21, 2015 @ 10:17 am

    …without much regard to where it was going, that is.

  18. David said,

    January 21, 2015 @ 8:00 pm

    @Martha and others. If my American ear serves to detect Britishisms then: "I went to hospital" is OK. 'I went to bank' sounds wrong. (Unless I mean that I went to bank but decided on a dunk instead.) I do think one needs to split the cases with proper names from those with common nouns.

  19. John Walden said,

    January 22, 2015 @ 3:59 am

    Yes. In this town that English seems to be describing, where there is one of everything, in its BrE version there is a set of places for which "the" is used to describe the physical place and no article for its purpose: school, college, hospital, prison (gaol), church, university, chapel. There must be others.

    So my children go to school but if I go to the school it could be for any number of reasons.

    One might think that these places are almost capitalised in the mind's eye, or ear. But why the cinema or the opera and many other places with very specific functions should escape the zero-article treatment is beyond me. Are they newer?

  20. John Walden said,

    January 22, 2015 @ 4:44 am

    Mind you, my children don't necessarily "go to school" where you might think they do. If I said "they go to the primary school" then it'd be the one you were probably thinking of.

RSS feed for comments on this post