Archive for Linguistic history

Dutch DE

Following up on yesterday's post "The case of the disappearing determiners", Gosse Bouma sent me some data from the CGN ("Corpus Gesproken Nederlands"), about determiner use in spoken Dutch by people born between 1914 and 1987. According to the CGN website,

The Spoken Dutch Corpus project was aimed at the construction of a database of contemporary standard Dutch as spoken by adults in The Netherlands and Flanders. […] In version 1.0, the results are presented that have emerged from the project. The total number of words available here is nearly 9 million (800 hours of speech). Some 3.3 million words were collected in Flanders, well over 5.6 million in The Netherlands.

It's not clear to me exactly when the recordings were made, but the project ran from 1998 to 2004.

Gosse sent data focused on the word de, which is the definite article for masculine and feminine ("common") nouns in Dutch, cognate with English the.  (The definite article for neuter nouns, het, is less frequent and also can be used as a pronoun.)

The results are similar to those that I reported earlier for English: Older people use the definite article more frequently than younger people (at least for people born in the 1950s onwards), and at every age, men use the definite article more than women.

Read the rest of this entry »

Comments (5)

The case of the disappearing determiners

For the past century or so, the commonest word in English has gradually been getting less common. Depending on data source and counting method, the frequency of the definite article THE has fallen substantially — in some cases at a rate as high as 50% per 100 years.

At every stage, writing that's less formal has fewer THEs, and speech generally has fewer still, so to some extent the decline of THE is part of a more general long-term trend towards greater informality. But THE is apparently getting rarer even in speech, so the change is more than just the (normal) shift of writing style towards the norms of speech.

There appear to be weaker trends in the same direction, at overall lower rates, in German, Italian, Spanish, and French.

I'll lay out some of the evidence for this phenomenon, mostly collected from earlier LLOG posts. And then I'll ask a few questions about what's really going on, and why and how it's happening. [Warning: long and rather wonky.]

Read the rest of this entry »

Comments (54)

Irish DNA and Indo-European origins

"Scientists sequence first ancient Irish human genomes", Press Release from Trinity College Dublin:


A team of geneticists from Trinity College Dublin and archaeologists from Queen's University Belfast has sequenced the first genomes from ancient Irish humans, and the information buried within is already answering pivotal questions about the origins of Ireland's people and their culture.  

The team sequenced the genome of an early farmer woman, who lived near Belfast some 5,200 years ago, and those of three men from a later period, around 4,000 years ago in the Bronze Age, after the introduction of metalworking. […]

These ancient Irish genomes each show unequivocal evidence for massive migration. The early farmer has a majority ancestry originating ultimately in the Middle East, where agriculture was invented. The Bronze Age genomes are different again with about a third of their ancestry coming from ancient sources in the Pontic Steppe.

"There was a great wave of genome change that swept into Europe from above the Black Sea into Bronze Age Europe and we now know it washed all the way to the shores of its most westerly island," said Professor of Population Genetics in Trinity College Dublin, Dan Bradley, who led the study, "and this degree of genetic change invites the possibility of other associated changes, perhaps even the introduction of language ancestral to western Celtic tongues."

Read the rest of this entry »

Comments (14)

Which-hunting — and relative decline?

In "A quantitative history of which-hunting", I reproduced a plot due to (an anonymous colleague of) Jonathan Owen, showing that texts from the last half of the 20th century saw a decrease in the relative frequency of NOUN which VERB, and an increase in the relative frequency of NOUN that VERB. Jonathan took this to indicate the success of (usage guides like) Strunk & White's The Elements of Style in persuading writers and copy-editors to avoid which in "restrictive" (AKA "defining" or "integrated") relative clauses

Here are some plots showing the effect, for data (without smoothing) from the Google Books ngram corpus. The "British English" dataset shows about the same increase in NOUN that as the "American English" collection does, but somewhat less decrease in NOUN which:

American English British English

Read the rest of this entry »

Comments (12)

"… to do is (to) VERB …"

Dyami Hayes writes to point out that there has been a change over the past century in the relative popularity (at least in printed text) of constructions like these:

What this book sets out to do is to provide some tools, ideas and suggestions for tackling non-verbal reasoning questions.

What it attempts to do is provide a framework for understanding how local governments are organized.

The Google Books ngram plots for provide, look, tell, and say show similar patterns — or summed for those four verbs (with the to do is VERB version in red and the to do is to VERB version in blue):

Read the rest of this entry »

Comments (12)

New discovery in English historical lexicography

A retired lecturer in medieval history, Dr Paul Booth, has discovered a reference in a 1310 court record to a man named Roger Fuckebythenavele, and he believes it really does mean that the man was known as Roger Fuck-By-The-Navel, the surname (possibly a nickname given by enemies) actually meaning "fuck via the belly button", so this may be the earliest known use of the verb fuck in its sexual sense.

Read the rest of this entry »

Comments off

A decision entirely

Urgent bipartite action alert for The Economist: First, note that my copy of the July 18 issue did not arrive on my doormat as it should have done on Saturday morning, so I did not have my favorite magazine to read over the weekend; please investigate. And second, the guerilla actions of the person on your staff who enforces the no-split-infinitives rule (you know perfectly well who it is) have gone too far and are making you a laughing stock. Look at this sentence, from an article about Iran (page 21; thanks to Robert Ayers for pointing it out; the underlining is mine):

Nor do such hardliners believe compliance will offer much of a safeguard: Muammar Qaddafi's decision entirely to dismantle Libya's nuclear programme did not stop Western countries from helping his foes to overthrow and kill him.

Read the rest of this entry »

Comments off

Lhomond

One of the small streets near where I'm staying for a couple of months is the Rue Lhomond, which the street signs tell me is named for a grammarian, Charles François Lhomond (1727-1794). Since I pass the intersection every day on my way to the LPP, I've been curious about what this grammarian's grammar was like. And Gallica offers his Élémens de la Grammaire Françoise (1780), which begins like this:

La Grammaire est l'art de parler & d'écrire correctement. Pour parler & pour écrire on emploie des mots : les mots sont composés des lettres.

Il y a deux sortes de lettres, les voyelles et les consonnes.

Les voyelles sont a , e , i , o , u , & y. On les appelle voyelles, parce que, seules, elles forment une voix, un son.

Il y a trois sortes d'e ; e muet, e fermé, e ouvert.

Grammar is the art of speaking and writing correctly. To speak and to write one uses words : words are made up of letters.

There are two kinds of letters, vowels and consonants.

The vowels are a , e , i , o , u , & y. We call them vowels, because, alone, they form a voice, a sound.

There are three kinds of e ; mute e, closed e, open e.

Read the rest of this entry »

Comments (44)

Dilige et quod vis fac

A few weeks ago, Eric Baković organized a "Short 'schrift" in honor of Alan Prince's forthcoming retirement, asking for

– a paean
– a poem
– a story
– a greeting
– an expression of gratitude
– a work of art (whatever that may mean to you)
– a 'classic-style' squib (à la 1970s-era LI)
– a brief analytical argument
– a simple formal proof
– a spoof of any of the above

I contributed a story, "Dilige, et quod vis fac". The result has now been revealed —  squibs,  greetings and thanks, stories, music, images, poetry & prose,  from the archives, family & friends — so I'm reprinting my contribution below.

Read the rest of this entry »

Comments (8)

Solving the mystery of "off the cuff"

Peter Jensen Brown, "Paper Linen and Crib Notes – A Well-Planned History of 'Off the Cuff'", Early Sports and Pop Culture History Blog, 2/20/2015, following up on "The 'off the cuff' mystery", 8/16/2012:

The idiom, “off the cuff,” meaning “without preparation . . . as if from impromptu notes made on one’s shirt cuffs,” dates to the 1930s.  Mark Liberman, the Christopher H. Browne Distinguished Professor of Linguistics at the University of Pennsylvania, pushed the earliest known use of “off the cuff” back from 1938 to 1936; but wondered how or why the expression came into being decades after detachable paper cuffs had long fallen out of fashion, and with no apparent immediate impetus.  Charlie Chaplin’s film, Modern Times, released in February 1936 (which features a scene in which Chaplin’s Tramp writes notes on his cuffs), notwithstanding; he could not find a satisfactory reason for the decades-long gap between paper-cuff fashion and the “off the cuff” expression; none of the seemingly plausible explanation made sense.  “So what happened?”

For the answer, see the rest of Peter's post.

[h/t Peter Reitan]

 

 

Comments (15)

John McWhorter responds

Some clarifications about my Wall Street Journal article, which seems to have led to some misunderstandings among Language Log’s readers (as well as over at Languagehat). Since the readers here are the most well-informed audience that piece will ever reach outside of professional linguists, I thought it’d be useful to clarify what I based the observations in that piece on.

Read the rest of this entry »

Comments (21)

All ADJ and shit

Howard Oakley ("Birth of a new English phrase", 1/23/2015) was struck by the phrase "all proper and shit", in the context of a tweet by Christopher Phin noting that "[choice of printing mode] makes my writing seem all proper and shit". So Howard investigated the history of that four-word sequence by means of various web search tools.

I strongly support the combination of linguistic curiosity and empirical methods, but in this case, I'm puzzled by the fact that Howard saw the phrase as novel. As far as I can see, "all proper and shit" is a syntactically, semantically, and pragmatically compositional combination of two constructions that have existed in English for hundreds of years.

Read the rest of this entry »

Comments (23)

Why definiteness is decreasing, part 3

Ten days ago, I documented a striking 20th-century decrease in the frequency of the definite article the ("Decreasing definiteness", 1/8/2015) — from about 6.6% to about 5.4% in the Corpus of Historical American English; from about 6.4% to 5.2% in the Google Books ngram indices; and from about 9.3% to about 4.7% in U.S. presidents' State of the Union messages.

In two follow-up posts, I offered some additional ideas about this change:

In "Why definiteness is decreasing, part 1", I suggested that it might be connected to an overall decrease in the formality of published English, starting with the observation that in contemporary English, the frequency of the varies by a large factor between very formal material (6.42% in the "Academic" genre of the Corpus of American English) and conversational speech (2.47% in the Fisher corpus).

In  "Why definiteness is decreasing, part 2", I noted that both in a collection of Facebook posts and in Fisher conversational speech transcripts, older people use the more often than younger people, and men use the more often than women; and I wondered whether this is a stable life-cycle and gender-identity difference, or the result of a change in progress. (Or both…)

Today, I want to discuss a third idea about the decreasing frequency of the, suggested to me by Jamie Pennebaker.

Read the rest of this entry »

Comments (20)