Archive for Linguistic history

Tom Wolfe takes on linguistics

Or maybe I should say, Tom Wolfe's take on linguistics.

I've been an avid reader of Tom Wolfe's works since the 60s:  The Electric Kool-Aid Acid Test, Radical Chic & Mau-Mauing the Flak Catchers, The Kandy-Kolored Tangerine-Flake Streamline Baby, The Right Stuff, The Painted Word, Bonfire of the Vanities).  What I like most about his non-fiction is that, as a leader and exponent of the New Journalism, he writes with a flair that captures the reader's attention without sacrificing accuracy and objectivity.  What attracts me to his novels is that they convey the impression of having been based on a huge amount of research, without in the least being turgid or dull.

Read the rest of this entry »

Comments (17)

"Among the New Words"

Ben Zimmer, Jane Solomon, and Charles Carson, "Among The New Words", American Speech May 2016:

In this installment we continue our consideration of items nominated at the American Dialect Society’s 2015 Word of the Year proceedings […]

The overall winner is considered here: they used as a singular third-person pronoun, a gender-neutral (or “epicene”) alternative to the binary of he and she. One might object that there is nothing particularly new about singular they, as the Oxford English Dictionary (3rd ed.) includes examples
back to the fourteenth century […]

What is genuinely new, however, is the use of they to refer to a known person in order to transcend the binary of he and she in the construction of a “non-binary” gender identity, such as transgender, gender-fluid, genderqueer, or agender.

Read the rest of this entry »

Comments (18)

Two linguists explain

You should go read "Two Linguists Explain Pseudo Old English in The Wake", The Toast 6/14/2016. Gretchen McCulloch interviews Kate Wiles about the imitation-Old-English that Paul Kingsnorth uses in The Wake, a novel about resistance to the Norman invasion of England in 1066.

Read the rest of this entry »

Comments (9)

Shifty merchants with 251 secret words for trade

Lila Gleitman points out to me that in one of the slowly increasing number of articles passing round the pseudoscientific story about Yiddish originating in four villages in Turkey you can see that hallmark of non-serious language research, the X-people-have-Y-words-for-Z trope:

Putting together evidence from linguistic, history, and genetics, we concluded that the ancient Ashkenazic Jews were merchants who developed Yiddish as a secret language — with 251 words for "buy" and "sell" — to maintain their monopoly. They were known to trade in everything from fur to slaves.

You can see the article here, but don't take that as a recommendation; it looks to me like unsubstantiated drivel. Exactly 251 words for buying and selling? No examples cited, and no hint of how more than two basic words and a few random approximate synonyms could be the slightest bit useful? It looks like classic myth-repetition of the usual Eskimo-words-for-snow sort.

Read the rest of this entry »

Comments off

"Is a thing" antedated to 1783

In the comments on my post "When did 'a thing' become a thing", 4/18/2016,, James Barrett points us to a video from the Royal Society that includes the following passage from a letter, dated 1783, from one Eberhard Johann Schröter in St. Petersburg, addressed to Dr. Daniel Solander, an associate of Sir Joseph Banks:

If any body could be thoroughly convinced that a prediction of winds is a thing and possible and real, then to such a person a proper classification of them would be useful.

(This letter was selected to be read because its card was the very last item in the card catalogue of the Royal Society's library.)

This citation suggests that the "is a thing" usage has always been Out There in platonic Idiom World, and may have been incarnated many times through history before it finally caught the memetic brass ring. And never mind that Eberhard Schröter was presumably not a native speaker of English.

Comments (11)

Firing and wiring

In discussions about the history of usage, like this one, people often bring out generic memories ("I heard this all the time back in such-and-such a time period") or even more specific recollections ("I remember so-and-so saying this back in 19XX"). I've done this myself more than once. But recently something happened that made me wonder whether these memories can sometimes be false ones.

Read the rest of this entry »

Comments (19)

When did "a thing" become a thing?

Alexander Stern, "Is That Even a Thing?", NYT 4/16/2016:

Speakers and writers of American English have recently taken to identifying a staggering and constantly changing array of trends, events, memes, products, lifestyle choices and phenomena of nearly every kind with a single label — a thing. In conversation, mention of a surprising fad, behavior or event is now often met with the question, “Is that actually a thing?” Or “When did that become a thing?” Or “How is that even a thing?” Calling something “a thing” is, in this sense, itself a thing.

Read the rest of this entry »

Comments (68)

Style or artefact or both?

In "Correlated lexicometrical decay", I commented on some unexpectedly strong correlations over time of the ratios of word and phrase frequencies in the Google Books English 1gram dataset:

I'm sure that these patterns mean something. But it seems a little weird that OF as a proportion of all prepositions should correlate r=0.953 with the proportion of instances of OF immediately followed by THE, and  it seems weirder that OF as a proportion of all prepositions should correlate r=0.913 with the proportion of adjective-noun sequences immediately preceded by THE.

So let's hope that what these patterns mean is that the secular decay of THE has somehow seeped into some but not all of the other counts, or that some other hidden cause is governing all of the correlated decays. The alternative hypothesis is that there's a problem with the way the underlying data was collected and processed, which would be annoying.

And in a comment on a comment, I noted that the corresponding data from the Corpus of Historical American English, which is a balanced corpus collected from sources largely or entirely distinct from the Google Books dataset, shows similar unexpected correlations.

So today I'd like to point out that much simpler data — frequencies of  a few of the commonest words — shows some equally strong correlations over time in these same datasets.

Read the rest of this entry »

Comments (9)

The determiner of the turtle is heard in our land

One useful way to look at the "The case of the disappearing determiners" is to compare bible translations, because this controls to some extent for variation in the underlying message. So as a first tentative step on that path, I compared the  Song of Solomon in the King James Version, first published in 1611, with the Song of Solomon in the Message Bible, published between 1993 and 2002.

The overall statistics for the Song of Solomon in the two sources show a fall of about 38% relative:

Version # words # the % the
kjv 2663  175  6.57%
msg 2737  111  4.06%

And here are a couple of specific verses to compare:

kjv 2:12: The flowers appear on the earth; the time of the singing of birds is come, and the voice of the turtle is heard in our land;
msg 2:12: Spring flowers are in blossom all over. The whole world's a choir – and singing! Spring warblers are filling the forest with sweet arpeggios.

kjv 2:17: Until the day break , and the shadows flee away, turn, my beloved, and be thou like a roe or a young hart upon the mountains of Bether.
msg 2:17: Until dawn breathes its light and night slips away. Turn to me, dear lover. Come like a gazelle. Leap like a wild stag on delectable mountains!

Read the rest of this entry »

Comments (35)

Dutch DE

Following up on yesterday's post "The case of the disappearing determiners", Gosse Bouma sent me some data from the CGN ("Corpus Gesproken Nederlands"), about determiner use in spoken Dutch by people born between 1914 and 1987. According to the CGN website,

The Spoken Dutch Corpus project was aimed at the construction of a database of contemporary standard Dutch as spoken by adults in The Netherlands and Flanders. […] In version 1.0, the results are presented that have emerged from the project. The total number of words available here is nearly 9 million (800 hours of speech). Some 3.3 million words were collected in Flanders, well over 5.6 million in The Netherlands.

It's not clear to me exactly when the recordings were made, but the project ran from 1998 to 2004.

Gosse sent data focused on the word de, which is the definite article for masculine and feminine ("common") nouns in Dutch, cognate with English the.  (The definite article for neuter nouns, het, is less frequent and also can be used as a pronoun.)

The results are similar to those that I reported earlier for English: Older people use the definite article more frequently than younger people (at least for people born in the 1950s onwards), and at every age, men use the definite article more than women.

Read the rest of this entry »

Comments (5)

The case of the disappearing determiners

For the past century or so, the commonest word in English has gradually been getting less common. Depending on data source and counting method, the frequency of the definite article THE has fallen substantially — in some cases at a rate as high as 50% per 100 years.

At every stage, writing that's less formal has fewer THEs, and speech generally has fewer still, so to some extent the decline of THE is part of a more general long-term trend towards greater informality. But THE is apparently getting rarer even in speech, so the change is more than just the (normal) shift of writing style towards the norms of speech.

There appear to be weaker trends in the same direction, at overall lower rates, in German, Italian, Spanish, and French.

I'll lay out some of the evidence for this phenomenon, mostly collected from earlier LLOG posts. And then I'll ask a few questions about what's really going on, and why and how it's happening. [Warning: long and rather wonky.]

Read the rest of this entry »

Comments (54)

Irish DNA and Indo-European origins

"Scientists sequence first ancient Irish human genomes", Press Release from Trinity College Dublin:


A team of geneticists from Trinity College Dublin and archaeologists from Queen's University Belfast has sequenced the first genomes from ancient Irish humans, and the information buried within is already answering pivotal questions about the origins of Ireland's people and their culture.  

The team sequenced the genome of an early farmer woman, who lived near Belfast some 5,200 years ago, and those of three men from a later period, around 4,000 years ago in the Bronze Age, after the introduction of metalworking. […]

These ancient Irish genomes each show unequivocal evidence for massive migration. The early farmer has a majority ancestry originating ultimately in the Middle East, where agriculture was invented. The Bronze Age genomes are different again with about a third of their ancestry coming from ancient sources in the Pontic Steppe.

"There was a great wave of genome change that swept into Europe from above the Black Sea into Bronze Age Europe and we now know it washed all the way to the shores of its most westerly island," said Professor of Population Genetics in Trinity College Dublin, Dan Bradley, who led the study, "and this degree of genetic change invites the possibility of other associated changes, perhaps even the introduction of language ancestral to western Celtic tongues."

Read the rest of this entry »

Comments (14)

Which-hunting — and relative decline?

In "A quantitative history of which-hunting", I reproduced a plot due to (an anonymous colleague of) Jonathan Owen, showing that texts from the last half of the 20th century saw a decrease in the relative frequency of NOUN which VERB, and an increase in the relative frequency of NOUN that VERB. Jonathan took this to indicate the success of (usage guides like) Strunk & White's The Elements of Style in persuading writers and copy-editors to avoid which in "restrictive" (AKA "defining" or "integrated") relative clauses

Here are some plots showing the effect, for data (without smoothing) from the Google Books ngram corpus. The "British English" dataset shows about the same increase in NOUN that as the "American English" collection does, but somewhat less decrease in NOUN which:

American English British English

Read the rest of this entry »

Comments (12)