You should go read "Two Linguists Explain Pseudo Old English in The Wake", The Toast 6/14/2016. Gretchen McCulloch interviews Kate Wiles about the imitation-Old-English that Paul Kingsnorth uses in The Wake, a novel about resistance to the Norman invasion of England in 1066.
Archive for Linguistic history
Lila Gleitman points out to me that in one of the slowly increasing number of articles passing round the pseudoscientific story about Yiddish originating in four villages in Turkey you can see that hallmark of non-serious language research, the X-people-have-Y-words-for-Z trope:
Putting together evidence from linguistic, history, and genetics, we concluded that the ancient Ashkenazic Jews were merchants who developed Yiddish as a secret language — with 251 words for "buy" and "sell" — to maintain their monopoly. They were known to trade in everything from fur to slaves.
You can see the article here, but don't take that as a recommendation; it looks to me like unsubstantiated drivel. Exactly 251 words for buying and selling? No examples cited, and no hint of how more than two basic words and a few random approximate synonyms could be the slightest bit useful? It looks like classic myth-repetition of the usual Eskimo-words-for-snow sort.
Read the rest of this entry »
Read the rest of this entry »
In the comments on my post "When did 'a thing' become a thing", 4/18/2016,, James Barrett points us to a video from the Royal Society that includes the following passage from a letter, dated 1783, from one Eberhard Johann Schröter in St. Petersburg, addressed to Dr. Daniel Solander, an associate of Sir Joseph Banks:
If any body could be thoroughly convinced that a prediction of winds is a thing and possible and real, then to such a person a proper classification of them would be useful.
(This letter was selected to be read because its card was the very last item in the card catalogue of the Royal Society's library.)
This citation suggests that the "is a thing" usage has always been Out There in platonic Idiom World, and may have been incarnated many times through history before it finally caught the memetic brass ring. And never mind that Eberhard Schröter was presumably not a native speaker of English.
In discussions about the history of usage, like this one, people often bring out generic memories ("I heard this all the time back in such-and-such a time period") or even more specific recollections ("I remember so-and-so saying this back in 19XX"). I've done this myself more than once. But recently something happened that made me wonder whether these memories can sometimes be false ones.
Alexander Stern, "Is That Even a Thing?", NYT 4/16/2016:
Speakers and writers of American English have recently taken to identifying a staggering and constantly changing array of trends, events, memes, products, lifestyle choices and phenomena of nearly every kind with a single label — a thing. In conversation, mention of a surprising fad, behavior or event is now often met with the question, “Is that actually a thing?” Or “When did that become a thing?” Or “How is that even a thing?” Calling something “a thing” is, in this sense, itself a thing.
I'm sure that these patterns mean something. But it seems a little weird that OF as a proportion of all prepositions should correlate r=0.953 with the proportion of instances of OF immediately followed by THE, and it seems weirder that OF as a proportion of all prepositions should correlate r=0.913 with the proportion of adjective-noun sequences immediately preceded by THE.
So let's hope that what these patterns mean is that the secular decay of THE has somehow seeped into some but not all of the other counts, or that some other hidden cause is governing all of the correlated decays. The alternative hypothesis is that there's a problem with the way the underlying data was collected and processed, which would be annoying.
And in a comment on a comment, I noted that the corresponding data from the Corpus of Historical American English, which is a balanced corpus collected from sources largely or entirely distinct from the Google Books dataset, shows similar unexpected correlations.
So today I'd like to point out that much simpler data — frequencies of a few of the commonest words — shows some equally strong correlations over time in these same datasets.
One useful way to look at the "The case of the disappearing determiners" is to compare bible translations, because this controls to some extent for variation in the underlying message. So as a first tentative step on that path, I compared the Song of Solomon in the King James Version, first published in 1611, with the Song of Solomon in the Message Bible, published between 1993 and 2002.
The overall statistics for the Song of Solomon in the two sources show a fall of about 38% relative:
|Version||# words||# the||% the|
And here are a couple of specific verses to compare:
kjv 2:12: The flowers appear on the earth; the time of the singing of birds is come, and the voice of the turtle is heard in our land;
msg 2:12: Spring flowers are in blossom all over. The whole world's a choir – and singing! Spring warblers are filling the forest with sweet arpeggios.
kjv 2:17: Until the day break , and the shadows flee away, turn, my beloved, and be thou like a roe or a young hart upon the mountains of Bether.
msg 2:17: Until dawn breathes its light and night slips away. Turn to me, dear lover. Come like a gazelle. Leap like a wild stag on delectable mountains!
Following up on yesterday's post "The case of the disappearing determiners", Gosse Bouma sent me some data from the CGN ("Corpus Gesproken Nederlands"), about determiner use in spoken Dutch by people born between 1914 and 1987. According to the CGN website,
The Spoken Dutch Corpus project was aimed at the construction of a database of contemporary standard Dutch as spoken by adults in The Netherlands and Flanders. […] In version 1.0, the results are presented that have emerged from the project. The total number of words available here is nearly 9 million (800 hours of speech). Some 3.3 million words were collected in Flanders, well over 5.6 million in The Netherlands.
It's not clear to me exactly when the recordings were made, but the project ran from 1998 to 2004.
Gosse sent data focused on the word de, which is the definite article for masculine and feminine ("common") nouns in Dutch, cognate with English the. (The definite article for neuter nouns, het, is less frequent and also can be used as a pronoun.)
The results are similar to those that I reported earlier for English: Older people use the definite article more frequently than younger people (at least for people born in the 1950s onwards), and at every age, men use the definite article more than women.
For the past century or so, the commonest word in English has gradually been getting less common. Depending on data source and counting method, the frequency of the definite article THE has fallen substantially — in some cases at a rate as high as 50% per 100 years.
At every stage, writing that's less formal has fewer THEs, and speech generally has fewer still, so to some extent the decline of THE is part of a more general long-term trend towards greater informality. But THE is apparently getting rarer even in speech, so the change is more than just the (normal) shift of writing style towards the norms of speech.
There appear to be weaker trends in the same direction, at overall lower rates, in German, Italian, Spanish, and French.
I'll lay out some of the evidence for this phenomenon, mostly collected from earlier LLOG posts. And then I'll ask a few questions about what's really going on, and why and how it's happening. [Warning: long and rather wonky.]
"Scientists sequence first ancient Irish human genomes", Press Release from Trinity College Dublin:
A team of geneticists from Trinity College Dublin and archaeologists from Queen's University Belfast has sequenced the first genomes from ancient Irish humans, and the information buried within is already answering pivotal questions about the origins of Ireland's people and their culture.
The team sequenced the genome of an early farmer woman, who lived near Belfast some 5,200 years ago, and those of three men from a later period, around 4,000 years ago in the Bronze Age, after the introduction of metalworking. […]
These ancient Irish genomes each show unequivocal evidence for massive migration. The early farmer has a majority ancestry originating ultimately in the Middle East, where agriculture was invented. The Bronze Age genomes are different again with about a third of their ancestry coming from ancient sources in the Pontic Steppe.
"There was a great wave of genome change that swept into Europe from above the Black Sea into Bronze Age Europe and we now know it washed all the way to the shores of its most westerly island," said Professor of Population Genetics in Trinity College Dublin, Dan Bradley, who led the study, "and this degree of genetic change invites the possibility of other associated changes, perhaps even the introduction of language ancestral to western Celtic tongues."
In "A quantitative history of which-hunting", I reproduced a plot due to (an anonymous colleague of) Jonathan Owen, showing that texts from the last half of the 20th century saw a decrease in the relative frequency of NOUN which VERB, and an increase in the relative frequency of NOUN that VERB. Jonathan took this to indicate the success of (usage guides like) Strunk & White's The Elements of Style in persuading writers and copy-editors to avoid which in "restrictive" (AKA "defining" or "integrated") relative clauses
Here are some plots showing the effect, for data (without smoothing) from the Google Books ngram corpus. The "British English" dataset shows about the same increase in NOUN that as the "American English" collection does, but somewhat less decrease in NOUN which:
|American English||British English|
Dyami Hayes writes to point out that there has been a change over the past century in the relative popularity (at least in printed text) of constructions like these:
What this book sets out to do is to provide some tools, ideas and suggestions for tackling non-verbal reasoning questions.
What it attempts to do is provide a framework for understanding how local governments are organized.
A retired lecturer in medieval history, Dr Paul Booth, has discovered a reference in a 1310 court record to a man named Roger Fuckebythenavele, and he believes it really does mean that the man was known as Roger Fuck-By-The-Navel, the surname (possibly a nickname given by enemies) actually meaning "fuck via the belly button", so this may be the earliest known use of the verb fuck in its sexual sense.
Read the rest of this entry »
Read the rest of this entry »