Archive for February, 2013

Word String frequency distributions

Several people have asked me about Alexander M. Petersen et al., "Languages cool as they expand: Allometric scaling and the decreasing need for new words", Nature Scientific Reports 12/10/2012. The abstract (emphasis added):

We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.

The paper is thought-provoking, and the conclusions definitely merit further exploration. But I feel that the paper as published is guilty of false advertising. As the emphasized material in the abstract indicates, the paper claims to be about the frequency distributions of words in the vocabulary of English and other natural languages. In fact, I'm afraid, it's actually about the frequency distributions of strings in Google's 2009 OCR of printed books — and this, alas, is not the same thing at all.

It's possible that the paper's conclusions also hold for the distributions of words in English and other languages, but it's far from clear that this is true. At a minimum, the paper's quantitative results clearly will not hold for anything that a linguist, lexicographer, or psychologist would want to call "words". Whether the qualitative results hold or not remains to be seen.

Read the rest of this entry »

Comments (13)

What "foreign language" is on this Dickens poster?

A few years ago, "KateMonkey" posted this query on Flickr:

What is this language?

This was a poster of a book cover on the wall at the Dickens Museum, and all it said was "foreign language".

Really? You can't do better than that? "Foreign language"?

Charles Dickens Museum, Bloomsbury, London

Read the rest of this entry »

Comments (18)

Love toilet

Comments (36)

Ask Language Log: preference as a verb

Jessica Mason Pieklo, "Texas GOP Considers Turning State Into Tax Dodge Over Contraception Mandate", RH Reality Check 1/30/2013 (emphasis added):

To be considered constitutional, a state tax generally cannot discriminate against interstate commerce. Broadly speaking, the Supreme Court has taken that to mean that any tax which, by its terms or operations, imposes greater burdens on out-of-state goods, activities, or enterprises than on any competing in-state goods, activities or enterprises violates the Commerce Clause and will be struck down. The basic logic of this conclusion is pretty clear—states shouldn't be able to simply preference their own industries at the expense of others if those industries touch or are part of national commerce.

AC asks:

Is this use of "preference" as a verb commonplace? It didn't sound right to my ear. We already have the verbs "prefer" and "show preference".

Read the rest of this entry »

Comments (31)

Ask Language Log: "In wildcat form"

Joseph Berger, "Modesty in Ultra-Orthodox Brooklyn Is Enforced by Secret Squads", NYT 1/29/2013 (emphasis added):

“We give out proclamations,” said Rabbi Yitzchok Glick, its executive director. “We don’t enforce. It’s like people can decide to keep Shabbos or not. If someone wants to turn on the light on Shabbos, we cannot put him in jail for that.”

But Hasidim interviewed said squads of enforcers did exist in wildcat form.

“There are quite a few men, especially in Williamsburg, who consider themselves Gut’s polizei,” said Yosef Rapaport, a Hasidic journalist, using the words for “God’s police.”  “It’s somebody who is a busybody, and they’re quite a few of them — zealots who take it upon themselves and they just enforce. They’re considered crazy, but people don’t want to confront them.”

About the expression "in wildcat form", AMG asks:

I have never heard of this expression and when I Googled it, I only found the football term "wildcat formation" but no references that seem to indicate that this term has entered popular (e.g., non-football) culture.  Have you heard of it? Do you know what it means?  It seems odd to use such an obscure phrase in a NYTimes article.

Read the rest of this entry »

Comments (16)