Style or artefact or both?

In "Correlated lexicometrical decay", I commented on some unexpectedly strong correlations over time of the ratios of word and phrase frequencies in the Google Books English 1gram dataset:

I'm sure that these patterns mean something. But it seems a little weird that OF as a proportion of all prepositions should correlate r=0.953 with the proportion of instances of OF immediately followed by THE, and  it seems weirder that OF as a proportion of all prepositions should correlate r=0.913 with the proportion of adjective-noun sequences immediately preceded by THE.

So let's hope that what these patterns mean is that the secular decay of THE has somehow seeped into some but not all of the other counts, or that some other hidden cause is governing all of the correlated decays. The alternative hypothesis is that there's a problem with the way the underlying data was collected and processed, which would be annoying.

And in a comment on a comment, I noted that the corresponding data from the Corpus of Historical American English, which is a balanced corpus collected from sources largely or entirely distinct from the Google Books dataset, shows similar unexpected correlations.

So today I'd like to point out that much simpler data — frequencies of  a few of the commonest words — shows some equally strong correlations over time in these same datasets.

Read the rest of this entry »

Comments


Annals of singular "they"

Shane Hickey, "The innovators: the app promising the perfect-fitting bra", The Guardian 1/10/2015:

The sizing technology works via an iPhone app. To use it, a woman must take two pictures of themselves while wearing a tight fitted top in front of a mirror. The phone is held at the bellybutton and a picture is taken from the front and the side. Software developed by Thirdlove then draws up measurements by calculating the distance between the mirror and the contours of the body.

Maybe an editor changed "women" to "a woman" and neglected to change "themselves" to "herself". But I prefer to think that it's just another brick in the singular-they wall — and maybe a vote for "themselves" as the reflexive form?

[h/t Bob Ladd]

 

Comments (24)


Spoken Sanskrit

From December 13-17, 2015, I participated in an international workshop at the Israel Institute for Advanced Studies (IIAS) on the Edmond J. Safra campus of Hebrew University in Jerusalem.  The title of the workshop was "A Lasting Vision: Dandin’s Mirror in the World of Asian Letters".  Here's the workshop website.

The workshop was about Sanskrit poetics, especially as detailed in the Kāvyādarśa (simplified transliteration:  Kavyadarsha; Mirror of Poetry) of Daṇḍin (circa AD 7th c.), the earliest surviving systematic treatment of poetics in Sanskrit.

Read the rest of this entry »

Comments (56)


Correlated lexicometrical decay

This is a brief progress report on "The case of the disappearing determiners", which I've continue to poke at in my spare time.

As the red line in the plot below shows, the proportion of nouns immediately preceded by THE decreased over the course of the 20th century, from an average of 18.9% for books published in 1900-1910 to 13.5% for books published in 1990-2000.  The blue line shows that the proportion of adjective+noun sequences immediately preceded by THE was higher, overall, but followed a remarkably similar falling trajectory, from 29.1% in 1900-1910 to 21.2% in 1990-2000:

Read the rest of this entry »

Comments (12)


ADS Word of the Year is singular "they"

At the American Dialect Society annual meeting in Washington, D.C. (held in conjunction with the Linguistic Society of America), the 2015 Word of the Year selection has been made. The winner is they used as a gender-neutral singular pronoun. They was recognized by the society particularly for its emerging use as a pronoun to refer to a known person, often as a conscious choice by someone rejecting the traditional gender binary of he and she.

Check out the press release here and my full writeup for Vocabulary.com here. The WOTY vote also has received coverage from Time, the Washington Post, and Business Insider, among others.

Comments (22)


Chinese characters and the left-brain vs. right-brain hypothesis

Report of the results of a study that I've been long awaiting:

"Different languages spark same brain activity: study"

by Chen Wei-han Taipei Times (1/6/16)

TOPIC OF DEBATE:
  An NTNU [National Taiwan Normal University] psychology professor said the results debunk a myth that Chinese and alphabetic languages are processed by different sides of the brain

Read the rest of this entry »

Comments (14)


The determiner of the turtle is heard in our land

One useful way to look at the "The case of the disappearing determiners" is to compare bible translations, because this controls to some extent for variation in the underlying message. So as a first tentative step on that path, I compared the  Song of Solomon in the King James Version, first published in 1611, with the Song of Solomon in the Message Bible, published between 1993 and 2002.

The overall statistics for the Song of Solomon in the two sources show a fall of about 38% relative:

Version # words # the % the
kjv 2663  175  6.57%
msg 2737  111  4.06%

And here are a couple of specific verses to compare:

kjv 2:12: The flowers appear on the earth; the time of the singing of birds is come, and the voice of the turtle is heard in our land;
msg 2:12: Spring flowers are in blossom all over. The whole world's a choir – and singing! Spring warblers are filling the forest with sweet arpeggios.

kjv 2:17: Until the day break , and the shadows flee away, turn, my beloved, and be thou like a roe or a young hart upon the mountains of Bether.
msg 2:17: Until dawn breathes its light and night slips away. Turn to me, dear lover. Come like a gazelle. Leap like a wild stag on delectable mountains!

Read the rest of this entry »

Comments (35)


Linguists strike back…

Comments (79)


Lu Xun and the Zhao family

Lu Xun (1881-1936) is generally regarded as the greatest Chinese writer of the twentieth century.  Despite his tremendous reputation and enormous influence through the 70s and into the 80s, in recent decades Lu Xun had fallen somewhat into disfavor as the CCP (Chinese Communist Party), which transformed itself into what I call the CCCCMMMMPPPP (Chinese Communist Christo-Confucian Marxist Maoist Militant Mercantilist Propagandistic Pugnacious Plutocratic Party), no longer took kindly his radical critique of corrupt, feudalistic society.

Read the rest of this entry »

Comments (11)


Malheur militia snark

The internet has responded with a wave of snarky hashtags to the self-appointed militia occupying the  visitors' center at the Malheur National Wildlife Refuge in Oregon. Many are inappropriately anti-rural (#YokelHaram, #YeeHawdists), or irrelevantly anti-southern (#YallQaeda), but in a case like this, snarky stereotype-based ridicule is a better weapon than gun battles, I guess.

Read the rest of this entry »

Comments (33)


R.I.P. John Holm (1943-2015)

Today's New York Times includes an obituary for the pioneering creolist John Holm, with some remembrances from our own Sally Thomason.

Read the rest of this entry »

Comments (11)


Dutch DE

Following up on yesterday's post "The case of the disappearing determiners", Gosse Bouma sent me some data from the CGN ("Corpus Gesproken Nederlands"), about determiner use in spoken Dutch by people born between 1914 and 1987. According to the CGN website,

The Spoken Dutch Corpus project was aimed at the construction of a database of contemporary standard Dutch as spoken by adults in The Netherlands and Flanders. […] In version 1.0, the results are presented that have emerged from the project. The total number of words available here is nearly 9 million (800 hours of speech). Some 3.3 million words were collected in Flanders, well over 5.6 million in The Netherlands.

It's not clear to me exactly when the recordings were made, but the project ran from 1998 to 2004.

Gosse sent data focused on the word de, which is the definite article for masculine and feminine ("common") nouns in Dutch, cognate with English the.  (The definite article for neuter nouns, het, is less frequent and also can be used as a pronoun.)

The results are similar to those that I reported earlier for English: Older people use the definite article more frequently than younger people (at least for people born in the 1950s onwards), and at every age, men use the definite article more than women.

Read the rest of this entry »

Comments (5)


Chinese phrases of the year 2015

We've already had a look at the candidates for Chinese Word of the Year 2015, but apparently that is too tame and lame, so now we also have to think about the top Chinese phrases of the year.  This photograph illustrates (or perhaps I should say "spawned") one:

Read the rest of this entry »

Comments