Why definiteness is decreasing, part 2

In an earlier post on this topic ("Why definiteness is decreasing, part 1"), I suggested that the decrease in definite-article frequency in published English text, over the course of the past century, might be connected with a decrease in formality.  Roughly, this means that writing has been becoming more like speech (though speech has also been changing, and writing and speech remain very different).

In this post, I want to discuss two other socio-stylistic dimensions — age and sex. If the language is changing, then we expect to see "age grading", where younger people tend to exhibit the innovative pattern, while older people's usage is more old-fashioned. And because women are generally the leaders in language change, we expect to see women at every age being more linguistically innovative and men being more conservative. In other words, "young men talk like old women".  And as the plot on the right illustrates, differences by age and sex in the frequency of the seem to confirm this hypothesis. (Click on the graph for a larger version.)

Why definiteness is decreasing, part 1

I ended yesterday's post ("Decreasing Definiteness") with a promise to say more about why the frequency of the has decreased so much over the past century or so, and this morning's post will start to redeem that promise.

As several commenters observed, there are probably several different things going on here. But I think that one relevant factor is decreasing formality of style.

I'll leave for another day the question of what formality really is, and why a decrease in formality correlates with a decrease in the frequency of the. In this post, I'll try to establish two simpler points:

  1. In English text that's more formal, in common-sense terms, the is more common;
  2. The formality of (various genres of) English writing has been decreasing over the past century or so.

Labiality and femininity

I recently got this note from Bill Labov, following up on a conversation about UM and UH (see "UM / UH update", 12/13/2014, for a summary),

I've been thinking about the female preference for the labial gesture in hesitation forms, and this returned me to the issues raised by Gordon and Heath in their paper on sex and sound symbolism (Matthew Gordon and Jeffrey Heath, "Sex, Sound Symbolism, and Sociolinguistics", Current Anthropology 1998). I think it's an important contribution because it brings in quite a bit of data on general patterns of sex preference and it's well reviewed by the commentators. I've always been interested in G&H's efforts to explain the general principles of chain shifting that I've extracted.

Gordon and Heath develop the notion of sex differentiation by sound symbolism on an acoustic basis. I'm more inclined to look to articulatory factors, associating the female preference for movement to more peripheral vowels with the expressive gestures of lip spreading and lip rounding. These are associated with fronting and backing somewhat more than with raising. So the preference for um might go along with the female orientation to labial gestures.

Little Urban Anna

A note from David Donnell:

A friend in Urbana, IL informed me this afternoon that a fellow Urbana-ite, Melissa Applebee, was appearing on the game show "Jeopardy" this evening (12/23). However, she lamented, Jeopardy host Alex Trebec pronounced the name of her town as "Urbahna". (It reminded me of people from Colorado and Nevada lamenting that outsiders don't pronounce the penultimate syllables in those Latinate state-names as a short 'a' vowel. Whaddyagonnado?)

So I went in search of the origin of the seemingly-Latinate name of my friend's Illinois town. (Of course, in Italian, Spanish & Portuguese, it means “urban”.)

UM / UH update

Nine years ago, I stumbled on an unexpected fact about the filled pauses UM and UH ("Young men talk like old women", 11/6/2005). I found, as I expected, that older people tend to use UH more often than younger people do, and that males tend to use UH more than females. The surprising thing was that UM seemed to work in the opposite way, at least in the (large) American conversational-speech corpus that I looked at — younger people use UM more than older people, and females use UM more than males:

Last summer, some colleagues and I began a study of interviews with adolescents on the autism spectrum compared with neurotypical controls, and one of the features that we looked at was filled pause usage. We found a significant difference in UM vs. UH usage; and subsequently learned that some researchers from OGI had reported a similar finding in a poster at the 2014 International Meeting for Autism Research ("Fillers: Autism, gender, and age", 7/30/2014).

A couple of weeks later, this came up in coffee-break conversation at the Methods in Dialectology meeting in Groningen, and a few of the people sitting around the table in the break room immediately pulled out their laptops and started looking at other datasets. To our surprise, we found essentially the same pattern in the Philadelphia Neighborhood Corpus, in the (spoken part of) the British National Corpus, in the Edinburgh-Glasgow Map Task Corpus, and in collections of Dutch, German, and Norwegian conversational speech. This work has continued (for a partial progress report, see "UM / UH in Norwegian", 10/8/2014), and we hope to finish a journal paper on the topic over the holiday break. As part of the effort, I've looked a bit more closely at one of the datasets used in my 2005 post, and below I'll show you a few of the resulting pictures.

Accent elimination class

In a better world, the speakers of the "standard" variety would take a prejudice elimination class instead.

UM / UH map in the media

Jack Grieve's map ("UM / UH geography", 8/13/2014) has been featured in an article by Nikhil Sonnad, "Um, here’s an, uh, map that shows where Americans use 'um' vs. 'uh'", Quartz 9/15/2014. Unfortunately, the lovely map in the article reverses the UM and UH areas  (just as I did in the first version of the 8/13 post):

Um and Uh in Dutch

Below is a guest post by Martijn Wieling, following up on a series of LLOG postings over the years on the effects of sex, age, geography and other factors on the relative frequency of the filler words um and uh: "Young men talk like old women", 11/6/2005; "Fillers: Autism, gender, and age", 7/30/2014; "More on UM and UH", 8/3/2014; "UM UH 3", 8/4/2014; "Educational UM / UH", 8/13/2014; "UM / UH geography", 8/13/2014; "UM / UH: Life-cycle effects vs. language change", 8/15/2014; "Filled pauses in Glasgow", 8/17/2014.

I was surprised to see this effect in the first place; and more surprised to see it robustly replicated in a variety of American English datasets; and even more surprised to see the same pattern in Glasgow. The fact that the same pattern is also found in Dutch raises some interesting questions, about which more later.

Transitive marvel wonders reader

From J.M.:

Am I misreading this cryptic headline (I do confess my severe deficiency of "urban cool"), or has "marvel" become a transitive verb, a synonym for "amaze"? "Rihanna front row as Wang urban cool marvels New York", AFP 9/7/2014.

More on tonal variation in Sinitic

In a number of posts, we have discussed departure from stipulated tonal configurations in speech, e.g.:

"Dissimilation, stress, sandhi, and other tonal variations in Mandarin "

"When intonation overrides tone"

"Where did Chinese tones come from and where are they going?"

In this post, we will focus on the wide variation of tone in names for some family relationships.

Dissimilation, stress, sandhi, and other tonal variations in Mandarin

A few months ago on the Penn campus I heard a Chinese guy and a girl having a conversation in Mandarin, and I was surprised when he twice said, "Wo3 ming2bai4 le."  The rest of his speech was standard, but then he came out with this strange transformation of "Wo3 ming2bai le".  Of course, I shouldn't have been surprised, because I've heard the exact same thing before.  Nonetheless, it still sounded odd to me, since from first-year Mandarin on I've had it drilled into me that this sentence should be pronounced "Wo3 ming2bai le" and that any other pronunciation of ming2bai was wrong.  This was reinforced by the canonical pronunciation ming2bai given in dictionaries and other authoritative sources.

Read the rest of this entry »

ER and ERM in the spoken BNC

From John Coleman:

Inspired by your recent Language Log pieces, I tried an analysis of "er" vs "erm" in the Spoken BNC. These are the two main transcriptions for filled pauses labelled as "UNC" in the Claws-5 tagset and also "UNC" in the richer set of pos labels used in BNC. I.e. they are distinguished from items labelled as ITJ / INTERJ, in which the few tokens of "uh" and "um" are classified. These "uh"s are almost all in "uh huh" meaning "yes", and many of the "um"s and "mm"s are also in contexts where the "yes" sense is clear. So I disregarded the ITJs and restricted the analysis to UNC "er" and "erm", which are far more numerous in any case. As these are mostly nonrhotic dialects one can interpret "erm" as just schwa + nasality, with no implication of rhoticity; ditto for "er".

Read the rest of this entry »

UM / UH: Life-cycle effects vs. language change

In English-language conversations, older people tend to use UH more often and UM less often. And at every age, men tend to use UH more than women, and women tend to use UM more than men.  These effects are large and robust – they've been documented in at least five independent datasets, from both North American and Great Britain — for details, see the links at the end of this post.

The cited patterns are consistent with two quite different classes of explanation:

  • There might be a language change in progress, with older people reflecting the patterns of an earlier time and younger people showing the language of the future, while women are leading the change, as they often do.
  • There might be stable gender and life-cycle effects, so that the UM and UH sex and age associations looked the same a few decades in the past, and will look the same a few decades in the future.

And there's an independent question about the functions of the classes of vocalizations that we transcribe as UM and UH:

  • Perhaps UM and UH are simply alternative expressions of the same compositional or communicative function — say, two different (classes of) ways of stalling for time in the process of speaking — or alternatively
  • perhaps UM and UH have partly or entirely different functions, and it's differences in the frequency of these functions that are associated with age, sex, and so on.

In neither case are the alternatives mutually exclusive — the truth might be some mixture of the two.

Yesterday, Joe Fruehwald looked at UM and UH usage in a dataset with enough time depth that we can tell the difference between a change in progress and a stable life-cycle effect. And he found that the truth seems to be a bit of both.

Read the rest of this entry »

