In English-language conversations, older people tend to use UH more often and UM less often. And at every age, men tend to use UH more than women, and women tend to use UM more than men. These effects are large and robust – they've been documented in at least five independent datasets, from both North American and Great Britain — for details, see the links at the end of this post.
The cited patterns are consistent with two quite different classes of explanation:
- There might be a language change in progress, with older people reflecting the patterns of an earlier time and younger people showing the language of the future, while women are leading the change, as they often do.
- There might be stable gender and life-cycle effects, so that the UM and UH sex and age associations looked the same a few decades in the past, and will look the same a few decades in the future.
And there's an independent question about the functions of the classes of vocalizations that we transcribe as UM and UH:
- Perhaps UM and UH are simply alternative expressions of the same compositional or communicative function — say, two different (classes of) ways of stalling for time in the process of speaking — or alternatively
- perhaps UM and UH have partly or entirely different functions, and it's differences in the frequency of these functions that are associated with age, sex, and so on.
In neither case are the alternatives mutually exclusive — the truth might be some mixture of the two.
Yesterday, Joe Fruehwald looked at UM and UH usage in a dataset with enough time depth that we can tell the difference between a change in progress and a stable life-cycle effect. And he found that the truth seems to be a bit of both.
The dataset Joe used was (transcripts from) 395 speakers over the four decades of the Philadelphia Neighborhood Corpus. He fitted a models of UM and UH frequencies with a three-way interaction between age, sex, and year. Since the number of speakers is relatively small, he sensibly chose to plot the predictions of the fitted model rather than the (much noisier) observations — so think of those plots as a graphical expression of the relationships found by the statistical analysis.
If you're familiar with R, Joe used knitr to produce a neat record of his efforts that you'll want to take a look at. Below I'll just present his graphics and the accompanying narrative, omitting the R code.
Here's the predicted frequency of UM per 1000 words, with the lines joined up according to year of study.
You can see the apparent-time change, which shifts upwards every year of the study.
Connecting up the predicted values by Date of Birth cohorts gives a different view:
It looks like if anything, across a DOB cohort’s lifespan, they shift more towards more UM, which is also the direction of the real & apparent time change.
And here are some more UM plots, emphasizing the sex differences:
Now the corresponding plots for UH. Again, there’s an apparent-time trend towards less UH, which advances in that direction over each year of the study:
Past LLOG posts on this topic:
"Young men talk like old women", 11/6/2005
"Fillers: Autism, gender, age", 7/30/2014
"More on UM and UH", 8/3/2014
"UM UH 3", 8/4/2014
"Male and female word usage", 8/7/2014
"UM / UH Geography", 8/13/2014
"Educational UM / UH", 8/13/2014