We've previously observed a surprisingly consistent pattern of age and gender effects on the relative frequency of filled pauses (or "hesitation sounds") with and without final nasals — what we usually write as "um" and "uh" in American English, or often as "er" and "erm" in British English.
Specifically, younger people use the UM form more than older people, while at any age, women use the UM form more than men do. We've seen this same pattern in various varieties of American English and in John Coleman's analysis of the spoken portion of the British National Corpus, and we found the sex effect in the HCRC Map Task Corpus, which involves task-oriented dialogues among college students from Glasgow in Scotland.
It was even more surprising that Martijn Wieling found the same pattern in a collection of Dutch conversational speech. And to make the puzzle more puzzling, Joe Fruehwald's analysis of the Philadelphia Neighborhood Corpus, which includes recordings across several decades of real time, suggests an on-going change in the direction of greater overall UM usage, as well as a life-cycle effect within each cohort of speakers. And Jack Grieve's analysis of Twitter data indicates a pattern of geographical variation within the U.S.
For additional details, see "Young men talk like old women", 11/6/2005; "Fillers: Autism, gender, age", 7/30/2014; "More on UM and UH", 8/3/2014; "UM UH 3", 8/4/2014; "Male and female word usage", 8/7/2014; "UM / UH geography", 8/13/2014; "Educational UM / UH", 8/13/2014; "UM / UH: Lifecycle effects vs. language change", 8/15/2014; "Filled pauses in Glasgow", 8/17/2014; "ER and ERM in the spoken BNC", 8/18/2014; "Um and uh in Dutch", 9/16/2014.
Now Martijn Wieling has found the same pattern in German. His guest post follows.
After conducting the analysis about the uh/um distinction and its relation to gender and age for Dutch speakers, I decided to investigate the same pattern in German. For this purpose I obtained (with help from Thomas Schmidt) frequencies of ‘uh’ and ‘um’ together with age and gender information from the Forschungs- und Lehrkorpus für gesprochenes Deutsch.
The German speakers in this dataset seem to use ‘uh’ (äh or öh) slightly more than ‘um’ (ähm or öhm), 60% versus 40%, but the imbalance is much smaller than for the Dutch data.
A logistic regression mixed-effects regression model predicting the probability of using ‘um’ (as opposed to ‘uh’) revealed that the relative frequency of ‘um’ significantly (p = 0.007) increases for women compared to men and younger as opposed to older speakers (p < 0.0001). The table and figure below illustrate this relationship by showing the relative frequency of ‘um’ in four age groups (each containing approximately 25% of the speakers). (Note that the relative frequency of ‘uh’ can be obtained by subtracting these values from 1.)
|Relative frequency of ‘um’||Male||Female|
|Born between 1930 and 1964||0.204||0.139|
|Born between 1965 and 1981||0.333||0.420|
|Born between 1982 and 1986||0.463||0.543|
|Born between 1987 and 2006||0.495||0.626|
While the graph suggests there to be an interaction between gender and year of birth, this interaction was not significant (p = 0.33). All results of the analysis can be viewed and replicated here: http://www.let.rug.nl/wieling/ll/analysis-German.html.
Above is a guest post by Martijn Wieling.
Several things about all this are interesting, not to say puzzling:
- The pattern (greater UM usage by younger people and females) is robust across across many geographical and social varieties of English, and at least two other Germanic languages, despite what appear to be overlaid changes across time and space;
- These are very large effects in the distribution of a very common feature, and yet no one seems to be consciously aware of them. I stumbled on the American English pattern in 2005 while looking for something else.
Three sorts of explanation seem to be available:
- Hesitation sounds with and without final nasals have some intrinsic properties, e.g. phonetic symbolism, that differentially attract speakers of different ages and genders;
- Hesitation sounds with and without final nasals have different functions, retained across Germanic languages and dialects, which are differentially useful to speakers of different ages and genders (like uncertainly about what to say vs. uncertainty about how to say it);
- The age and sex associations of hesitations sounds with and without final nasals are purely conventional, like the different lateralization of male and female shirt buttons, but have somehow been retained or reinforced over thousands of years and thousands of miles.
None of these explanations seems very plausible to me — but the facts are clear.