Nine years ago, I stumbled on an unexpected fact about the filled pauses UM and UH ("Young men talk like old women", 11/6/2005). I found, as I expected, that older people tend to use UH more often than younger people do, and that males tend to use UH more than females. The surprising thing was that UM seemed to work in the opposite way, at least in the (large) American conversational-speech corpus that I looked at — younger people use UM more than older people, and females use UM more than males:
Last summer, some colleagues and I began a study of interviews with adolescents on the autism spectrum compared with neurotypical controls, and one of the features that we looked at was filled pause usage. We found a significant difference in UM vs. UH usage; and subsequently learned that some researchers from OGI had reported a similar finding in a poster at the 2014 International Meeting for Autism Research ("Fillers: Autism, gender, and age", 7/30/2014).
A couple of weeks later, this came up in coffee-break conversation at the Methods in Dialectology meeting in Groningen, and a few of the people sitting around the table in the break room immediately pulled out their laptops and started looking at other datasets. To our surprise, we found essentially the same pattern in the Philadelphia Neighborhood Corpus, in the (spoken part of) the British National Corpus, in the Edinburgh-Glasgow Map Task Corpus, and in collections of Dutch, German, and Norwegian conversational speech. This work has continued (for a partial progress report, see "UM / UH in Norwegian", 10/8/2014), and we hope to finish a journal paper on the topic over the holiday break. As part of the effort, I've looked a bit more closely at one of the datasets used in my 2005 post, and below I'll show you a few of the resulting pictures.