UM/UH accommodation

Over the years, we've presented some surprisingly consistent evidence about age and gender differences in the rates of use of different hesitation markers in various Germanic languages and dialects. See the end of this post for a list; or see Martijn Wieling et al., "Variation and change in the use of hesitation markers in Germanic languages", forthcoming:

In this study, we investigate cross-linguistic patterns in the alternation between UM, a hesitation marker consisting of a neutral vowel followed by a final labial nasal, and UH, a hesitation marker consisting of a neutral vowel in an open syllable. Based on a quantitative analysis of a range of spoken and written corpora, we identify clear and consistent patterns of change in the use of these forms in various Germanic languages (English, Dutch, German, Norwegian, Danish, Faroese) and dialects (American English, British English), with the use of UM increasing over time relative to the use of UH. We also find that this pattern of change is generally led by women and more educated speakers.

For other reasons, I've done careful transcriptions (including disfluencies) of several radio and television interview programs, and it occurred to me to wonder whether such interviews show accommodation effects in UM/UH usage. As a first exploration of the question, I took a quick look at four interviews by Terry Gross of the NPR radio show Fresh Air: with Willie Nelson, Stephen KingJill Soloway, and Lena Dunham.

Willie Nelson was born in 1933; Stephen King in 1947; Jill Soloway in 1965; and Lena Dunham in 1986.  Nelson and King are male, while Soloway and Dunham are female. So both on the basis of age and on the basis of gender, we expect some differences in UM/UH usage, And the interview transcripts confirm the expectation — the percentage of UM among UM+UH hesitation markers varies from 0% for Willie Nelson to 69.4% for Lena Dunham:

Words #UM #UH UM per KW UH per KW UM/(UM+UH)
Willie Nelson 3318 0 92 0 27.7 0%
 Stephen King  4750  4  65  0.9  13,7 5.8%
 Jill Soloway  4327  42  25  9.7  5.8 62.7%
 Lena Dunham  5740  25  11  4.4  1.9 69.4%

What about Terry Gross? She's female, and born in 1951, so we might expect her rates to fall somewhere in the middle of the observed distributions. And overall, that's true — in the transcripts of those four interviews (or at least the parts that are presented on the Fresh Air website) her combined numbers show an UM proportion of 52.8%:

Words #UM #UH UM per KW UH per KW UM/(UM+UH)
Terry Gross 5885 38 34 6.5 5.8 52.8%

But if we look at the number interview-by-interview, we see the sort of pattern that we would expect if some accommodation were going on, with her UM percentages varying from 26.7% with Stephen King to 82.6% with Lena Dunham:

Words #UM #UH UM per KW UH per KW UM/(UM+UH)
TG(Willie Nelson) 1482 6 14 4.0 9.4 30%
 TG(Stephen King)  1172  4  11  3.4  9.4 26.7%
 TG(Jill Soloway)  1516  9  5  5.9  3.3 64.3%
 TG(Lena Dunham)  1715  19  4  11.1  2.3 82.6%

Four interviews is not nearly enough to draw any strong conclusions from. But I predict that we should find similar accommodation effects, at least for some speakers, if we look more broadly.

Unfortunately for this purpose, the transcripts provided on the Fresh Air website omit all hesitation markers, and editing them in by hand takes 1.5 to 2 times the interview time. But there are other collections with more complete transcriptions, so you'll be seeing some more on this topic in the future.

1 Comment

  1. D.O. said,

    November 24, 2015 @ 9:51 pm

    Prof. Liberman, because you've already done this hard work of annotating the 4 interviews, it might be interesting to look at the process of accommodation. Does Mrs. Gross' UM-frequency change in a regular way during the interviews adjusting to the UM-frequency of her interlocutor?

    [(myl) I don't think that the amount of data available here is enough for that to be meaningful. Note that TG's overall rate of hesitation-marker usage (UM+UH) in these interviews is around 12.3 per thousand words, and her word counts range from 1172 to 1715 (mean 1471). So if we looked at (say) the first quarter of the interview vs. the last quarter, we'd be down to expected overall hesitation-marker counts of about 4, with a hypothetical base UM/(UM+UH) rate of about 50%, and asking whether there's evidence of slight deviations up or down from the base rate. This might be feasible with 100 interviews, but I don't think it makes statistical sense with four.

    Also, these are not live interviews, and I don't know how much conversation there might have been outside the broadcast segments. And in fact the Willie Nelson recording is actually a re-run, combining pieces of two earlier interviews. So both for sample-size reasons and for other reasons, this is not the right dataset in which to look for possible dynamics of UM/UH accommodation.]

