Language Log

Filled pauses in Glasgow

August 17, 2014 @ 9:06 am · Filed by Mark Liberman under Language and gender

In previous posts about filled pauses, we've seen a consistent and large sex difference: women use (what's transcribed as) "um" somewhat more than men do, and men use (what's transcribed as) "uh" a lot more than women do. This pattern has been found in two large conversational telephone speech corpora involving a mix of ages and American regions, in a collection of undergraduate speed-dating transcripts, in a collection of undergraduate "tell me about your weekend" interviews, and in a collection of several hundred sociolinguistic interviews collected over a period of four decades in Philadelphia.

There are apparently also effects of age, of region, of time period, of years of education, of Autism diagnosis, and so on. Today I'll add one more geographical data point — young adults from the Glasgow area — and one more variable — friends vs. strangers.

For this morning's Breakfast Experiment™, I've analyzed the transcripts from the HCRC Map Task Corpus. You can follow this link to read about the design of this collection — the relevant part is here:

Subjects are necessarily paired for the task, and since the pairing is under the experimenter's control we were able to vary systematically the familiarity between the participants, by asking subjects to attend with a friend. Each pair of familiar subjects was tested in coordination with another pair who were unknown to either member of the first pair. Two pairs formed a quadruple of subjects who used among them a different set of four map-pairs, with maps being assigned to pairs by Latin Square. Each subject participated in four dialogues, twice as Instruction Giver and twice as Instruction Follower, once in each case with a familiar partner, and once with an unfamiliar partner. As Instruction Giver they gave directions on the same map, but when following they used different maps each time. Half of the subjects gave instructions to a familiar partner first, the others to an unfamiliar partner first. […]

All sixty-four subjects who participated were undergraduates at the University of Glasgow. Sixty-one of the 64 subjects were Scottish, 56 of them having been born or brought-up within a thirty mile radius of Glasgow. Half the subjects were male, half were female, and their mean age was 20. […]

The experiment uses a Latin Squares design. Participants were asked to come to the experiment with someone they knew, thus forming familiar pairs.

Among other sorts of annotation, the Map Task transcripts are coded for parts of speech. Among the POS tags used is "FP", for "Filled Pause", which covers eight letter-strings:

FP FILLED PAUSE: eh, ehm, er, erm, hmm, mm, uh, um

Of these, "ehm" and "eh" are by far the most common. But for purposes of comparison with other transcription schemes that don't make so many distinctions, I've lumped the five m-final filled-pause letter strings (ehm, erm, hmm, mm, um) as UM-type, and the three vowel-final filled-pause letter strings (eh, er, uh) as UH-type.

And in addition to dividing transcript-sides by the sex of the speaker, I've also divided them according to whether the interlocutor was a friend or a stranger.

The individual and lumped counts are as follows:

	All	Male	Female	Friends	Strangers
ehm	693	188	505	298	395
erm	139	100	39	48	91
hmm	13	6	7	8	5
mm	155	71	84	79	76
um	108	30	78	50	58
All M-FP	1108	395	713	483	625
eh	651	501	150	330	321
er	162	128	34	74	88
uh	79	48	31	48	31
All V-FP	892	677	215	452	440
M/(M+V)	0.554	0.368	0.768	0.517	0.587

Here are the same numbers scaled by overall word counts per speaker category, in frequency per 1,000 words:

	All	Male	Female	Friends	Strangers
ehm	4.51	2.31	6.97	3.58	5.59
erm	0.90	1.23	0.54	0.58	1.29
hmm	0.85	0.07	0.10	0.10	0.07
mm	1.01	0.87	1.16	0.95	1.06
um	0.70	0.37	1.08	0.60	0.82
M-FP	7.21	4.86	9.84	5.81	8.85
eh	4.23	6.16	2.07	3.97	4.54
er	1.05	1.57	0.47	0.89	1.25
uh	0.51	0.59	0.43	0.58	0.44
V-FP	5.80	8.32	2.97	5.44	6.23
TOTAL WORDS	153780	81354	72436	83132	70648
M/(M+V)	0.554	0.368	0.768	0.517	0.587

General observations:

The frequency of UM is more than twice as great for female speakers compared to male speakers: 0.984% vs. 0.486%.
The frequency of UH is nearly three times as great for male speakers compared to female speakers: 0.832% vs. 0.297%
The proportion UM/(UM+UH) is more than twice as great for female speakers compared to male speakers: 76.8% vs. 36.8%
The frequency of UM-words is somewhat greater between strangers than between friends: 0.88% vs. 0.58%; and the frequency of UH-words also a little greater: 0.62% vs. 0.54%. As a result, the UM/(UM+UH) proportion is nearly the same: 0.587 vs. 0.517.

I'll also note that here again, the males are on average somewhat more talkative than the females.

I remain puzzled about what is really going on here, and I continue to think that we'll need to look at the range of rather functions for specific instances of filled pauses in order to understand the nature and source of the effects.

Past LLOG posts on UM vs. UH:

"Young men talk like old women", 11/6/2005
"Fillers: Autism, gender, age", 7/30/2014
"More on UM and UH", 8/3/2014
"UM UH 3", 8/4/2014
"Male and female word usage", 8/7/2014
"UM / UH Geography", 8/13/2014
"Educational UM / UH", 8/13/2014
"UM / UH: Life-cycle effects vs. language change", 8/15/2014

August 17, 2014 @ 9:06 am · Filed by Mark Liberman under Language and gender

Permalink

16 Comments

Ben Hemmens said,

August 17, 2014 @ 9:31 am

Try Dundee, famous (at least to itself) for turning everything into an eh.
An admirer said,

August 17, 2014 @ 9:38 am

I'm curious whether the way these fillers are spoken makes a difference to why they're used. On a completely unscientific hunch I'd think "um" is more reserved than "uh", as the mouth stays open.

When I have time, I'll have to follow up and read what the genders of the pairings were. Aforementioned hunch would expect women to be more reserved towards men in general than men towards women, given men seem to be much more prone to harassment.

Then again, maybe I'm just asking leading questions because I read a lot about that subject recently. Still, the opposite gender to matching gender comparisons might be an interesting tidbit to add to your analysis.
Coby Lubliner said,

August 17, 2014 @ 10:11 am

I wonder what the phonetic meaning of "er" and "erm" is. In non-rhotic England, "er" is the same as "uh", isn't it? But I believe that Glasgow speech is rhotic, so does "er" imply an "r" sound?
Gregory Kusnick said,

August 17, 2014 @ 12:22 pm

Coby Lubliner: I wondered about that too. I'm guessing they're non-rhotic, and the difference between er and uh is the difference between look and luck.

[(myl) Seem like a good guess to me. But the Map Task dialogue recordings are available, so we can check. I'll do so when I have a spare couple of hours.]
Pflaumbaum said,

August 17, 2014 @ 3:56 pm

Yes er represents the NURSE vowel in non-rhotic English – realised in most English English accents as something like a lengthened schwa, when not immediately preceding a vowel.

Scottish accents, meanwhile, are not only rhotic, but in most cases lack a NURSE vowel as such. They have /ɛɾ/, /ɪɾ/ and /ʌɾ/, corresponding more or less to what you'd expect from the spelling.

However, I'm pretty certain the interjection spelt er is generally non-rhotic in Scottish English too, varying between /ʌː/ and /ɛː/ (at least when not followed closely by a vowel.)

Alex Ferguson has a few versions, with and without final /m/, in the first few minutes of this interview. All sound non-rhotic to my ear. (It's true he's lived in Manchester for the best part of thirty years, but he's still pretty Glasgow and most of his other r's are rhotic.)
Bob Ladd said,

August 17, 2014 @ 5:26 pm

Since the HCRC Map Task Corpus is a local product and was for some time a minor local industry, it was easy for me to check with some of its creators. Part of one response was "Of course there are no FPs with [r] in the [Map Task Corpus]", which is what the commenters on this thread already suspected. As for the multiple spellings, the same person replies that "apart from the nasal ending, this largely seems to have been at the whim of the transcribers". If that's the case, my guess is that the cases of er/erm come from Southern English transcribers, the cases of uh/um from North American ones, and the rest from people who were paying some attention to the local phonetics of the vowel, which tends to be pretty eh-like in Scotland.
D.O. said,

August 17, 2014 @ 8:43 pm

I looked at the same corpus (it's free!) and tried to figure out where in the conversation one inserts this filled pauses. Nothing dramatic, but here's the results. Maybe someone is more sharped-eyed than I am.

First, classified by the conversation side (not turn, the same speaker may continue before or after the filled pause). Percentages
     unique start end inside
'UM' 18.4 38.9 15.1 27.6
'UH'   4.8 33.9 11.3 50.0
all 12.4 36.7 13.4 37.6

"Unique" means the whole side consists of the filled pause. Start and end exclude "unique" category, "inside" means there's some speech before and after.

Now by turn, percentages
     unique start end inside
'UM' 6.6 32.6 6.7 54.2
'UH' 1.9 27.5 5.2 65.5
all 4.5 30.3 6.0 59.2

Separate results for women and men are not very different from the overall picture.

The only thing that strikes the eye is that 'UM' much more correlates with the beginning/end of the side than 'UH', but it does not lead to the switch of the turn.
Bob Ladd said,

August 18, 2014 @ 12:35 am

In case my earlier comment gives the impression that the transcribers were simply careless about the phonetics, a response from one of the other originators of the Map Task Corpus (and indeed, of the Map Task itself) adds a clarification: "…we felt we could about standardize to things that had dictionary spellings – like 'yeah' – but spending the time on the fine details of filler pronunciation … was not worthwhile".
Coby Lubliner said,

August 18, 2014 @ 10:31 am

While "er" is basically just how "uh" is written in England, I know Americans who (perhaps from reading too many English novels) actually use rhotic "er" as their pause filler. Just like American parents who, when reading "Winnie-the-Pooh" to their kids, say "Eeyore" with the R sound instead of "eeyaw" (meant to represent braying).
Levantine said,

August 18, 2014 @ 5:21 pm

Coby Lubliner, I never realised that about "Eeyore" (though I'm a Londoner, so I would pronounce it non-rhotically anyway). Do Scots sound the final R of this name?
Piotr Gąsiorowski said,

August 19, 2014 @ 5:11 am

Just like American parents who, when reading "Winnie-the-Pooh" to their kids, say "Eeyore" with the R sound instead of "eeyaw" (meant to represent braying).

And thus poor Eeyore, a non-rhotic ass becomes a rhotic…
Brett said,

August 19, 2014 @ 10:06 am

As Eeyore (unlike, say, Owl) was one of the actual stuffed toys in the Milne household, he was presumably named by Christopher Robin, but with the spelling chosen by the boy's father. The name has made a lot more sense to me since I learned how it was supposed to be pronounced, but I still wonder why A. A. Milne chose to spell it that way. Is that a standard way to spell a donkey's bray ("hee-haw" in American English) in England?
Gregory Kusnick said,

August 19, 2014 @ 11:07 am

Even in England (and Boston), eeyaw becomes (rhotic) eeyore when followed by a vowel in phrases like Eeyore and Piglet. Perhaps that influenced the spelling.
SK said,

August 19, 2014 @ 1:28 pm

Brett: I think 'heehaw' is normal for us in England as well, as a spelling for a donkey's bray. I certainly wouldn't expect to see the sound spelt out as 'eeyore' or anything similar. And in general, when aiming to spell out a meaningless sound, I think we would use 'aw' more readily than 'ore' – so for example it seems more appropriate to me to think of the noise an ambulance makes as 'nee-naw' rather than 'nee-nore', even though I'd pronounce them both identically.

I think that might be the motivation behind the spelling 'Eeyore' for the character – it looks like it ought to be a real word with content, rather than just a transcription of a noise (and that's why even non-rhotic speakers often don't spot that it does stand for a donkey's bray). That is to say, 'Eeyore' looks like it could have been a name in its own right, just a funny-looking one that you happen never to have seen before, whereas if you just call a donkey 'Heehaw' or the like, you haven't gone very far towards giving it an individual name at all.
Anthony said,

August 19, 2014 @ 2:47 pm

Kipling writes "dorg" rather than "dawg" (granted this is indeed not a transcription of a meaningless sound).
Ray Girvan said,

August 19, 2014 @ 6:08 pm

@SK: Same here. I'm a native English speaker, somewhat rhotic (southern English), and it took me decades to realise that "Eeyore" was not some nonsense fantasy name like "Skeletore" but a representation of the donkey's bray.

Same with "erm" (the filler sound) in print: even knowing what it's supposed to mean, I still read it as /ɜːrm/, not /əm/.

RSS feed for comments on this post

Filled pauses in Glasgow

16 Comments

Ben Hemmens said,

An admirer said,

Coby Lubliner said,

Gregory Kusnick said,

Pflaumbaum said,

Bob Ladd said,

D.O. said,

Bob Ladd said,

Coby Lubliner said,

Levantine said,

Piotr Gąsiorowski said,

Brett said,

Gregory Kusnick said,

SK said,

Anthony said,

Ray Girvan said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta