Wow, patterns!

« previous post | next post »

In "Wow…?", 7/17/2011, I presented 10 isolated examples of "wow" or "oh wow" from published telephone conversations, and invited readers to judge the intensity and valence of each of the ten items (where "valence" is taken to mean the speaker's apparent negative or positive evaluation of the situation under discussion). There were 56 usable responses — I discarded another 5 or 6 because of  problems like 9 or 11 judgments instead of 10. I've done some simple analysis, described below.

The 56 sets of usable responses were well differentiated and fairly consistent: people evaluated these utterances in a lawful way. This kind of survey has promise as a source of input for efforts to learn the mapping between acoustic properties and human responses.

There's no obvious independent check on the "intensity" judgments, so the main question was how consistent they would be. In the case of the "valence" judgments, we can also look at the context to see how the speakers seems to be evaluating the state of affairs that they're responding to.

Here's a plot of the mean intensity and valence responses to the ten test items (item 10 is coded as "0" for graphical convenience):

Here are the values, with standard errors:

Item Mean I Std Error Mean V Std Error
(1) 4.77 0.16 5.04 0.13
(2) 4.41 0.18 4.84 0.19
(3) 2.54 0.15 2.57 0.16
(4) 3.32 0.16 2.54 0.19
(5) 4.96 0.16 5.48 0.14
(6) 4.95 0.17 4.09 0.21
(7) 4.70 0.13 4.68 0.20
(8) 4.39 0.14 5.86 0.15
(9) 2.55 0.15 4.71 0.16
(10) 2.07 0.14 2.82 0.16

As several respondents observed, there's an obvious correlation between intensity and valence estimates. Specifically, the correlation between the mean intensity and mean valence responses is r=0.69 — but there seems to be some additional structure in there as well.

For a more graphical presentation of the location and differentiation of the responses, here's a plot of 100 "bootstrap" re-estimates of the means:

If you want the raw data, here are the valence judgments and the intensity judgments.

Now let's look at — and listen to — the contexts. All of these examples came from the Switchboard collection, which comprises about 2,400 two-sided telephone conversations among 543 speakers from all areas of the United States. These calls were recorded by Texas Instruments in 1990-1, under DARPA sponsorship, and distributed by the Linguistic Data Consortium starting in 1992-3.

(1) From sw02153. This is a discussion of paying for college. A's husband went to Notre Dame; B went to Wisconsin, and was able to pay off his loans within two years; A is (positively) impressed by this financial success.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: well we're still writing checks for the loans for Notre Dame
B: um-hum yeah i'm sure that that's uh it's not uh cheap either
A: but he's really happy he went there
B: well that's good
B: i i got uh by with the minimal amount of financial aid and was able to pay it all off within two years of graduation
A: wow

(2) From sw02201. This is a discussion of the bad consequences of failing a drug test at work. B tells the story of someone who failed a drug test due to taking prescription medicine; and was suspended without pay during obligatory counseling; A is (negatively) impressed by the financial consequences of this situation.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: oh i didn't know they took you off the job.
B: i was that i was told that uh somebody had uh some woman had had a cold or something and had some cough medicine prescribed and her husband or maybe the other way around one of them anyway had uh come down with the same thing and used some of the medicine and was tested the next day and failed and was out of a job until he went through counseling
A: wow
B: and it's like you know i can't afford that kind of loss of income you know
A: who can afford that my God i can't afford to miss a day let alone ((six))
B: i mean i'm like then yeah the first time so what i'll go explain myself and

(3) From sw02273.  This is a discussion of spending time with children. B explains that his son is off on a world trip, and has been gone for seven months; A is impressed by this long period of separation. The evaluation is negative, in that it reminds him that he didn't spend enough time with his own children, though perhaps there is some positive evaluation of the child's enterprise and independence.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

B: uh my oldest son he's uh out seeing the world right now he's in australia at the present time
A: wow
A: so you don't get a chance to to spend much time with him then until when he comes home
B: well he's yeah he's been gone for about seven months now we're expecting him to be getting home here before too much longer hopefully
A: wow
A: i uh unfortunately didn't spend enough time with my children uh where i i had a lot of things that i thought that i needed to do that were more important than than spending time with my children in fact when they were younger

(4) From sw02274. Two women are commiserating about health care costs. B notes that her out-of-pocket medical expenses, not covered by her health plan, were big enough to itemize on her tax return; A's "wow" registers appreciation for this bad situation.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

B: you know i'm thinking of you know i ((was)) just doing my tax return for last year
A: um-hum
B: you know if your medical bills are more than a certain percent of your income
A: um-hum
B: you know it's worthwhile to itemize and and last year i was the first year we reached that mark
A: oh you did
B: where my medical where it was worth it to itemize for medical expenses
A: wow
B: because they were just they were that much of a percentage of my income
A: um-hum
B:that it just
A: um-hum
B: you know i just did it it's very frustrating a hot topic for me
A: i know well well you know you're not the only one though um i'm in a situation where when I started my job …

(5) From sw0231. It's a discussion of hobbies; B makes clothes, and has made clothes for weddings; this positively impresses A.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: what do you do?
B: well i sew
A: oh
B: clothes you know
A: yeah
B: that's like a hobby because i don't ever make anybody nothing but me
A: [laughs]
B: or maybe little kids i've i've done weddings but
A: oh wow
B: not no- not big weddings like you know
A: um-hum
B: big big weddings just bridemaids' dresses and stuff like that for people that i know

(6) Again from sw02317. The same two women are discussing ballooning. A is positively impressed by the size of the Albuquerque balloon festival.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: what about that one they do in Albuquerque
B: now that's old that is that is the largest
A: yeah that's the oldest one isn't it
B: yeah that's the largest one in the world they they have over six hundred balloons
A: wow
B: um-hum

(7) From sw02390. In a discussion of credit cards, B talks about a store credit card with a very high interest rate, and A is (negatively) impressed by the rate.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: credit cards almost seem unfair to a person who's who's got a victim of impulse buying
B: oh yeah store credit cards
A: like unfair advertising or something
B: store credit cards are even higher interest rate than say Master Card or Visa
B: because i i had a i had a Neiman Marcus card for a while
B: and i used it once and then i cut up because the interest rate was like almost twenty three percent
A: oh wow
B: so i i cut i cut that thing up

(8) From sw02430. In a discussion of automobile maintenance, A is (positively) impressed that B replaced a timing chain.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

A: well what is it that you prefer to provide as far as maintenance on your vehicle
B: i do all my own maintenance matter of fact i just finished putting a timing chain in my wife's Toyota
A: wow how about that
B:i do i do all of that myself

(9) Again from sw02430. Six  minutes later,  B continues to impress. His interlocutor may be a bit wowed out at this point…

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

B: and i even lubricate the car myself
A: wow
B: because i have a i have means of of running it up on on jack stands and and you know i have a creeper and i crawl around underneath it so it's not a big deal

(10) From sw02517. Discussing cats. A is (positively) impressed by the fact that a four-month-old kitten weighs five and a half pounds.

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

B: but this one is almost all black she's black and chocolate and silver
A: hum she sounds really pretty
B: hm she is she she's gorgeous she
A: how old is she
B: she is four months old
A: well she's just a tiny little thing they're adorable at that age too
B: hum five and a half pounds
A: wow
B: she's uh laying right in front of my keyboard here on the desk right now

The valence judgments were roughly as consistent across listeners as the intensity judgments were, but we can see that this consensus doesn't correspond consistently to the implicit content of the context the utterances came from. There's apparently some signal there, but it seems that listeners are influenced by the perceived intensity. Wows where the speaker seems animated and aroused tend to be interpreted as positive in valence, while low-intensity wows tend to strike listeners as emotionally negative.

I should also note again the obvious fact that the prosodic and emotional landscapes are much more complex than the aspects this little experiment tried to address. But I'll leave the rest of the discussion to the comments for now.



14 Comments

  1. Ryan Denzer-King said,

    July 19, 2011 @ 10:11 am

    I'm wondering how well things like average and peak amplitude, average and peak F0, and average and peak intensity, as well as duration, would predict the "intensity"' responses, either on their own or in combination. From my casual browsing I've assumed that I can't access the Switchboard collection since I'm not part of an LDC member organization?

    [(myl) Actually, Rutgers has a license — Marian Reed will send you information about the contact person there. If you have any trouble getting a copy from them, let me know and I can send you a duplicate.]

  2. Russell said,

    July 19, 2011 @ 10:44 am

    Ryan:

    You might find that the CS department has access to some LDC corpora, and might be able to convince someone there to do a little resource-sharing (and maybe even on long-term basis).

    Alternatively, there are some smaller speech corpora distributed freely. I've used the Santa Barbara Corpus of Spoken American English and the CallFriend databases in some of my projects, and they're findable at TalkBank. A preliminary search finds 36 wows in latter, and 11 in the former.

  3. Ben Hemmens said,

    July 19, 2011 @ 10:53 am

    I really thought this was going to be junk. I wondered whether we were supposed to use the 1-7 scales to represent our imagined spectrum of what wows could be or should set 1 and 7 to the extremes of this set. And after doing the series once I was sure I had assigned more or less random scores. So, several hours later I gave it another go, and was surprised to see I gave quite similar scores.

    Well, well.

    Then again: that the intensity scores are fairly consistent, but the valence scores largely follow the intensity is not really that strange of a result. I don't think we're too good at the valency without context.

  4. Viktor said,

    July 19, 2011 @ 11:02 am

    I am surprised by how much my judgement as a non-native differs from that of natives. I trended in the right direction but I had a very hard time picking up negative valence. Even when listening through them again I can't quite pick up on whatever quality it is I should listen for.

    That'll teach me to be too cocky about my skills with the English langauge!

  5. Russell said,

    July 19, 2011 @ 11:28 am

    It looks like "wow" communicates, "that's unexpected" and "I'm aligning my pos/neg evaluation of the situation with yours." That is, "wow" is not the optimal way to introduce something like, "but hey, you don't have to look at it that way." This makes it a useful backchannel, but potentially dangerous if you don't listen closely and deploy it at the wrong time.

    So then what we were trying to do was see if acoustic features of a "wow" reflect the stance that the "wow"-speaker believes the interlocutor had. Not that this means it should be more difficult or variable, of course.

    (I guess my description doesn't really work for things like "Wow, we got totally different answers," which I would hesitate to say specifically communicates a positive or negative evaluation. "Wow" also need not even be a response to anything linguistic: Wow, look, they have three moons on their planet. Okay, so some revisions are needed.)

    [(myl) It's plausible that "wow" simply communicates a certain kind of surprise, with the "I'm aligning my pos/neg evaluation of the situation with yours" part just being the default for non-oppositional dialogue. Like other such implicatures, it can be overriden explicitly.]

  6. Sili said,

    July 19, 2011 @ 1:01 pm

    This kind of survey has promise as a source of input for efforts to learn the mapping between acoustic properties and human responses.

    The Zooniverse are looking for new projects that are best solved by throwing lots of (amateur) man-hours at them.

  7. Russell said,

    July 19, 2011 @ 2:21 pm

    @myl

    My intuition is that "wow" encodes some measure of stance-taking that's not necessarily present in, e.g., "huh!" But yes, it may be farfetched to say that it specifically encodes that sort of evaluation alignment.

  8. Bathrobe said,

    July 19, 2011 @ 5:30 pm

    Would the use of 'wow' mark one as an sympathetic or attentive listener? Hearing a person use it would make them sound like they are quite emotionally involved in what you are saying. Also, are there dialect differences? Heavy use to me sounds rather American, but I could be wrong. I don't think I use it that much, but I've never really monitored myself.

  9. Tim Leonard said,

    July 19, 2011 @ 7:49 pm

    Mechanical Turk could provide evaluations cheaply with lots of evaluators who speak English as a second language.

  10. Adrian Morgan said,

    July 19, 2011 @ 9:52 pm

    Regarding the mean intensity scores, my biggest surprise is that #3 was not the absolute lowest. I thought it was clearly so, but according to the hive mind, #10 was even less intense. I would rank the clips from least to most intense as: 3 10 9 4 2 7 8 1 5 6, which is similar to the mean results.

    Regarding the mean valence scores, my biggest surprise is that #6 got such a mediocre result. I would rank the clips from lowest to highest valence as: 4 9 10 3 1 8 2 7 5 6, which bears no relation to the man results. (This is not the same as what I said in my official survey response, as I've made some changes with hindsight, in particular moving 1 and 8 down.)

    Some scenarios that made sense to me for each clip:

    1 – Response to anecdote about eccentric behaviour: not good or bad, just extreme.
    2 – Intrigued. Valence focused on fact of the telling rather than on what is told.
    3 – Completely emotionless, sounded like voice synthesiser.
    4 – Speaker is bored and making no secret of it it.
    5 – Sounded like someone putting on voice. e.g. children's animal cartoon character.
    6 – As per 5. Influenced by thinking wow in children's cartoon would be high valence.
    7 – This was the hardest. No hypothetical scenario came to mind.
    8 – Unsure of valence, but sounded cognitive: "I'll have to think about that" undertone.
    9 – Speaker uninterested and dismissive of the topic, but trying to be polite.
    10 – Speaker finds topic trivial, but is content, doesn't mind much.

  11. Rick Sprague said,

    July 19, 2011 @ 10:07 pm

    MYL: "…we can see that this consensus doesn't correspond consistently to the implicit content of the context the utterances came from."

    I don't think we did too badly. If you compare the direction of the mean valences (from the scale center at 4) to the positive/negative valences predicted for a sympathetic wow-utterer based on the context, we got 7 out of 10 right—which is better than chance, I imagine. The ones we got wrong were samples 2, 7, and 10. Of these, the first two were pretty close to neutral (4.86 and 4.68, respectively) and the last had the weakest mean intensity of any sample (2.07), all of which suggests to me that the poll respondents might have had less convincing evidence/been more uncertain about these samples than the others.

    I'd also like to point out that it's not established that the wow-utterers were themselves consistent with the sympathy-predicted valences. For example, you said of sample 9 that "his interlocutor may be a bit wowed out at this point." His weak "wow" might have been an effort to convey the pragmatic message "I'm bored with this topic" without being overtly rude. You can posit similar unsympathetic mindsets for the wow-utterers in most of the examples, based on such mental/emotional states as envy, relief, doubt, boredom, or ignorance, which they're choosing to hide from their interlocutor. So the fact that the poll respondents didn't sense a sympathy-generated valence in three cases doesn't prove that they missed one.

  12. Dan M. said,

    July 19, 2011 @ 11:58 pm

    Wow, I'm surprised that there was this much consensus. I'm particularly surprised by the agreement on valence.

    I agree with Adrian in thinking that 3 was markedly less intense than 10 when in isolation, but actually hearing it with more from that speaker makes the consensus that it was slightly more intense seems quite reasonable. That makes me wonder if there's inter-accent error in intensity judgement, because I also felt that speaker had a strong regional accent (from Minnesota?).

  13. General updates 2011: July (including new Zazzle product) « The Outer Hoard said,

    July 21, 2011 @ 12:42 am

    […] asking people to give scores for valence and intensity to ten recorded instances of the word "wow", in order to investigate both the accuracy and consistency of people's context-free […]

  14. Charly said,

    July 23, 2011 @ 3:26 am

    me: my heart rate was f*cking insane

    155ish most the time, as high as 180

    c: wow

    ___________

    me: "fantastically pathetic pink garden deer"

    Guess how much that is

    C: haha

    how much

    me: 950 [dollars]… This is what is wrong with bourgeois American society

    Christopher: haha

    wow

    ________

    me: Wow I leave Seattle and we let the [Anaheim] ANGELS get into first place?!
    And 7.5 gb [games behind]in the wild card? Yuck

    ________

    I think these three real-life examples from recent google chats of mine show various shades of connotative meaning for "wow." The first registers surprise and impressedness; the second, negative surprise and disbelief; the third serves as a conversational "filler" phrase (since it was not in response to anything previously said but a fact found out outside the context of the conversation) as well as negative surprise.

RSS feed for comments on this post