Last week, I discussed some of the things that Rev. Jeremiah Wright had to say at the National Press Club about race, language, and the brain ("Wright on language and linguistics", 4/29/2008). But I didn't discuss the passage that many journalists identified as the rhetorical and emotional core of his outburst. (Click the link to hear the audio.)
This is the transcript:
In our community, we have something called playing the dozens.
If you think I'm going to let you talk about my momma,
and her religious tradition, and my daddy, and his religious tradition, and my grandpa,
you got another think coming.
Or is it?
Some newspapers transcribed those last few words the way I did, for example the Boston Globe:
"If you think I'm going to let you talk about my mama," Wright said, "then you've got another think coming."
But many others rendered it a little differently:
“If you think I’m going to let you talk about my mama, and her religious tradition,” he said, pausing a beat, “you got another thing coming.” (New York Times, 4/28/2008)
He added: "If you think I'm going to let you talk about my mama and her religious tradition . . . you got another thing coming." (Washington Post, 4/29/2008)
If you thought that once the Rev. Jeremiah Wright traded his pulpit for a podium he'd be less of a threat to Barack Obama's presidential aspirations, in the words of Wright, "you got another thing coming." (Minneapolis Star-Tribune, 4/30/2008)
Let's leave aside "you've got" vs. "you got", and focus on "another think coming" vs. "another thing coming". The testimony of transcribers is clearly equivocal. As for my own impressions, I first heard "think", but I can hear it as "thing" if I try.
The difference doesn't have any political implications, but several readers asked about it, and it brings up some interesting linguistic issues. So I spent some time this morning writing an explanation of why this word sequence is likely to be ambiguous.
I don't need to say anything here about the history of the eggcorn "another thing coming" for "another think coming", since we've already covered that question at length. (For details, see "Another thing coming", 9/28/2007; "Have another think", 9/28/2008; and also "The thin line between error and mere variation (Part 1 of 2)", 6/29/2004.) The original expression is "another think coming", but "another thing coming" has been competing with it since 1919 at least, and is winning the battle of Google hits these days. So if Rev. Wright made that choice, he's in good company.
The fact that this came up in a recorded question-and-answer session draws our attention to an central feature of this kind of variation: the spelling is different but the sound is similar or identical. As a result, it's often impossible to decide which alternative a speaker chose on some particular occasion. And of course, that's why the variation arises in the first place.
When you write (either version of) this phrase in conventional English orthography, you have to choose between writing "think" and writing "thing". These are very different words. They mean different things, they mostly occur in different contexts, and in many contexts they sound quite different. In isolation, the standard pronunciations, rendered in the International Phonetic Alphabet, look like this:
[θ] represents the voiceless dental fricative at the start of "thistle" or "thought", or the end of "moth" or "wrath".
[ɪ] represents the vowel in "hit" or "miss".
[ŋ] represents the velar nasal at the end of "ding" or "dong".
And [k], of courses, is the voiceless velar stop at the start of "kiss" or the end of "sick".
Except that /k/ is not so simple.
Well, when you look at the details, nothing about pronunciation is simple, alas. But in this case, the tricky part is the way that English speakers usually pronounce /k/ when it's at the end of a syllable, followed by another stop consonant at the beginning of the next syllable.
Let's take a canonical /k/ case first. Between vowels, at the start of a stressed syllable, English /k/ is a voiceless aspirated velar stop.
The "stop" part means that there's a complete closure of the airway. The "velar" part means that the airway is closed by the pressing the body of the tongue against the soft palate, otherwise known as the velum. The "voiceless" part means that the vocal cords don't vibrate during the closure, and the "aspirated" part means that they don't start vibrating again for 60 milliseconds or so after the release of the closure. (The span of time just after the release is filled with noise, created partly by turbulent flow of air through the narrow passage as the tongue body moves away from the palate, and partly by turbulent flow through the glottis as the vocal cords come together to (re-)establish voicing.)
As an illustration, here's an acoustic picture of Amanda Seidl saying the word accomplish, taken from the American English Spoken Lexicon, LDC99L23. (Time goes from left to right; the bottom panel is a waveform display, in which the vertical axis indicates sound pressure level; the top panel is a spectrogram, in which the vertical axis indicates frequency, and the blackness of the plot at a given location indicates the energy of the signal at the corresponding frequency and time.
The colored lines in the middle panel show the voiceless closure of the [k] (in red), the noisy release and aspiration of the [k] (in green), and the voiced region of the stressed second syllable of accomplish, which overlaps with the aspiration in this case.
(You can click on the display for a larger version.)
Now let's look at /k/ in a different kind of context. I've taken this from the start of yesterday's Radio Times on WHYY, in which Marty Moss-Coane interviewed Gary Marcus about his new book, Kluge. She introduces him like this (again, click the link to play the sound):
My guest, psychologist Gary Marcus, uses the word "kluge" to describe memory, language, and the human mind. And this is the way he uses the word: "a clumsy or inelegant yet surprisingly effective solution to a problem".
Marty Moss-Coane is a skillful and practiced professional speaker, and as you can hear if you click the link, she's enunciating especially carefully in this introductory passage, because she wants to be sure that her audience won't miss any details. Nevertheless, if we look at an acoustic picture of her pronunciation of the word effective, this is what we see:
In the middle of effective, there's a /kt/ cluster. And between vowels, at the start of a stressed syllable, /t/ would be a voiceless aspirated stop just like /k/ is. But here, we don't see two closures and two release-and-aspiration segments, one for /k/ and one for /t/ — instead, there's just one closure (marked by the red line) and just one release and aspiration (marked by the green line).
Ok, this means that the /k/ is unreleased. So how do we know that it's there?
Well, this is a curious thing. If we listen to the whole word, we hear a nice clear [kt] cluster. (Well, I do, at least.) But if we listen only to the middle syllable — even including the entire following silence — we don't hear the [k] at all. Instead, we just hear something that you might transcribe in pseudo-phonetic English as "feh", or in IPA as [fɛ].
That's because Marty has used her glottis to stop the voicing — and therefore all the sound — before she makes the [k] closure. What she says for this syllable really *is* "feh".
And when we listen just to the last syllable — again including the whole preceding silence — we don't hear [ktɪv]. In fact, we don't really even hear [tɪv]! Most English native speakers will hear this syllable, played in isolation as [dɪv]. That's because the last syllable of effective is unstressed, and so the aspiration is too short for a proper [t] in an isolated (and therefore stressed) syllable.
For the whole story, you need to take a phonetics course. But a crucial piece of the story, here, is that the 85 milliseconds of silent closure between the second and the third syllable is too long to explain any other way.
OK, now back to Rev. Wright. Listen to the audio of the critical phrase, "… you've got another ??? coming", and look at an acoustic picture of the "think/thing coming" sequence:
In between the nasal murmur [ŋ] at the end of thing-or-think, and the vowel [ʌ] of the first syllable of coming, there's a stop closure of about 55 milliseconds (marked in red) and about 60 milliseconds of aspiration (marked in green). (There's some noise during what I've marked as the "closure", but that is probably the start of the audience reaction, though it's possible that the /k/ closure has weakened to the point that it becomes fricative-like.)
That's just one closure and just one aspiration, not two — so does that mean that there's only one /k/, and therefore that Rev. Wright said "thing coming", not "think coming"?
Well, no. No native speaker of English would release the final /k/ of think in a fluent pronunciation of "think coming", any more than they would in the second syllable of "effective".
So does the duration of this stop closure tell us whether Rev. Wright meant to say one /k/ or two?
No, I don't think so. As an example of why, consider a case elsewhere in his performance where Rev. Wright used a word sequence that clearly does involve two voiceless stops across a word boundary: "They have a different person to whom they are accountable".
Here's the acoustic picture:
Again, the /t p/ sequence yields just one closure and just one aspiration. And in this case, the closure is again about 50 to 55 milliseconds long.
In comparison, there's a nearby phrase "… based on sound bites, based on polls …", in which he says a word sequence ("on polls") where we know that a word-final nasal is followed by a word-initial voiceless stop.
Here's the acoustic picture:
In this case as well, the closure is between 50 and 55 milliseconds long.
We can't conclude anything for sure from one example of each category, and I don't have the time this morning to do a larger instrumental analysis of the Rev. Wright's speech. But these examples are enough to suggest what I expect, on other grounds, that we'd find: the acoustic distributions in fluent speech of the patterns
V N C # C V vs. V N # C V [where V="vowel", N="nasal", C="voiceless stop"]
are going to overlap enough that the case we started with — "think coming" vs. "thing coming" — will often not be clearly assigned to one category or the other.
[If you'd like to see the broader context of this passage, it's at about 3:28 of this video.]