Paul Krugman, "My Lawn Guyland Roots Are Showing", The Conscience of a Liberal, 11/24/2013:
I see that some commentators were wondering why, in an earlier post, I wrote “pedal to the medal” instead of “pedal to the metal”. The answer is, typing fast, and writing what I heard in my head. The truth is that sometimes, usually when I’m tired, I do hear myself referring to a bottle of water as a boddle of oo-waugh-duh — gotta get the three-syllable pronunciation there.
I’ve never made a conscious effort to change my accent, and I know from recordings that a bit of the Noo Yawk is still there, but four decades in academia have, I believe, flattened it out into Mid-Atlantic neutral most of the time. But not always, and sometimes not even when I write.
Several commenters noted that Prof. Krugman's explanation is too narrow:
I'm confused by this post. Are there a lot of Americans who sharply differentiate those words? I mean, the whole point about the phrase is that it rhymes.
Generic American accents pronounce metal as 'medal' too, actually.
Those observations are essentially true, though terse. One commenter offers a more extended explanation:
In American English the contrast between d and t is neutralized between vowels. The actual sound is a quick tap of the tongue. Words with these sounds are still distinct, however, since a vowel before a voiced consonant (d) is longer than a vowel before an unvoiced consonant (t), and this length difference has been preserved even though the voiced/unvoiced consonant contrast has been lost for d/t between vowels. Compare rider/writer. The d and t are pronounced the same, but the i of rider is held a bit longer before the tap consonant is pronounced.
This is less accurate and also less relevant.
It's true that in many (but not all) American varieties of English, the phonemes /t/ and /d/ both become the voiced alveolar "tap" or "flap" [ɾ] when they precede a vowel and are not in the onset of its syllable (see "Raising and lowering those tighty whities", 3/20/2005, for some further details). This happens in some varieties of English outside the U.S. as well.
It's not true that this happens "between vowels" — thus it doesn't happen to the /t/ in "attack", but it does happen to the /t/ in "parting" and (for many speakers) to the /t/ in "panting".
Nor is it true in general that "the length difference is preserved". Most Americans pronounce latter and ladder in exactly the same way, though many feel instinctively that these words are different in sound. In some other cases the lexical distinction is maintained by other means, despite flapping and voicing: thus some speakers have a raised and fronted vowel in write versus ride, and maintain the vowel-quality (and perhaps some vowel length) difference in writer vs. rider. (For some background and discussion of interesting further developments, see e.g. Josef Fruehwald, "The Spread of Raising: Opacity, Lexicalization, and Diffusion", 2008.)
But none of this is strictly relevant to the metal/medal/mettle case. It's true that for most Americans, these words are homophones. But the second syllable is pronounced as a syllabic /l/, not a vowel; and the /t/ or /d/, though short and voiced, is not a flap, but rather a voiced stop released into that syllabic /l/. Like flapping, this is an aspect of the more general lenition of non-onset consonants in American English; but it's not flapping.
Here's the pronunciation of medal from the online Merriam-Webster dictionary entry, and the corresponding spectrogram:
Here's the same thing for metal:
And to illustrate a true flap, metaphor:
Those pronunciations are all rather precise and even hyper-articulated, as is appropriate for isolated dictionary examples. In more natural speech, the closures in medal and metal would be even shorter and weaker, and might even disappear altogether. But they wouldn't technically become flaps or taps, because the following syllabic /l/ keeps the blade of the tongue pressed against the palate.