Ken Auletta, "Changing Times: Jill Abramson takes charge of the Gray Lady", The New Yorker 10/24/2011:
The first thing that people usually notice about Jill Abramson is her voice. The equivalent of a nasal car honk, it’s an odd combination of upper- and working-class. Inside the newsroom, her schoolteacherlike way of elongating words and drawing out the last word of each sentence is a subject of endless conversation and expert mimicry. When she appeared on television after her appointment as executive editor, the blogger Ben Trawick-Smith wrote, “Speech pathologists and phoneticians, knock yourself out: what’s going on with Abramson’s speech?” He was deluged with responses. One speculated that, like a politician, she had trained herself to limit the space between sentences so that it would be hard to interrupt her; another said she had probably acquired the accent in an attempt to not sound too New York while she was an undergraduate at Harvard. The writer Amy Wilentz, a college roommate of Abramson’s, has said that the accent probably has something to do with trying to sound a bit like Bob Dylan.
The cited blog post is "Jill Abramson’s Accent", The dialect blog 7/28/2011. LLOG readers were apparently all playing beach volleyball that week, and so no one drew my attention to Ben Trawick-Smith's plea for assistance.
Auletta and Trawick-Smith are talking about Ms. Abramson's performance in a PBS NewsHour interview, "New York Times Names First Woman to Executive Editor Job", 6/2/2011:
The features that caught their attention are (I think) on display in Ms. Abramson's response to Jim Lehrer's first question, "what does it mean to you to become the executive editor of The New York Times?":
It means the world to me uh I grew up
uh here in Manhattan. And uh
The New York Times was worshipped in my family. And what
The Times said was true was the truth. And so,
uh I became an avid reader of the paper as a young schoolkid. And
it seems scarcely believable to me that I will
hold the top uh ed- editorial position in the newsroom.
In fact, two of the unusual features of her vocal style are on display in her first six words, "It means the world to me uh":
The first feature is the long, low, loud, level tail of the phrase — 1.048 seconds of essentially level pitch at the bottom of her pitch range, which is about 141 Hz, and at an average sound level only about 2 dB below the average level of "world", the most prominent syllable in the phrase.
The second, even more unusual feature, is the low frequency (about 23-24 Hz) amplitude modulation of this stretch of speech. You can see it in the wide-band spectrogram as a pattern of waxing and waning intensity at a rate corresponding to about 1/6 of her fundamental frequency, so that the fine vertical lines representing her pitch pulses get darker and lighter in groups of six or so. I've adjusted the dynamic range of the plot to make this feature saliently visible:
In the frequency domain, 141/6 = 23.5 Hz; in the time domain, 6*.0071 = .043 seconds.
The human ear can't resolve repeated things happening faster than about 15-16 times a second (corresponding to a period of about 0.06 seconds) as sequences of events in time, but rather hears them as low frequency pitches. If you create a sequence of clicks happening every 0.050 seconds (= 20 Hz) and listen to the result, you'll hear a sustained low-frequency buzz rather than a sequence of individual sounds.
So Ms. Abramson's 20-25 Hz phrase-final amplitude modulation of her 140-145 Hz fundamental frequency is heard as a sort of superimposed infrasound. (Technically "infrasound" should be below 20 Hz, but this is close.) 140 Hz is not unusually low for the bottom of an adult woman's pitch range — but 20-25 Hz is low for humans of any kind.
It's quite common to see phrase-final low-pitched stretches in which alternate pitch periods start to vary systematically in amplitude. This can result in an abrupt apparent doubling in period (and a corresponding halving in fundamental frequency). A clinical variant of this is called "diplophonia", but most people do it sometimes — it's a natural consequence of oscillation in non-linear physical systems, and is a specific instance of the phenomenon of "period doubling" much loved by complexity buffs. I've sometimes seen period-tripling. But this is first time that I've ever seen such a large-factor amplitude modulation so stably superimposed on a speaker's sequence of pitch pulses.
These two features — long, low, loud phrasal tails with near-infrasonic amplitude modulation — are seen again and again in Ms. Abramson's side of this interview. Here's a spectrogram of the last two syllables ("and uh") of the next phrase from her interview:
I grew up here in Manhattan and uh
The ratio between the fundamental and the infrasonic modulation is variable — it seems to be more like 8-to-1 towards the end of this sample — but the general pattern remains the same. Her long/low/loud phrase endings also often shift into "vocal fry", which is a kind of chaotic oscillation; thus the pronunciation of the second syllable of "schoolkid" in the passage quoted above:
Listening to a sample suggests that the same features will be found in other available records of her speech (e.g. the Book Review podcast for 10/142011, "Jill Abramson’s ‘Puppy Diaries’" (link to mp3) and many YouTube videos).
There are some other interesting features as well, but these two are what struck me most forcefully in a quick peek.