Archive for Prosody

Jeopardy gossip

The internet has been working hard at providing Deborah Cameron with material for a book she might write on attitudes towards women's voices. (Background: "Un justified", 7/8/2015; "Cameron v. Wolf" 7/27/2015.)

To see what I mean, sample the tweets for  #JeopardyLaura, or read some of the old-media coverage, like "Is this woman the most annoying 'Jeopardy!' contestant ever?", Fox News 11/24/2015:

"Jeopardy!" contestant Laura Ashby is causing quite a stir on social media. The Marietta, Georgia, native isn't getting attention for her two-day winning streak but instead the tone of her voice.  

Ashby first appeared on the competition show on Nov. 6 and when she returned this week the Internet went crazy over her voice.

Several tweeters went out of their way to exemplify Cameron's observation that "This endless policing of women’s language—their voices, their intonation patterns, the words they use, their syntax—is uncomfortably similar to the way our culture polices women’s bodily appearance":

Read the rest of this entry »

Comments (39)

"Often more [difficulty] than in this chosen pair"

We've often complained about the ignorant aftermath of E.B. White's ignorant 1959 incitement to which-hunting, which launched the idea that restrictive (or integrated, or defining) relative clauses in English should always and only be introduced by that, while non-restrictive (or supplementary, or non-defining) relative clauses should be introduced by which. (See "Reddit blewit" 12/24/2012 for details and additional links. Note that for simplicity, I'm considering only relative clauses with inanimate/nonhuman heads, though the fundamental point remains the same when we add who to the mix.)

My point today is that the whole distinction is a false one.

More exactly: The traditional restrictive/non-restrictive dichotomy merges distinct morphological, syntactic, semantic, prosodic, rhetorical, and psychological questions; the correlation among these different dimensions is loose at best; several of the relevant distinctions are gradient rather than categorical; and some of the distinctions are sometimes a matter of pragmatic vagueness rather than grammatical ambiguity.

If I'm right, then modern linguists have been committing White's sin in a less extreme form, trying to impose an over-simplified rationalist taxonomy on a more complex linguistic reality.

Read the rest of this entry »

Comments (21)

Cantonese intonation

On a recent flight across the Atlantic, I watched a Hong Kong movie called Gangster Payday in English, 大茶飯 ("Big Tea Rice"?) in Chinese, directed by Lee Po-Cheung. One of the things that struck me was a particular pattern of pitch and time at the ends of certain phrases, involving elongation of the final syllable, typically on a mid-level pitch. It seems to come in bunches, and to occur on quite different phrase-final syllable sequences, so I'm guessing that it's an intonational pattern rather than a series of lexical tones.

The movie is available on YouTube, so I've pulled out a few examples of this phenomenon, in the hope that someone who knows Cantonese (and perhaps also the speech patterns of Hong Kong gangsters — or at least older Hong Kongers of lower-SES origin?) can explain it.

Read the rest of this entry »

Comments (18)

Political pitch ranges

I don't have time for much this morning, but here's a plot of the f0 quantiles of the first minute or so of each of six speeches from the 2015 NRA-ILA Leadership Forum:

["F0", pronounced "eff zero", is a conventional designation for the fundamental frequency of the voice, which represents the rate of oscillation of the vocal folds in voiced speech, and is a physical proxy for the psychological dimension of "pitch". "Hz" is the standard abbreviation for "Hertz", the international unit of frequency (cycles per second) named after Heinrich Hertz.]

Read the rest of this entry »

Comments (6)

REAPER

A couple of days ago, I mentioned ("Sarah Koenig", 2/5/2015) that David Talkin was releasing a new pitch tracking program called REAPER (available from github at the link). After a few minor improvements in documentation, it's ready for the general public.

The reaper program uses the EpochTracker class to simultaneously estimate the location of voiced-speech "epochs" or glottal closure instants (GCI), voicing state (voiced or unvoiced) and fundamental frequency (F0 or "pitch"). We define the local (instantaneous) F0 as the inverse of the time between successive GCI.

After trying it out, I can recommend it whole-heartedly — it's robust and accurate and fast. It's my new standard pitch tracker.

Read the rest of this entry »

Comments (5)

Vocal creak and fry, exemplified

There are several different sorts of things involved on the perceptual side of the phenomena that people call "vocal fry" and (less often but more appropriately) "vocal creak".

One perceptual issue is the auditory equivalent of the visual "flicker fusion threshold". If regular impulse-like oscillations in air pressure are fast enough, we hear them as a tone; as they get slower and slower, we can increasingly separate the individual pressure pulses as independent events. The threshold at which the pulses fuse into a tonal percept is called "auditory flutter fusion" or sometimes "auditory flicker fusion". The transition between separation and fusion is a gradual one, and in the boundary region, we can hear the pattern in both ways, sometimes as what is called a "creak" sound, because it sounds like the creaking of a sticky hinge.

The other issue is the perceptual effect of pressure oscillations that are irregular as well as relatively low in frequency. Large amounts of random local variation in period sound like the sound of frying food, as bubbles of steam randomly form and pop here and there.

Both creak and fry can happen in human speech vocal-cord oscillation. But what people generally call "vocal fry" is actually more often mostly "vocal creak".

Read the rest of this entry »

Comments (5)

Jazz Dispute

Just in case you haven't seen this:

[h/t Taylor Jones]

Comments (16)

Phrasal trends in pitch, or, the lab subject's moan

It's been a while since I posted a Breakfast Experiment™ — things have been hectic here — but yesterday in a discussion with some phonetics students, I learned that certain old ideas about (linguistic) intonation have passed out of memory. And in trying to explain these ideas, I posed a problem for myself that is a suitable subject a little hacking during this morning's breakfast hour.  Attention Conservation Notice: We're going to wander in the history-of-phonetics weeds for a while here.

Read the rest of this entry »

Comments off

Combating stereotypes — with stereotypes

Laura Starecheski, "Can Changing How You Sound Help You Find Your Voice?", NPR All Things Considered 10/14/2014:

Just having a feminine voice means you're probably not as capable at your job.  

At least, studies suggest, that's what many people in the United States think.

There's a gender bias in how Americans perceive feminine voices: as insecure, less competent and less trustworthy.  This can be a problem — especially for women jockeying for power in male-dominated fields, like law.

Read the rest of this entry »

Comments (9)

The shape of a spoken phrase in Mandarin

A few years ago, with Jiahong Yuan and Chris Cieri, I took a look at variation in English word duration by phrasal position, using data from the Switchboard conversational-speech corpus ("The shape of a spoken phrase", LLOG 4/12/2006; Jiahong Yuan, Mark Liberman, and Chris Cieri, "Towards an Integrated Understanding of Speaking Rate in Conversation", InterSpeech 2006). As is often the case for simple-minded analysis of large speech datasets, this exercise showed a remarkably consistent pattern of variation — the plot below shows mean duration by position for phrases from 1 to 12 words long:

The Mandarin Broadcast News collection discussed in a recent post ("Consonant effects on F0 in Chinese", 6/12/2014) lends itself to a similar analysis of phrase-position effects on speech timing. So for this morning's Breakfast Experiment™, I ran a couple of scripts to take a first look.

Read the rest of this entry »

Comments (3)

Consonant effects on F0 in Chinese

Following up on two earlier Breakfast Experiments™ ("Consonant effects on F0 of following vowels", 6/5/2014; "Consonant effects on F0 are multiplicative", 6/6/2014), here are some semi-comparable measurements of consonant effects on fundamental frequency (F0) in Mandarin Chinese broadcast news speech.

[As I warned potential readers of those earlier posts, this is considerably more wonkish than most LLOG offerings.]

Why do people care about the effects of consonant features on F0? The main reason is that tonogenesis — the historical development of lexical tones — often arises from re-interpretation of "micromelodies" of this kind, typically driven by laryngeal features of consonants such as voiceless vs. voiced (e.g. p,t,k,s vs. b,d,g,z). So it's natural to wonder whether languages where this has already happened, like Mandarin Chinese, retain or suppress such effects.

Read the rest of this entry »

Comments (3)

Consonant effects on F0 are multiplicative

[Warning: an unusually nerdy follow-up to an unusually nerdy post…] In the comments on yesterday's post "Consonant effects on F0 of following vowels", the question came up whether the effect of consonant voicing on vowel pitch is additive (e.g. plus or minus N Hz) or multiplicative (up or down by M percent). The fact that I calculated the effects in proportional terms indicates that I assumed, without checking, that the effects are multiplicative.

One easy way to check this assumption is to redo the calculations for female vs. male speakers independently, since we expect the overall F0 patterns of female speakers to be about 65-70% higher on average. So for this morning's Breakfast Experiment™ I did just that — it required changing just two characters in the scripts I wrote yesterday, so this was the easiest experiment ever…

Read the rest of this entry »

Comments off

Consonant effects on F0 of following vowels

I spent the past couple of days at a workshop on lexical tone, organized by Kristine Yu at UMass. A topic that came up several times was the question of whether "segmental" influences on pitch — for instance, the fact that voiceless consonants are typically associated with a higher pitch in the first part of a following vowel — might be diminished or even eliminated in languages with lexical tone. Several participants observed that the evidence for this is not very strong: the classical paper on the subject studied a small number of utterances from one speaker in Thai, for example.

So for this morning's Breakfast Experiment™, I wrote a little script that calculates and displays (one way of looking at) these effects in the TIMIT dataset, which includes 10 English sentences spoken by each of 630 speakers. (Specifically, there are two sentences spoken by all 630 speakers;  450 sentences spoken by 7 speakers each; and 1890 sentences spoken by a single speaker.)

I had to go to a meeting before I had a chance to write up the results, but the meeting ended early enough for me to find 15 minutes before lunch, so:

Read the rest of this entry »

Comments (8)