Archive for Prosody

Cantonese intonation

On a recent flight across the Atlantic, I watched a Hong Kong movie called Gangster Payday in English, 大茶飯 ("Big Tea Rice"?) in Chinese, directed by Lee Po-Cheung. One of the things that struck me was a particular pattern of pitch and time at the ends of certain phrases, involving elongation of the final syllable, typically on a mid-level pitch. It seems to come in bunches, and to occur on quite different phrase-final syllable sequences, so I'm guessing that it's an intonational pattern rather than a series of lexical tones.

The movie is available on YouTube, so I've pulled out a few examples of this phenomenon, in the hope that someone who knows Cantonese (and perhaps also the speech patterns of Hong Kong gangsters — or at least older Hong Kongers of lower-SES origin?) can explain it.

Read the rest of this entry »

Comments (18)

Political pitch ranges

I don't have time for much this morning, but here's a plot of the f0 quantiles of the first minute or so of each of six speeches from the 2015 NRA-ILA Leadership Forum:

["F0", pronounced "eff zero", is a conventional designation for the fundamental frequency of the voice, which represents the rate of oscillation of the vocal folds in voiced speech, and is a physical proxy for the psychological dimension of "pitch". "Hz" is the standard abbreviation for "Hertz", the international unit of frequency (cycles per second) named after Heinrich Hertz.]

Read the rest of this entry »

Comments (6)

REAPER

A couple of days ago, I mentioned ("Sarah Koenig", 2/5/2015) that David Talkin was releasing a new pitch tracking program called REAPER (available from github at the link). After a few minor improvements in documentation, it's ready for the general public.

The reaper program uses the EpochTracker class to simultaneously estimate the location of voiced-speech "epochs" or glottal closure instants (GCI), voicing state (voiced or unvoiced) and fundamental frequency (F0 or "pitch"). We define the local (instantaneous) F0 as the inverse of the time between successive GCI.

After trying it out, I can recommend it whole-heartedly — it's robust and accurate and fast. It's my new standard pitch tracker.

Read the rest of this entry »

Comments (5)

Vocal creak and fry, exemplified

There are several different sorts of things involved on the perceptual side of the phenomena that people call "vocal fry" and (less often but more appropriately) "vocal creak".

One perceptual issue is the auditory equivalent of the visual "flicker fusion threshold". If regular impulse-like oscillations in air pressure are fast enough, we hear them as a tone; as they get slower and slower, we can increasingly separate the individual pressure pulses as independent events. The threshold at which the pulses fuse into a tonal percept is called "auditory flutter fusion" or sometimes "auditory flicker fusion". The transition between separation and fusion is a gradual one, and in the boundary region, we can hear the pattern in both ways, sometimes as what is called a "creak" sound, because it sounds like the creaking of a sticky hinge.

The other issue is the perceptual effect of pressure oscillations that are irregular as well as relatively low in frequency. Large amounts of random local variation in period sound like the sound of frying food, as bubbles of steam randomly form and pop here and there.

Both creak and fry can happen in human speech vocal-cord oscillation. But what people generally call "vocal fry" is actually more often mostly "vocal creak".

Read the rest of this entry »

Comments (5)

Jazz Dispute

Just in case you haven't seen this:

[h/t Taylor Jones]

Comments (16)

Phrasal trends in pitch, or, the lab subject's moan

It's been a while since I posted a Breakfast Experiment™ — things have been hectic here — but yesterday in a discussion with some phonetics students, I learned that certain old ideas about (linguistic) intonation have passed out of memory. And in trying to explain these ideas, I posed a problem for myself that is a suitable subject a little hacking during this morning's breakfast hour.  Attention Conservation Notice: We're going to wander in the history-of-phonetics weeds for a while here.

Read the rest of this entry »

Comments off

Combating stereotypes — with stereotypes

Laura Starecheski, "Can Changing How You Sound Help You Find Your Voice?", NPR All Things Considered 10/14/2014:

Just having a feminine voice means you're probably not as capable at your job.  

At least, studies suggest, that's what many people in the United States think.

There's a gender bias in how Americans perceive feminine voices: as insecure, less competent and less trustworthy.  This can be a problem — especially for women jockeying for power in male-dominated fields, like law.

Read the rest of this entry »

Comments (9)

The shape of a spoken phrase in Mandarin

A few years ago, with Jiahong Yuan and Chris Cieri, I took a look at variation in English word duration by phrasal position, using data from the Switchboard conversational-speech corpus ("The shape of a spoken phrase", LLOG 4/12/2006; Jiahong Yuan, Mark Liberman, and Chris Cieri, "Towards an Integrated Understanding of Speaking Rate in Conversation", InterSpeech 2006). As is often the case for simple-minded analysis of large speech datasets, this exercise showed a remarkably consistent pattern of variation — the plot below shows mean duration by position for phrases from 1 to 12 words long:

The Mandarin Broadcast News collection discussed in a recent post ("Consonant effects on F0 in Chinese", 6/12/2014) lends itself to a similar analysis of phrase-position effects on speech timing. So for this morning's Breakfast Experiment™, I ran a couple of scripts to take a first look.

Read the rest of this entry »

Comments (3)

Consonant effects on F0 in Chinese

Following up on two earlier Breakfast Experiments™ ("Consonant effects on F0 of following vowels", 6/5/2014; "Consonant effects on F0 are multiplicative", 6/6/2014), here are some semi-comparable measurements of consonant effects on fundamental frequency (F0) in Mandarin Chinese broadcast news speech.

[As I warned potential readers of those earlier posts, this is considerably more wonkish than most LLOG offerings.]

Why do people care about the effects of consonant features on F0? The main reason is that tonogenesis — the historical development of lexical tones — often arises from re-interpretation of "micromelodies" of this kind, typically driven by laryngeal features of consonants such as voiceless vs. voiced (e.g. p,t,k,s vs. b,d,g,z). So it's natural to wonder whether languages where this has already happened, like Mandarin Chinese, retain or suppress such effects.

Read the rest of this entry »

Comments (3)

Consonant effects on F0 are multiplicative

[Warning: an unusually nerdy follow-up to an unusually nerdy post…] In the comments on yesterday's post "Consonant effects on F0 of following vowels", the question came up whether the effect of consonant voicing on vowel pitch is additive (e.g. plus or minus N Hz) or multiplicative (up or down by M percent). The fact that I calculated the effects in proportional terms indicates that I assumed, without checking, that the effects are multiplicative.

One easy way to check this assumption is to redo the calculations for female vs. male speakers independently, since we expect the overall F0 patterns of female speakers to be about 65-70% higher on average. So for this morning's Breakfast Experiment™ I did just that — it required changing just two characters in the scripts I wrote yesterday, so this was the easiest experiment ever…

Read the rest of this entry »

Comments off

Consonant effects on F0 of following vowels

I spent the past couple of days at a workshop on lexical tone, organized by Kristine Yu at UMass. A topic that came up several times was the question of whether "segmental" influences on pitch — for instance, the fact that voiceless consonants are typically associated with a higher pitch in the first part of a following vowel — might be diminished or even eliminated in languages with lexical tone. Several participants observed that the evidence for this is not very strong: the classical paper on the subject studied a small number of utterances from one speaker in Thai, for example.

So for this morning's Breakfast Experiment™, I wrote a little script that calculates and displays (one way of looking at) these effects in the TIMIT dataset, which includes 10 English sentences spoken by each of 630 speakers. (Specifically, there are two sentences spoken by all 630 speakers;  450 sentences spoken by 7 speakers each; and 1890 sentences spoken by a single speaker.)

I had to go to a meeting before I had a chance to write up the results, but the meeting ended early enough for me to find 15 minutes before lunch, so:

Read the rest of this entry »

Comments (8)

Final rises

As Eric Baković recently noted, there's been a lot of buzz about a presentation about "uptalk" by Amanda Ritchart and Amalia Arvaniti at the 2013 Acoustical Society meeting. All we have so far is a sort of press release  ("Do We All Speak Like Valley Girls? Uptalk in Southern Californian English", ASA Lay Language Papers, 12/5/2013), but this is enough to see that Ritchart and Arvaniti have made a valuable contribution.

They based their analysis on systematic analysis of a good-sized recorded dataset (23 "native speakers of SoCal English", who were asked to describe a muted video clip and to participate in a "map task" interaction). They distinguished among different interactional functions ("simple statement", "question", "floor holding", "confirmation request"), they systematically noted aspects of the location and extent of rises, and they based their conclusions on a statistical analysis of the interrelationship of these features.

Read the rest of this entry »

Comments (7)

Media uptake on uptalk

Yesterday afternoon, UC San Diego Linguistics grad student Amanda Ritchart presented her research (joint with Amalia Arvaniti) on the use and realization of uptalk in Southern California English at the 166th Acoustical Society of America meeting. This work is profiled in the ASA's press room, and has thus far received a fair amount of attention. You can hear and/or read about it on KPBS (San Diego's public radio station), at WBUR's Here & Now, on BBC News, and in the Washington Post. (See also this shout-out on the Linguistic Society of America website.)

Uptalk has been discussed many times here on Language Log, so regular readers are probably not unfamiliar with it. But one of the most recent Language Log posts on the topic ("Uptalk awakening", 9/29/2013) shows how relatively unaware of this long-standing feature of many varieties of English some folks still are. So the media coverage of Ritchart & Arvaniti's work is welcome — and on the whole pretty good, if a little biased toward a "wow, it's spreading to men!" interpretation of the research results, which kinda misses the point. But of course, if you scroll down to the comments (why oh why do I ever scroll down to the comments???), you'll see that many appear to think that the use of rising intonation at the ends of (some!) statements is the clearest evidence we have of the decline of western civilization. Sigh.

Update — more here.

 

Comments (29)