A few years ago, with Jiahong Yuan and Chris Cieri, I took a look at variation in English word duration by phrasal position, using data from the Switchboard conversational-speech corpus ("The shape of a spoken phrase", LLOG 4/12/2006; Jiahong Yuan, Mark Liberman, and Chris Cieri, "Towards an Integrated Understanding of Speaking Rate in Conversation", InterSpeech 2006). As is often the case for simple-minded analysis of large speech datasets, this exercise showed a remarkably consistent pattern of variation — the plot below shows mean duration by position for phrases from 1 to 12 words long:
The Mandarin Broadcast News collection discussed in a recent post ("Consonant effects on F0 in Chinese", 6/12/2014) lends itself to a similar analysis of phrase-position effects on speech timing. So for this morning's Breakfast Experiment™, I ran a couple of scripts to take a first look.
You'll search Google News in vain for stories about most technical terms in phonetics — no recent coverage of lenition, for example — but "vocal fry" has been prominent in the popular press for several years. Despite all the coverage, many people seem to be unclear about what it is and where it comes from — so today I thought I'd spend a few minutes on the phenomenon from a phonetician's perspective.
[As I warned potential readers of those earlier posts, this is considerably more wonkish than most LLOG offerings.]
Why do people care about the effects of consonant features on F0? The main reason is that tonogenesis — the historical development of lexical tones — often arises from re-interpretation of "micromelodies" of this kind, typically driven by laryngeal features of consonants such as voiceless vs. voiced (e.g. p,t,k,s vs. b,d,g,z). So it's natural to wonder whether languages where this has already happened, like Mandarin Chinese, retain or suppress such effects.
[Warning: an unusually nerdy follow-up to an unusually nerdy post…] In the comments on yesterday's post "Consonant effects on F0 of following vowels", the question came up whether the effect of consonant voicing on vowel pitch is additive (e.g. plus or minus N Hz) or multiplicative (up or down by M percent). The fact that I calculated the effects in proportional terms indicates that I assumed, without checking, that the effects are multiplicative.
One easy way to check this assumption is to redo the calculations for female vs. male speakers independently, since we expect the overall F0 patterns of female speakers to be about 65-70% higher on average. So for this morning's Breakfast Experiment™ I did just that — it required changing just two characters in the scripts I wrote yesterday, so this was the easiest experiment ever…
I spent the past couple of days at a workshop on lexical tone, organized by Kristine Yu at UMass. A topic that came up several times was the question of whether "segmental" influences on pitch — for instance, the fact that voiceless consonants are typically associated with a higher pitch in the first part of a following vowel — might be diminished or even eliminated in languages with lexical tone. Several participants observed that the evidence for this is not very strong: the classical paper on the subject studied a small number of utterances from one speaker in Thai, for example.
So for this morning's Breakfast Experiment™, I wrote a little script that calculates and displays (one way of looking at) these effects in the TIMIT dataset, which includes 10 English sentences spoken by each of 630 speakers. (Specifically, there are two sentences spoken by all 630 speakers; 450 sentences spoken by 7 speakers each; and 1890 sentences spoken by a single speaker.)
I had to go to a meeting before I had a chance to write up the results, but the meeting ended early enough for me to find 15 minutes before lunch, so:
Reader Jean-Michel found an odd example of a Sinographic typo and it's got him stumped. This has to do with the Korean Blu-ray release of "As Tears Go By," the 1988 debut feature by Hong Kong director Wong Kar-wai.
In Chinese the film is known as Wàngjiǎo kǎmén 旺角卡門 ("Mongkok Carmen") after the Bizet opera (though the resemblances are very superficial). What is strange, however, is that the Korean Blu-ray art, as illustrated below, initially gave the characters as Wàngjiǎo xiàwèn 旺角下問.
In a post a couple of days ago ("PSDS", 3/30/2014), I observed that in English, "Syllable-final (and especially phrase-final) /z/ is usually voiceless". In a comment, Mark F. asked
[A]re "buzz" and "biz" just isolated counterexamples to the generalization about syllable-final /z/, or is it generally false for accented syllables? Or do I just think I pronounce the /z/?
Linguists are generally scornful of "eye dialect", in both of the common meanings of that term:
As an "unusual spelling intended to represent dialectal or colloquial idiosyncrasies of speech", like roight for right or yahd for yard;
As a "the use of non-standard spellings such as enuff for enough or wuz for was, to indicate that the speaker is uneducated".
The first kind of eye-dialect is seen as inexact ("you should use IPA") and the second kind is seen as snobbish. I'm generally more curious than censorious about both of these practices; but in any case, I recently saw a case of the first kind that struck me as especially interesting.
The obituaries for the great comic Sid Caesar invariably mention his proficiency in "double-talk," mimicking the sounds (but not the sense) of foreign languages. (On the phenomenon of double-talk, see Mark Liberman's posts on yaourterhere, here, here, and here.) It turns out that this was a talent Caesar had cultivated ever since he was a boy clearing tables at his father's restaurant in multi-ethnic Yonkers.
The words were being hastily shouted down a phone, with loud sounds of wind and waves in the background, and the emergency call center operator could make no sense of them. Attempts at conversing with the caller failed; he seemed not to understand English. Yet the tone was unmistakably urgent: someone was in danger of his life. But who? And where?
About 23 people died in the event that led to that desperate, unintelligible phone call. It happed in 2004, ten years ago today. My vagueness about the number of victims is because no one who knew all the facts wanted to talk about the circumstances (the skull of one victim was only found in 2010).