Like I said yesterday, the whole stress-timed-vs.-syllable-timed business is "a gigantic tangled intellectual thicket that’s easy to get into and hard to get out of". And one of the comments on my post asked a question that tempts me in further:
So then there’s a psychological perception of syllable-timed language that is not visible in the objective data?
Yes and no.
In the first place, there really are differences in the syllabic rhythms of languages. And in particular, syllables in (say) Spanish really are more nearly equal in duration than they are in (say) English. But the reasons for this are not the reasons that people who use the terminology of "syllable timed" vs. "stress timed" usually have in mind.
And in the second place, there are also psychological differences in the prosodic organization of languages, both in perception and in production, that are connected to the idea of languages being "stress timed" vs. "syllable timed". However, there's more than one dimension involved, and more than two categories, and thereby much confusion arises.
And in the third place, …
No, that's far enough into the thicket for today. But I'll contribute another Breakfast Experiment™ to illustrate my statement that there really is an objective difference in the syllabic durational patterns of different languages, and then I'll try to explain why this doesn't mean what people often think it means.
As a source of data, I took David Puente's ABC Exclusiva podcasts for 4/24/2008 in English and for 4/25/2008 in Spanish. Mr. Puente is bilingual — I gather that Spanish was his first language, but his English sounds pretty much like ordinary American.
I segmented the first 120 syllables of the two podcasts, starting in each case from the point where he identifies himself ("Hello everyone, I'm David Puente"; "Yo soy David Puente").
[Let me start by reposting the warning I gave yesterday:
Note that I’ve avoided two critical questions here: how to divide a phonetic transcription into syllables, and how to align a phonetic transcription with the stream of sound. Depending on the answers, the concept “syllable duration” , applied to the same recording, will yield somewhat different experimental measurements; and therefore numbers and graphs are hard to interpret in the absence of well-defined standards for such annotation, which I haven’t provided. So take my numbers and graphs as an example of how I claim such experiments are going to come out, and feel free to produce your own –though if we wanted to get serious about this, we should publish our annotation manuals, our audio recordings, and our raw segmentations.
So by all means try this at home -- but if you want to spend more than a few minutes on it, start by defining your terms and methods, so that you can identify and segment syllables (or speech units of whatever kind) in a consistent and well-defined way.]
Here are the numbers I got.
In English, David Puente's average syllable duration was 204 milliseconds, with a standard deviation of 107 msec, or 52% of the mean.
In Spanish, his average syllable duration was 189 msec, with a standard deviation of 59 msec, or 31% of the mean.
So his Spanish is closer to being objectively "syllable timed" than his English, in exactly the way that Rudy Giuliani's English was not closer to being objectively "syllable timed" than Mitt Romney's English.
Some other numbers… In English, the interquartile range of his syllable durations was 117 msec, while in Spanish it was 79 msec. In English, the average absolute value of the difference between adjacent syllable durations was 114 msec, while in Spanish it was 65 msec.
In graphical terms, here's a boxplot:
Here are some histograms (click on the image for a larger version):
Here's the raw syllable-duration sequence, according to my segmentation:
OK, so what's wrong with the way that people often think about the difference that this little experiment illustrates?
Well, the main reasons for the difference are pretty obvious, and they're mostly differences in the nature of individual syllables, not the way that speakers "beat time" in producing them. English syllables can be as simple as a single vowel, or (in the short passage in today's experiment) as complex as church or priest, while the most complex Spanish syllables are substantially simpler. English unstressed vowels are often reduced in duration relative to stressed syllables, as in the middle syllable of the word tragedy; unstressed vowels are not reduced in the same way in Spanish. There's more, but this is enough to create a difference of the kind under discussion, namely a narrower range of syllable durations in Spanish.
So a difference in speakers' rhythmic intentions isn't needed to explain the basic observation. But absence of evidence isn't evidence of absence. There's a widespread impression that Spanish speakers (unconsciously) try to produce a sequence of equal-duration syllables, while English speakers try to produce a sequence of equal-duration stress groups (or prosodic feet, i.e. inter-stress intervals). We can test this idea directly; and if we do, we'll find that it's wrong. At least, it's wrong if we interpret "equal duration" as having something to do with the objective similarity of intervals of time. But that's an experiment for another day's breakfast.