Jaya Saxena, “Examples of Male Vocal Fry“, The Toast 7/22/2015, presents YouTube videos of a bunch of well-known males (human and otherwise) exhibiting so-called vocal fry. There’s no textual commentary — but the choice of examples, and the word “male” in the title, underlines the fact that young women are currently being criticized for a phenomenon that can be found to some degree in the speech of every human being who ever spoke, and indeed in the noises made by every creature that ever vocalized.

For example, here’s Bruce Willis:

As discussed in the posts linked below, what people call “vocal fry” is an mixture of related phenomena — mainly period-doubling in regions of low-pitched voice, and “chaotic” or aperiodic vocal-cord oscillation. (Technically, regular pitch whose frequency is below the flutter-fusion threshold is “creak”, and “fry” is irregular vocal-cord oscillation, in which the glottal pulses sound sort of like the irregular popping of moisture in frying-temperature oil; but never mind…)

Willis exhibits classic period-doubling in his uh filled pauses — here’s the first one, with the time waveform shown below it:

In the region marked with the blue arrow, his glottal pulses are a little less than 12 milliseconds apart, for a fundamental frequency of a bit more than 80 Hz. In the region marked with the red arrow, the period (the spacing between the pulses) has just about doubled, and so the fundamental frequency (the number of pulses per second) falls to around 40 Hz.

At the start of this uh, in the region marked with the yellow arrow, you can see that both frequencies are there at the same time, in a sense, with pulses alternately larger and smaller. This is call diplophonia — and like creak and fry, diplophonia can be a symptom of vocal cord pathology, but it’s also characteristic of some of the speech of every single human being who ever spoke, because these phenomena are inherent in the physics of non-linear dynamical systems.

Although Willis reliably exhibits period-doubling in his filled pauses, it’s also common on regular words at low-pitched ends of his phrases, e.g. on the word that in this clip (the waveform shows “in the same light that”):

The final that by itself:

And Willis is by no means the most fryful male in the collection.

The following things about vocal (creak and) fry are clear:

  1. Everybody does it.
  2. Everybody has always done it.
  3. There’s a widespread belief that young American women are now doing it more (than young women did in earlier decades, or than older women do now, or that men of any age do it or did it).
  4. No one has ever presented any non-anecdotal evidence that (3) is true.

The belief in (3) could well be true, at least in the sense that there might be a style of speech that’s somewhat more common among certain groups of young women now than in the past. I did find some evidence, for example, that Sarah Koenig’s radio personality changed in this direction over the course of 15 years — bringing her almost to the point of being as fryful as Ira Glass has always been (see “Freedom fries“, 2/3/2015; “You want fries with that?“, 2/3/2015; “Sarah Koenig“, 2/5/2015).

On the other hand, the whole thing could be a gigantic epidemic of stereotype formation and confirmation bias. Given the massive media-driven interest in the topic, it’s about time that someone looked into it.

Update — Counterbander, in the comments, points us to Ikuko Yuasa,  “Creaky voice: A new feminine voice quality for young urban-oriented upwardly mobile American women?American Speech 85, no. 3 (2010): 315-337. This paper looks at about two minutes of speech (401 words) from each of 12 young females and 11 young males from California:

Ten female Japanese speakers in their 20s and 30s were similarly studied. The the prevalence of creaky voice — measured as the proportion of words where the author detected it by visual inspection of waveforms and spectrograms — was about twice as great for the Californian females as for the Japanese females or the Californian males:


There’s something wonky in that table, since 27.5/401 is 6.9%, not 5.6%; and 22.4/401 is 5.6%, not 6.9% — it looks like the either the counts or the percentages were swapped between the American Males and the Japanese Females.

Anyhow, in this rather small sample, everybody creaks, but the American women creak more often. So this counts as a bit of non-anecdotal evidence — thanks to the commenter who pointed it out.

One caveat, aside from the small sample, is the possibility of that the experimenter(s) might have influenced the behavior of the subjects. As the paper explains:

All the American female informants conversed with female interlocutors, while eight American males held conversations with male interlocutors and three with female interlocutors.

So a relevant question would be whether the “interlocutors” (and for that matter the “informants”) knew the topic of the study; and also what the prevalence of creak in the contributions of the interlocutors was.

This is the right kind of study, and what we need is replication across a wider sample of people, places, topics, and times. A couple of other methodological points: (1) it would be good to  use published datasets — such as oral histories — so that other aspects of the studied material can be investigated; and (2) it would be good to use automatic classification methods, or else classification by judges who are blind to the study’s goals, because judgments about individual regions are often uncertain, and we’d want to avoid unconscious bias. (And since everybody likely to be a competent judge knows about the topic and the hypothesis now, let me revise that to suggest that automatic classification methods should be used…)


  1. Counterbander said,

    July 23, 2015 @ 1:08 pm

    How about, for example, Yuasa. P.I. (2010), American Speech, 85(3):315-337: “An examination of creaky voice occurring in natural conversations among relatively young educated American and Japanese speakers revealed that female speakers of American English residing in California employed creaky voice much more frequently than comparable American male and Japanese female speakers.”

  2. Karen said,

    July 23, 2015 @ 3:55 pm

    I walked into a store that had a talk show on the radio a couple of weeks ago and the guest was excoriating young women for uptalk and then proceeded to do the same thing to young women for vocal fry. In the latter case she asserted that women were doing this on purpose to stop uptalking. But she asserted that both things were horrible horrible habits and that women needed to stop doing them right now. Not once did she indicate that anyone else did it – in fact, from the five minutes or so I heard, she was clearly implying, if not outright claiming, that ONLY young women did either of them.

    I wished afterwards that I had asked the owner what station it was.

  3. Patrick B said,

    July 23, 2015 @ 5:46 pm

    Extra points for anyone who can find an example of Stephen Fry exhibiting vocal fry.

  4. Breffni said,

    July 23, 2015 @ 6:13 pm

    I have nothing non-anecdotal to add, but when the evidence comes in my money will definitely be on the existence of a characteristic creaky-voice pattern (not just overall prevalence) among some groups of young (mainly American?) women, and against the epidemic-of-stereotype-formation hypothesis. What I have in mind (which may or may not be what others have in mind) is the pattern heard very consistently in the speech of the woman introducing Steven Pinker here. If I had to pinpoint what (I think) I hear as different about her creak compared to Willis, Depp, Diesel et al., my first guess would be the suddenness of the drop in pitch, e.g. (all within the first minute), in “popular science author“, “Department of Psychology, “computational theory of mind“, “why violence has declined“, “that violence has declined“, “and in the short run“, “not just this year, but ever“, and so on. In all these cases, the shift from modal voice to creak strikes me as unusually abrupt.

    [(myl) But period-doubling is *always* abrupt, usually taking place between one glottal cycle and the next, or across a voicing gap between one vowel and the next, sometimes with a few milliseconds of chaotic transition.

    So your wager might well work out, but abruptness will not be why.

    One relevant perceptual factor is that the flutter-fusion threshold in fact extends over a range of frequencies. At the high-frequency end you perceive nothing but a pitched sound, and at the low-frequency end you perceive nothing but a sequence of separated pulses. In between, the pitched-sound perception gradually weakens and the pulse-sequence perception gradually strengthens.

    So period-doubling from a relatively high-pitched starting point may create the impression of non-creak turning into creak, whereas period-doubling from a lower-pitched starting point would create the impression of weaker creak turning into stronger creak.]

  5. Chris C. said,

    July 23, 2015 @ 10:55 pm

    @Patrick — I think we can hear a few examples here:

  6. Nerdcore › Moar Links from the End of the World said,

    July 24, 2015 @ 3:30 am

    […] Fresh Air on „policing“ young women’s voices, related: Male vocal fry […]

  7. A response to Naomi Wolf’s “Young women, give up the vocal fry and reclaim your strong female voice” | Bouts of Domesticity said,

    July 26, 2015 @ 7:52 pm

    […] use uptalk and vocal fry, too. (And men say like just as much as I do. And there’s no data beyond anecdotal observation […]

