There are several different sorts of things involved on the perceptual side of the phenomena that people call "vocal fry" and (less often but more appropriately) "vocal creak".
One perceptual issue is the auditory equivalent of the visual "flicker fusion threshold". If regular impulse-like oscillations in air pressure are fast enough, we hear them as a tone; as they get slower and slower, we can increasingly separate the individual pressure pulses as independent events. The threshold at which the pulses fuse into a tonal percept is called "auditory flutter fusion" or sometimes "auditory flicker fusion". The transition between separation and fusion is a gradual one, and in the boundary region, we can hear the pattern in both ways, sometimes as what is called a "creak" sound, because it sounds like the creaking of a sticky hinge.
The other issue is the perceptual effect of pressure oscillations that are irregular as well as relatively low in frequency. Large amounts of random local variation in period sound like the sound of frying food, as bubbles of steam randomly form and pop here and there.
Both creak and fry can happen in human speech vocal-cord oscillation. But what people generally call "vocal fry" is actually more often mostly "vocal creak".
Read the rest of this entry »