« previous post | next post »
"Superficial auditory (dis)fluency biases higher-level social judgment." Walter-Terrill, Robert, et al. Proceedings of the National Academy of Sciences 122, no. 13 (March 24, 2025): e2415254122
Significance
In recent years, tools such as videoconferencing have shifted many conversations online, with stark auditory ramifications—such that some voices sound clear and resonant while others sound hollow or tinny, based on microphone quality and characteristics. A series of experiments shows that such differences, while clearly not reflective of the speakers themselves, nevertheless have broad and powerful consequences for social evaluation, leading listeners to make lower judgments of speakers’ intelligence, hireability, credibility, and even romantic desirability. Such effects may be potential sources of unintentional bias and discrimination, given the likelihood that microphone quality is correlated with socioeconomic status. So, before joining your next videoconference, you may want to consider how much a cheap microphone may really be costing you.
Abstract
When talking to other people, we naturally form impressions based not only on what they say but also on how they say it—e.g., how confident they sound. In modern life, however, the sounds of voices are often determined not only by intrinsic qualities (such as vocal anatomy) but also by extrinsic properties (such as videoconferencing microphone quality). Here, we show that such superficial auditory properties can have surprisingly deep consequences for higher-level social judgments. Listeners heard short narrated passages (e.g., from job application essays) and then made various judgments about the speakers. Critically, the recordings were modified to simulate different microphone qualities, while carefully equating listeners’ comprehension of the words. Though the manipulations carried no implications about the speakers themselves, common disfluent auditory signals (as in “tinny” speech) led to decreased judgments of intelligence, hireability, credibility, and romantic desirability. These effects were robust across speaker gender and accent, and they occurred for both human and clearly artificial (computer-synthesized) speech. Thus, just as judgments from written text are influenced by factors such as font fluency, judgments from speech are not only based on its content but also biased by the superficial vehicle through which it is delivered. Such effects may become more relevant as daily communication via videoconferencing becomes increasingly widespread.
This reminds me of the hi-fi setup of a friend of mine who lives in Massachusetts and is very serious about listening to classical music. When I visited him around twenty-five years ago, I was astonished by the quality of the acoustic reproduction of his hi-fi equipment. Sitting in his living room, I felt as though I were in a concert hall.
I complimented him on the sound reproduction of his equipment and said that he must have spent a fortune on his turntable, amplifier, and speakers. Au contraire, he said. The cables that ran between his amplifier and his speakers cost more than all the other components put together. I hesitate to quote the figure lest readers think I'm making it up.
Selected readings
[Thanks to Ted McClure]
April 28, 2025 @ 12:45 pm
· Filed by Victor Mair under Acoustics, Language and music, Language and society
Permalink
AntC said,
April 28, 2025 @ 4:06 pm
very serious about listening to classical music.
I'm very serious about listening to classical music. Where you apprehend music is inside your head, not just with your ears. Famously Beethoven was deaf in later years; also Smetana, who depicted in his late works the high-pitched ringing that afflicted him. Your hi-fi enabling you to hear the second flute take a breath is the opposite of musical appreciation.
As to the findings of the paper, I agree a poor set-up can interfere with getting clear articulation across. That surely depends as much on bandwidth and equipment at the audience's end, over which the broadcaster has no control?
Jarek Weckwerth said,
April 28, 2025 @ 4:09 pm
This is gold, thank you! Finally some hard data to show to people who don't think this is a thing.
But (admittedly without reading the original article), I would say that claiming that "microphone quality is correlated with socioeconomic status" is a stretch. For example, good sound depends to a large extent on microphone placement, and that implies using an external microphone, and people of a certain socioeconomic inclination loathe an "ugly" external microphone, or god forbid headphones, in the shot.
ErikF said,
April 28, 2025 @ 4:23 pm
Good equipment is not necessarily expensive. A good cable should be determined from its physical characteristics, not its brand: I've had to troubleshoot expensive "boutique" cables that had terrible fake XLR connects and aluminum(!) wire, where a standard cable would have done a better job for a quarter of the price.
Regarding microphones, frequency response is one component. A high-priced reference mic with a flat response curve will actually sound much worse than a cheap vocal mic. The preamp that is used can play as much of a factor, or even more so, than the mic!
On the digital side, bitrate and compression can make all the difference between acceptable and incomprehensible.
In short, you have to look at your entire audio chain, not just some parts of it, to see if it's working properly or not.
Jarek Weckwerth said,
April 28, 2025 @ 5:03 pm
@AntC hear the second flute take a breath is the opposite of musical appreciation — It may be different from abstract musical appreciation, but opposite??? The sound of fingers on guitar strings is one of my favourites, and it certainly adds to the feeling of genuine human performance.