Language Log

Yanny vs. Laurel, pt. 2

May 25, 2018 @ 7:07 am · Filed by Victor Mair under Language on the internets, Phonetics and phonology, Pronunciation, Psycholinguistics, Speech technology

« previous post | next post »

Just when you thought you'd never have to worry about this vexing acoustic phenomenon again, "Yanny vs. Laurel: an analysis by Benjamin Munson" (5/16/18) and the comments thereto having carried out such a probing, exhaustive investigation, a 3:44 video (5/15/18) attempts surface to explain it in a way that has not yet been mentioned:

Source: "This video explains why some hear Yanni [sic*] and some, Laurel, when they play the now famous audio clip: Don’t worry, you now have a chance to hear both", Scroll.in (5/18/18)

[VHM: It's only in the title of the Scroll.in article that the name is spelled this way]

The speaker is an engineer who has nearly forty years experience working with audio named Doug Johnson of Doug Johnson Productions in Orem, Utah. Doug employs spectrum analysis tools to isolate the high frequency and low frequency portions of the Laurel | Yanny recording and play them separately.

I'm confident that everyone with normal hearing will hear both "Laurel" and "Yanny" when Doug plays them separately. What's interesting is that when he plays the "Yanny" portion of the spectrum — the high frequencies — I hear absolutely nothing. As I explained in several comments to the first "Yanny vs. Laurel" post, that portion of my hearing range was destroyed by a powerful explosion close to my head. It also accounts for the loud, high-pitched tinnitus that I have been afflicted with for half a century. No Yanny for me.

From an anonymous colleague:

It looks as though people are simply predisposed to hear either high-pitched or low-pitched range, and the engineer who produced the Cloe Feldman tape was shooting for a mix that was as close to a 50/50 split of the population. If this was in fact what had been done, quite a fancy trick!

I'm eager to hear from the audio engineers among Language Log readers whether they think Doug Johnson's spectrum analysis and demonstration of separate but overlaid low frequency "Laurel" and high frequency "Yanny" stands up to scrutiny.

May 25, 2018 @ 7:07 am · Filed by Victor Mair under Language on the internets, Phonetics and phonology, Pronunciation, Psycholinguistics, Speech technology

Permalink

27 Comments

Victor Mair said,

May 25, 2018 @ 7:12 am

I note that comments were being made on the first "Yanny vs. Laurel" post this morning and yesterday evening, so it seems that it's still live.
Laurence Whiteside said,

May 25, 2018 @ 7:18 am

Has anyone run this by a speaker of a language that doesn't have separate /l/ and /ɹ/ sounds?
rcalmy said,

May 25, 2018 @ 8:20 am

I'm not qualified to comment on the technical aspects of the frequency analysis. But it would help explain why for my son it sounded more like "Larry" than anything else.
David L said,

May 25, 2018 @ 9:12 am

When he plays the lower frequency part of the recording, I can barely hear anything. I had to turn the volume up to hear a faint 'laurel.' So I guess I am somehow attuned to the high frequencies, which is why I only hear 'yanny' in the full recording.
What I find puzzling is that I am my sixties and everything I've read says that it's the higher frequency perception that diminishes with age. I would think, from this explanation, that young people would hear yanny and older people would hear laurel, but that doesn't appear to be a consistent pattern.
Trogluddite said,

May 25, 2018 @ 10:37 am

I've been an amateur sound engineer and audio signal processing programmer for several decades, and the explanation certainly rings true for me. I've spoken with several other people with the same interests, some of whom have carried out their own similar analysis, and they have all reached very similar conclusions to those suggested in the video.

The significant proportion of listeners who report different results depending upon the playback technology suggests the same explanation. The frequency response of the playback device and listening environment would be overlaid upon the frequency response of the "hearing predisposition" suggested by Victor's colleague. The "hearing predisposition" would be a combination of the physiological response of the ear and how our neurology has adapted to using that information for recognising speech as our brains developed. Indeed, the cochlea of the inner ear performs a very similar function to the Fourier algorithms used by the kind of analysis software demonstrated in the video; it essentially splits the incoming sound into many narrow frequency bands. This suggests to me that auditory perception relies on a neurological algorithm for finding patterns in an "inner spectrogram".

I do think that it is a stunningly effective demonstration of perceptual ambiguity, and I am fascinated to know whether the clip is the result of carefully applied psycholinguistics, an intended effect arrived at by empirical experimentation, or simply somebody noticing the curious result of a happy accident.

One of the biggest difficulties in sound engineering is to balance the musical elements of a song while maximising the intelligibility of the sung lyrics, so many of us are amateur psycholinguists to a certain extent, whether we recognise this or not. I can well imagine a sound engineer having the motive and means to try an experiment like this, or to have stumbled across it by accident while exploring the features of the incredible array of processing tools that are available to us these days.
Ellen K. said,

May 25, 2018 @ 10:51 am

I heard "laurel" when for both the high and low frequencies. This listening with high quality headphones. And I'm someone who first heard Yanny.
Christopher Carr said,

May 25, 2018 @ 11:15 am

I still hear "Laurel" both in the lower and higher frequency portions of the recording.
Victor Mair said,

May 25, 2018 @ 11:31 am

@ Ellen K.

@ Christopher Carr

Do you hear both "Laurel" and "Yanny" equally well / easily / clearly at both frequencies?
John Roth said,

May 25, 2018 @ 11:45 am

Yeah, there's something wrong with the lower frequency. When I first heard this, it was definitely "Laurel". (I'm 74, by the way, so I suspect my upper frequency range is shot.)

On this, I can barely hear that there might be something there in the lower register, and I hear 'Yanny" clearly.
bratschegirl said,

May 25, 2018 @ 1:14 pm

On the original post, I heard Yanny the first time and Laurel every subsequent time. Here, I hear Yanny twice when he first plays it. I hear Laurel when he separates out the lower portion, but at a reduced volume; I hear Yanny, at a somewhat louder volume, when he separates out the upper portion.

FWIW, I'm a classical musician who has some extremely mild lifelong tinnitus. I had my hearing checked after exposure to extremely loud fireworks at a 4th of July gig a couple of years ago, and according to that examination I have no degree of hearing loss at any frequency.
Ben Zimmer said,

May 25, 2018 @ 1:19 pm

Just to recap what we now know, the recording was not created by overlaying two different bits of audio, as Doug Johnson suggests. The original audio, a pronunciation of the word laurel on Vocabulary.com, was heard by a high school student as yanny, based on her picking up on the higher frequencies of the recording. That audio was then re-recorded (apparently from laptop to phone), which is what was shared by Cloe Feldman. The distortion from the re-recording emphasized the tinny higher frequencies, which happened to make it more likely for people to pick up on the yanny interpretation. There was no clever engineering involved. (More backstory from Wired here.)
Ellen K. said,

May 25, 2018 @ 1:26 pm

@Victor Mair.

No. As I said, I heard laurel at both. Only laurel.

I have never perceived both simultaneously any time listening to the clip. Though I have occasionally, after not hearing if for a while, had it, the first time, start as "yanny" and switch to "laurel" in the middle of the word. And then after that it's firmly laurel.
Jonathan Smith said,

May 25, 2018 @ 2:28 pm

I just pointed out in the other thread that you can reproduce the effect by clipping "Laurel" from the original voice actor's recent CNN appearance: removing low frequencies from the segment gives "Yanny". So it's actually something to do with his voice, which has a formant-rich "broadcaster" quality to it that I'm unable to describe in a technical way.
Victor Mair said,

May 25, 2018 @ 2:51 pm

The information provided by Ben helps me to understand better a perception about the Cloe Feldman recording and its manipulation with the NYT slider I've had from the very beginning of this Laurel vs. Yanny journey. Judging from many of the comments to the original post and also to this post, other readers also share this perception (or group of perceptions) to one degree or another. Namely, to the extent that we hear "Laurel" or "Yanny" or both "Laurel" and "Yanny", there's a certain ambiguity and "lack of full signal", if I can put it that way, to the recording.

As I characterized it in one of my comments to the first post, the sound that comes through is "slight, tinny, scratchy" and "wispy" — at least it is so at the end of the spectrum (the upper one) that is hardest for me to perceive. Another adjective for characterizing the sound of the Cloe Feldman recording that I can't get out of my mind is "anemic". This quality may well be a function of the process of transmission from the Vocabulary.com audio pronunciation through other devices (laptop? phone?) before it went viral via Cloe Feldman's tweet.

This convoluted transmission process may also help to account for some of the unusual features of the spectrogram pointed out by Doug Johnson.

Also contributing to the tenuity and ambiguity of that which we hear upon each iteration of the recording are the idiosyncrasies of our own physiological and neurological abilities, as well as the capabilities of the receivers and speakers that bring the recording to our ears, together with the ambient sounds in the environment when we register what we are hearing.

The most uncanny aspect of this entire experience for me is that I never can hear "Yanny" at all, and, at the higher frequencies where many other commenters say they can hear "Yanny", I hear absolutely nothing, not even the pale ghost of a "Laurel", let alone a lurking "Yanny". Just nada, zilch, zero — blank emptiness.
Sergey said,

May 25, 2018 @ 3:37 pm

With the adjustable version of the player (NYTimes?) I could hear both words overlapping each other if I tune the player at just the transition point, and I can mentally shift to one or the other, another one becoming a background noise.
M.N. said,

May 25, 2018 @ 4:17 pm

I might be deluding myself because I've heard the stimulus so many times now, but I think I can still hear "Laurel" when he isolates the upper frequencies. Isolating the lower frequencies definitely got rid of "Yanny", though. (Ordinarily, I hear both at once.)

This all reminds me of that thing about how bells are tuned, where the note you hear isn't really the fundamental frequency and it's just filled in by your brain based on the rest of the harmonic series.
Thaomas said,

May 25, 2018 @ 4:43 pm

Has there not been any discussion of who hears one and the other broken down by age, gender, etc?
CPC said,

May 25, 2018 @ 4:50 pm

The New York Times had a convenient tool based on a similar analysis: https://www.nytimes.com/interactive/2018/05/16/upshot/audio-clip-yanny-laurel-debate.html

They let you move a slider to increase or decrease the low and high frequencies in the sample. Everyone seems to have a different cut-off.
Aaron Toivo said,

May 25, 2018 @ 4:54 pm

The first time I heard the original recording, I heard "YEN-" for the split second it took to get from there to the "EL", but that second syllable gave me a sharp sensation of having misheard the first. That moment of cognitive dissonance resolved to "laurel" in my head and ever since I've been able to hear nothing else. Not even in the above video. Nor in others I've seen that altered the recording's whole frequency up and down.

Brains are so strange.
A Migdal said,

May 25, 2018 @ 5:37 pm

It is possible to do some simple experiments by loading the sample to a common audio processing program (Audacity is a free one). For me it indeed appears true you can switch between yenny and laurel by adjusting the pitch — either via simple speed change or a length-preserving pitch change. In the original sample, I consistently hear "laurel", but shifting pitch down changes it to "yenny".

I tried filtering out frequencies above 3kHz in the equalizer — much below my apparent hearing cutoff — and the yanny/laurel switch when changing speed/pitch is not changed. The confusing part of the signal then may be at the low frequencies, and not as sensitive to hearing loss.

Filtering out frequencies above 1.5kHz makes the result sound only like "laurel" regardless of pitch shift. The same occurs if I filter out the part outside the window [1.5, 3] kHz, sounds only like "larry" / "laurel".

So this looks like more complicated than simple hearing loss.
JPL said,

May 25, 2018 @ 9:15 pm

@Victor Mair

On both the original Vocabulary.com recording and the twitter recording by Ms Feldman I can hear both "laurel" and "yanni" simultaneously. The onset of the second syllable "-ni" of "yanni" coincides with the articulation of the final /l/ of "laurel", while the two vowel sounds overlap (low back rounded and low front ae as in "cat"). My question toward the end of the other thread still stands: How general is this phenomenon: is it unique to this particular recording or this particular speaker, or can this pattern be found in any recording of human speech? (I listened to other recordings on Vocabulary.com, but I didn't hear anything similar. The fellow who recorded "laurel" and "audacity" does have a peculiarly resonant voice.) Like Benjamin Munson, Doug Johnson seems to think that there are two distinct signals overlaid in one recording, but Ben Zimmer disagrees ("… not created by overlaying …"). I want to know how it's possible to have concurrent higher frequency formants giving the impression of an articulation that is different from that of the normal formants of the word as intentionally pronounced. What is the explanation for the perceivable differences (as opposed to "reflection") in the higher frequency formants? Would I be able to hear both "laurel" and "yanni" if I heard that speaker say the word in "real life"?
M.N. said,

May 26, 2018 @ 12:52 am

Someone at vocabulary.com had a similar idea: here's a list of some of the words that were recorded by the same guy. https://www.vocabulary.com/lists/2365341

They do share a rather interesting timbre. Plus, it sounds to my ear like he's a big purveyor of the elusive male creaky voice:
https://www.vocabulary.com/dictionary/minatory
https://www.vocabulary.com/dictionary/suppuration
Martin said,

May 26, 2018 @ 8:03 am

Presumably this analogy’s been made elsewhere, but this feels strongly reminiscent of a colour-blindness test. Different eyes vary systematically in their responses to different frequencies, and we understand that well enough now that it’s relatively easy to generate colourblindness tests that can determine someone’s perception patterns. Presumably similar tests could be created or devised using the yanny/laurel effect to help pin down which frequencies are best perceived by different people – or to create audio that sounds different to different age cohorts, if there’s an age-correlated pattern in those differences.
B.Ma said,

May 27, 2018 @ 2:36 am

Is there a way to isolate the higher frequencies from the recording and then process them in some way to make them lower (I don't know the right terms) so that Victor would be able to hear "yanny"?
jjwk said,

May 27, 2018 @ 3:01 pm

In the unadjusted version, I heard Yanny (more like Yammy) on the first iteration, then Laurel on every repetition and ever since then … until listening to this version, when again I heard Yanny once and Laurel right after that.

Also, has anyone else tried recording themselves saying Laurel and comparing? I've done it, and the high-pitched versions sound different (more like lala), but never approach Yanny.
Robert said,

May 28, 2018 @ 7:30 am

@B Ma. That would involve high pass filtering of the recording and then lowering the pitch by pitch shifting down.
BZ said,

May 29, 2018 @ 11:54 am

I still mainly hear "laurel" in the vocabulary.com original, but if I listen carefully, I do hear what appears to be a faint recording artifact that, if I had to interpret it, sounds a lot like an electronic whispered high-pitch "yanni". I'm pretty sure it's not the guy's voice, but a combination of the voice and recording equipment. It's present in a few of his other recordings, such as "mugwump" and "cusp" as well, but I can't construe it as a spoken word there. His other recordings are entirely clear of this.

RSS feed for comments on this post

Yanny vs. Laurel, pt. 2

27 Comments

Victor Mair said,

Laurence Whiteside said,

rcalmy said,

David L said,

Trogluddite said,

Ellen K. said,

Christopher Carr said,

Victor Mair said,

John Roth said,

bratschegirl said,

Ben Zimmer said,

Ellen K. said,

Jonathan Smith said,

Victor Mair said,

Sergey said,

M.N. said,

Thaomas said,

CPC said,

Aaron Toivo said,

A Migdal said,

JPL said,

M.N. said,

Martin said,

B.Ma said,

jjwk said,

Robert said,

BZ said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta