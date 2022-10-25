« previous post |

As part of my on-going exploration of the many ways in which F0 is not pitch and pitch is not F0, I did a little demo/experiment with a sample of Anna-Maria Hefele's "Polyphonic Overtone Singing" video:

Here's an 8.85-second audio clip from that video, starting at 43.45 seconds:

Here's a spectrogram of that segment:

(The blue line at the top is Praat's attempt at pitch tracking, which follows the fundamental (at about 275 Hz) rather than the variable-strength harmonics which we hear as a time-varying melody.)

And here's the fun part, an animation of sequential spectral slices from the same clip. I've slowed the performance by about a factor of three — the spectral slices were calculated 100 times per second, in overlapping 15-millisecond windows (code available on request), whereas the animation runs at 30 frames per second. I extended the audio to match the duration of the video, without changing the pitch, using sox's tempo feature (i.e. "sox Hefele1X1a.wav Hefele1X1b.wav tempo 0.3051724 50"):

More later on what this means (for those who don't get it already), and applications of the same kind of visualization to "creak", "throat singing", "voice quality", etc. etc.

