What's (still) wrong with text-to-speech?
Text-To-Speech technology has improved enormously over the decades — but there's still some headroom, as a friend has recently underlined for me. He observes that when The Economist magazine first publishes a piece online, it appears with a AI-read audio, and then later with a human-read version:
The rhythm/prosody/pitch (I'm not exactly sure which – all three?) is the same in nearly every sentence and even clause. This high-then-falling pattern is fine in one sentence, but repeated 50 times in a row is awful.
Later, those pieces that make it into the print edition get their own, human-read version. So voilà, you have a perfect before-and-after.
Read the rest of this entry »


