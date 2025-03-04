« previous post |

Over the years, we've documented various applications of voice morphing technology besides the malicious creation of "deep fake" audio clips. Here's a new one: Amrit Dillon, "AI erases call centre staff’s Indian accents", The Times 3/2/2025:

A French company which operates the largest number of call centres in the world is using artificial intelligence to soften Indian accents in real time to make customer conversations easier and shorter.

Teleperformance said that it was sometimes difficult for customers calling call centres in India — and the Philippines — to understand workers’ accents, leading to frustration and longer than necessary calls.

“When you have an Indian agent on the line, sometimes it’s hard to hear, to understand,” Thomas Mackenbrock, the company’s deputy chief executive, told Bloomberg News. “The technology can neutralise the accent of the Indian speaker with zero latency. This creates more intimacy, increases customer satisfaction, and reduces the average handling time. It is a win-win for both parties.”

The software, called “accent translation”, has been developed by Sanas, a start-up based in Palo Alto, California.

The article quotes an objection:

Akhilesh Agarwal, 28, worked in a call centre in Bangalore for two years years and disliked every minute of it. He is not amused by the AI accent tool.

“It’s not really neutralisation is it? That’s just sugaring the pill. It’s favouring an American or British accent above an Indian one. I wonder if they’d neutralise a Scottish or Irish accent? I doubt it,” he said.

A more extensive critique can be found in Payne et al., "Real-time accent-altering technology: The message is clear, and it is dehumanizing", PsyArXiv 2023. FWIW, my own opinion is that a system of this type is meant to fool its users rather than to degrade its employees, though it probably has that effect as well. And the moral issues involved go back to Henry Higgins in Shaw's Pygmalion, who aimed to accomplish the same sort of accent transformation by non-computational means.

But sociolinguistic prejudices aside, I'm not yet convinced that the Sanas system actually works. From the Times article:

Some demo videos on YouTube from two years ago, when the technology was first developed, show strong Indian and African accents becoming instantly clearer and more American-sounding.

We learned more than 50 years that "evaluation by demo" can't be trusted — and if a creditable third-party evaluation has been done, I can't find any documentation of it. The company's website is here, and includes some percentage-decorated claims of system performance:

However, they don't seem to offer any explanation of where these numbers came from.

Still, a reliable system of this type will certainly become possible within a few years, whether or not it works well now.

