Speech-To-Text not quite perfect yet….
« previous post |
Yesterday on YouTube, "Former White House chief strategist Steve Bannon sits down with Dasha Burns, POLITICO's White House bureau chief". At the end of the interview, there's a conventional exchange of thank-yous. From Dasha Burns:
All right Steve, I know you got a show to record,
thank you so much for- for beaming in here
and uh sorry for the technical difficulties everyone.
Steve thanks so much.
And Steve Bannon's response:
Dasha thank you,
and thank Politico for having me.
Pretty much as expected. But Google's transcription generation system (along with its usual failure to divide segments by speaker) hears "Politico" as "polio":
Polio has been in the new recently, because of RFKJr's hostility to the polio vaccine, which may be why Google was primed to hear it. But no vaccine- or polio-related stuff is mentioned in this interview. And although Bannon's response has various (standard) kinds of phonetic reduction, like flapping-unto-extinction the intervocalic /t/ in "Politico", the /k/ in his performance of that word remains clear:
In IPA, I guess he said something like [ˈplɪ.kou] …
David Marjanović said,
January 15, 2025 @ 3:32 pm
More likely, polio was in Google's dictionary, Politico was not.
I've watched presentations from scientific conferences with automatic captions (which the presenters included in an attempt to be helpful). Invariably, almost all technical terms were replaced by words that were much more common but only vaguely similar. Without context, few sentences in the captions that contained any such terms would have remained comprehensible.
Mark Liberman said,
January 15, 2025 @ 4:15 pm
@David Marjanović: "More likely, polio was in Google's dictionary, Politico was not."
Nope — from earlier in the same YouTube transcript:
And there are several other correct recognitions of that word later in the same transcript.