Eggcorn of the month
« previous post |
YouTube's speech-to-text system is way behind the state of the art, or maybe has a good sense of humor. From its transcription of Donald Trump's 5/15/2025 speech in Qatar (the whitehouse.gov version):
A few other (meta-usage) examples of "Pulit suprise" are Out There, but even an old-fashioned bigram language model would know that the right answer is "Pulitzer Prize" — so it's a puzzle why Google's (presumably) LLM-based model screws this up so badly.
And it makes the same choice in other recordings of the same speech, for example this one from Bloomberg:
And that recording's transcript has the same word sequence, but divides the transcript into lines differently — through still in a way that makes no sense, neither in terms of the message content nor in terms of its prosodic delivery. The large variation in line length removes the theory that the goal is a just a certain number of words or characters per line. So again, why this application of Google's language model is so (variably) crappy is a puzzle.
The word error rate is not especially large, but the system makes plenty of other weird choices as well. In its transcription of that particular speech, Trump refers (in a somewhat rambling way) to Sean Duffy. in his role as Secretary of Transportation and also as a former lumberjacking champion. The YouTube transcription of the whitehouse.gov version has his name spelled "Sean" six times and "Shawn" three times. The YouTube transcription of the Bloomberg version uses each spelling five times. (I'm not clear why the totals are different, and don't have time to look into it further — a reader may figure it out for us…)
And here the spelling choices are also slightly different:
Random trawling through YouTube transcripts, as I've done over the years, turns up lots of weird stuff — as one other example, both of the cited trancripts render references to C.C. Wei as "Mr. weey", with a lower-case initial letter as well as a weird spelling, even though the context should make it clear to any Artificial (un)Intelligence that Trump is talking about the head of TSMC.
Maybe somebody from Google can explain what's going on.
Jerry Packard said,
June 18, 2025 @ 8:24 am
We had a roommate several years ago who would treat us to dinner dishes such as a fish dish called turbot surprise and turkey surprise but our favorite was her chicken dish she called pullet surprise.
Kate Bunting said,
June 18, 2025 @ 8:26 am
I laughed when BBC TV subtitles rendered "P & O Ferries" as "piano fairies".
Ralph J Hickok said,
June 18, 2025 @ 9:40 am
"Pullet Surprise" is misspelled. It's a delightful chicken dish!
Robert T McQuaid said,
June 18, 2025 @ 9:48 am
Sometimes the eggcorn could be legitimate. A Canadian juggler with three rubber chickens once claimed: "This act gets the Pouletzer Prize."
Robert Coren said,
June 18, 2025 @ 9:51 am
We often watch TV with captions turned on because between our advancing age and a growing tendency for indistinct dialogue (people mumbling the way they do in real life, or mood music obscuring the dialogue). I don't know what mechanism is used for generating these captions, but whatever or whoever does it for the "Marple" series doesn't seem to understand some aspects of British English where it differs from US usage. The instance that stands out in my mind is a scene in which there was a bridge game going on in the background, and one of the players was captioned as saying "Low B", when it was clear that they had actually said "no bid", which is what British bridge players often say instead of "pass".
Rodger C said,
June 18, 2025 @ 12:20 pm
Our reasons for having the captions on are the same as Robert Coren's, and since we often watch British murder mysteries, we often see strange American renderings of this and that. "What?? … Oh, he said …"
tudza said,
June 18, 2025 @ 1:46 pm
Thought it was spelled poulet
Certainly Poulet Surprise is a good name for a chicken dish.
Mark Liberman said,
June 18, 2025 @ 3:10 pm
@tudza: "Thought it was spelled poulet"
In French, yes. In English it's pullet.
Bob Ladd said,
June 18, 2025 @ 3:34 pm
"Pullet Surprises" was the title of a book by Amsel Greene – a collection of malpropisms collected from student essays – which according to Amazon Books first appeared in 1969.