Eggcorn of the month

« previous post | next post »

YouTube's speech-to-text system is way behind the state of the art, or maybe has a good sense of humor. From its transcription of Donald Trump's 5/15/2025 speech in Qatar (the whitehouse.gov version):

A few other (meta-usage) examples of "Pulit suprise" are Out There, but even an old-fashioned bigram language model would know that the right answer is "Pulitzer Prize" — so it's a puzzle why Google's (presumably) LLM-based model screws this up so badly.

And it makes the same choice in other recordings of the same speech, for example this one from Bloomberg:

And that recording's transcript has the same word sequence, but divides the transcript into lines differently — through still in a way that makes no sense, neither in terms of the message content nor in terms of its prosodic delivery. The large variation in line length removes the theory that the goal is a just a certain number of words or characters per line. So again, why this application of Google's language model is so (variably) crappy is a puzzle.

The word error rate is not especially large, but the system makes plenty of other weird choices as well. In its transcription of that particular speech, Trump refers (in a somewhat rambling way) to Sean Duffy. in his role as Secretary of Transportation and also as a former lumberjacking champion. The YouTube transcription of the whitehouse.gov version has his name spelled "Sean" six times and "Shawn" three times. The YouTube transcription of the Bloomberg version uses each spelling five times. (I'm not clear why the totals are different, and don't have time to look into it further — a reader may figure it out for us…)

And here the spelling choices are also slightly different:

Random trawling through YouTube transcripts, as I've done over the years, turns up lots of weird stuff — as one other example, both of the cited trancripts render references to C.C. Wei as "Mr. weey", with a lower-case initial letter as well as a weird spelling, even though the context should make it clear to any Artificial (un)Intelligence that Trump is talking about the head of TSMC.

Maybe somebody from Google can explain what's going on.



28 Comments »

  1. Jerry Packard said,

    June 18, 2025 @ 8:24 am

    We had a roommate several years ago who would treat us to dinner dishes such as a fish dish called turbot surprise and turkey surprise but our favorite was her chicken dish she called pullet surprise.

  2. Kate Bunting said,

    June 18, 2025 @ 8:26 am

    I laughed when BBC TV subtitles rendered "P & O Ferries" as "piano fairies".

  3. Ralph J Hickok said,

    June 18, 2025 @ 9:40 am

    "Pullet Surprise" is misspelled. It's a delightful chicken dish!

  4. Robert T McQuaid said,

    June 18, 2025 @ 9:48 am

    Sometimes the eggcorn could be legitimate. A Canadian juggler with three rubber chickens once claimed: "This act gets the Pouletzer Prize."

  5. Robert Coren said,

    June 18, 2025 @ 9:51 am

    We often watch TV with captions turned on because between our advancing age and a growing tendency for indistinct dialogue (people mumbling the way they do in real life, or mood music obscuring the dialogue). I don't know what mechanism is used for generating these captions, but whatever or whoever does it for the "Marple" series doesn't seem to understand some aspects of British English where it differs from US usage. The instance that stands out in my mind is a scene in which there was a bridge game going on in the background, and one of the players was captioned as saying "Low B", when it was clear that they had actually said "no bid", which is what British bridge players often say instead of "pass".

  6. Rodger C said,

    June 18, 2025 @ 12:20 pm

    Our reasons for having the captions on are the same as Robert Coren's, and since we often watch British murder mysteries, we often see strange American renderings of this and that. "What?? … Oh, he said …"

  7. tudza said,

    June 18, 2025 @ 1:46 pm

    Thought it was spelled poulet

    Certainly Poulet Surprise is a good name for a chicken dish.

  8. Mark Liberman said,

    June 18, 2025 @ 3:10 pm

    @tudza: "Thought it was spelled poulet"

    In French, yes. In English it's pullet.

  9. Bob Ladd said,

    June 18, 2025 @ 3:34 pm

    "Pullet Surprises" was the title of a book by Amsel Greene – a collection of malpropisms collected from student essays – which according to Amazon Books first appeared in 1969.

  10. anonymous said,

    June 18, 2025 @ 10:37 pm

    I somewhat doubt Google is using an LLM for this. Their automatic captioning has been arojnd for years and years, and it wouldn't surprise me if it used an older system.

  11. Arthur Baker said,

    June 19, 2025 @ 1:10 am

    Kate Bunting, my Australian wife has always pronounced ferry as fairy, e.g. "I'll be taking the fairy across Sydney Harbour". Hmm, will that be the Sugar Plum, an elf, a pixie, or what?

  12. Bob Ladd said,

    June 19, 2025 @ 1:30 am

    @Arthur Baker: I think most speakers of American English would assume that EVERYONE, not just your Australian wife, pronounces ferry as fairy. The 3-way merry/Mary/marry distinction beloved of mid-20th-century phonemicists and American dialectologists is not widely observed in North America.

  13. Arthur Baker said,

    June 19, 2025 @ 2:01 am

    Bob Ladd, thanks for that. Coincidentally my wife has a sister named Mary and a brother-in-law who married Mary, and manages to wish both of them "Merry Christmas" each year, so the 3-way distinction you mention does exist here. Why the merry/Mary distinction doesn't carry over to ferry/fairy, I have no idea. Just one of the linguistic oddities of the Antipodes.

  14. Jarek Weckwerth said,

    June 19, 2025 @ 2:58 am

    @anonymous I somewhat doubt Google is using an LLM for this. I second that. Just last night, I watched a video about German Panzer III tanks with a British narrator that stubbornly captioned Panzer as Panza. In a video about tanks. What kind of LLM would do that? And this kind of thing is really quite typical for YT.

  15. Mark Liberman said,

    June 19, 2025 @ 6:21 am

    @Jarek Weckwerth: "What kind of LLM would do that?"

    A really stupid one? The thing is, an old-fashioned ngram model will trivially pick Panzer over Panza, since the latter is very much less likely in the relevant contexts (e.g. not following "Sancho").

    My guess is that the system is starting from letter-substring "tokens", as modern LLMs do, and then makes some unfortunate choices (e.g. about "temperature") in its role guiding an end-to-end speech to text system.

  16. David Marjanović said,

    June 19, 2025 @ 8:46 am

    Why the merry/Mary distinction doesn't carry over to ferry/fairy, I have no idea. Just one of the linguistic oddities of the Antipodes.

    I wonder if you're hearing something different: she has the distinction and says ferry with her BED vowel… but that's [e], as in New Zealand, South Africa and 1950s Advanced RP, instead of [ɛ].

  17. Robert Coren said,

    June 19, 2025 @ 9:15 am

    @Bob Ladd: I think you're exaggerating the extent of the merry/marry/Mary blend in the US. I definitely say and hear the difference, and I think that's true of most Northeasterners.

  18. Jonathan Smith said,

    June 19, 2025 @ 1:27 pm

    FWIW Harvard Dialect Survey map (2003) on Mary/merry/marry — no regional breakdown but at any rate the distinction(s) are fading fast everywhere… plus given different spellings, the naive tendency is to OVERestimate the extent to which one differentiates… I have seen people swear they pronounce e.g. knight and night differently :/

  19. Bob Ladd said,

    June 19, 2025 @ 2:48 pm

    On marry/merry/Mary again: I deliberately didn't say it was dead in North America, just that it's not widely observed. (My pronunciation is Northeastern US if it's anything, and I have a rock-solid distinction between marry and the other two. The Mary/merry distinction is certainly less solid for me, but it's kind of there sometimes.)

    One of my sons was once in a conversation an American friend about given names that are often spelled differently for male and female bearers of the name – pairs like Lee/Leigh or Leslie/Lesley. Much to my son's puzzlement, the friend cited Kerry/Carrie as another example.

  20. Chas Belov said,

    June 19, 2025 @ 3:33 pm

    I think most of my personal accent comes from Southwestern Pennsylvania, with a few other places mixed in. I don't have the Mary/marry distinction and, as best I can tell as someone who is not a trained phonetician, I kind of half-have the merry distinction. I think I also half-have the fairy/ferry distinction.

    By half-have, I mean if I listen carefully, I can sort of hear the distinction, but I have to work at hearing it and can't guarantee I always do it. That is, I may not be consistent in how I produce merry or ferry.

    I am very aware that I have the internal rowt/root distinction for route and the crick/creek distinction for creek, based on whether or not it is a proper noun. That is:

    down by the crick
    Jones Creek

    take that rowt
    Root 66

    and I remember a discussion with a Southwestern Pennsylvania friend when I was living there that they had that distinction as well, so it wasn't just me.

    And I pronounce color as "keller."

  21. Jarek Weckwerth said,

    June 19, 2025 @ 4:13 pm

    @Mark Liberman "What kind of LLM would do that?" — A really stupid one? Isn't that a bit harsh? ;)

    I don't have ready examples at hand, but I regularly see YT captions spit out complete gibberish, i.e. letter sequences that aren't English words at all but sort-of fit the phonetics of what is being said. That would point in the direction of a system in which phoneme-to-grapheme mappings override any "language model" outputs for at least out-of-dictionary material. (And the dictionary seems to lag behind life quite badly; I was astonished, for example, by how long it took them to start getting Covid right…)

    I've made a reminder to log some examples.

  22. Jarek Weckwerth said,

    June 19, 2025 @ 5:53 pm

    OK, here comes: In a podcast, Kamala Harris is CCed as Kla Harris. I'm not a native speaker but Kla does not strike me as a real English word.

  23. Jarek Weckwerth said,

    June 19, 2025 @ 5:56 pm

    And a few seconds later, Kamla: https://youtu.be/hQRFcg7CKcY?si=QAmW00idqpqg2b2i&t=1736.

  24. Philip Taylor said,

    June 20, 2025 @ 3:52 am

    Sadly the days when personal names had to be "real English words" are long since past —any attempt to transcribe such names from audio would require an ever-growing database, but there could still be no guarantee that the spelling guessed at (no stronger wording would be justified) is the correct one.

  25. Jarek Weckwerth said,

    June 20, 2025 @ 11:40 am

    @ Philip Taylor: That's a general truth of speech recognition. There will always be strings that are not in the dictionary, be it personal names or new words etc. Here, however, the question is what happens under such circumstances. If there was a preference for what Mark Liberman calls "old-fashioned ngram models", then you would expect real words that are similar phonetically to the input but make semantic sense in the context. But here, it's evident that the phonetic form takes precedence. Likewise in the Panza/Panzer case; the form is selected on phonetic grounds, ignoring the context. My point is that that is not what I would expect from an LLM.

    I'm familiar with similar things at the "other end of speech tech", in text-to-speech. Many systems at the moment still don't do any "language modelling", which manifests e.g. in an inability to do normalization (i.e. conversion of things such as digits or acronyms into actual phonemic strings). Polish numerals are an excellent example; the form of the numeral depends on so many factors that even top-tier systems can't cope.

    For example spotkamy się po 25 is reasonably intepretable for a human as 'we will meet after the 25th [day of this month]', and in the context of a text-based conversation would not cause any trouble. But you need to know that it can only reasonably refer to a month date, and if so, it will use the ordinal in the correct case as decided by the preposition. Try it on ElevenLabs. Doesn't work.

  26. Chas Belov said,

    June 20, 2025 @ 2:09 pm

    @Philip Taylor:

    Sadly the days when personal names had to be "real English words" are long since past

    I don't think of English given names as English words, in that I could use them in a sentence where they weren't a name, aside from a limited number of word-names such as Charity, Hope, or Felicity. Unlike Chinese or Japanese or some indigenous American languages where names commonly are made up of words from that language.

  27. Chas Belov said,

    June 20, 2025 @ 7:25 pm

    And actually, now that I think about it, when I use Google Translate on text that includes someones name, it will often literally translate the name as English words when what I really want is a Latin-character transcription of the name.

  28. stephen said,

    June 20, 2025 @ 9:16 pm

    i24NEWS is an Israeli news channel. Their closed captioning used to be complete gibberish, just random words. Now they don't have any closed captioning.

    For what it's worth…regarding my own accent, Upper Midwest US….
    I pronounce Campbell as Camel
    Crayon as Cran
    I'll as All
    Well as Wool or sometimes like Wall

RSS feed for comments on this post · TrackBack URI

Leave a Comment