MT story of the month

« previous post | next post »

Arika Okrent, "Translation Error Announces 'Clitoris Festival' in Spain", Mental Floss 11/2/2015:

The town of As Pontes in northwestern Spain has held a festival to celebrate the local leafy green delicacy of grelo, or broccoli rabe, since 1981. This year, visitors who went to the festival website hoping to find useful information were surprised by the announcement of a "Clitoris Festival" and the claim that "the clitoris is one of the typical products of Galician cuisine."

Municipal spokesman Monserrat García explained that the mistake was the result of automatic Google translation from the local language of Galician into Castilian Spanish.

The source of the story: "Cuando «o grelo» se convierte en «el clítoris»", La Voz de Galicia 10/31/2015.

Arika's Mental Floss column offers screen shots of the translation, and explains why the translation is perfectly logical (for an algorithm with no common sense…).

The Spanish-language web page for La Feria del Grelo of As Pontes now simply uses the term "grelo", and Google Translate no longer translates grelo as "clitoris", in the cited context.

But can it really be true that in Galicia, which after all is part of Spain, the people responsible for creating that web page needed to use Google Translate to go from Galician to Spanish? And then didn't recognize that "La Feria del Clitoris" was problematic?

Good news for the health of minority languages in Spain, if so.

But it seems improbable. I haven't seen a screenshot of the alleged festival web page (which anyhow could be faked) — my guess is that it's one of those "wouldn't it be funny if they did this" kind of jokes…

Arika Okrent quotes a named "municipal spokesman", though only via the  link to La Voz de Galicia. The Guardian quotes the same official, and provides a few direct quotes — Asifa Kassam, "Google Translate error sees Spanish town advertise clitoris festival", The Guardian 11/3/2015:

“It was quite a surprise,” Montserrat García, the town’s spokesperson, told the Guardian. “At first, we didn’t believe what we were seeing.” […]

Officials in As Pontes are considering filing a formal complaint with Google. García said: “They should recognise Galician and translate it accurately.”

Google Translate has since changed the translation, with grelo now said to mean “brote” or sprout. But García remains dissatisfied: “It’s still not the best way of describing grelo, as it is a vegetable from the turnip family.”

Town officials have turned their backs on Google Translate, but there has been an upside to the embarrassing error, García said, as it caused a surge of interest in this year’s festival. “It’s become a means – albeit a very odd means – of promoting our festival.”

But the Guardian reporter was apparently working from Madrid. So I'm reserving judgment — it's either a real-life funny mistranslation, or a well-executed prank based on a potential mistranslation.




  1. Matt Heath said,

    November 3, 2015 @ 4:33 pm

    "Grelos" as "clitoris" as a joke appears in this (in places very funny) spoof Portuguese/English that I recall taking in some British bloggers a few years back

  2. Graeme said,

    November 3, 2015 @ 5:10 pm

    Galician is poised between Portuguese and Castilian Spanish In an online translate tool, "grelo" did come out as "clitoris" – a kind of slang extension of bud or sprout. maybe a warning about not checking the back-translation…;)

    [(myl) Yes, that's clear — Arika Okrent gives screenshots of Google Translate doing it. The question is, did the As Pontes tourism board REALLY use Google Translate to create the Spanish version of their Grelo Festival web page, and not notice the problem???]

  3. Rubrick said,

    November 3, 2015 @ 6:10 pm

    The festival may be difficult to find, but it's totally worth it.

  4. Belial Issimo said,

    November 3, 2015 @ 6:57 pm

    ^ Yes, and delicious.

  5. Matt said,

    November 3, 2015 @ 7:56 pm

    I can 100% believe that someone tasked with creating the Spanish version of the page by a boss who didn't really care how it was done just ran the text through Google Translate and put it up without even looking at it. Looking at the page itself, it's only one sub-page among many, and it doesn't seem to have any elements that would have required attention by a designer, etc. — it's quite plausible that there was no stage in the process where a human being involved in production looked at the finished result.

    (I was wondering if perhaps they had used that automatic Google Translate toolbar thingie which sits at the top of the page and "translates" into the user's preferred language automatically, but unless they've entirely designed it out of the site since then that doesn't seem likely.)

  6. Adam F said,

    November 4, 2015 @ 8:33 am

    @Matt Heath

    That reminds me, of course, of the famous Hungarian phrase book.

  7. Graeme said,

    November 4, 2015 @ 10:48 am

    Because of the tourist industry between Spain and the UK, many if not most Spanish restaurants offer menus translated into English but the translations are frequently unintelligible – for example translating "pasas" as "steps" instead of "raisins", or "pata negra" as "black leg" rather than as "ham". I find it easy to believe that they just used Google Translate. You would need a fairly profound or specialised knowledge of English to discern that clitoris is an odd word to employ. I have a reasonable knowledge of Spanish but off-hand I could not tell you what the Spanish word for "clitoris" is but at least I now know a slangy Portuguese translation.

    [(myl) But the allegedly problematic page was in SPANISH, not in English. And the Spanish word for clitoris is apparently "clitoris", so we need to assume that
    (1) Someone at the tourism agency wrote the page in Galician.
    (2) Someone else (in Spain, mind you), tasked with translating the page into Spanish, felt the need to use Google Translate.
    (3) Either no one looked at the result, or they somehow didn't notice that "clitoris" is not a good Spanish approximation of "broccoli rabe"…

  8. Terry Hunt said,

    November 4, 2015 @ 10:58 am

    I now expect a Clitoris Festival to figure in the next volume of Ann Leckie's 'Imperial Radch' series (which is not without linguistic interest, by the way). Readers of Ancillary Sword will understand why.
    (And yes, I do know really that Ann doesn't plan a further volume.)

  9. un_malpaso said,

    November 4, 2015 @ 11:42 am

    My first thought upon reading the original translation was, "Gosh, I think this nose-to-tail cuisine trend has gone a bit too far."

  10. Rod Johnson said,

    November 4, 2015 @ 12:02 pm

    (Leckie does plan further volumes set in the same universe, just not in the story of Breq.)

  11. J. W. Brewer said,

    November 4, 2015 @ 12:32 pm

    In terms of improbability, one empirical fact that seems important is whether there are any significant number of monolingual L1 Galiciaphones (if that's the word), or whether due to the effect of education/media/etc we should expect approximately 100% Galician speakers of the age and social background likely to be working on a website to be fully literate in Castillian. Possible UK parallel — I think there have been instances in Wales of Welsh text that came out comically wrong because it was lousy MT from an English original not subject to quality checking by someone with actual Welsh fluency, but I would be much more surprised if there were non-hoax texts in English that were the result of lousy MT from a Welsh original, because approximately 100% of adult L1 Welsh-speakers in Wales have good enough English to notice the problem (indeed, quite probably good enough English as to make resort to MT inefficient).

  12. J. W. Brewer said,

    November 4, 2015 @ 12:40 pm

    FWIW, google translate claims that the word for "clitoris" in Galician (and also in Portuguese) is "clitóris." Obviously there must be synonyms used in more informal registers (which might or might not be vulgar). I suppose it is possible that in a given language community the polite/medical/scientific word for the organ is so high-register that it's not universally known, with a non-vulgar-but-less-formal synonym being the lexeme in common use, a la the way AmEng-speaking elementary school kids used to try to trick each other into embarrassment on the playground or school bus (maybe they still do?) by saying "ooh, your epidermis is showing."

  13. Jerry Friedman said,

    November 4, 2015 @ 1:54 pm

    I wonder why the La Voz article has o grelo (masculine, I take it) while the screen shot at Mental Floss has a grelo (feminine)?

  14. Arika Okrent said,

    November 4, 2015 @ 3:35 pm

    That might matter. I don't speak Galician (though I do have passing Portuguese) and it actually took me a while to come up with a phrase that would translate grelo as clitoris. Plain "grelo" didn't do it, nor did "feira do grelo" or "feira dos grelos." I don't remember if I tried "a grelo" and "o grelo" but eventually I just took the sentence they mentioned in the article, translated from English to Galician, and then fed the product into Galician to Spanish (it gave me "a grelo" from Galician to English) and that worked.

    Interestingly, one other thing that worked was if the number was mismatched. It would translate "dos grelos" (de los grelos) as greens, but "dos grelo" as clitoris (or something like that. I didn't take a screenshot but I know it worked when there was a plural-singular mismatch between determiner and noun). This reveals something interesting about machine learning, I think. If the surrounding text has been part of the training it will return a good result, but mismatches force it to draw from disparate parts of training. Maybe it's important to have a noun gender mismatch too and that's why it worked in this case.

    I also wouldn't expect Galician speakers to not also be fully fluent in Spanish, but I don't think it was a stunt. The fact that the very sentence they mention is the only one that really forced that translation tells me they found it by accident and just didn't notice it.

    Now that translation glitch doesn't work any more. It's funny to think that Google has a team of engineers ready to do hand-crafted adjustments in these situations. I wonder how often that happens.

    Remember to take screenshots when you find something weird!

  15. Jerry Friedman said,

    November 4, 2015 @ 6:28 pm

    Arika Okrent: Thanks for replying and for sharing more information. If Google translate doesn't take grelo clitorally* unless there's a mismatch that a native speaker is unlikely to make, I'd think an accident would be less likely. So it might be a wash.

    *D. Barton Johnson

  16. Prado said,

    November 5, 2015 @ 10:26 am

    There really is no story here. The word for "clitoris" in both Spanish and Galician is also "clitoris". Nobody "felt the need" to use Google Translate because they were not fluent in Spanish, they were just lazy and didn't even read the translated text.

  17. Gilson de Azevedo said,

    November 5, 2015 @ 11:35 am

    Grelo is always masculine in Portuguese (o grelo). And it has many meanings.

    Dicionário Houaiss
    ( )

    grelo \ê\

    substantivo masculino ( 1563)
    1 embrião quando surge da semente
    2 germe dos bulbos, rizomas e tubérculos, ao aparecer fora da terra; broto
    ‹ g. da batata ›
    3 m.q. broto ('início do desenvolvimento de planta')
    4 P haste das crucíferas antes do desabrochar completo das flores
    ‹ g. da couve › ‹ g. de nabo ›
    5 tab. m.q. clitóris
    6 P fita que orna a pasta dos estudantes do 4o ano e que se queima, em antiga praxe na Universidade de Coimbra

    grelos: substantivo masculino plural BEI Algodres
    1 excremento(s)

    Sinônímia e Variantes

    como ver sinonímia de excremento



    Gilson de Azevedo
    Porto Alegre, Brazil

  18. Peter Taylor said,

    November 5, 2015 @ 5:35 pm

    @J.W. Brewer, the minority languages of Spain were repressed during the dictatorship, and although the constitution of 1978 grants regional official status to three of them* it also states that it is a duty of every Spaniard to be able to speak Castilian (Article 3). In rural areas I wouldn't be surprised to find towns where everyone speaks Gallego ten times as frequently as Castilian, but I would be very surprised to find a monolingual Gallego speaker.

    * Or four, if you go by political definition of languages rather than linguistic.

  19. J. W. Brewer said,

    November 6, 2015 @ 10:40 am

    Peter Taylor, I don't mean to minimize the role of government policy in Spain, but my impression is that by this point it is fairly unusual in any European country that has a single dominant/official language for native-born L1 speakers of regional/minority languages not to be fully bilingual in the dominant language (maybe a few exceptions where national borders have changed comparatively recently), even if the historical processes by which that has occurred were not always quite so heavy-handed as those in Spain (although I'm sure each and every minority language group has a story to tell about official government policy that seemed heavy-handed from their perspective).

  20. The Same Other Eric said,

    November 7, 2015 @ 4:57 pm

    FWIW, I believe Google Translate only translates to and from English, so that every translation will pass through English at some point. That is to say, GT has e.g. no German-Dutch translation feature; behind the scenes, it's actually German-English-Dutch.

    [(myl) This is false.]

  21. The Same Other Eric said,

    November 7, 2015 @ 5:28 pm

    @Arika Okrent:

    (wait… zomg, really??! Aaaaahh, my hero! huge fan! Anyway)

    It's funny to think that Google has a team of engineers ready to do hand-crafted adjustments in these situations.

    While I don't doubt they have that capability, Google Translate is crowdsourced, so that users such as you or I can click "Edit" and fix mistaken translations. If some attention was brought to this quirky news item, it wouldn't surprise me that a couple websurfers have already offered corrections that the AI has incorporated.

    If the surrounding text has been part of the training it will return a good result

    Google Translate works on a probability-based model, where it basically looks at a ton of texts and then says, "OK, what's that word most likely to mean in this context?" – it doesn't know anything about grammar, unlike some other MT. This does work fairly well most of the time (the exceptions being conspicuous, just like when a young child embarrassedly realizes s/he's inferred the meaning of a word incorrectly), but words in a context it doesn't recognize can really trip it up.

    [(myl) While it's true that Google Translate is based on statistical analysis of very large quantities of parallel text, it's false that it "doesn't know anything about grammar" — or at least, there's an increasing use of parsing inside the algorithm.]

RSS feed for comments on this post