Google Translate sabotage

« previous post | next post »

Somehow it happened:

 

That was yesterday, Thursday, June 13, 2019.

I just checked GT 5 minutes ago, and it has been fixed now:

Hěn gāoxìng kàn dào Xiānggǎng chéngwéi Zhōngguó de yībùfèn

很高興看到香港成為中國的一部分

I am very happy to see Hong Kong become a part of China.

[h.t. Geoff Wade]



13 Comments

  1. Waldron said,

    June 14, 2019 @ 3:03 pm

    Same thing with Wiki. Only the future is certain. The past is constantly changing.

  2. PeterL said,

    June 14, 2019 @ 3:49 pm

    I'm amused that the tweet was in simplified and you checked in traditional …
    (and, yes, this gives the same result: 很高兴看到香港成为中国的一部分)

    I wonder how this would be written in Cantonese (which doesn't seem to be an option in GT)?

  3. Bathrobe said,

    June 14, 2019 @ 5:33 pm

    I got:

    所以遗憾地看到香港成为中国的一部分。

    Which is correct. Add an 'I'm' at the start and you get:

    我很伤心地看到香港成为中国的一部分。

    Google Translate no longer seems to operate on word-by-word translations (if it ever did). It seems to operate on what it actually finds. Perhaps it found so many affirmations of happiness in Chinese that it decided this was the appropriate translation. Google Translate needs a LOT of tweaks to get it to work properly.

  4. Andrew Usher said,

    June 14, 2019 @ 9:32 pm

    What? If 'happy' is the correct translation, that explanation doesn't make sense. And how is GT 'sabotaged', anyway? I'm pretty sure no one but a Google employee could do that kind of thing on purpose.

    (No politics here. No comment about whether people should be happy, or sad, or utterly indifferent …)

    k_over_hbarc at yahoo.com

  5. Scott said,

    June 15, 2019 @ 1:22 am

    @Andrew U,

    Since Google parses webpages and solicits user corrections to translate, it seems that it could be manipulated by spamming incorrect translations into crawled webpages or inputting incorrect "corrections" into translate.

  6. Mark Liberman said,

    June 15, 2019 @ 5:53 am

    As noted here and elsewhere, modern machine translation includes a strong influence from a target-language "language model" based on frequency of sequences in the training material, independent of any source language considerations.

    This is most obvious when the input is a sequence of random characters, but it can result in other strange things. I suspect that this is one of them.

  7. liuyao said,

    June 15, 2019 @ 7:37 am

    Google reading too much sarcasm?

    Serious question: are there instances of correct translation (most natural to natives) that has at face value the opposite meaning? It's observed that in a given language, the meaning of a word could evolve to its complete opposite.

  8. Victor Mair said,

    June 15, 2019 @ 7:45 am

    This is not a word evolving into its complete opposite.

    This is a whole complex sentence coming out with the opposite meaning.

    Moreover, the translation is smooth and proper, not a mishmash.

  9. ktschwarz said,

    June 15, 2019 @ 9:41 am

    Machine hallucinations can produce normal-appearing sentences; there are lots of them in the elephant semifics posts here. I've also seen neural-net translations reverse the meaning of a sentence by leaving out a "no" or "not". For example, I asked Google to translate this misnegation:

    Barring no major revelations from the FBI, the Senate could vote on confirming Kavanaugh next weekend.

    from English into French, and got

    Sauf révélation majeure du FBI, le Sénat pourrait voter sur la confirmation de Kavanaugh le week-end prochain.

    which is what the original sentence should have said, but didn't!

    It's hard to fix a hallucination. I wonder what Google did; did they just add in a check for that exact phrase, as (we speculated) Baidu Fanyi did for the embarrassing Taiwan mistranslation?

  10. Michael Watts said,

    June 16, 2019 @ 5:00 am

    I got:

    所以遗憾地看到香港成为中国的一部分。

    Which is correct. Add an 'I'm' at the start and you get:

    我很伤心地看到香港成为中国的一部分。

    …that first attempt isn't correct. 所以 is an entirely different "so".

    I don't think it's possible to interpret the English expression "So sad to see Hong Kong become part of China" as being elided from "So [it's] sad to see Hong Kong become part of China", as that translation implies, rather than "[It's] so sad to see Hong Kong become part of China".

  11. John Swindle said,

    June 16, 2019 @ 10:55 pm

    The one with 所以 is a different kind of translation. Sure, it's from the same process, and it's incorrect too, but it's more analytical than the happy version. The happy version may have been something the machine actually saw somewhere in a context it didn't properly distinguish from the present one. On the other hand, National Public Radio journalists in the USA are increasingly starting their reports with "So" when they hand off between the studio and the field.

  12. Victor Mair said,

    June 19, 2019 @ 2:06 pm

    "Freshman senator and big-tech hawk Josh Hawley is questioning Google about its relationship with China after a weird translation 'mistake'"

    https://www.insider.com/sen-hawley-letter-to-google-ceo-about-china-2019-6

  13. Oop said,

    June 20, 2019 @ 6:15 pm

    "And how is GT 'sabotaged', anyway? I'm pretty sure no one but a Google employee could do that kind of thing on purpose."
    Actually, when one uses Google Translate, means are offer to "improve" the translation, which can then be easily abused. I suspect it wouldn't work after a single case, but repeated "improvements" would probably influence the AI.
    Also, there is a crowdsourcing translation program for volunteers, where you can collect points and climb in levels by evaluating existing translations and translating phrases offered by GT.

RSS feed for comments on this post