Drawing a line in the noodles

« previous post | next post »

The following photograph was found on the internet by Charles Mok and was shared by Rebecca MacKinnon (of the Berkman Center) on Facebook:

Just make sure that you don't slip on the pasta! Seriously, though, what is a traveler supposed to do when instructed to "wait outside rice-flour noodle"?


This is what the Chinese sign really says:

Qǐng zài yī mǐ xiàn wài děnghòu
请在一米线外等候
"Please wait behind the one meter line"

Unlike several recent Chinglish signs that we've examined (e.g. "Difficult to find the translation", "Google me with a fire spoon"), the English in this case was not taken directly from Google Translate, which gives "Waiting outside in a noodle". But Babel Fish gives "Please wait outside rice-flour noodle." Bingo!!

So how did Google Translate and Babel Fish bring noodles into the picture?

The answer is simple: both Google Translate and Babel Fish failed to parse the sentence correctly. They treated yī mǐ xiàn 米线 as if it were [yī [mǐ xiàn]] "one rice-noodle" instead of [yī mǐ] xiàn]] "one-meter line", where mǐ 米 is an abbreviated transcriptional loan for English "meter", rather than a morpheme meaning "rice".

In overall statistical terms, mǐ 米 = "rice" is no doubt much more common than mǐ 米 = "meter". And it doesn't really help us to note that in Modern Standard Mandarin (MSM), mǐ 米 is a bound morpheme and only means "rice" in specific combinations such as cāomǐ 糙米 ("brown rice"), mǐfàn 米饭 ("[cooked] rice"), dào mǐ 稻米 ("[paddy] rice"), because in this case, the sign does have the sequence mǐxiàn 米线 , which really can mean "rice noodles".

How does Baidu Fanyi handle Qǐng zài yī mǐ xiàn wài děnghòu 请在一米线外 等候? The output is perfect:
"Please wait behind the 1 metre line."

It's possible that this translation is perfect because the same exact pair of Chinese and English phrases were found in the training corpus, perhaps more than once. Still, Baidu's treatment of the problematic part of this sentence (taken in different chunks) is:

在一米线外等候 "in line waiting outside" (no coverage of the 一米 in this or the next two iterations))

在一米线外 "in a line outside"

一米线外 "line outside"

米线外 "rice noodle and"

米线 "rice noodle"

At least in this sample, so long as the numeral 一 appears before the 米, Baidu Fanyi does not make the mistake of translating the sequence 米线 in this collocation as "rice noodle". This is probably a safe heuristic, at least in translating signs, but it's not clear how to ensure that this result emerges from a standard statistical MT algorithm. Perhaps a better method would be to adjust the language-model priors to take note of the fact that rice-flour noodles are less likely than one-meter lines in the context of airport crowd-control signage.

Or, more radically, you could pay someone who knows both Chinese and English to translate your signs.

[A tip of the hat to Joel Dietz]

Share:



11 Comments »

  1. Kylopod said,

    August 14, 2011 @ 2:21 pm

    They treated yī mǐ xiàn 米线 as if it were [yī [mǐ xiàn]] "one rice-noodle" instead of [yī mǐ] xiàn]] "one-meter line", where mǐ 米 is an abbreviated transcriptional loan for English "meter", rather than a morpheme meaning "rice".

    This reminds me of an English-to-English error on Garmin Nuvi GPS, which reads "Delaware Memorial Bridge" as "Delaware Mem Brother," because the abbreviation is "Delaware MEM BR." This error has puzzled me (why would a GPS machine be programmed to read BR as "brother"?), but that's what it says.

  2. Paul Walters said,

    August 14, 2011 @ 2:31 pm

    In Google Translate if one places a space between 米线, then the translation is correct: "please wait one meter outside the line". Perhaps the parsing error is mechanical rather than logical.

  3. John Henry said,

    August 14, 2011 @ 5:22 pm

    Paul Walters:

    do you mean "between 米 and 线"? Anyway, I believe that translation is not correct, as the sign is saying to [wait [outside the [[one meter] line]], rather than wait one meter outside the line.

  4. Eric TF Bat said,

    August 14, 2011 @ 6:43 pm

    Or, more radically, you could pay someone who knows both Chinese and English to translate your signs.

    Oh, that's crazy talk!

  5. Jake Nelson said,

    August 14, 2011 @ 8:38 pm

    Or, more radically yet, one could use an alphabet or syllabary to represent sounds and reserve logograms for actual meanings…

    I keep feeling that using characters for their sound instead of meaning is destroying the one useful trait of written Chinese.

    [(myl) "Is destroying"? To change this, you'd have to take your time machine back to the periods in which the world's various logographic systems were developed. As I understand it, the "rebus principle" has always been an essential part of turning a limited system of icons into a full writing system.]

  6. ~flow said,

    August 15, 2011 @ 9:17 am

    i want to second the above commentary about the rebus principle being an essential part of the chinese written language; when you look up historical uses of characters you will find uncount numbers of 通用字, 借用字, 假借字 and so (it can be difficult to clearly draw lines between these and other concepts; an example for the last is the character 自 which originated as meaning 'nose', but then was borrowed to mean 'self'). that said, i feel that many sentences would be much clearer if they had fewer ambiguous characters in them, characters like 米 meaning both 'rice' and 'meter' or 美 meaning both 'beautiful' and 'america'. to demonstrate this, try to read this sentence:

    请在一咪线外等候。

    already much clearer, no? historically, 咪 has in fact been used for meter, but that sadly fell out of use.

  7. slobone said,

    August 15, 2011 @ 1:39 pm

    Sounds like Chinese needs hyphens…

  8. GQ said,

    August 15, 2011 @ 8:20 pm

    @~flow

    I still see 咪 used for meter in some places in Hong Kong (generally older signs). But in Cantonese 咪 has another meaning, a emphatic imperative "don't"

    The sentence 请在一咪线外等候 sounds quite unnatural to me, it sounds a lot better if they paint the line yellow and say 請在黃綫後等候

    As for 美, I think it is only used on its own (without 國) in headlines, but we all know how difficult headlines can be to read

    @slobone

    Or maybe spaces between semantic units like Korean. In the UK nobody seems to use hyphens even where they are really needed

  9. Jake Nelson said,

    August 15, 2011 @ 11:50 pm

    'Destroying' is perhaps hyperbole. I just get frustrated by the especially opaque nature of rebus uses of logograms, particularly when it renders the characters devoid of interlingual meaning. Given the ongoing debates about utility of, official policy toward, and technical support of logograms going forward, it presents a serious challenge for their proponents. I suppose my background leads me to look at things from perspectives of technological implementation and governmental policies more than the natural linguistic aspects.
    Also, my Chinese acquaintances are not a representative sample- they're mostly of a certain group of Cantonese speakers, and have expressed a sense that more widespread and formal uses of the rebus readings are a form of imposing Mandarin-reading orthodoxy on speakers of other languages that use hanzi. (That might just be the perspective of this small group, and not representative- I wouldn't know.)

  10. The Stupidest Airport Passenger Sign, Oddly Pasta-Based | stupidest.com said,

    August 16, 2011 @ 9:07 am

    [...] language log This entry was posted in stupid mistranslations, stupid signs and tagged chinglish, lost in translation. Bookmark the permalink. ← The Stupidest "Ha ha, I So Funny" Diplomatic "Joke" [...]

  11. michaelyus said,

    October 19, 2011 @ 6:41 pm

    米 doesn't have to be a bound morpheme to be referring to uncooked rice. E.g. 快点把米端来,水要潽出来了!Quickly bring the rice here, the water's about to boil over!

    I would doubt that it's an abbreviation in this case, short for 稻米 or something, as 米 itself is really the basic term.

RSS feed for comments on this post · TrackBack URI

Leave a Comment