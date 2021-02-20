« previous post | next post »

Ben Hull writes:

In our Computational Linguistics class we were discussing different methods of segmenting Chinese character texts. Today I came across a terrific example of the problems of segmenting left to right, in the first sentence of the attached image. I hope you find it as amusing as I did.

The story is about Ted Cruz going on vacation to warm Cancun when his state was freezing from the Big Chill and suffering from rolling blackouts.

The title in Chinese characters — without spaces between words — is:

度假哥有大麻煩

If you convert that to Pinyin, you need to put spaces between words, presenting you with two options:

1. "Dùjià gē yǒu dà máfan" ("Vacationing brother has big trouble")

2. "Dùjià gē yǒu dàmá fán" ("Vacationing brother has marijuana trouble")

One of the many good things about Pinyin is that it forces the writer and reader to be clear about what is being said.

Selected reading

Permalink