Ask Language Log: Looking up hanzi for ignoramuses
« previous post | next post »
From Mark Meckes:
I'm a regular Language Log reader, completely ignorant of Chinese languages. I was just wondering whether there exist worthwhile online tools to help someone like me figure out the meaning of something written only in hanzi. (The question is occasioned by my looking at a package of tea given to me by a Chinese student; the writing on the package is mostly hanzi, with a little English and no pinyin.) I'm perfectly competent to use Google Translate and similar tools (and know how much skepticism to approach the results with) for the last stage of the process. But starting from written hanzi on a physical object, I first need some way to translate that image into either pinyin, Unicode, English, or something equivalent to one of the above — and something that relies on no knowledge of the meaning or pronunciation of the characters, or knowledge of the structure of Chinese characters in general. Do you have any suggestions?
You're right. For someone who is not already familiar with characters, there's no easy way to look up one that you don't know in a dictionary. For those who have a basic understanding of how characters are constructed, there are many different approaches — total stroke count, radical plus residual strokes, shapes of the individual strokes, guessing at the sound if there's a more or less obvious phonophore, etc. — but none of these are reliable or easy, even for someone who already knows a lot of characters. In fact, looking up characters is so excruciatingly difficult and frustrating that most people don't even bother (they just guess at or skip the ones they don't know). For compulsive, responsible Sinologists and China specialists, looking up unknown characters is one of the banes of their life.
"Sinological suffering" (3/31/17)
With the advent of electronic devices for processing hanzi / kanji / hanja, the situation has greatly improved. I have seen people who are literate in Chinese write the shape of an unknown character on the glass of their iPhone, Android, iPad or other tablet, etc. and find a dictionary entry that way. However, judging from the fact that even such literate folks often fail to locate the character they're after using this method, I wouldn't recommend it for the neophyte, who would be clueless about what actually counts as a stroke, stroke direction, stroke order, etc. — all of which are crucial for identifying a particular character.
A better bet would be to install an app that will let you scan / photograph the character in question and have your device find the specific character you're after. I'm confident that other Language Log readers will be able to tell you which programs have these capabilities. One that I know of off the top of my head is pleco, an extremely powerful and versatile tool for anyone interested in Chinese languages, from complete tyros to the most advanced Chinawallahs. One reason I know about pleco is that a core component of this software are the ABC Chinese dictionaries, of which I am the series editor, from University of Hawaii Press.
And Pleco offers much more, including the brand new (as of November 21, 2017) sine qua non vade mecum for research on Chinese history and culture, that monument of erudition, Chinese History: A New Manual (5th Edition), by Endymion Wilkinson.
Mark, if the writing on your package of tea is typeset, electronic aids such as those described above should work pretty well. But just treat the parts that are calligraphic as decoration. The machines most likely won't be able to read them.
Bruce said,
November 29, 2017 @ 1:55 pm
As an example of what OCR listed above can do with a smart phone, using Google Translate, you can select translation from Chinese to English, and then select the Photo icon for input,
The Translate app works in one of two ways, (1) by sending data to Google to translate (2) from a local database. The second option takes about 200MB of space.
Assuming the second option is enabled, and after the photo input has seen selected but before a photo has been taken, if you direct the camera lens to Chinese text, it will show you preliminary translations of what it finds, even before you take the picture.
After the picture is taken you need to select the Chinese text from other stuff (English and non verbal marks). For instance, I asked it to translate an old dual Chinese/English sign I have on my front door saying "Please close the door" but there are marks which are not Chinese characters (decorative).
This is by far the fastest way to translate Chinese characters more or less instantly.
Philip Taylor said,
November 29, 2017 @ 6:31 pm
Is there any equivalent to Pleco for those of us that still (by choice) inhabit the dark ages and have powerful PCs, decent digital SLR cameras, but no smart'phones, tablets, touch-sensitive monitors or similar devices ?
Dave Cragin said,
November 29, 2017 @ 7:11 pm
To follow on Bruce's comments: I've found Googletranslate works even for somewhat complex situations, i.e., Chinese subtitles in movies. When I'm watching a movie and I can't pick out the words they are saying, I turn on Chinese subtitles and take a picture.
Then I have Googletranslate scan a picture of the subtitles. Normally, it is successful, even when the characters blend in somewhat with the background of the movie.
Mark Meckes said,
November 29, 2017 @ 9:15 pm
To add some more context to my original question: I personally inhabit the same dark ages mentioned by Philip Taylor. And as far as I can see, the non-mobile version of Google Translate doesn't have the photo input feature.
Dave Cragin said,
November 29, 2017 @ 11:13 pm
My iphone is mainly a Chinese language learning device (and it also happens to make phone calls as well). On it, I have 2 dictionaries, 2 translators and wechat, which has countless messages to look at. Technology makes learning characters a much more enjoyable process.
Using a paper dictionary for characters is a painful (and often fruitless) process, particularly for a new learner (as noted by Victor). In addition, even if you find the right character, you still need to know whether it should read it by itself or whether it should be combined with other characters next to it and you generally only know this if you know lots of Chinese.
If I didn't have my phone to scan characters, I wouldn't even try to use a paper dictionary. I learned early on that it's too time consuming with too little return.
(last year, the Wall St J noted that 1 in 7 Americans still used a flip phone and sales of them actually were increasing – I believe up by a few million units from the previous year).
Keith said,
November 30, 2017 @ 2:52 am
@Dave Cragin
Wall St J noted that 1 in 7 Americans still used a flip phone and sales of them actually were increasing
Those phones are cheap and have no GPS. I believe that a phone like that is known as a "burner".
Mark Meckes said,
November 30, 2017 @ 10:28 am
Rebecca Starr, in email, pointed out both the photo input option for Google Translate on smartphones, and the handwriting input option which works on desktops. (In fact, the writing on my tea package is so small and on such a dark and crinkly background, that I have my doubts that the photo input would work well, if at all.) It's a bit time-consuming to trace out unfamiliar characters with a mouse or trackpad, but on the other hand the interface doesn't care about things like stroke order — it appears to be applying OCR to the image you create.
So far I've successfully entered and translated 友情提示.
Markonsea said,
November 30, 2017 @ 1:26 pm
Apple's App Store seems to have no shortage of free apps which will identify text in an image, run OCR and translate it.
The one I went for has the original name of Scanner & Translator, and has worked well on both Chinese and Japanese. (Takes pix from your Gallery, so it matters not whether they're camera-originated or downloads.) Not always 100%, but how many of us are?
Markonsea said,
November 30, 2017 @ 1:32 pm
Mind you, I prefer the challenge of working through the radicals in the Pinyin Dictionary; but when this defeats me, as so often happens …