Cantonese input methods

Despite the efforts of the central government to clamp down on and diminish the role of Cantonese in education and in public life generally, the language has been experiencing a heady resurgence, especially in connection with the prolonged Umbrella Movement last fall.

Of course, if you write by hand, you're not constrained by electronic fonts, but can use any special characters you wish.  But if you want to enter Cantonese into electronic devices, then you are subject to various constraints, including your own ability to interface with the software.  From what I've personally witnessed and from what my informants tell me, people rely on one or more of the following methods to input Cantonese, usually in combination:

Tsang-chieh and other shape/component-based systems

Hanyu Pinyin (the official romanization of the PRC)

Jyutping (romanization developed by the Linguistic Society of Hong Kong and favored by linguists, but not official)

Yale (widely used for teaching Cantonese) or other romanization

Touch pads (finger tracing / writing) — often users switch to this only when they have to write a Cantonese character that is not in normal fonts

And, believe it or not, English!  Many of my HK friends use English (words) — at least part of the time — to input Cantonese.  For example, if you type in "umbrella", you can select from saan3 傘, jyu5saan3 雨傘, or ze1 遮.

"Language notes from Macao and Hong Kong" (6/22/14)

Bob Bauer polled the 14 students in his class at the University of Hong Kong on what input methods they use for Chinese characters.  Here are the results:

Method Computer Other devices
 速成 ("rapid")
[modified form of Tsang-chieh]
 6  4
 倉頡 (Tsang-chieh)  3  2
 拼音 (pinyin)  3  3
 筆 劃 (strokes)  0  8
 *iPhone 手指 (finger)  (1)  1
 注音符號 (bopomofo)  1  0
(relies on scanning)
 1  0

*This student said she indirectly input Chinese characters into her computer with her iPhone which automatically emailed to her computer the Chinese texts she had written on the iPhone with her finger.

As a friend from Hong Kong puts it:

…people use all kinds of methods (倉頡 [Tsang-chieh]、速成 ["rapid"]、粵語拼音 [romanization for Cantonese]、漢語拼音 [Hanyu Pinyin]、九方 [Q9]).  I guess it depends on the age of the person?  A lot of the pre-1997 generation use touch pads (finger writing), this is because they don't know any other writing systems without exerting a lot of effort in learning.

As another friend put it, for "casual, brief" writing of actual Cantonese (as opposed to "Chinese"), people will often rely on writing on a touch pad with one's finger, but that doesn't seem to be much favored for longer and more "formal" (i.e., "Chinese" [zung1man4 中文]) texts.

Up to now, the situation has been fairly messy and complicated, for most people often involving reliance on multiple methods, because of the following reasons:

1. lack of an official Cantonese romanization that is taught in the schools to all students

2. an abundance of special characters for writing Cantonese that are not in usual fonts (to write Cantonese, you need a thousand or more of them)

3. strong discouragement of students from writing Cantonese by teachers and educational authorities, hence lack of familiarity with writing Cantonese and the failure to sanction input methods for Cantonese in schools and universities

4. minimal commitment of software companies to develop input methods designed especially for Cantonese (as opposed to zung1man4 中文 ["written Chinese"])

The good news is that Google recently introduced a powerful method for inputting Cantonese that is succinctly described in this short video.  Even if your spelling is not perfect, the system is "intelligent" enough to guess at what you're trying to type.  I suspect that, with the advent of Google's Cantonese input method, people will be further encouraged to write in Cantonese, since the burden of switching from one imperfect system to another will be obviated.

For many additional posts on Cantonese and related issues, see here.

[Hat tip James Dew; thanks to Bob Bauer, Mandy Chan, Dehuai Yao, and Norman Leung]


  1. Eidolon said,

    January 20, 2015 @ 7:28 pm

    How popular is Cantonese in mainland China, exactly? I am aware of the fact that it is vibrant in Hong Kong, but Hong Kong is, after all, governed differently than the rest of China. With the retreat of other Sinitic languages eg Shanghainese, especially in the newly developed urban centers, and the influx of people who do not speak Cantonese into Guangdong, I'm not sure whether the Cantonese situation in HK reflects that of other Cantonese regions.

  2. tsts said,

    January 21, 2015 @ 1:05 am

    @eidolon: My (very non-expert) impression from my last trip to Guangzhou and Shenzhen 3 years ago is that Cantonese is doing quite well. Especially in GZ, but I was surprised that even in SZ there was a lot of Cantonese spoken. Also several cases were people were conversing in both languages, one person talking in Cantonese and the other replying in Mandarin, with little problems.

    My impression was that while there are a many newcomers from other parts of China in GZ and SZ, their own kids often learn, and in some cases prefer, Cantonese (in cases where the kids are with them and not back in their home village). I would not be surprised if over time SZ will become more Cantonese. (There is of course also the countervailing trend towards Mandarin nationwide, so hard to tell what will happen, but the influx of newcomers is not the main issue IMO.)

  3. Peter Dirix said,

    January 21, 2015 @ 5:54 am

    Obviously, one could also use speech recognition as an input method. Both Nuance and Google have at least products for Cantonese on mobile devices.

  4. DMT said,

    January 21, 2015 @ 7:09 am

    I was wondering why the video introducing Google's Cantonese input system was recorded in English – but then I found the Cantonese version here.

    The crucial feature of this inputting system is its ability to guess based on "approximate" spellings. Does any pinyin input system have this feature? All the ones I know of seem to require exact input.

  5. Tom said,

    January 21, 2015 @ 10:06 am

    For day-to-day communication, the surging popularity of WeChat and other hybrid text/voice messaging apps makes this problem almost irrelevant. My Cantonese "second family" has a WeChat group in which everyone communicates using a strange mix of Cantonese, Mandarin, English, and emojis!

  6. Victor Mair said,

    January 21, 2015 @ 11:27 am


    Do you think that mixture will develop into some sort of language?

  7. Martin said,

    January 21, 2015 @ 6:07 pm

    That's implementation dependent and often called fuzzy pinyin or the like. I know that it's available in Microsoft Pinyin in Windows 8 and ibus-pinyin on Linux.


  8. Tom said,

    January 22, 2015 @ 10:11 pm

    @Victor Mair
    I wonder – It's certainly a lively and effective communication tool! I think part of the linguistic mixture might be due to the fact that the family is spread out over Guangzhou, Hong Kong, San Francisco, and Toronto. What's curious to me is how often they communicate in written Sinitic (sometimes Mandarin, sometimes Cantonese) when the voice-messaging option is so simple to use and circumvents the problem of the written language entirely.

  9. Victor Mair said,

    January 24, 2015 @ 12:02 pm


    Just yesterday at lunch I was talking with a graduate student from Fuzhou whose mom is Hakka and dad from somewhere else in the south of China (they both went to university in Amoy / Xiamen) and whose boyfriend was born in America of Taiwanese parents, both of whom are now living in southern Taiwan. Spoken communication between the boyfriend and his parents is a mixture of Taiwanese, Mandarin, and English, but written communication is entirely in English.

    My observation of global families (with home bases in Japan, Korea, Vietnam, China, Taiwan, Singapore, etc.) is that communication is a hodge-podge of languages such as you described in your first comment, but with English becoming increasingly dominant in the mix, especially for written communication.

