Furigana-like glossing in Mandarin

On Language Log, we have often touched upon the use of furigana ruby to gloss kanji (Chinese characters) for various purposes, most recently in the comments to "Roman-letter Mandarin pronoun of indeterminate gender " (8/9/16).

Indeed, the use of furigana in Japanese is a truly interesting mechanism for presenting layers of meaning of a word or character. So far as I know, this usage only occurs for kanji phrases. The meaning originally signified by the character(s) serves as one layer, and the furigana, be it hiragana or katakana, adds another layer of meaning. Although this particular use of furigana normally appears only in informal writings, it often serves as tongue-in-cheek wordplay that adds subtlety and nuance to the composition.

Current Mandarin is also influenced by, or may be said to have adopted / adapted this use of in-text character glossing à la furigana.  While typographically ruby annotation is not yet widespread for Chinese, it is coming, especially for phonetic glosses, and occasionally for explication, irony, and so forth.  Meanwhile, other devices have emerged that permit Chinese to achieve a similar effect in their writing.

There is a saying, “ xiě zuò 写作…, dú zuò 读作…” ("write as…, read as…"), which basically refers to this mechanism for conveying double meaning.

Here's an example:

xiě zuò shēnshì, dú zuò biàntài 写作绅士,读作变态 ("write as 'gentleman', read as 'abnormal'")

Or in a different form, but with the same meaning:

shēn (biàn) shì (tài) 绅(变)士(态) (lit., character by character, "gentry [transformed] scholar [condition]")

If the same double meaning were expressed in ruby, it would look something like this:

biàntài 变态  ("abnormal")
shēnshì 绅士 ("gentleman")

Of course, the biàntài 变态 on top would be much smaller than the shēnshì 绅士 on the bottom (I've tried to suggest that differential effect with the bold and italics).

The above example exists in both Mandarin and Japanese, and I believe this 绅士/变态 double meaning originated from the otaku* culture of Japan, where it would be shinshi 紳士 / hentai 変態 ("gentleman / abnormal").

*For otaku, see "Nerd, geek, PK: Creeping Romanization (and Englishization), part 2" (3/5/13).

Other examples of double layered glossing in Mandarin include xǐ (sàng) wén (xīn) lè (bìng) jiàn (kuáng) 喜(丧)闻(心)乐(病)见(狂) (lit., character by character:  "happy loss hear heart glad sick see mad"), which is an expression now very popular among the younger generations.  What it really boils down to is that they're saying, superficially one is "xǐwénlèjiàn 喜闻乐见" ("happy to hear, glad to see [somebody / something / you]"), but inside, deep down, one is "sàngxīnbìngkuáng 丧心病狂" ("distracted and frenzied").

I feel that in the Chinese version of this kind of double meaning or double layered expression, often the latter / second / added / glossed meaning / layer is more emphasized and carries a sense of being the hidden, "true" meaning of the overall, combined expression.

We have occasionally touched upon this method of parenthetical glossing in Chinese on Language Log before and have also described the use of "English as ruby annotation for Chinese" (11/16/14).

It's interesting that Chinese characters, which are supposedly semantically pregnant, so readily invite additional layers of meaning, some of which are quite at odds with the original signification.

[Thanks to Tianran Hang]


  1. leoboiko said,

    August 11, 2016 @ 7:51 am

    It's interesting that this is spreading to Chinese. The examples with parenthesis, in particular, seem to be trying to hint at the same effect of "simultaneity" of furigana glosses – indeed, parenthesis, as argued by Nunberg in Linguistics of Punctuation, are a tool used to interweave together two alternative texts, which is why, in Japan, they're the fallback of choice when one cannot type furigana. The only difference with the above notation is that they parenthesis-gloss word-by-word, not character-by-character.

    > So far as I know, this usage only occurs for kanji phrases.

    While kana-over-kanji is the prototype structure for gloss-play, and definitely the dominant usage, the technique does happen with other scripts in the base position. Particularly common is the use of a single Latin capital initialism, typeset as a "double-width" or "full-width" character, as an abbreviation for some foreign word, with the furigana spelling its pronunciation in full; for example, "Jump" (the name of the famous magazine) written as J(ジャンプ), with the katakana above the letter. I figure this is less a literary than a typographic technique; by setting a single letter in a standard kanji square, one avoids the awkwardness of mixing top-down, regular-spaced kanji with left-right, variable-spaced Latin letters.

    I don't have examples at hand, but I've also seen kana-over-kana, kana-over-numbers, kana-over-punctuation (%(パーセント)), and even kanji-over-kana. Except in the case of alphabetic abbreviations, the common point seems to be that the furigana position stands for the surface-level word to be pronounced, and the base position stands for its deeper meaning or implication. An important subset are deictics ; e.g. チョムスキー派(奴ら), "those guys" as a gloss over the word "Chomskians", for a context where you say “those guys” referring to Chomskians.

    What I find interesting about this technique is that it's non-linear. Aesthetically, it feels different than saying one word, then adding a parenthetical like "by which I mean X" (even if it, arguably, functionally means the same thing, the subjective impression is different). Somehow there's a feeling of simultaneity, which I think arises from the fact that a skilled Japanese reader is conditioned to grasp "sound" from gloss and "meaning" from base simultaneously (pure conjecture: could this be related to the "dual-route" model of writing in neuroscience?). The fact that non-linear expression is impossible to do in speech is, in my opinion, another point in favor of Vachek and Halliday's proposals about writing being able to access language in its own particular ways; or for Nunberg's (op.cit.) brilliant formulation of writing as "an application of the principles of language". In this case, it's an application of the paradigmatic axis, taking advantage of writing's bidimensional medium.

  2. unekdoud said,

    August 11, 2016 @ 8:53 am

    The one thing I find special about this usage is that I've never seen these parenthesized expressions used for phonetic glossing in Mandarin, only for the "double meaning" sense. I'm certain that this would also confuse readers who are entirely unfamiliar with furigana.

    The double meanings given in the examples above are different, even opposing, in a way that doesn't commonly occur in English, even the online slang. There are some variants on parenthesized expressions and "accidental" self-corrections, but no compact "this is a white lie" signifier, and there is a certain sarcastic tone that would get lost in translation.

  3. leoboiko said,

    August 11, 2016 @ 9:02 am

    @unekdoud: Tongue-in-cheek self-corrections can get close; one comparable resource in informal English (and other languages) is the joking use of strike-through – "Trump told his supporters to murder stop Hillary…”

  4. liuyao said,

    August 11, 2016 @ 10:21 am

    Victor would be happy to know that the parenthetical words often appear in pinyin (without tone marks), which makes you stop and spell it out to see what the second layer of meaning is. (I find it a little annoying though.)

  5. Victor Mair said,

    August 11, 2016 @ 11:02 am

    Yes, I've seen those parenthetical pinyin glosses, which are not meant as pronunciation annotations, but to convey meanings. They don't slow me down any more than parenthetical character glosses do, and I don't need tone marks either. I've written about how reading pinyin texts is as easy as reading character texts for me, probably easier. I suppose this is because of my philological and linguistic training, and my entire orientation to Sinitic languages, which is based more on sounds than on symbols. (There's an old Karlgren book about that.)

    By chance, during the last couple of weeks, I had already drafted two posts on this very issue of reading pinyin texts. I'll probably put them up within the next few days.

    Meanwhile, I am indeed happy that liuyao pointed out this usage of pinyin for parenthetical glosses. Yet another instance of emerging digraphia.

  6. Michael Watts said,

    August 11, 2016 @ 12:48 pm

    Or in a different form, but with the same meaning:

    shēn (biàn) shì (tài) 绅(变)士(态) (lit., character by character, "gentry [transformed] scholar [condition]")

    Where I see this kind of thing, it takes this form:


    That is, you have characters with readings provided in pinyin, and no characters are provided for the readings. I don't know how common it is to include tone markings.

  7. Eric said,

    August 11, 2016 @ 4:17 pm

    "Although this particular use of furigana normally appears only in informal writings"

    Not sure if fiction is excluded here, but I distinctly remember the Japanese translations of Harry Potter write "unicorn" as 一角 with the furigana ユニコーン yunikoon, for whatever that's worth. In this case, my admittedly non-native impression was that Japanese people know the creature in general as a ユニコーン (after all, the canonical reading of 一角 is いっかく ikkaku), but a surface Sino-Japanese spelling was used to provide the impression the creature wasn't "foreign" to the characters or the world they inhabit. I don't know if this is considered the same kind of furigana use discussed before.

  8. David Marjanović said,

    August 11, 2016 @ 5:07 pm

    the joking use of strike-through –

    When such HTML is unavailable, there's a certain tradition of spelling out backspace: murder^H^H^H^H^H^H stop

  9. Michael Rank said,

    August 11, 2016 @ 5:22 pm

    What’s the origin of the expression “ruby (annotation)”. Why ruby? Is this used by anyone apart from encoders and typographers (mainly dealing with Japanese)? Is it quite new? I have not seen it anywhere except on LL.

  10. Rubrick said,

    August 11, 2016 @ 5:23 pm

    As a word-puzzler and wit, I envy this aspect of Japanese and other Sinitic languages. I feel a bit like a painter who doesn't have access to a whole subrange of colors.

  11. leoboiko said,

    August 11, 2016 @ 5:56 pm

    @Michael: It comes from a kind of typographer's jargon, in English, where each point size is named after a precious stone. Ruby is 5.5—quite small—and in Japanese it became a common word for the small-type glosses (not necessarily 5.5). It's at least as widespread as "furigana", if not more.

  12. Michael Rank said,

    August 11, 2016 @ 6:12 pm

    Oxford English Dictionary says under ruby: 7. Printing. A size of type (equal to 5½ points) larger than pearl and smaller than nonpareil. Cf. agate n. 5. _Now chiefly hist._
    I have added the _ _. It makes no mention of the highly technical usage in connection with Japanese kana, etc. If you hover over “Cf. agate n. 5.” it says agate is (also) “larger than pearl and smaller than nonpareil (in Britain called ruby…”

  13. leoboiko said,

    August 11, 2016 @ 6:37 pm

    The word "Ruby" ルビ is widely used in the present day to mean "typographic gloss" in Japanese, not in English, just like the word "my-car" means "personal car" in Japanese. Just because these Japanese words came from English doesn't mean that they're English. You'll find this sense of "ruby" in any monolingual Japanese dictionary.

  14. JS said,

    August 12, 2016 @ 12:53 am

    The comparison to strike-through is great!

    The Chinese joke annotations indeed reflect real annotative practices that employ Romanization or, originally, more familiar characters that write homophonous words/syllables. The practice may well (also) derive from the Japanese practice, but I would want to see some concrete indications.

    I had earlier found the Japanese stuff "good fun" but admit to not having taken it terrifically seriously before. But on reflection, one way to think about it is as a deconstruction of the written word as sign: the two poles of "written symbol" and "spoken word", within a standardized script so seamlessly unified, are de-coupled in the context of what are in essence ad hoc proposals for entirely new written words; e.g., Ruby ユニコーン over 一角 constitutes a spontaneous performative proposal for writing the word "yunikoon" with the symbol(s) . But as leoboiko in effect points out towards the end of his first comment, the reader brings along "baggage" in the form of the conventional mapping(s) of the involved written symbol(s) to other spoken word(s) (here, 'section', etc.?), baggage that is not so easy to put down.

    In this light is possible to draw connections to more familiar phenomena, like alternate spellings: and are alternate ways to write the same word (/ˈdouˌnət/ 'sweet dough ring'), but perhaps the presence of makes reading a measurably different cognitive experience? Etc.

    This is all interesting to me because the single greatest obstacle to real progress in the study of the early Chinese script remains the failure of many scholars to appreciate the ontological separateness of word and glyph. Explanation of this simple but crucial point should for me only require saying things like "well, the question is what word this glyph was first invented to write…" or "this glyph was apparently used to write several different but etymologically related words…", etc., but it is precisely this sort of thing that, given the nature of the script, Chinese does not yet provide good terms for saying.

  15. JS said,

    August 12, 2016 @ 12:56 am

    oops, I used brackets.
    par. 3 l. 8 …with the symbol(s) "一角"…
    par. 4 l. 3…perhaps the presence of "dough"…

  16. JS said,

    August 12, 2016 @ 12:57 am

    "dough" and "doughnut", etc. You get the point.

  17. Michael Watts said,

    August 12, 2016 @ 1:34 am

    As a word-puzzler and wit, I envy this aspect of Japanese and other Sinitic languages. I feel a bit like a painter who doesn't have access to a whole subrange of colors.

    We've got plenty of ways to achieve the same effect. Strikethrough and ^H have been mentioned, but you could also take the approach seen in Mike's panel 4 response here.

  18. Michael Rank said,

    August 14, 2016 @ 9:01 am

    Ruby would be useful in English to give the pronunciation of British Olympic diving gold medallist (Jack Laugher) (/lɔː/, according to Wikipedia). Who wudda guessed it?

  19. Rodger C said,

    August 14, 2016 @ 12:00 pm

    @Michael Rank: Apparently Laugher's name should be pronounced the same as "lore," however one pronounces "lore."

  20. Alyssa said,

    August 14, 2016 @ 9:41 pm

    The "Chomskians" example is really interesting to me, because compared with parentheticals in English (and, if I'm understanding correctly, the gentleman/abnormal example in Chinese), it's flipped around – the word that is *meant* is the one that is written as part of the main sentence, and what is *said* is added as a supplement. In English you would do the opposite -you'd write "those guys (Chomskians)", not "Chomskians (those guys)".

  21. leoboiko said,

    August 15, 2016 @ 8:28 am

    @Alyssa: that's because the comparison with parentheses is limited. The parentheses in my comments are a typographical fallback, because we can't type ruby-style glosses here. Please imagine that the word in parenthesis is typeset above the immediately preceding word.

    In Japanese practice, it's not quite the case that the gloss is a supplement to what's being said. It's not an alternative text, the way parenthesis are. Rather, it's usually a reading aid for the very same word (an aid especially useful in texts for young audiences). Imagine annotating English spelling with its pronunciation, for the benefit of foreign learners (something I've missed dearly). The somewhat artificial sentence "I see one eye which was won by the Witch of Greenwich" might be annotated thus:

    (Sorry if I got something wrong, and again, I wish someone would publish books like this!) Notice how the glosses follow the sound, while the spelling varies with meaning? And still, each base/gloss pair represents the very same word? Japanese writing is like this, except that English spelling is related to meaning only incidentally, while in Japanese the relationship is systematical; and that literally everyone learns to read through texts glossed in this way.

    Now the large majority of Japanese glosses are of the above kind; but sometimes a writer will use distinct words in the base and the gloss, as a stylistic tool. This is unusual enough to be felt as clever, though it's not exactly rare, and not at all limited to informal texts. By now it should be clear that base/gloss don't code for "utterance/supplement" but for "meaning/pronunciation", implying that whatever's in the base is the reference, implication, or suggestion of what's actually said on the surface, which goes in the gloss. Therefore, if I say "those guys" this goes in the gloss, and if by that I mean Chomskians, that goes in the base.

    @Michael Watts: For the same reason, I maintain that sequential resources like strike-through can reproduce e.g. the kind of humor with parenthesis we saw in the Chinese examples; but not the aesthetic effect of true gloss-play, the parallel "reading with subtitles" sensation (thanks David Eddyshaw for the spot-on metaphor), which needs both digraphia and an habit of gloss-reading to reach its full effect.

  22. leoboiko said,

    August 15, 2016 @ 8:30 am

    Er, it seems we can't insert images either. Just click the word "thus" to see the example, or click here if your system can't load that image format.

    I add that I don't quite get the same parallel sensation from the English-plus-IPA example as I do with Japanese kanji-plus-kana; possibly due to not being used to it.

  23. Alyssa said,

    August 16, 2016 @ 12:02 am

    leoboiko: Sorry, I think I was unclear – when I said "supplement", I meant *visually*. The gloss is a supplement, visually speaking – if you removed it, the base would look like a normal word. The gloss in something *added* to the base.

    So, starting from the understanding that with furigana base/gloss imply meaning/pronunciation, what I found remarkable is that in the Chinese example above, this appears to be reversed – the base is the pronunciation (gentleman), and the gloss is the meaning (abnormal). This is analogous to how parentheticals are used in English to convey the same type of irony – the "base" of the sentence is what is implied to have been pronounced, and the "gloss" (the part in parenthesis) is what is meant. Neither system is more logical than the other, but it does mean that as an English speaker, the Japanese style sometimes feels upside down.

