Language Log

Mix and match Japanese orthography

April 17, 2024 @ 9:24 am · Filed by Victor Mair under Gender, Usage, Variation, Writing systems

Most Language Log readers are aware that the Japanese writing system consists of three major components — kanji (sinoglyhs), hiragana (cursive syllabary), and katakana (block syllabary). I would argue that rōmaji (roman letters) are a fourth component, as they are in the Chinese writing system.

How do people decide when to switch among the different components of the Japanese writing system? Of course, custom and usage determine when to use one and when to use another. (It's a bit like masculine, feminine, and neuter in gender based languages [a frequent and recent topic on Language Log] — you don't ask why, you just do it].) In most cases, convention has fixed which of the three main components of the writing system is used for a particular purpose. On the other hand, since I began learning Japanese half a century ago, I noticed a fairly conspicuous slippage regarding what I had been led to believe were predetermined practices.

Sanae Heist, a senior studying linguistics at Columbia University, brought this whole matter to the surface when she wrote to me as follows:

I understand you study Chinese, but I was wondering if you happened to know any information about the contexts in which Japanese people switch the writing systems, i.e. hiragana or katakana for words that would traditionally be written in kanji or katakana for words typically in hiragana. I have caught my mother (Japanese), myself (half Japanese heritage speaker), and some Japanese friends doing this, and although I've yet to grasp the exact context for when this occurs (I usually just mimic how the word was written by the other person), I am curious to know whether there's a larger pragmatic role than just for emphasis or for convenience.

Interesting questions.

My observations are that there is a lot of variability in the way people mix and match hiragana, katakana, and kanji.

Here's a response from a specialist on Japanese orthography, J. Marshall Unger, emeritus professor at Ohio State University:

After 1945, the Ministry of Education established rules about kanji and kana to be taught in school. Briefly, only approved kanji and their approved readings* should be used (some exceptions for kanji in personal names had to be permitted); only simplified kanji were approved (again, some exceptions had to be tolerated); hiragana were to be the default syllabary for Japanese words not written with kanji; katakana were to be used for non-Japanese words. One was not to use the spelling of long vowels in hiragana when using katakana, nor vice versa.

*This subsumes cases of a kanji followed by okurigana; e.g. 分かる was specified for wakaru, but one finds 分る for the same word in pre-war writing.

Prior to 1945, there was much greater variation in how people wrote. I recall that there's a novel by Tanizaki in which each chapter is supposed to be the diary entry of a husband or his wife: he uses katakana as the default and rather formal language; she uses hiragana and a more vernacular style.

I don't think what the student is asking about goes much beyond words such as taihen, which can be 大変or たいへん, which can be written either way unless (s)he is including the wild variations one sees in manga balloons, etc. As the example of taihen shows, common SJ words may be written in hiragana—obscure SJ words that use kanji not on the approved list have to be written, in whole or in part, using hiragana, or else be replaced with synonyms. Of course, when the LDP introduced the joyo kanji, the list of approved kanji was demoted to the status of a 'guide' (meyasu): in theory, any damned kanji can now be used, and word-processor makers have been adding kanji of extremely limited utility to their inventories ever since. Now that people often rely on conversion routines to produce text rather than writing by hand, all kinds of weird errors show up in what one sees.

It sounds like the rules are not written in stone, that there's a lot of flexibility and informality in how hiragana, katakana, and kanji usage plays out.

And here's a response from Nathan Hopson (formerly at Nagoya University), who teaches at the University of Bergen in Norway:

I've actually had this conversation with a colleague in Japanese linguistics. She thinks it's a viable research topic, i.e., that it's not fully explained as a phenomenon.

There are some cases of words written in all three scripts in common usage:

癌　がん　ガン (cancer)

There is a different feel to each. I can't confidently articulate a full explanation for all three, but がん is definitely trying to soften the blow or make the whole thing less terrifying (e.g. がん保険, gan hoken for insurance).

拉麺　ラーメン　らーめん (ramen)

ガン is also an example of katakana being used for a "native" Japanese word because long strings of hiragana can be hard to parse — we expect them to be grammatical markers such as verb conjugations, etc.:

ごみ　ゴミ (waste/trash/garbage)

Another type would be animal names, where the katakana is sometimes acting, explicitly or implicitly, as the Latin scientific name and sometimes as ゴミ above:

犬　いぬ　イヌ (where the last can mean Canis familiaris)

人　ひと　ヒト (where the last can mean Homo sapiens)

All this is further complicated by the choice to add the annotative glosses called ルビ (rubi, from the 5.5-pt font size used for interlinear annotations). The most common type is furigana, which gloss the readings of kanji. They are often seen with characters beyond the 2000-ish basic kanji or with proper nouns, etc., for which the reading is obscure, etc. That obviously depends a great deal on audience. But you can also use, for instance, the readings of synonymous kanji (or even antonyms, etc.) to add what I guess we'd call "color" or "flavor": glossing 迷宮 (meikyū) with ラビリンス might mean "The Labyrinth" instead of "labyrinth, while 胎児 (taiji) with こども makes a fetus an (unborn) child.

* h/t Aya Homei for those two examples

If I want you to read 日本 as ニッポン instead of にほん, I could either do that by just writing out the katakana or glossing the kanji.

In any case, in the majority of instances, orthographic choice reflects some assumptions about audience (what do they know, expect, etc?). There are conventions, which we use without a lot of intentionality, but the thing about conventions is, breaking them deliberately is fun. It's part of the word play we all engage in, whatever the language.

After I shared these observations with Sanae, she replied:

Thank you so much for such a generous response and for extending my question to others you know. I get the general consensus that the orthography variations are an individual stylistic decision, which I agree is a feature of all languages (perhaps not in orthography, but the general concept of variations and flexibility in language). This just happened to catch my eye in recent months so I really appreciate the extra background knowledge and examples that were provided.

As one can readily see, Sanae's questions about variation in Japanese orthography lead to larger issues of deviation from standards (not standard deviations!) in language in general.

Selected readings

"Katakana nightmare" (6/20/19)
"The esthetics of East Asian writing" (4/7/12)
"Ye Olde English katakana" (8/11/14)
"More katakana, fewer kanji " (4/4/16)
"Kanji as commodity " (4/30/18)
"The economics of Chinese character usage " (9/2/11)
Mark Hansell, "The Sino-Alphabet: The Assimilation of Roman Letters into the Chinese Writing System," Sino-Platonic Papers, 45 (May, 1994), 1-28 (pdf)
Helena Riha, "Lettered Words in Chinese: Roman Letters as Morpheme-syllables" (pdf)
"Zhao C: a Man Who Lost His Name" (2/27/09)
"Creeping Romanization in Chinese, part 3" (11/25/18)
"The actuality of emerging digraphia" (3/10/19)
"Sememic spelling" (3/27/19)
"Polyscriptal Taiwanese" (7/24/10)
"The Roman Alphabet in Cantonese" (3/23/11)
"Love those letters" (11/3/18)
"Acronyms in China" (11/2/19)
"Ask Language Log: The alphabet in China" (11/6/19)

April 17, 2024 @ 9:24 am · Filed by Victor Mair under Gender, Usage, Variation, Writing systems

Permalink

12 Comments

Chris Button said,

April 17, 2024 @ 12:07 pm

the katakana is sometimes acting, explicitly or implicitly, as the Latin scientific name

I've always wondered why animal names were often written in katakana! Now I know!
Jason said,

April 17, 2024 @ 7:56 pm

I'm sorry, so in the case of 犬 written as イヌ ("inu") there is a convention that this should sometimes be read as the Latin "Canis Familiaris" ?

That's completely insane. For Latin names, why not use romaji, ie *Latin letters*, for this purpose? It's not like there's any reluctant to use Roman letters for all kinds of other purposes in Japan.
John J Chew said,

April 17, 2024 @ 8:29 pm

If you're talking about your pet dog, it's an 犬 (or if you're in kindergarten, いぬ). If you're talking scientifically about the family Canidae, that's イヌ科, the genus Canis イヌ属, or the domestic dog Canis lupus familiaris, イエイヌ or イヌ　for short. If you're talking about the zodiacal animal, it's 戌. If you're old, or like old words, you might use 狗 instead to refer either to small dogs, or canids in general, but you run the risk of sending people to the dictionary trying to figure out what you mean.

In any case, this being Japanese, none of what is written has anything to do with the intended pronunciation. Most of the time, you'd expect any of these to be read "inu", but "doggu" or "dog" could easily appear in ruby, and it would not surprise me to see either イヌ or "Canis lupus familiaris" glossed in terms of the other in Japanese.
John J Chew said,

April 17, 2024 @ 8:35 pm

Also, I think what Victor was trying to say about Latin scientific terms for plants and animals is that Japanese has its own scientific terms for plants and animals, corresponding to but unrelated to the Latin terms. The use of katakana to represent them in Japanese is a convention that distinguishes imprecise colloquial terminology from precise scientific terminology; this is necessary because the words are often otherwise identical.
Jim Breen said,

April 18, 2024 @ 1:45 am

The flexibility of Japanese orthography presents quite a challenge for lexicographers, especially those working on decoding dictionaries aimed at non-Japanese speakers, and also for lexicons supporting text analysis and glossing software.

Take, for example, the verb tekozuru (to have a hard time, etc.). In the major published Japanese dictionaries this is recorded as 手古摺る, 梃摺る, 手子摺る and 梃子摺る, depending on which one you consult. Analysis of Japanese corpora reveals that most people write it as 手こずる or in hiragana alone (てこずる). Who can blame them? Pity the poor learner who encounters 手こずる and tries to look it up in 広辞苑 or 大辞林.

I was a little puzzled by Jim Unger's comment: "when the LDP introduced the joyo kanji, the list of approved kanji was demoted to the status of a 'guide' ". The jōyō list of 1,945 kanji established in 1981 replaced the earlier tōyō ist of 1,850 kanji established in 1946. I don't think there was really any demotion or other significant change of status. Kanji standardization and use has often been a bit of a political issue. I recommend Nanette Gottlieb's excellent "Kanji Politics: Language Policy and Japanese Script" (1995), and of course Jim Unger's "Literacy and Script Reform in Occupation Japan".
AG said,

April 18, 2024 @ 5:58 am

I feel like I see "neko" spelled out in kana a lot, but I don't understand the latin name point at all. I mostly play video games, and no one is using scientific species names in these contexts.

I have also noticed the kanji for apples, grapes, and strawberries are hardly ever used, and some of my students admitted to not knowing how to write them.
Victor Mair said,

April 18, 2024 @ 6:08 am

Already three decades ago, I noticed a growing tendency for people to use fewer kanji, in favor of more kana, and even rōmaji. This occurred in all sorts of circumstances, public and private.
ohwilleke said,

April 18, 2024 @ 9:43 am

Another nuance in addition to these four writing systems is that there are certain words, like the words for numbers, that are written in a special way in legal documents in order to make them resistant to being forged or modified, with a few strokes of a pen. It is a custom somewhat similar to writing numbers in both words and numerals on a check or in a legal contract.

Scripts aren't the only context in which the Japanese mix and match. It strikes me a deeply parallel to the syncretic nature of Japanese religious practice, with the average Japanese person invoking a mix of Confucian philosophy (which is pervasive), Shinto religious practices (with shrine observances especially on certain holidays and in home shrines for deceased family members), Buddhist religious practices (especially with respect to funerals), and even some Western Christian religious/cultural practices (such as Christmas celebrations heavy on Santa and light on Jesus, and aspects of Christian wedding practices). Similarly, in Japanese popular fiction, it is almost cliche for supernatural threats to be challenged by mixed religion teams of priests from Shinto, Buddhist, and Christian perspectives, with few tensions between them, much like a party of warriors who are each trained in a different martial art.
Bathrobe said,

April 19, 2024 @ 4:20 am

I never found the Japanese writing system to be difficult (although it was memory-intensive) and I developed a reasonable feel for how and when words could be written in different ways.

Yes, scientific plant and animal names are written in katakana to distinguish them from ordinary usage. I suspect that this dates back to the time when Japan was developing the scientific naming of plants, animals, and birds on the model of the West, and the traditional writing of names (often in complex and multiple ways using kanji and katakana/hiragana) were felt to be cumbersome and complicated. Katakana is cleaner, more "scientific", and cuts out disputes over the correct way to write words.

At that time (late 19th century), Japan was the first Asian country to develop standard "common names" for plant, animal, and avian species on the model of Western languages. The names, as written in Chinese characters, were then often reimported back into Chinese.

For example the Japanese called the thrushes by the common name ツグミ tsugumi, which was written 鶫 in Chinese characters. In Chinese the character 鶫 / 鸫 dōng was not actually used for the thrushes. It is found, for instance, in the obscure old word 鶫鵛 dōng-jīng, which refers to the Wryneck woodpecker. In the meaning 'thrush', 鶫 is a direct reimport from Japanese, thanks to a "terminological dictionary" published in China at the time that simply listed Japanese terminology in kanji form. Eventually the ornithological list Chinese Birds of 1927 adopted 鶫 dōng as the Chinese equivalent to English 'thrush'. The imported terms were then further developed into complex and at times artificial naming systems.

Nanette Gottlieb (now Emeritus Professor of Japanese) is the one who set me on the path to learning Japanese. When I first met her (1972) she was Nanette Twine, a postgraduate student working as secretary of the Japanese department over the summer break. When I expressed my hesitancy about learning a language with complicated ideographs, she brushed my concerns off with the comment that "Japanese is just an ordinary language". With my fears allayed, I went on to enrol in Japanese and ultimately spend most of my life studying Japanese and Chinese while living in those countries. I'm eternally grateful to Nanette for what she did that day. It literally changed my life.
Victor Mair said,

April 19, 2024 @ 6:03 am

Bathrobe says that "memory intensive" is "not difficult".
Alan Kennedy said,

April 19, 2024 @ 11:04 am

The four Japanese 'alphabets' allow more creative possibilities for Japanese graphic designers, and they are used vertically (top to bottom) and horizontally (right to left and vice versa). That's why I am fascinated by Japanese posters, signage, advertising, etc.
KIRINPUTRA said,

April 28, 2024 @ 8:30 am

I've often wondered

(1) why some pre-WW2 texts use just kanji & katakana; and

(2) why hiragana are used to indicate kanji readings sometimes, and katakana other times.

RSS feed for comments on this post

Mix and match Japanese orthography

12 Comments

Chris Button said,

Jason said,

John J Chew said,

John J Chew said,

Jim Breen said,

AG said,

Victor Mair said,

ohwilleke said,

Bathrobe said,

Victor Mair said,

Alan Kennedy said,

KIRINPUTRA said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta