Phonosymbolism and Phonosemantics in Chinese

Since Westerners first encountered Chinese characters centuries ago, they have been confused over how the characters convey meaning.  It was obvious from the beginning that the characters are very different from a simple syllabary in that they do not directly and unmistakably signify the sounds of whole syllables on a one-for-one basis; all the more, they are unlike alphabets in not indicating phonemes.  The earliest Western interpreters tended to think of the characters as pictographs and ideographs that somehow indicated meanings directly without the intervention of sounds.  In time, however, as scholars came to better understand how the characters are constructed, many of them realized that sound plays an important role in conveying meaning, as it does in all other full writing systems ("full" in the sense of being able to convey all the main aspects of living and dead languages, including morphology and grammar).  John DeFrancis wrote two wonderful books that grappled successfully with the explication of these thorny issues:  The Chinese Language:  Fact and Fantasy (Honolulu: University of Hawai'i Press, 1984) and Visible Speech: The Diverse Oneness of Writing (Honolulu: University of Hawai'i Press, 1989).

Still today, however, people are perplexed by the principles whereby characters indicate meaning.  It is now generally recognized that phonology is a key component in the process by means of which the Chinese writing system represents linguistic utterances.  In grappling with this baffling conundrum, researchers have devised various theories and mechanisms to explain how the tens of thousands of characters mediate between the sounds of spoken language and its written expression on a two-dimensional surface (including now computer screens and the displays of other electronic devices for processing and transmitting language).

One such approach is that of phonosymbolism.  William Rozycki has written a stimulating article ("Phonosymbolism and the Verb cop") in which he attempts to show that various presumably unrelated languages around the world have independently chosen the syllable kap, or some close variant thereof, to convey the following meanings: "take, grasp, grab, seize, capture". He is able to cite an impressive amount of evidence in favor of his contention.

Rozycki explicitly states that he makes no claim for the universality of phonosymbolism, yet the manner in which he presents his argument leads him to come dangerously close to making such an assertion. Here is the distillation of his thesis:

…I will present both historic and areal evidence that a tendency or force is at work in the connection of the phonetic shape [kap] and the semantic range of 'catch, seize, snatch.' Like suprasegmentals in relation to the workings of phonology, this phonosymbolic force is another dimension, not yet clearly understood, that exerts influence on the process of word formation.

After reading and rereading Rozycki's paper several times, I have come to the conclusion that phonosymbolism is not at all like suprasegmentals, that it is a mystical concept, that it never will be clearly understood, and that it has no effective or discernible influence on the process of word formation — except for onomatopoeia (and there only to a limited degree because it is well known that people in different cultures come up with radically different imitative sounds for dogs barking [e.g., English "bow-wow" is wang-wang in Mandarin], trees crashing, rain falling, squeals of delight or pain, and so forth).  “At best, phonosymbolism can be used to explain merely a tiny portion of the vast lexicons belonging to human languages."

The above four paragraphs are based on the beginning of my article entitled "Phonosymbolism or Etymology:  the Case of the Verb 'Cop'", Sino-Platonic Papers, 91 (January, 1999), 1-28, except for the last sentence, which comes from p. 17.  Rozycki's paper is dedicated to explaining the root sound-meaning of two dozen Chinese characters (listed on pp. 27-28) that contain the phonophore 夾 ("press from both sides") and are pronounced as follows in Modern Standard Mandarin (MSM):  jiā, jiá, jià, xiá, xiǎ, xià, xié, xiē, jiē, qiè, yì, jié, zhǎ, shè, yà, and there are other variant and topolectal pronunciations of characters containing this phonophore.  In Middle Sinitic and Old Sinitic, the 夾 phonophore would have been pronounced roughly as *kāp.

For all of the reasons spelled out above and at greater length in "Phonosymbolism or Etymology", I do not believe that phonosymbolism can account for the  sounds and shapes of the Sinitic lexicon and the entire inventory of Chinese characters.

Another theory that is more sophisticated than pictography or ideography in attempting to account for how Chinese characters convey meaning is that of phonosemantics.  Here I would like to call attention to the interpretations of Chinese characters presented at  The site is based on collaborative research by Lawrence J. Howell and the late Hikaru Morimoto. Asked about their academic credentials for explicating the characters, Howell replies, “Nada. Zilch. We're autodidactic glossographers.  In academe, projects like this never have a chance.”

The site specifies which notions originated with which collaborator. For shorthand, the following description is ascribed to the active contributor.

Howell's research centers on phonosemantic patterns he identifies within the early Han language (aka Proto-Chinese [Howell's term]). Specifically, he believes that particular initials and particular finals in Proto-Chinese correspond to particular conceptual indicators (details below). He also avers that a vowel the pronunciation of which required rounding of the lips (similar to English O or U) characterizes ancient terms related to curved / circular lines, objects, or motion.

Howell's transcriptions of Proto-Chinese are simplified forms of Bernhard Karlgren's old reconstructions. He claims these transcriptions allow what he calls “the big picture” and “the essential simplicity of Proto-Chinese” to emerge with greater clarity than they do via renderings of contemporaries Laurent Sagart, William Baxter, or Axel Schuessler.

Below are Howell's “big picture” conclusions about Proto-Chinese.

The language is phonosemantic in nature.

Seven concepts (Frame, Continuum, Concealment, Supple, Spread, Small / Slender, Straight) generated all its terms excepting onomatopoeia and a handful of loan words. Each concept corresponds to an initial consonant (K L M N P S and T, respectively). When secondary concepts (Extend; Encompass, Adhere / be proximate; Press; Continuum; Cut / Divide / Reduce) were to be conveyed, this function was performed by the consonant within the final (-NG -M -N -P -R and -T respectively).

The yīnfú 音符 ("sound note") in xíngshēng zì 形聲字("phono-semantic compounds") was intended to suggest not only the character's pronunciation but also its meaning, again with the exception of onomatopoeia and loan words.

All compound characters created in Proto-Chinese that traditionally have been assigned to the huìyì zì 會意字(ideogrammic compounds) category were devised as phono-semantic compounds (形聲字). Apparent anomalies in compound characters owe to 1) transposed, abbreviated or otherwise altered elements, 2) sound notes the independent character forms of which dropped out of use, and 3) pronunciation changes owing to consonant shifts in either the initial or the final.

Consonant shifts in derived terms, occurring in both the initials and the finals, correspond to shifts in meaning, and these follow the conceptual associations noted above.

The last three points are of interest only to specialists, so I'll move along without speaking to their merits, or lack thereof.

The first two assertions, in contrast, require rebuttal because they would (if tenable) completely transfigure our understanding of the origin and lexicon of the Sinitic family of languages. Here are my issues with Howell's interpretive schema.  First, his concept groups are not historically accurate.  Second, it is hard to fit his explanations with what we know from the bones and the bronzes and the seal forms of the characters. Perhaps most critically for Howell's findings, I do not believe that the shapes of the characters have any necessary or intrinsic relationship to the sounds of the characters. The shapes of the characters are often highly arbitrary, and even for the early period (viz., around 1200 BC), they have been borrowed to represent different ideas, and the borrowings are not always phonologically rigorous. Howell, on the other hand, believes his data account for such borrowings and arbitrary usages. Someone with abundant spare time is welcome to corroborate or undermine that claim.

Howell considers his phonosemantics to be a type of phonosymbolism, but I believe that his system is far more comprehensive in its scope and has been developed with greater attention to the specifics of the Chinese writing system.  Nonetheless, for the reasons outlined above, I am not convinced that the Howell-Morimoto scheme can explain the origins and development of the Old Sinitic lexicon.

For those who might wish to judge for themselves, Howell's data (as noted at the outset) may be accessed online (no charge); they are also available through the site in book form as Kanji Etymology. My remarks above apply to the paper publication's contents as well. I might add that, for a book produced in Japan, Kanji Etymology is not particularly expensive (5,000 yen [about $65] plus shipping / handling), yet the printing is clear and the paper of good quality.  My main complaint about the paper version of the data base is that there are no indices, which makes it hard to navigate.  In response, Howell writes, "Traditional indexing covering both Chinese and Japanese usages would have doubled the bulk and inflated the price. The book, Kanji Etymology, is intended to rest on a desk and be worked through page by page, group by group and down through all related sub-groups. The aim is to facilitate the reader's visualization of the conceptual world of the characters. The data begs for illustration, which will bring this conceptual world into much sharper focus. I'll see to that following my reincarnation as a manga artist."

Before closing, I'll make a slight digression, one prompted by the title Kanji Etymology (sigh!). A year ago, I posted here about Richard Sears' site, While conceding that specialists and others alike are in the habit of referring to the analysis of character structure as "etymology", I stressed that, properly speaking, written symbols do not have etymologies. I also maintained that true Chinese etymology has to take into account the development of sounds and meanings through time. Here I'll reiterate: Sinitic etymology should ideally be done without reference to the characters; the characters would be connected to the etyma in a secondary fashion.  In a forthcoming post, I will turn to the question of genuine Sinitic etyma (which are defined by sound and meaning, NOT by character structure), with some surprising results in comparison with the number of Indo-European, Semitic, and other roots.

[Thanks to Lawrence J. Howell for help in preparing this post]



  1. Minivet said,

    January 13, 2012 @ 7:02 pm

    You might be interested in this webpage showing how chance correspondences between languages are actually quite likely, especially when you loosen your criteria for "similar sound" and "similar meaning."

  2. mondain said,

    January 13, 2012 @ 8:23 pm

    In the traditional study of exegesis and philology (训诂), the principle of 因聲求義 (‘to search the meaning through the sound’) seems to share the very similar phonosymbolic or phonosemantic idea that the sound is suggestive of its meaning. It has been developed by Qing scholars and used in their practice to solve some conundrums in the interpretation of ancient texts. Based on historical phonology, it usually attempts first to search for clues in the sound or words and identify the effect of 假借 phonetic borrowing (hence ‘因聲’), then recover its original meaning (hence ‘求義’). In Duan Yucai's commentary on Shuowen Jiezi, there are also notes that ‘the characters of the sound X usually have the meaning “Y”’.

  3. Peter said,

    January 14, 2012 @ 12:35 am

    A very interesting article considering the 因聲求義 approach alongside other (more ancient and more modern) approaches. Probably the best article I've ever read when it comes to picking apart when phonophores are meaningful from when a sound is just a sound.

    [ Credit to this excellent comment thread, interesting on its own merits: ]

  4. Colin Z said,

    January 14, 2012 @ 1:09 am

    "The yīnfú 音符 ("sound note") in xíngshēng zì 形聲字("phono-semantic compounds") was intended to suggest not only the character's pronunciation but also its meaning, again with the exception of onomatopoeia and loan words."

    As you point out this claim is wrong– there is nothing like a systematic correspondence between yinfu meanings and character meanings. That said, every so often I come across a character and wonder whether its yinfu might have been sneakily borrowed for meaning also.

    for instance: 里\俚,乏\貶,令\命
    disclaimer: i'm no expert on character origins, and i find it's always hard to tell apart this putative category of characters from "semantic extension" characters like (fingers crossed for accuracy) 反/返, 回/迴,復/覆/複

  5. LDavidH said,

    January 14, 2012 @ 3:13 am

    FWIW: In modern Albanian, the verb "kap" means "to grab, catch, seize" – did this William Rozycki by any chance speak Albanian??

  6. Leonardo Boiko said,

    January 14, 2012 @ 8:53 am

    I’m trying to build a picture of various methods or “schools” of character analysis; could you guys check whether it makes sense?

    My knowledge is very limited due to the language barrier, but the way I see it at the moment is:

    1. First, we have traditional analysis based on classical glossaries and rhyme-tables, primarily the Shuōwén Jiězì, of the kind we usually see on standard dictionaries;
    2. Then, modern scholars who leverage this tradition but add to it a) the knowledge acquired from oracle-bone and bronze script, and b) the methods of historical linguistics;
    3. Shizuka Shirakawa’s analysis, which are somewhat influent in Japan (though not without criticism), based on manually retracing oracle bone characters and reconstructing (?) religious/symbolic significance;
    4. This phonosemantic approach, including Howell/Morimoto and 因聲求義 ?

    By the way, if you forgive me the plug, I have a simple tool to compare various character “etymology ” (sorry Mr Mair!) sites—currently zhongwen, kanjinetworks, ja.wikitionary, and

  7. Victor Mair said,

    January 14, 2012 @ 9:21 am

    @Leonardo Boiko

    I just used your "simple tool" to search for 夾, and it only returned the entry from Have I used your "simple tool" incorrectly, or is the only one of these four sites that has an entry for 夾?

    In any event, your "simple tool" seems very handy, and I shall undoubtedly introduce it to my students at Penn and at Peking University, though, of course, with my usual caveats about "character etymology".

  8. Victor Mair said,

    January 14, 2012 @ 9:30 am

    @Leonard Boiko

    Hmmmmm…. I just did the same thing with 來 and got the same results. Surely such a common character as 來 would have an entry at all these sites, so I suspect one of the following:

    1. I'm using your "simple tool" incorrectly

    2. your site is incomplete

    3. your site has a glitch

  9. Ethan said,

    January 14, 2012 @ 1:55 pm

    @Victor Mair
    I just tried 夾 and it opened 4 new tabs in the browser (chrome), one from each responding site. So perhaps

    4. Some browser setting

  10. TB said,

    January 14, 2012 @ 3:23 pm

    What's a better term for the origin and evolution of the characters?

  11. Victor Mair said,

    January 14, 2012 @ 4:46 pm

    Thanks, TB. It works for me now. Leonard Boiko, your "simple tool" is handy and useful.

    David Prager Branner wrote a response to this post on his blog; it is entitled "Phonosymbolism, etymology, and the nebulous Chinese word family"

  12. Leonardo Boiko said,

    January 14, 2012 @ 5:36 pm

    Mr Mair: Thanks! The tool is biased for Japanese students, which I admit is a shame; I should add buttons to convert to and from traditional characters, Chinese simplified and Japanese simplified. Unfortunately sites like kanjinetworks tend to only work with Japanese simplified, which doesn't work on sites like zhongwen.

    I figure that, as a reader and fan, I owe you an apology as to why I decided to go with the flow and use "hanzi etymology". One reason is for search engines—I figured students looking for character history and analysis would be likely to search for "hanzi etymology" or "kanji etymology", so I put both words in the title (Google likes titles). But the real reason is mischief. If we look at the etymology of the word "etymology", we find Greek etymon “true, real [sense]”; in the prescriptivist dark ages, the goal of etymology was to find the "true meaning" of words, presumed to be lost due to the loose morals of linguistic decadence. But then the paradigm shifted, and now "etymology" has nothing to do with searching for the proper original Truth; since we recognize that the present, living language is as good as any other, etymology became about tracing the history of words and morphemes. In other words, the meaning of the word “etymology” is itself “etymologically incorrect” (if we assume the old meaning of etymology = “study of true meanings”, which is the only way to make sense of the phrase “etymologically incorrect”). This fact amuses me immensely, & I could not resist the temptation of making the word even more “incorrect” by applying it to characters.

    I'm not making this up right now, and I have a comment somewhere on languagehat or no-sword to prove it :)

    But I understand very well the problem you're fighting against (the all-too-common mistake of presenting character analysis as if it could explain the etymology of words/morphemes), so I'll at least put up a note or something. If you know of any other character analysis website that you find academically useful, just ping me anytime and I'll add it.

  13. mark said,

    January 23, 2012 @ 5:38 am

    Manual pingback: I just posted something on these issues over at The Ideophone.

  14. Victor Mair said,

    January 24, 2012 @ 9:34 am

    It has been insightful to observe the angles from which folks approach the points I made in my post. There's Mark Dingemanse, at

    expressing aversion to systematic approaches to language, saying, "Often a certain amount of explanatory leakage is more exciting than a neat account." Or this idea from Joe Perez at

    "… OF COURSE there must be phonosemantic connections in Chinese and in every language under the sun because God/Spirit emanates through all languages in patterns that are inherently meaningful and evolving (not random chance)."

  15. mark said,

    January 28, 2012 @ 1:20 pm

    Hi Victor,

    I agree that my approach is almost diametrically opposed to what Joe Perez (a self-proclaimed language mystic) writes there. I would resist your characterization of my position as "an aversion to systematic approaches to language" though.

    I'm all for systematic approaches to language; it is just wishful assumptions of perfect systematicity that I warn against. As I wrote, "Seeking regularity all the way leads to oversimplification. In some possible world, all Chinese characters are neat pictograms, the Chinese language is phonosemantic in nature, and all ideophones are nice imitative words. This world is not ours however…"

  16. Jess Tauber said,

    February 2, 2012 @ 7:12 pm

    Some years ago I took a long look at phonosemantics in Korean. While the language has a great many expressives/ideophones, and vocalic and consonantal alternations of the augmentative/diminutive kind, which are interesting for their own sake (and we also know the degree to which the written language has been iconized under Sejong), it occurred to me to examine the Chinese borrowings to see if they differed in any significant way phonosemantically from the ideophones. What I found was that there seemed to be hidden, in the forms, a sort of social and material-mechanical hierarchical mapping system. Social hierarchies in human societies allow for a structured distribution of goods and services. Different phonemes associated with different operations and positions within this system. I don't remember the specifics anymore, this was quite some time ago. I never had the chance to look at the entire Chinese lexicon from this perspective, but have seen that there are, in the reconstructed protolanguage, certain recurrent phonosemantic mappings. The one involving terminal -p is relevant here in the discussion of forms like kap. Many of the Chinese forms ending in -p refer to bringing materials together in one place, preventing their escape, and so forth. This is in line with what is found in many language families, but not all. Comparison with Thai dialects shows a similar mapping, though with a twist. The Chinese forms often involve the evaluative notion of impropriety, while the non-Chinese Thai forms see the capture and holding of materials as a good thing, more often than not. Same act, different moral evaluation.

