Chinese "Etymology"

« previous post | next post »

My previous post was about "dialects" that are often not really dialects, but bona fide languages, and the efforts of the Chinese government to phase them out.  In this post, I'll be talking about "etymology" that is not really etymology, but character analysis.

The occasion for these ruminations (see especially the last two paragraphs below) is this brief news item that occurred in the Beijing Morning Post on January 13 (pardon the somewhat peculiar English of the following paraphrase, which is taken from a daily Chinese newspaper digest [so far as I know, the BMP is published only in Chinese]; it conveys the sense and tenor of the original in a serviceable, though abridged, fashion):

A 04: "An American guy makes China's sinologues embarrassed"

Richard Sears might be a nobody in the US, but he has certainly made a name for himself among the Chinese netizens. The guy spent 20 years creating a website that allows users to trace Chinese characters to their ancient shapes, helping users to see what a given character looked like when they were carved on animal bones and oracles or written on silk two or three thousand years ago.

Nobody in China, not even the professors who wrote so many books and made so much money, had created anything as remotely convenient for ancient Chinese researchers as he did.

This brief article also was published in a number of other newspapers in China, some with a picture of Richard Sears and sample illustrations from his website.

The website referred to by the article is Sears' "Chinese Etymology".

The Chinese article about Sears elicited a huge response on the Internet.  Sears told me, in a phone interview, that shortly after the article appeared his site received 600,000 page views in 24 hours, whereas before that he had been getting about 15,000 per day, half of them from Taiwan and China.  By yesterday, the page views had leveled off at around 150,000 per day.  After the Chinese article appeared, his e-mail spiked from a mere trickle to over a thousand in the last few days.  The comments on Chinese blogs that I have seen are spirited, with many of them expressing astonishment and shame at what Sears has accomplished ("How could a foreigner do all of this??!!"  "We Chinese are only interested in making money."  And so forth and so on.)

Sears' work, both in China and abroad, is widely recognized as being very useful, and he has invested an enormous amount of time and effort in it.  Indeed, Sears has labored for more than two decades to assemble and present the massive amount of data that is available on his site.  It is truly remarkable that one man could have done nearly all of this by himself.  The only help he received was from someone whom he hired to scan thousands of pages for him.  The conceptualization, design, programming, entry, and everything else, including much of the scanning, is entirely Sears' own handiwork.  For about 10-15 years, Sears had a good job in Silicon Valley, and that is how he could afford to pay for the scanning.

Sears' B.A. was in nuclear physics, but in 1985 he received an M.A. in computer science, with OCR of Chinese a particular interest of his.  I suppose that is what steered him into his preoccupation with the Chinese script.

Since Richard Sears is not a well-known figure among academics (although scholars do utilize his website to find the Shuowen and other seal script forms, bronze inscriptional forms, and oracle bone inscriptional forms), I copy here his own self-introduction (from the first paragraph of the Chinese characters and etymology link of his website):

When I was a young man of 22 in Taiwan in 1972 trying to become fluent and literate in Chinese, I was faced with the prospect of learning to write about 5,000 characters and 60,000 character combinations. The characters were complex with many strokes and almost no apparent logic. I found on the rare occasions when I could get a step by step evolution of the character from its original form, with an explanation of its original meaning and an interpretation of its original form, suddenly it would become apparent how all the strokes had come to be. The problem is that there is no book in English that adequately explains this etymology and even if you read Chinese there is no single book in Chinese that explains it all. In short it is a research project to understand each character. To have this information at my fingertips in English would have been a great help.

The first advantage of a computerized etymology is that you can do all kinds of analysis which would be limited by the linear nature of books. The second advantage is that etymology is an ongoing research project. We do not know all the answers when it comes to character etymology. If errors or discrepancies are discovered in a computerized system, they can be corrected. They can not be corrected in a book that has already been published.

There are literally thousands of references on this subject, most of them in Chinese. Most of them having something new, unique or interesting to say. I only list what I have found to be the top references.

Sears tells me that he is an American who lives in Knoxville, Tennessee as "an unemployed computer programmer" and that his "regular CV is oriented toward getting a programming job."  He also says that what he does is probably "not very interesting to linguists."

Although Sears is not an academic, nor does he consider himself a linguist, his website is without equal for its convenience and comprehensiveness in providing early forms of the sinographs.

Here, in his own words, is what Sears has accomplished in the last 20 years:

1. I have compiled a database of over  96,000 ancient and archaic Chinese characters.

2. Shang dynasty 1500 BC – 1000 BC 31,876 oracle bone characters XuJiaGuWenBian 續甲骨文编

3. Zhou dynasty 1000 BC – 200 BC 24,223 bronze characters JinWenBian 金文编

4. Qin-Han Dynasty 200 BC – 200 AD 11,109 seal characters and ShuoWenJieZi 說文解字

5. Full text Chinese source from the ShuoWenJieZi 說文解字

6. Qin-Tang dynasty 200 BC 1000 AD 38,596 alternate seal characters from LiuShuTong 六書通

7. Mandarin Speech and phonetic database

8. Taiwanese Speech and phonetic database

9. Cantonese phonetic database

10. Shanghai dialect phonetic database

11. English translation

12. Phonetic separation and analysis

13. Cognate separation and analysis

14. Seal character to traditional character transition mechanisms

15. Traditional character to simplified character transition mechanisms

16. Complete etymological analysis of 6,552 most common modern Chinese characters

Sears is not resting on his laurels, but has further ambitious plans for his website and will continue to refine it.  For example, he wrote to me, "I have also scanned a couple of thousand pages of cursive 行书 and super-cursive 草书 Chinese characters. If any of your readers would like to volunteer to help me cut the images out and index them. please let me know."  "I also have a Shanghaiese and Cantonese phonetic list, but no Shanghaiese or Cantonese speech database. If any of your readers would like to volunteer to record the speech data I would appreciate it."

I have heard few serious complaints about Sears' site as an initial stop for early forms of the characters.  Specialists, however, do have various quibbles and reservations.  Since these are highly technical and require an advanced level of understanding about the nature and history of the Chinese script, I have decided not to include them in this post (which is already growing too long), but will send them separately to anyone who writes to request them.

The only major problem I myself have with the site is its title, "Chinese Etymology," I don't consider what Sears does to be "etymology" per se.  Written symbols (characters, letters, graphs, etc.) do not have etymologies.  Rather, they undergo evolution and development.  Thus, Sears' work has to do with Chinese character structure, analysis, and evolution, not etymology.  True Chinese etymology has to take into account the development of sounds and meanings through time (roots, derivatives, cognates, etc.).  For that, the most convenient, reliable, and authoritative source for the early period is Axel Schuessler's ABC Etymological Dictionary of Old Chinese.

I should point out, however, that it is very common, both among specialists in the field of Chinese Studies and among the lay public, to refer to the analysis of character structure as "etymology," so Sears is certainly not alone in doing so.   Still, I consider this a serious issue in Sinology and in Chinese linguistics, just as serious an issue as calling Sinitic languages like Cantonese "dialects."  Chinese linguistics has long been bedeviled by deep confusion between the writing system and language, and it is this confusion that leads people to mistakenly speak of characters as having etymology.

[With thanks to Ken Takashima, Wolfgang Behr, Richard Sears, Julie Wei, Roger Olesen, Alexa Olesen, Matt Anderson, and Jonathan Smith]


  1. mondain said,

    January 18, 2011 @ 12:00 am

    Maybe 'paleography' is the proper term for the study, which is used by centres at Chicago and Fudan (the logo is barely legible).

  2. Carl said,

    January 18, 2011 @ 1:27 am

    This error is indeed prevalent, but what do you suggest as an alternative term for the evolution of Chinese characters? Just "the evolution of Chinese characters"?

  3. carat said,

    January 18, 2011 @ 2:16 am

    I have been using Sears's site for a few years now and am immensely grateful for having it as a resource. However, I find myself doubting it on occasion when the analysis it states is particularly tenuous. Does anyone know of good up to date print sources for chinese character analysis?

  4. Erik Zyman Carrasco said,

    January 18, 2011 @ 4:22 am

    Off topic, but I was surprised by "anything as remotely convenient for ancient Chinese researchers…" (instead of remotely as). And yet "anything as remotely" has >27,000 ghits. I wonder if it's a processing error, since it seems problematic semantically—and, if so, what enables it.

  5. Amos said,

    January 18, 2011 @ 4:39 am

    One reason I might suggest for the way the word 'etymology' is used is that 'etymology' is typically defined as the study of the history of 'words'. The concept of the 'word' in Chinese is often a subject of debate, with Mandarin Chinese zi 字 often being translated as 'word'. (People familiar with Mandarin Chinese are probably aware that another concept called ci 词, which is often translated as 'phrase' often better corresponds to the English notion of 'word'.) More precisely, 字 actually refers to a single Chinese character. It's therefore no big leap to see how the word 'etymology' has come to be used to describe the history of such 'words' in Chinese.

    On a slightly related note, I was recently asked to help with a dictionary for a Tibeto-Burman language of Nagaland. The language displays a fair amount of agglutinative morphology and most words are morphologically transparent. One request was that the dictionary also explain the 'etymology' of each word. Given that there are no historical records of the language (or proto-language) before the 1900s, I found it a rather difficult request. However, I suspect that many speakers simply wish for me to analyse such words morphologically, as opposed to providing an etymology for the words.

  6. Amos said,

    January 18, 2011 @ 5:00 am

    Just to be clear, I only mentioned 'Mandarin Chinese' in my comment because I gave the pinyin for those characters according to their pronunciation in Mandarin.

  7. John Hill said,

    January 18, 2011 @ 6:16 am

    Thank you, Victor, for pointing me once again to really useful sources which I quite possibly would have missed for ever. Also, many thanks for the brief but valuable assessment of this resource. Sears' achievement is truly astonishing and should prove helpful to my ongoing research. So, three cheers and a hurray for Richard Sears and Victor Mair!

  8. Charlie C said,

    January 18, 2011 @ 8:10 am

    Are any of the 16 accomplishments listed available separately as databases or analyses?

  9. marie-lucie said,

    January 18, 2011 @ 10:34 am

    It sounds like Sears should get some sort of award from linguists, at least from

  10. not someone else said,

    January 18, 2011 @ 11:29 am

    Never commented here, but this is fascinating, and I wonder if that help he needs is skilled or unskilled. I did a double-take when I realized this was being done by a local. Good for him.

  11. Fresh Sawdust said,

    January 20, 2011 @ 7:21 pm

    It would be nice if Sears' site had a means (clickable menus, or Pinyin input etc) contained within it for selecting search items; as it is one has to copy and paste individual characters in from other sources, which is a bit cumbersome.

  12. Ryan said,

    January 21, 2011 @ 5:09 pm

    I'd love to hear Victor's opinion on how Sears' resource stacks up against Chinese Characters: A Genealogy and Dictionary by Rick Harbaugh. The online version is available at The Sears resource looks more extensive, but perhaps there's also a difference in the quality of scholarship.

  13. Elizabeth Braun said,

    January 24, 2011 @ 10:03 am

    Thanks for this resource, I'll be book-marking it!

    Fascinating that the chap who runs it got a Bachelor of *Arts* in nuclear physics and a Master of *Arts* in Computer Science!!! What an interesting HE system you have in the States!!!

  14. fs said,

    January 25, 2011 @ 3:50 am

    Sadly it seems that someone has hacked the website and it has been taken down.

  15. Sami C said,

    March 8, 2011 @ 9:50 pm

    I stumbled upon Sear's extensive website, and as a native Cantonese speaker, and writer of traditional Chinese, I was impressed by the amount of work he has put into something that most of us Chinese take for granted.

    I'm not really sure if there are parallel words in English for every Chinese words (or character), including the word etymology. For one thing, English was built upon on an entirely different system than ours (the use of alphabet,using roots from different older languages,spelling,etc). As defined in a dictionary, "etymology" is the history of a word. However, the concept for "word" in English may be different than that in Chinese. We don't have alphabet per se, only the so-called characters. But sometimes one character can be a word by itself, whereas some characters can hardly mean anything when used alone. So characters can be words, and vice versa.

    I think this is what is so interesting about languages, and it is such a difficult task to try blending the two spaces together by means of translation. Good luck finding a new word for "Chinese etymology."


  16. Eugenio Llorente said,

    February 1, 2012 @ 8:39 pm

    I entirely agree with Ryan. If it comes to comparing Richard Sears site to that of Chinese Characters: A Genealogy and Dictionary by Rick Harbaugh, in my view Rick Harbaugh is vastly superior. It really shows a deep understanding of Chinese character system. His work is really a breakthrough in Chinese linguistics. Wheras Sears efforts, although very usefull, provide an enormous amount of data but without making much contribution to the understanding of Chinese character or the chinese script.

  17. Pai Y Soo said,

    April 13, 2013 @ 5:31 am

    I ran into Richard Sears by accident or when I was searching Chinese scripts etymology to write a book Chinese in God's Land because I happened to notice 神 福 禄 寿 or shen fu lu shou embedded with pictograms the book of Genesis could give some lights. It was also the discovery so to speak by Gong Yu Hai (宫玉海)in his research on shanhaijing(山海经)that Eden was located in Yunnan, China gave me the idea that Bible may be linked to Chinese scripts to unlock the pictograms embedded in the above four characters. I have also included some other scripts to complete the said book. The book is available at rosedogbookstore and Amazon. I have given my thanks to Richard Sears who allowed me to use the materials to show the etymology of those scripts. By the way, I also gave appreciation to eSword Website for allowing me to quote sufficiently materials from KJV bible especially the Genesis stories on Creation to explain the pictogram in those scripts. I came to this comment on Sears by accident too and I am glad to learn much from the post.

RSS feed for comments on this post