My previous post was about "dialects" that are often not really dialects, but bona fide languages, and the efforts of the Chinese government to phase them out. In this post, I'll be talking about "etymology" that is not really etymology, but character analysis.
The occasion for these ruminations (see especially the last two paragraphs below) is this brief news item that occurred in the Beijing Morning Post on January 13 (pardon the somewhat peculiar English of the following paraphrase, which is taken from a daily Chinese newspaper digest [so far as I know, the BMP is published only in Chinese]; it conveys the sense and tenor of the original in a serviceable, though abridged, fashion):
A 04: "An American guy makes China's sinologues embarrassed"
Richard Sears might be a nobody in the US, but he has certainly made a name for himself among the Chinese netizens. The guy spent 20 years creating a website that allows users to trace Chinese characters to their ancient shapes, helping users to see what a given character looked like when they were carved on animal bones and oracles or written on silk two or three thousand years ago.
Nobody in China, not even the professors who wrote so many books and made so much money, had created anything as remotely convenient for ancient Chinese researchers as he did.
This brief article also was published in a number of other newspapers in China, some with a picture of Richard Sears and sample illustrations from his website.
The website referred to by the article is Sears' "Chinese Etymology".
The Chinese article about Sears elicited a huge response on the Internet. Sears told me, in a phone interview, that shortly after the article appeared his site received 600,000 page views in 24 hours, whereas before that he had been getting about 15,000 per day, half of them from Taiwan and China. By yesterday, the page views had leveled off at around 150,000 per day. After the Chinese article appeared, his e-mail spiked from a mere trickle to over a thousand in the last few days. The comments on Chinese blogs that I have seen are spirited, with many of them expressing astonishment and shame at what Sears has accomplished ("How could a foreigner do all of this??!!" "We Chinese are only interested in making money." And so forth and so on.)
Sears' work, both in China and abroad, is widely recognized as being very useful, and he has invested an enormous amount of time and effort in it. Indeed, Sears has labored for more than two decades to assemble and present the massive amount of data that is available on his site. It is truly remarkable that one man could have done nearly all of this by himself. The only help he received was from someone whom he hired to scan thousands of pages for him. The conceptualization, design, programming, entry, and everything else, including much of the scanning, is entirely Sears' own handiwork. For about 10-15 years, Sears had a good job in Silicon Valley, and that is how he could afford to pay for the scanning.
Sears' B.A. was in nuclear physics, but in 1985 he received an M.A. in computer science, with OCR of Chinese a particular interest of his. I suppose that is what steered him into his preoccupation with the Chinese script.
Since Richard Sears is not a well-known figure among academics (although scholars do utilize his website to find the Shuowen and other seal script forms, bronze inscriptional forms, and oracle bone inscriptional forms), I copy here his own self-introduction (from the first paragraph of the Chinese characters and etymology link of his website):
When I was a young man of 22 in Taiwan in 1972 trying to become fluent and literate in Chinese, I was faced with the prospect of learning to write about 5,000 characters and 60,000 character combinations. The characters were complex with many strokes and almost no apparent logic. I found on the rare occasions when I could get a step by step evolution of the character from its original form, with an explanation of its original meaning and an interpretation of its original form, suddenly it would become apparent how all the strokes had come to be. The problem is that there is no book in English that adequately explains this etymology and even if you read Chinese there is no single book in Chinese that explains it all. In short it is a research project to understand each character. To have this information at my fingertips in English would have been a great help.
The first advantage of a computerized etymology is that you can do all kinds of analysis which would be limited by the linear nature of books. The second advantage is that etymology is an ongoing research project. We do not know all the answers when it comes to character etymology. If errors or discrepancies are discovered in a computerized system, they can be corrected. They can not be corrected in a book that has already been published.
There are literally thousands of references on this subject, most of them in Chinese. Most of them having something new, unique or interesting to say. I only list what I have found to be the top references.
Sears tells me that he is an American who lives in Knoxville, Tennessee as "an unemployed computer programmer" and that his "regular CV is oriented toward getting a programming job." He also says that what he does is probably "not very interesting to linguists."
Although Sears is not an academic, nor does he consider himself a linguist, his website is without equal for its convenience and comprehensiveness in providing early forms of the sinographs.
Here, in his own words, is what Sears has accomplished in the last 20 years:
1. I have compiled a database of over 96,000 ancient and archaic Chinese characters.
2. Shang dynasty 1500 BC – 1000 BC 31,876 oracle bone characters XuJiaGuWenBian 續甲骨文编
3. Zhou dynasty 1000 BC – 200 BC 24,223 bronze characters JinWenBian 金文编
4. Qin-Han Dynasty 200 BC – 200 AD 11,109 seal characters and ShuoWenJieZi 說文解字
5. Full text Chinese source from the ShuoWenJieZi 說文解字
6. Qin-Tang dynasty 200 BC 1000 AD 38,596 alternate seal characters from LiuShuTong 六書通
7. Mandarin Speech and phonetic database
8. Taiwanese Speech and phonetic database
9. Cantonese phonetic database
10. Shanghai dialect phonetic database
11. English translation
12. Phonetic separation and analysis
13. Cognate separation and analysis
14. Seal character to traditional character transition mechanisms
15. Traditional character to simplified character transition mechanisms
16. Complete etymological analysis of 6,552 most common modern Chinese characters
Sears is not resting on his laurels, but has further ambitious plans for his website and will continue to refine it. For example, he wrote to me, "I have also scanned a couple of thousand pages of cursive 行书 and super-cursive 草书 Chinese characters. If any of your readers would like to volunteer to help me cut the images out and index them. please let me know." "I also have a Shanghaiese and Cantonese phonetic list, but no Shanghaiese or Cantonese speech database. If any of your readers would like to volunteer to record the speech data I would appreciate it."
I have heard few serious complaints about Sears' site as an initial stop for early forms of the characters. Specialists, however, do have various quibbles and reservations. Since these are highly technical and require an advanced level of understanding about the nature and history of the Chinese script, I have decided not to include them in this post (which is already growing too long), but will send them separately to anyone who writes to request them.
The only major problem I myself have with the site is its title, "Chinese Etymology," I don't consider what Sears does to be "etymology" per se. Written symbols (characters, letters, graphs, etc.) do not have etymologies. Rather, they undergo evolution and development. Thus, Sears' work has to do with Chinese character structure, analysis, and evolution, not etymology. True Chinese etymology has to take into account the development of sounds and meanings through time (roots, derivatives, cognates, etc.). For that, the most convenient, reliable, and authoritative source for the early period is Axel Schuessler's ABC Etymological Dictionary of Old Chinese.
I should point out, however, that it is very common, both among specialists in the field of Chinese Studies and among the lay public, to refer to the analysis of character structure as "etymology," so Sears is certainly not alone in doing so. Still, I consider this a serious issue in Sinology and in Chinese linguistics, just as serious an issue as calling Sinitic languages like Cantonese "dialects." Chinese linguistics has long been bedeviled by deep confusion between the writing system and language, and it is this confusion that leads people to mistakenly speak of characters as having etymology.
[With thanks to Ken Takashima, Wolfgang Behr, Richard Sears, Julie Wei, Roger Olesen, Alexa Olesen, Matt Anderson, and Jonathan Smith]