Peter Stephen Du Ponceau and Vietnamese dictionaries

From Connected (2/4/22), a publication of the Peabody Essex Museum:

"Phillips Library digitizes dictionaries from Vietnam and unlocks stories of museum founders and their travels", by Kathlene Baldanza

The blog post is accompanied by beautiful images of pages from the dictionaries.  Here are the first three paragraphs:

Two recently digitized manuscript dictionaries in the Phillips Library collection are once again sparking conversation. In 1819, John White, a lieutenant in the US Navy, received dictionaries from an Italian Catholic priest named Joseph Morrone in Saigon and deposited them with the East India Marine Society in Salem. The members of the East India Marine Society were the founders of what is today the Peabody Essex Museum. Published in the US in 1838, the dictionaries fueled a trans-Atlantic debate about the nature of Asian languages. Catholic missionaries, their Vietnamese interlocutors, and Salem mariners made the initial connections that allowed for the scholarly conversation that played out in the pages of journals including The North American Review, The Foreign Quarterly Review, and The Canton Register.

Peter Stephen Du Ponceau, the President of the American Philosophical Society in Philadelphia, borrowed the manuscripts from the East India Marine Society and published them as an appendix to his A Dissertation on the Nature and Character of the Chinese Writing System. Introducing the dictionaries, Du Ponceau wrote, “The United States, therefore, will have the honour of being the first to publish authentic documents respecting the language of Cochinchina, and to introduce that curious idiom to the literary world.” The digitization of the source manuscripts allows us to revisit this early engagement of the United States with Vietnam.

The digitization of the two manuscripts — made possible with funding from James T. Lap, in memory of his mother Anna Nguyễn Thị Diệc (1909-1958) — allows researchers to appreciate anew these important sources and the conditions of their creation. Unlike du Ponceau’s published account, the manuscripts preserve the character-based Vietnamese demotic script, Nôm, as well as the marks left by their creators and users. Du Ponceau published his Dissertation to disprove the theory that Chinese was a universal language written in an ideographic script. He used Nôm as an example to make the point that Chinese characters could not be adopted universally by speakers of other languages. In contrast, the dictionaries were meant as language-learning tools to aid in the proselytization of Catholicism. The manuscripts preserve faint traces of the communication between missionaries and their Vietnamese converts.

The first of the two dictionaries introduced here is titled Lexicon Cochin-Sinense Latinum ad usum missionarium.  It is a version of a dictionary that had been circulating in Vietnam for two centuries.

New European arrivals would copy the dictionary for their own use, and annotate it as needed. Arranged alphabetically in two columns with Romanized Vietnamese headwords, the dictionary fills 139 pages. The wastepaper binding of the manuscript is made of calligraphy practice sheets, possibly of a European priest following the model of a Vietnamese teacher.

The article includes a fascinating photograph of a piece of the wastepaper used in the binding of the Lexicon Cochin-Sinese Latinum ad usum missionarium.  Another intriguing photograph shows a sheet of musical notation, a scale denoting the six tones.

The second dictionary featured in the article is the Vocabulaire domestique Cochinchinois Français, which may be the earliest surviving Vietnamese-French dictionary.

Remarkably, it contains both Romanized Vietnamese (quốc ngữ) and in the character-based Nôm. The use of Nôm attests to the missionaries’ commitment to learning and preaching in vernacular Vietnamese.

A few words about Peter Stephen Du Ponceau (1760-1844) :

…a French-American linguist, philosopher, and jurist. After emigrating to the colonies in 1777, he served in the American Revolutionary War. Afterward, he settled in Philadelphia, where he lived the remainder of his years. He contributed significantly to work on the indigenous languages of the Americas, as well as advancing the understanding of written Chinese.

Du Ponceau… was one of the first Western linguists to reject the axiomatic classification of Chinese writing as ideographic. Du Ponceau stated:

    1. That the Chinese system of writing is not, as has been supposed, ideographic; that its characters do not represent ideas, but words, and therefore I have called it lexigraphic.
    2. That ideographic writing is a creature of the imagination, and cannot exist, but for very limited purposes, which do not entitle it to the name of writing.
    3. That among men endowed with the gift of speech, all writing must be a direct representation of the spoken language, and cannot present ideas to the mind abstracted from it.
    4. That all writing, as far as we know, represents language in some of its elements, which are words, syllables, and simple sounds. In the first case it is lexigraphic, in the second syllabic, and in the third alphabetical or elementary.

He used the example of Vietnamese, then called "Cochinchinese," which used Chữ Nôm, a modified form of Chinese characters. He showed that Vietnamese used the Chinese characters to represent sound, not meaning. A hundred years later, his theory was still a source of controversy.[


In my estimation, although Du Ponceau wrote his magnum opus on the Chinese writing system nearly two centuries ago, he displayed extraordinary prescience and insight in describing and analyzing it.

Du Ponceau, Peter Stephen, A Dissertation on the Nature and Character of the Chinese System of Writing, in a Letter to John Vaughn, Esq (Philadelphia: The American Philosophical Society, by M'Carty and Davis, 1838.

His acuity regarding the nature of Chinese writing was not to be matched until the second half of the 20th century with the works of John DeFrancis (1911-2011):

Visible Speech: The Diverse Oneness of Writing (Honolulu: University of Hawai'i Press, 1989)

The Chinese Language: Fact and Fantasy (Honolulu: University of Hawai'i Press, 1984)

See also:

J. Marshall Unger, Ideogram: Chinese Characters and the Myth of Disembodied Meaning (University of Hawai'i Press, 2004)

It is noteworthy that both Du Ponceau and DeFrancis paid great attention to the development of writing in Vietnam through several crucial stages.

John DeFrancis, Colonialism and Language Policy in Viet Nam (The Hague: Mouton, 1977)

Although gross obfuscation concerning the Sinographic writing system has been the order of the day for more than two millennia, the clear-headed ratiocination of Peter Stephen Du Ponceau and his intellectual heirs, inspired by the evolution of Vietnamese writing, ensures that eventually those who care to do so will begin to understand how the characters are composed, how they function, and that we don't need tens of thousands of them to clog up our brains and information technology.



There is probably no subject on earth concerning which more misinformation is purveyed and more misunderstandings circulated than Chinese characters (漢字, Chinese hanzi, Japanese kanji, Korean hanja) or sinograms.

–Victor Mair
from the foreword to Ideogram, by J. Marshall Unger


Selected readings


  1. Jerry Packard said,

    February 7, 2022 @ 5:23 pm

    As John DeFrancis would have been quick to point out, UC Berkeley professor Peter Boodberg (1903-1972) was one of many who followed Du Ponceau and preceded DeFrancis in emphasizing that Chinese characters are not ideographic, representing phonetic words rather than ideas.

    Boodberg, P. (1940). "'Ideography' or Iconolatry?". T'oung Pao. 35 (4): 266–88. doi:10.1163/156853239X00062

  2. Victor Mair said,

    February 7, 2022 @ 7:36 pm

    "His acuity regarding the nature of Chinese writing was not to be matched until the second half of the 20th century with the works of John DeFrancis (1911-2011)."

  3. Jonathan Smith said,

    February 7, 2022 @ 10:02 pm

    Thanks; striking that, in the work of Du Ponceau and other serious-minded European learners through the ages, dumb-old, common-sense facts about East Asian languages and scripts are again and again plainly recognized — this beneath the persistent din of various kinds of ideologically- or otherwise-motivated silliness… :D

    A good recent-ish statement is actually say Sampson (1994) in Linguistics (32.117–32) — as unfortunately DeFrancis 1989 itself degenerates into curious polemic against dumb-old facts like logographic writing…

  4. Peter Grubtal said,

    February 8, 2022 @ 3:51 am

    "various kinds of ideologically- or otherwise-motivated silliness" can come from any quarter. That they in Chinese are logograms has been the standard narrative in most textbooks for a long time now, and it seems PC nowadays to deny any semantic value to the characters.

    I don't know about Chinese, but certainly in Japanese it seems undeniable that the characters have some semantic value: each character usually having the very different on- and kun- pronunciations, and are used in jukugo (compound words) carrying semantic information.

    Even in Chinese, surely the character for nose, conveys that concept in many (most?) contexts?

  5. J.W. Brewer said,

    February 8, 2022 @ 12:54 pm

    Du Ponceau was writing before Commodore Perry et al. had gone to Japan, which means that the "ideogram" misconception he was condemning presumably was not rooted in the Western fascination with things Japanese that blossomed later in the 19th century.

    That being said, I do agree with Peter Grubtal that the "ideogram" concept seems rather more plausible in a Japanese context. To take a simple example I may have used before in other comment threads, the character 中 is used in Japanese to write both "naka" and "chū." It is exactly as if in English we used a single character to write both "middle" and "center/central." When the same character is used to write two different words that have nothing in common phonetically or etymologically but are synonyms when it comes to semantics, it is hard to fend off the notion that the character is somehow denoting what the two words have in common, i.e. the common meaning or referent (although maybe that's subtly different from "idea"?) that both words carry.

    If 中 as used in written Japanese is a "logogram," it's a -gram for multiple logoi in the Japanese lexicon that are entirely unrelated other than semantically, and maybe we need a different technical term to explain that phenomenon. By all means coin some scientific-sounding term other than "ideogram" if that is overburdened with negative and exoticizing baggage. But shouldn't we call it something? Wikipedia advises me that "morphogram" has been proposed to describe this aspect of kanji, as a term contrasting with "logogram," but I don't particularly like it because it seems to imply that e.g. while "naka" and "chū" are different "words" they are somehow the same "morpheme," which in turn implies what seems a rather non-standard meaning of "morpheme." Can't we do better?

  6. Philip Taylor said,

    February 8, 2022 @ 1:02 pm

    I would (seriously) suggest that we continue to use the long-established and universally understood term "ideogram" and thereby reclaim it from those who would seek to overburden it with negative and exoticizing baggage.

  7. Jerry Packard said,

    February 8, 2022 @ 8:04 pm

    Following Peter Grubtal and J.W. Brewer, the characters certainly do represent semantic information in both Chinese and Japanese. Although characters represent phonetic words rather than ideas, reading characters can directly activate meaning in our neural circuitry, just as written words do in reading English when we ‘speed read’ or otherwise read without activating phonetic form. Because Chinese characters actually contain semantic information in the form of semantic radicals, the retrieval of semantic information occurs more directly and quickly when reading Chinese than it does when reading alphabetic orthographies like English.

  8. Jonathan Smith said,

    February 8, 2022 @ 8:09 pm

    Commenters above are expressing that common graphic representation of multiple words or mophemes exhibiting some degree of semantic closeness is unlikely to be a psychological nothing in the minds of writing system users? Sure, this seems prima facie obvious. The point is that such a fact is an instrumental nothing in the sense of being useless when it comes to doing actual things with language in the actual world — i.e., "that dog won't hunt," and Du Ponceau and his ilk are hunters, not philosophers.

    One actually finds, in a practice such as teaching, that undue foregrounding of the "ideographic" notion above is not merely nothing in practical terms but actively destructive. There is no really apt English parallel, but if we take say /kləʊz/ and /kləʊs/, then perhaps you can agree that to guide learners by drawing their attention to the shared written form "close" (via, say, endless rote copying), to the plausible psychic significance of this shared form (underdetermined 'proximity' or the like?), and to texts in which one or the other "reading" of this form was to be recognized as apt — as opposed to, say, listening to and trying to speak English — would approach the Literally Dumbest F***** Idea Ever? Multiply this doing-it-bass-ackwards by close to the whole lexicon in, say, Japanese, or worse, Taiwanese, and you get a sense for the scale of the problem given traditional pedagogical approaches.

    @J.W. Brewer & "中" in Japanese, yes, but why not a character which, as is far more typical, is not on its face plausibly iconic in origin (although… is the "line" here in the middle of the "box" or the "box" in the middle of the "line"?). Let us say "腸" for ちょう as well as はらわた (etc.), expressive of ineluctable intestinality…?

    On the other hand, "中" is nice as it also writes two Mandarin words, zhōng ‘middle, etc.' and zhòng 'to hit the mark'​… for whether or not there is ever any confusion among students regarding how to "read" this character given foregrounding of written form in pedagogical materials, see on "close" above.

    @Peter Grubtal & "PC narrative", what are you reading sir :D so… there's languages, ways to write them that exploit mappings from script to spoken language at various levels, various ways of generating novel symbol-to-language-unit associations based on principles of various kinds, various part-to-whole, etc., relationships that obtain among the symbols of a mature set, and so on… it's science, or so we tell ourselves :D

  9. J.W. Brewer said,

    February 8, 2022 @ 11:07 pm

    I don't want to mix up the "ideogram" notion with the "pictogram" notion and certainly wasn't picking 中 because its visual form somehow cued the semantics. I'm happy for the relation between signifier and signified to be arbitrary in that regard — it's the non-arbitrary relation between the different words written with the same character that I'm interested in. I just have a limited inventory of kanji I know, and 中 was a convenient example. But do the same with e.g. 豚, where if there's a Just So Story about how it kinda looks like an abstracted pig or piglet if you squint at it at just the right angle I don't recall having ever heard it. But it's the character with which you write multiple Japanese lexemes (e.g. ton and buta) which are unrelated except for their semantics, just as if you used it in English to write both "swine" and "pork." (My understanding of early Japanese lexical history is a little shaky, but I believe they inherited "buta" from Anglo-Saxon and borrowed "ton" from Norman French.)

  10. Rodger C said,

    February 9, 2022 @ 9:54 am

    The right-hand element of 豚 is in fact a pig.

  11. J.W. Brewer said,

    February 9, 2022 @ 4:27 pm

    @Rodger: That well may be what the script-historians say, and they may be right, but I daresay it would be easy to construct a lineup of three or five different radicals including that one from which a Westerner or other illiterate person would have no better-than-chance ability to guess which was the one that was supposed to look like a pig.

  12. Chris Button said,

    February 10, 2022 @ 8:08 am

    @ J.W. Brewer

    My understanding of early Japanese lexical history is a little shaky, but I believe they inherited "buta" from Anglo-Saxon and borrowed "ton" from Norman French.

    Is this a tongue in cheek comment? Perhaps I’m missing something? Don’t get me wrong, I’m curious.

    “Ton” is the Sino-Japanese on-yomi coming from Chinese.

  13. ~flow said,

    February 10, 2022 @ 8:13 am

    > The right-hand element of 豚 is in fact a pig.

    Ce 豕-ci n'est pas un 'pig'. But to me it works the same, 馬 *is* a horse, 石 *is* a stone.

    In my experience as a reader of Chinese and Japanese texts, it's largely immaterial whether 豕 looks much like a pig when you squint at it; it has become one of many hundred rather arbitrary signs, similar to Arabic digits and stuff like the ampersand. The same is true of many, many compound signs; they may or may not have a signifier and a phonatory part which may or may not be helpful, respectively; in the end, the trained reader just recognizes them and associates them in their contexts with spoken words and meanings, and both may be ambiguous. Alphabetic scripts sometimes display a similar phenomenon. When you see 'Bootshop' you'd think of a place to buy shoes, right? Well, as a plausible shop sign in Germany it just could be that, or else a place to buy *boats* and accessories. Similarly "Backshop" has become very common, and it's not what you'd think.

    @Jonathan Smith I find it somewhat besides the point what you're saying, although not wrong. Sure, when you teach the spoken language you probably shouldn't recurse too much on the orthography. But reading and writing has to be taught, too, and in my experience it's sometimes truly helpful for a learner to know that two homophones are in fact written differently.

    As for the writings of John DeFrancis that I have consumed, I always missed a reference to the Japanese writing system. In my memory DeFrancis tries to convince me that Chinese is in fact not the 'ideographic' system it's often claimed to be, and, as far as the *Chinese* language is concerned, his arguments are rather convincing. However, right next to Chinese there's Japanese, and, as has been pointed out above, that language's orthography is rather more 'ideographic' than the Chinese one, and so one should not just treat DeFrancis' claims to be necessarily true across all writing systems. Here we have not even discussed some Cuneiform orthographies where similar observations seem to apply.

    Lastly I want to congratulate Du Ponceau on his coinage, "Lexicographic". This is an ingenious term and it meshes well with what modern linguistics call 'lexicon', 'lexemes' and so on. One could say the English orthography is one that is basically alphabetic but with an unusually strong lexicographic tendency, while Chinese is a primarily lexicographic orthography with some (important) phonetic traits (like phonetic hints in compound characters and re-use of the same character for unrelated but homophone morphemes). Japanese, then, has roughly speaking a split lexicographic and (syllabo-) phonetic spelling.

  14. J.W. Brewer said,

    February 10, 2022 @ 10:34 am

    @chris button: It was an analogy disguised as an apparently unsuccessful joke. I'm not sure to what extent the analogy breaks down (in a Japanese context) when it comes to this famous distinction in the post-1066 English lexicon:

  15. Chris Button said,

    February 10, 2022 @ 12:13 pm

    @ J.W. Brewer

    Ah, that’s actually very funny. Now I’m feeling stupid for not getting it the first time.

  16. Jonathan Smith said,

    February 10, 2022 @ 8:49 pm

    @flow of course re: reading/writing, but the specific thing one must say as it relates to the discussion above as I understood it is rather, say, "observe that these different words we know, X and Y, are written with the same character (+ add reasons here if pressed but generally who cares in the basic language learning context.)" One should under no circumstances encourage learning of disembodied "characters" and then treat reading as an exercise in selecting "readings" as opposed to an exercise in associating text with one's prior knowledge of the language. This is actually done better and better in East Asian language classrooms in the U.S. IMO (and Japanese generally better than Chinese in my experience), but students often seem capable of approaching the issue in the wrong way entirely of their own accord :D
    @J.W. Brewer I see exactly what you mean but would assert simply that the "psychic something" doubtless generated by homographic (approximate-)synonyms does not seem to be characterizable in any specific, useful terms and thus is not something one would wish to teach, or learn, or generally attempt to say much of anything about at least given the current state of our understanding of the neural basis of "ideas" :)

  17. Jonathan Smith said,

    February 10, 2022 @ 8:58 pm

    @~flow re "One could say the English orthography is one that is basically alphabetic but with an unusually strong lexicographic tendency, while Chinese is a primarily lexicographic orthography with some (important) phonetic traits (like phonetic hints in compound characters and re-use of the same character for unrelated but homophone morphemes). Japanese, then, has roughly speaking a split lexicographic and (syllabo-) phonetic spelling."
    Minor quibbles aside this particular statement seems about right (top of mind quibble is maybe that the "phonetic hints" pertain to what I above called "part-to-whole relationships that obtain among the symbols" and crucially an effect of *coinage mechanisms* as opposed to, as DeFrancis 1989 believes, a fundamental feature of the script. 1989 in particular goes beyond your characterization and almost forgets that these scripts are "primarily lexicographic/logographic" at all.

  18. Peter Grubtal said,

    February 11, 2022 @ 2:09 am

    What seems indisputable is that there is no such thing as an ideographic writing system, which can, solely using ideographs, represent all (or even many) possible utterances in a language.
    But that many Chinese characters have a clear semantic, ideographic (sorry!) value is equally clear (to me at least).

    Japanese people can often get some sense from written Chinese without knowing a word of the language. And I, also ignorant of everything Chinese, but after my Kanji labours have no trouble working out that the sign on the building means "Shanghai International Banking Corp." I could find less trivial examples.

    I can understand teachers wanting to eliminate in their students misconceptions about written Chinese, but to try to suppress any notion that meanings divorced from language attach to some characters seems wrong-headed.

    It's amusing to read :"students often seem capable of approaching the issue in the wrong way entirely of their own accord" (Jonathan Smith). Strict ideological drilling and suppression of some obvious facts don't always produce the desired result.

  19. Jonathan Smith said,

    February 12, 2022 @ 10:03 pm

    I *think* I see the (an) issue — take the Chinese gentleman I knew who learned an impressively large number of words of (written) English *by sight* (i.e., associated them with particular Chinese words and expressions) but could not speak them or understand them when heard. Such a learner could grock the meanings of say (written) Spanish "final", "error", etc. (just as a Japanese reader can grock "銀行", "國際", etc.), but would be off on "actual", "tuna", etc. (and the Japanese reader off on "結束", "大家", etc.)… if such a trivial phenomenon is the "ideogaphy" Peter Grubtal has in mind, he is welcome to it… No surprise, though, that folks like this Chinese gentleman feel strongly that they have "learned" a great deal and defensively characterize gentle attempts to prod them in more productive directions as intelekshl gezpacho tactics…

  20. Peter Grubtal said,

    February 13, 2022 @ 10:03 am

    Jonathan –
    my standpoint is that of the adult learner, with no long residency in the country concerned, and – ars longa, vita brevis – finite time to get to a working competence in the language. Anything that helps with vocabulary acquisition is welcome. Yes, you're right with the "falsos amigos", of course, and it would apply to Japanese speakers trying to make sense of a Chinese text. But language learners have to be aware of this, and watch out for the traps.

    English speakers learning French or Spanish (or even German) are particularly at risk.
    Spanish I find particularly tricky in this regard. Words with a close English match sometimes seem to have two or more distinct meanings in Spanish, some of which map fairly close to the meaning of the English word, but the others can be quite different. It struck me today with "suspenso" which besides the English meaning, can be a mistake or a fail in an examination. The very common "ilusion" falls into this category for me, as well.

