Sino-Semitica: of cinnamon, cassia, and katsura and Old Sinitic reconstructions, part 2

If you stroll through the grounds of the Morris Arboretum of the University of Pennsylvania, you may come upon this phenomenal tree:

It is the champion katsura tree of the state of Pennsylvania, truly an astonishing specimen.

From reading Japanese literature, I was familiar with the name "katsura", but I didn't know the equivalent in Chinese or English, and I didn't know what it looked like.  When Zihan Guo told me that the kanji for katsura is 桂, I was stunned, because it immediately called to mind some exciting discussions about that plant that we had had on Language Log over the years (see below).


かつら / カツラ 桂


Cinnamomum cassia


… Chinese cassia or Chinese cinnamon, is an evergreen tree originating in southern China, and widely cultivated there and elsewhere in South and Southeast Asia (India, Indonesia, Laos, Malaysia, Thailand, and Vietnam). It is one of several species of Cinnamomum used primarily for their aromatic bark, which is used as a spice. The buds are also used as a spice, especially in India, and were once used by the ancient Romans.

See "Sino-Semitica: of gourds, cassia, and hemp and Old Sinitic reconstructions" (2/1/20), where I introduced an old, learned German friend named Elfriede Regina (Kezia) Knauer (1926-2010) who was very much aware of the Semitic origins of her nickname (Kezia is a biblical name of Hebrew origin; she was one of Job's three beautiful daughters) and often asked me about its Sinitic parallels (see herehereherehere, and here).  Hebrew קְצִיעָה‎ (“cassia tree”). Compare cassia. From Latin cassia (“cinnamon”), from Ancient Greek κασσίακασίακάσια (kassía, kasía, kásia), from Hebrew קְצִיעָה‎ (qəṣīʿā), from Aramaic קְצִיעֲתָא‎ (qəṣīʿătā), from קְצַע‎ (qṣaʿ, “to cut off”) (source).

That "Sino-Semitica" post provides many other references, including this addendum by John Huehnergard [2/17/20]:

Hebrew qǝṣiʕā is unlikely to refer to Cinnamomum cassia, but rather to a plant found in Ethiopia or Arabia; a recent discussion of the Hebrew word is Benjamin J. Noonan, Non-Semitic Loanwords in the Hebrew Bible (2019) 196-97 (who, along the way, refutes a proposed Chinese etymology for qǝṣiʕā). As Noonan also rightly notes, the word is not found in Aramaic apart from Jewish sources referring to the Hebrew word, so it's not a real Aramaic word.

Also see this key comment by Chris Button (cf. herehere, and here):

My take is that labiovelars alternating with velars provides evidence for uvulars in Old Chinese. A nice xiesheng series in that regard is that of 圭 EMC kwɛj < OC *qáj which includes unrounded words like 街 kaɨj < *kráj and also has 桂 kwɛjʰ < *qájs which quite possibly came from earlier *qjáts via an association with Hebrew qetsia "cassia" with its uvular onset.

To which I replied:

I like what you say about Old Sinitic "桂 kwɛjʰ < *qájs which quite possibly came from earlier *qjáts via an association with Hebrew qetsia 'cassia'". I had a learned old German friend, Elfriede Regina Knauer, whose nickname was Kezia (pronounced "Ketzia"), and she always suspected that it had something to do with Sinitic guì 桂: 1. osmanthus; sweet osmanthus 2. cassia; Chinese cinnamon quotations ▲ 蒙舍蠻以椒、薑、桂和烹而飲之。[Classical Chinese, trad.] 蒙舍蛮以椒、姜、桂和烹而饮之。[Classical Chinese, simp.] From: Tang Dynasty, Fan Chuo, Manshu, chapter 7, part 7 Méngshè Mán yǐ jiāo, jiāng, guì hé pēng ér yǐn zhī. [Pinyin] The Mengshe Barbarians use pepper, ginger, and cinnamon to boil as a drink. 3. true cinnamon; Saigon cinnamon; Indonesian cinnamon 4. laurel; bay laurel 5. of or relating to Guilin, Guangxi, or the region of the Gui River 6. A surname​. (source)

The thrill of seeing that enormous, grand, spreading katsura tree in Morris Arboretum — a type of tree that thitherto I had imagined to be a rather delicate, scraggly bush — was one that I shall never forget.

Selected readings

Update (July 7, 2022)

From Ross Bender (6/6/22):

Orth: katura

Part-of-speech: noun

Sense1: katsura tree (cercidiphyllum japonicum)
me ni pa mite te ni pa torayenu tukwi no uti no katura no gotoki imo wo ikani semu2
muka tu wo no wakakatura no kwi siduye tori pana matu ima ni nagekituru kamo3
momiti suru toki ni naru rasi tukwipito no katura no yeda no iroduku mireba4
ame no umi ni tukwi no pune uke katurakadi kakete kogu miyu tukwipitowotokwo

From John Whitman (6/6/22):

Here are the two etymologies given by Wiktionary:


From Old Japanese. Originally a compound of (ka, “fragrance, good smell”) +‎ (zu, “to come out, to put something out”) + (ra, nominalizing suffix): "that which puts out a good smell", from the way the wood smells good.

This is a total folk etymology. The verb +‎ “zu” is historically (that is, both in OJ and pJ on the basis of JR comparative evidence) idA- < *ind(A)-. No way that would give -tura.

From Old Japanese. Alternate spelling for 女桂 (mekatsura, “female katsura”), an archaic name for the cinnamon tree. Compare 男桂 (okatsura, “male katsura: the katsura tree”). Appears with this reading in the 和名類聚抄 (Wamyō Ruijushō), a Japanese dictionary of Chinese characters completed in 938.

These forms simply apply the standard male and female prefixes to a root katura. So they don’t solve the problem.

Japanese lexicographers seem to have a consensus that the first syllable ka- is ‘fragrance’, historically written 香 (although this can also be a phonogram!). There aren’t a lot of OJ examples with ka- as the first member in a subcompound, and at least one of them triggers rendaku, though, so why katura is not kadura is a problem; as is the fact that modern tree called katura is not fragrant. It could be that one of the other two species referred to as katura, particularly cinnamon, is the original word. In OJ tura (usually 鉉) appeared to refer not just to strings of an instrument but to vines, but the modern tree katura does not have vines.

From John Bentley (6/7/22):

In response to your first question: the first etymology is untenable. Old Japanese (OJ) id- (idu) “come out, go out” is vowel-initial, and that is not accounted for in the first etymology. Also, it is phonetically /ntu/ and not zu. The second example is not really an etymology, simply a personification, apparently to reflect the strength of the scent of two different trees. In Wamyōshō mekatura is written with the Chinese character 桂, while wokatura is written as 楓. Consider, however, that in the late 9th Century character dictionary Shinsen jikyō both Chinese characters are ready simply as katura (加豆良). The final interesting piece is that both words have the same pitch accent, aside from the gender designator. In essence, I think there may have been an attempt by the Japanese to distinguish between two species of tree, where only one was indigenous to Japan.

Regarding the final question about Sino-Japanese, both Schuessler and Baxter/Sagart’s reconstructions of early Chinese forms of the character 桂 is something close to *ke or *kwe, which doesn’t match with the OJ name.

From Nathan Hopson (6/7/22):

日本国大辞典 lists the following etymological hypotheses:
All but #5 agree that 香 gives us か. Other than that, they appear to be all over the place.
To the extent that 桂 is a 香木 (こうぼく), this at least passes the smell test, if you know what I mean.


  1. Chris Button said,

    June 25, 2022 @ 6:52 pm

    My take is that labiovelars alternating with velars provides evidence for uvulars in Old Chinese. A nice xiesheng series in that regard is that of 圭 EMC kwɛj < OC *qáj … which includes 桂 kwɛjʰ < *qájs which quite possibly came from earlier *qjáts via an association with Hebrew qetsia "cassia" with its uvular onset.

    Thanks for posting that old comment of mine. On reflection, I think the last part with the -ts is probably unnecessary in terms of loanword phonology and the common s ~ ts alternation. I think we could just go with *qájs, or as I tend to write it now, *qajːs.

    And yes, Japanese katsura (*katura) is surely ultimately coming from the same source.

  2. David Marjanović said,

    June 26, 2022 @ 5:48 am

    Also see this key comment by Chris Button (cf. here, here, and here):
    Hebrew qetsia "cassia" with its uvular onset

    I don't think it's clear when that changed from a velar ejective to a uvular non-ejective. Is it even clear that this happened before Hebrew died out as a spoken language?


    If that's meant literally, it would be very strange. A /j/-/jː/ contrast is rare worldwide; distinguishing length within consonant clusters is also rare; I'm not sure if distinguishing length in the first part of a consonant cluster, at least unless morpheme boundaries are involved, is even attested; and the combination has to be super-rare at best.

  3. Chris Button said,

    June 26, 2022 @ 8:21 am

    @ David Marjanović

    Regarding 桂 *qajːs, it's an individual loanword from Semitic. So in this individual comparison we're working in terms of phonetic likelihood rather than rigid phonological sound laws.

    Regarding surface "sonorant" length (which includes "vowels") as part of the underlying phonological structure, I admit that it is a surface reflex of moraic weight, which Pulleyblank marks with acute and grave accent markers (Ferlus' analyses are instructive here too since he is thinking along similar typological lines).

    So I'm admittedly straying too far onto the surface (along the lines of the front/rounded vowel hypotheses, which are only valid for surface rhyming choices but unfortunately misused for phonological reconstruction). However, recently I've opted for the length markers as a convenient way instead of Pulleyblank's accents to denote the moraic weight. If it helps, you could think of it in terms of rising versus falling diphthongs on the surface and the distinctions between glides and vowels, but at that point we're way too far up on the surface.

  4. Chris Button said,

    June 26, 2022 @ 8:31 am

    I should also add that a distinction like -aːj and -ajː works in terms of diphthongs, but a distinction like -aːn and -anː can't be thought of in that way.

  5. Wolfgang Behr said,

    June 26, 2022 @ 3:19 pm

    Just two quick notes on 桂: a Near Eastern connection to Sumerian via Semitic had already been mentioned in passing long ago by Ulrich Unger, see "Aspekte der Schrifterfindung. Das Besipiel China", in: Frühe Schritfzeugnisse der Menschheit. Vorträge gehalten auf der Tagung der Joachim Jungius-Gesellschaft der Wissenschaften, Hamburg am 9. und 10. Oktober, Göttingen: Vandenhoeck & Ruprecht 1969, p. 27, now also also available in his 雜集 ( Given the presumed origin of the plant(s) in SE Asia, a loanword _into_ Semitic seems prima facie more likely than the other way round.

    Nina Zhao-Seiler, a TCM and Chinese materia medica specialist just finished a wonderful MA thesis "Travelling Plants and their Names: Tracing Guì 桂 Cinnamomum" (259 pp.) here in Zurich, in which she loks into all types of pilological and botanical evidence related to your question. She will probably be happy to share it, I trust (contact:

  6. Chris Button said,

    June 26, 2022 @ 7:06 pm

    @ David Marjanović

    One further thought. Perhaps you’re thinking of Swedish, where the emergence of such a contrast is rare in Germanic? Superficially there is a surface correlation, but the origin is entirely different and notably you can have “long” obstruent codas there, which aren’t permissible here.

    Here instead, I’m talking about what are essentially tense/lax~fortis/lenis distinctions on sonorants (which includes vowels). I recall posting a comment to you on LLog about this topic before.

  7. Chris Button said,

    June 26, 2022 @ 8:36 pm

    @ Wolfgang Behr

    Thanks for the references. I’ll try to get hold of them.

    Given the presumed origin of the plant(s) in SE Asia, a loanword _into_ Semitic seems prima facie more likely than the other way round.

    That’s a good point. It would then be a SEA loanword into Chinese as well as Semitic. I certainly don’t think the term is native to Chinese. The uvular onset from the phonetic series still accounts for the otherwise random labialization in Chinese. I’m looking forward to see what Nina Zhao-Seiler says!

  8. john v burke said,

    June 27, 2022 @ 1:54 pm

    The three-cushion billiards player Danny McGoorty mentioned in his memoir ("McGoorty") that a Miss Katsura–Masako?–was an outstanding player of that very difficult game in San Francisco during the 1950s and 60s. One quiet afternoon at the Palace, on Market Street, I asked if she would play to 21 points with me, and she agreed. Of course I made clear I would pay for the table time since I was essentially taking a lesson, not offering actual competition; sure enough, although her small stature made it hard for her to reach some parts of the table, she beat me handily–21 to 3, as I recall. Every time I missed (i.e. almost every time I came to the table) she gave me an apologetic look before then running three or four points, and when she scored her 21st point she bowed politely. Graciousness was in short supply in the old-time poolhalls so that was an especially welcome experience.

  9. TKMair said,

    June 28, 2022 @ 10:48 am

    Now you linguists here on LL must add some botanist friends!

  10. David Marjanović said,

    June 28, 2022 @ 4:10 pm

    Perhaps you’re thinking of Swedish, where the emergence of such a contrast is rare in Germanic?

    No, I don't mean consonant length in general, I mean specifically glide length, which is very rare, and a length distinction in the first part of a consonant cluster, which may not be attested at all (…at least not in languages without a triple consonant length contrast).

    Proto-Germanic actually had /jː/ and /wː/ distinct from /j/* and /w/, but they were lost in three different ways in East, North and West Germanic. Standard Arabic has both; I suppose the function of consonant length in the grammar would quickly restore them if they were lost.

    *…Or not, because intervocalic /j/ had been lost, except after /i/ (where it may well have been reinterpreted as epenthetic) and in the causative morpheme *-/ja/- which sometimes followed long vowels.

  11. Chris Button said,

    June 28, 2022 @ 5:07 pm

    @ David Marjanović

    Then just take them as rising vs falling diphthongs. Otherwise you might end up importing importing a false phonetic distinction onto a purely phonological distinction between “vowel” and “glide”.

  12. David Marjanović said,

    July 3, 2022 @ 7:27 pm

    Fair enough, though falling diphthongs that begin with /a/ are still extremely rare. More so than /jː/ actually, I think.

  13. David Marjanović said,

    July 3, 2022 @ 7:28 pm

    …or maybe I confused falling & rising. Anyway, diphthongs that begin with a subsyllabic /a/ are extremely rare.

  14. Chris Button said,

    July 3, 2022 @ 10:21 pm

    Good luck finding the semivowel symbol correlating with “a”. You may as well look for the semi-sonorant symbol correlating with “n”.

    I’m talking about moraic weight rather than diphthongs. I only brought up diphthongs as a concept because you were focusing on glide codas, and that’s the easiest way to relate there because of the handy i~j and u~w symbol interchange.

    Perhaps using syllabic/non-syllabic diacritics would work better than using the length symbol?

    Incidentally, Pulleyblank always argued for a pharyngeal glide to correlate with “a”.

  15. Chris Button said,

    July 3, 2022 @ 10:39 pm

    Or perhaps I should just follow Theodore Stern in his “A provisional sketch of Sizang (Siyin) Chin” (p.228) where he says:

    “Since peaking is attended by relative length, it is here conveniently expressed by the mora [·] …”

    So maybe an interpunct dot instead of a length symbol could work.

  16. Chris Button said,

    July 4, 2022 @ 7:34 am

    Just to add a little more…

    I chose to transcribe Stern’s Sizang description in terms of tense/lax (fortis/lenis) since that is essentially a length distinction with various other associated acoustic phenomena.

    So I had ɛn and en for enː and eːn. I could also use ɐn and an in a similar way for anː and aːn in Sizang.

    But in Old Chinese I have but ə and a as the nucleus and no other “vowels” to play with. A ɐ~a distinction works, but what to do with schwa ə? I could perhaps take an alternation with zero as Ø~ə.

    Pulleyblank’s Middle Chinese pharyngeal off-glide is interesting in that regard. He treats it as a “non-syllabic a” but then transcribes is as a non-syllabic schwa in his Lexicon because “non-syllabic a” would have essentially sounded like schwa (e.g. a non-rhotic pronunciation of the end of “fear”) From an acoustic perspective, it works brilliantly with Sino-Japanese correlations.

    What’s interesting is that if we take this schwa-like off-glide and place it at the nucleus with moraic weight then we can have a schwa nucleus as the phonetic correlate as opposed to a syllabic coda with a zero vowel. But the phonological—as opposed to phonetic—correlate of the schwa-like glide would be an “a” nucleus rather than a “ə” nucleus …

    I wonder what ramifications this has for the emergence of a ə/a distinction and hence a vertical vowel system out of vowellessness (i.e the syllable as the basic building block).

  17. Chris Button said,

    July 4, 2022 @ 7:59 am

    But the zero~ə alternation only works with sonorant codas (n, j, etc.) but not obstruents (t, k), so the vowel length marker or Stern’s interpunct is really my only option.

    Regarding Sino-Japanese, take a look at how something like 力 LMC *liək (the schwa needs a non-syllabic mark) corresponds with Japanese kan-on *rjəku (modern “ryoku”) in which the schwa is at the nucleus and corresponds to modern “o”.

  18. David Marjanović said,

    July 4, 2022 @ 3:15 pm

    Perhaps using syllabic/non-syllabic diacritics would work better than using the length symbol?

    Yes! That's pretty standard in IPA usage, too.

    Incidentally, Pulleyblank always argued for a pharyngeal glide to correlate with “a”.

    This has some justification.

  19. Chris Button said,

    July 4, 2022 @ 5:06 pm

    The problem with using syllabic/non-syllabic diacritics is that it glosses over that we're really dealing with is the weight or "peaking" (to borrow Stern's term) on moras within the syllable rather than syllabicity per se.

    So these work (where moraic length for weight/peaking is treated as a distinction between tense/lax or fortis/lenis):

    pəːn vs pənː
    pəːt vs pət

    But the below with -n is rather confusing:

    pən vs pən̩ (or pn̩ or pə ̯n)

    And the second form below with -t is just nonsensical because the non-syllabic schwa means there is no "syllable" left:

    pət vs pə ̯t

