Farsi shekar ast

This is a quiz.  It's a short, pop quiz, but the post is going to be very long.

1. In what language is the title of this post written?

2. What does the title mean?

Let me say that I have never formally studied this language, but I took one look at the sentence and I knew immediately what it meant and what language it is.  It's not a European language, but all three of the words in the sentence are familiar to me because I've studied so many relevant languages that the cognates just leapt out at me.

I wonder how many Language Log readers who are not already familiar with this language had a similar reaction looking at the title.

The last word reminded me of German "ist" and Latin "est".  The second word is very close to Hindi and Sanskrit words I knew for "sugar", and even to "sugar" itself.  The first word is undoubtedly known to everyone who is news literate.

I was intrigued that the word for "sugar" appears to be the same in so many languages, so that got me wondering where it originally came from.

Back in February, I read this article by Elon Gilad in Haaretz (2/12/15):

"Arabic words in English you didn’t even know you knew"
Algorithm and sofa, candy and apricot, algebra and zero, and so many more are actually Arab migrants.

It's precisely because there are so many words of Arabic (and Hebrew) origin in English that the editors of The American Heritage Dictionary of the English Language decided to include an appendix of Semitic Roots to complement their famous appendix of Indo-European Roots:

"The American Heritage Dictionary of the English Language, 5th edition"  (11/14/12)

"Ur-etyma: how many are there?"  (7/6/14)

Gilad is right to stress how many English words of Arabic and Persian derivation go back to Sanskrit, but he misses some of them, including "sugar":

late 13c., sugre, from Old French sucre "sugar" (12c.), from Medieval Latin succarum, from Arabic sukkar, from Persian shakar, from Sanskritsharkara "ground or candied sugar," originally "grit, gravel" (cognate with Greek kroke "pebble"). The Arabic word also was borrowed in Italian (zucchero), Spanish (azucar, with the Arabic article), and German (Old High German zucura, German Zucker), and its forms are represented in most European languages (such as Serbian cukar, Polish cukier, Russian sakhar).
Online Etymology Dictionary

The intricate intertwining of Sanskrit, Persian, and Arabic led me to contemplate how great a contribution each of these languages had made to each other and to Central Asian, European, and other languages.  I was especially interested in Persian, because, when I was studying Uyghur, I was amazed at how many words of Persian derivation it contained, and I had the same experience when I was learning Hindi-Urdu.

Above all, these reflections took me back to the first four sentences of Nepali that I learned during Peace Corps training.  I'll never forget them.  During the whole summer in Columbia, Missouri, as we were learning Nepali, not a single word of English was ever used in the classroom.  In fact, most of our teachers didn't know any English.  This was truly total immersion, right from the beginning.

When we learned these first four sentences, the teacher held up the appropriate objects:

1. (showing a pen)
a. Yo ke ho?
b. Yo kalam ho.

2. (showing a book)
a. Yo ke ho?
b. Yo kitab ho.

3. (showing a piece of paper)
a. Yo ke ho?
b. Yo kagazh ho.   (not sure of the spelling)

By the second sentence, I already knew that "ke" must be an interrogative word cognate with "que", which I knew from Spanish, French, and other languages, and "yo" must be some sort of demonstrative pronoun.

During those lessons, I wasn't even thinking in English, so didn't translate the sentences at that time, but will do so now:

a. What is this?
b. This is a pen.

a. What is this?
b. This is a book.

a. What is this?
b. This is paper.

What was striking about all three of those substantive Nepali nouns is that they came from Arabic and Persian (must have something to do with technologies of literacy).  In an effort to get a rough idea of how many words from Persian there are in various languages, I asked several of my colleagues who are specialists in these languages whether they could tell me approximately the proportion of words in them that were borrowed from Persian.

From Richard Foltz, a specialist on Iranian religion:

The percentages are very rough because it differs according to the register — spoken or written, scholarly or popular, etc. They say 70 per cent of Persian vocabulary is Arabic (and similiar for Perso-Arabic in Turkish, Urdu, Uzbek, etc.), but whose vocabulary? A pompous mullah's speech might be 90%, while an ordinary conversation among friends could be 80% pure Persian. Ferdowsi (940-1020) made a deliberate effort to eschew Arabic vocabulary from the Shah-nameh, and that is still considered the model for Persian speakers. Some nationalists today insist on saying "Parsi" instead of "Farsi", etc. In short, it varies dramatically from one speaker and one situation to the next. Hindu natiionalists in the 19th century made a forced effort to replace Perso-Arabic vocabulary in Hindustani with contrived Sanskritic roots (thus creating Hindi and Urdu out of what had previously been a single language), Ataturk did the same with Turkish and Reza Shah attempted it with Persian too.

Kalam and kitab are Arabic and kagaz I believe is Turkish (paper having arrived from the East).

From Brian Spooner, specialist on Persian:

1. [Persian borrowings in Arabic] Can't give you the percentage, but it's low, and mostly very early borrowings, pre-Islamic, like divan, which eventually became the French douane.

2. [Central Asian Turkic languages] Uzbek is full of Persian.  Uyghur and others I think less, pretty certain, but don't know for sure.  I think the reason is that the more a language is written the more Persian it borrowed; the more colloquial, the less it borrowed.

3. [Hindi-Urdu] Again I can't give you a percentage.  But written Urdu is basically Hindi syntax with Persian vocabulary.  Again written has more Persian than spoken.  But even spoken (Hindustani) has an amazing amount of Persian in it, sometimes difficult to recognise.  For example, in Hyderabad (AP) a few years ago I found that in order to get my tea without sugar from a stall I had to say ba-gar shakar ka–a mixture of Persian, Arabic and Hindi!

4. qalam and kitab are Arabic; kaghaz is Persian.

From Jamal Elias, a specialist on South Asian religion:

I don’t know whether Persian took “shakar/shakr” from another Indo-Iranian language or whether it borrowed it backwards from the loan word used in Arabic, “sukar”. My guess is the former. Persian has another word for sugar/sweet, “qand” which is cognate with “candy”.

1 – 3. I can’t say about percentages. There is very little recognizable Persian vocabulary in Arabic. The bulk of what has been borrowed in the classical/literary language is from the early Persian of Islamization and perhaps earlier, so it’s Avestan mostly, with some Soghdian, etc although I don’t know the trajectory by which those words would have entered (i.e., through Avestan or through vernaculars). Examples would be the word “kaghaz” for “paper” which you have below. It’s used in Arabic sometimes, and it’s spelled in Persian using a peculiarly Arabic “z”. There are more colloquial loan words in Arabic which mostly come during the Ottoman period; as such, they are borrowings from Turkish, which happens to have Persian vocabulary.

The proportion of Persian is much higher in Central Asian Turkic languages, Urdu and western Turkish. In the case of Urdu, the majority of compound verbs are Persian and there tend to be two words for most things, a “high” word which is Persian and an “ordinary” word which is Indic vernacular. It’s very much like English and the Norman-Saxon division. The same holds more or less true for Turkish, or, to be more accurate, did so until language reform in the 20th century.

Three things to bear in mind:

(a), the Persian in question is the literary language of the 12th-16th centuries which also served as the lingua franca during that period in the cultural zones you mention. This was/is the language of Khurasan (Asiatic Persian of northeastern Iran, Afghanistan, northern Pakistan, Central Asia, etc), not the language of the Persian Gulf and the Caucasus (Shiraz, Isfahan, Hamadan, etc). The latter dialect came to dominate Iran in the modern period, such that the meanings of words can be different between modern Iranian Persian and the loan words in the other languages, which often share the premodern meaning of the word.

(b) Something similar holds true concerning Arabic vocabulary in Persian vs in Arabic; and other languages often borrowed Arabic vocabulary through Persian as Persian loan words. Things are different in the modern period.

(c) There has been conscious and unconscious language engineering in all the languages in question. Ottoman Turkish tried to replace Persian vocabulary with Arabic in the 17th-19th centuries, and modern Turkish has “Turkicized” by trying to replace Arabic and Persian vocabulary. Modern Central Asian languages are chock full of Russian vocabulary, to the point that the Tajik speakers I’ve met cannot speak Persian without using Russian. Modern loan words in Urdu tend to be English, and national “Islamization” has resulted in many Persian words being replaced by Arabic equivalents.

Among other things, the difference between Hindi and Urdu is one of political consciousness. What makes them different from each other (aside from script) is the choice of Hindi to emphasize a Sanskritic Hindu vocabulary and of Urdu an Arabo-Persian Islamic one. As such, Hindi has much less Persian than Urdu, and it gets less and less every day as Hindi rids itself of remnants of its pre-modern Islamic connection.

4. Of those three nouns beginning with "k" that you learned in Peace Corps training, “kaghaz/kagit etc” is Persian, the other two are Arabic.

From Roger Allen, specialist on Arabic literature:

Of those three words, kalam and kitab are unequivocally Arabic in origin.  Kagach is Persian.

There are indeed some Persian words in Arabic, but there are many more Arabic words in Persian.  Actually–as in Turkish during the regime of "Ataturk"–there was a move to eradicate the strength of the Arabic presence in Persian during the reign of the Shah–who was anxious to downplay the Islamic element and emphasize his (completely phony) ancient Iranian ancestry, but, with the return of Imam Khomeini in 1979, the Arabic element was again enhanced, as an "Islamic" gesture.

From Leopold Eisenlohr, a specialist on Persian, Arabic, Turkic, and Chinese:

Sugar as a word looks like it enjoys a long history of popular etymological theories, even including that the word came originally from Chinese as shazhe or shache (砂?) and from there became the Indic word śarkara. I have no idea where that comes from, but I've seen it a few times.

As for sarkara itself, in the meaning of 'gravel, sand particles' it is attested by Monier-Williams in the Śatapaṭha Brāhmaṇa, which represents the language of the 8th-6th centuries BC, and the Kātyāyana Śrauta Sūtra which is much later, Kātyāyana having lived sometime between the 4th and 6th centuries CE. In the meaning of 'sugar,' the attestations are Kāvya literature in general (this style began in the 1st-2nd centuries CE) and the 6th century polymath Varāhamihira's encyclopedic work the Bṛhat-Saṃhitā. I am including Monier-Williams's entry for the word śarkara as well as the entry which includes the word 'sugar' from Ali Nourai's Etymological Dictionary of Persian, English and other Indo-European Languages.  (see images copied below)

As you can see, Ali Nourai derives the word from IE korkâ. I did not know that crocodile means "pebble worm." I think some more specific information about what kind of sugar this could have been would be needed. I know that the Bengali word for sugar is chini, which I always assumed means "Chinese," but apparently it refers to white Chinese porcelain, and is used for refined white sugar. I don't know if that word is used in other Indic languages.

The other questions are complicated and I will try to answer them in a separate email to come a bit later today. Quickly, I will say that qaghaz is Turkic, meaning something like 'treebark,' and has been described as a borrowing from Chinese guzhi (I don't know the characters), apparently even by the great Annemarie von Gabain. I am still trying to locate this citation. Forms of the word appear in Turkic as kaġad, kaġïd, kägädä, kägdä, qaġat, qaġaz. Qalam is related to Greek kalamos, Latin calemus, and Sanskrit kalama, all from IE, originally meaning a kind of reed which was used for writing.

For the record, Nourai specifically notes that qaghaz does not come from guzhi. We need an Indo-Europeanist to give a definitive opinion on Nourai.

I'll just add a few statistics here about vocabulary.

For Uyghur, one source I have seen reference to is E.N. Nadjip's 1971 Modern Uigur. Abdurishid Yakup in The Turfan Dialect of Uyghur used Nadjip's data, collected in newspapers in 1944, for analyzing the makeup of the Uyghur lexicon. Essentially they say that about 41% of the sample were Arabic and Persian words, with 33.5% being "Arabic" and 7.5% being "Persian." However, as Reinhard Hahn writes in his entry for Uyghur inThe Turkic Languages, most Persian words reached Uyghur through Chaghatay and Iranic literature as well as in urban trading centers of Central Asia; a lot of them were  loanwords from Arabic and they had already been subjected to all the phonological mutation of adapting Arabic to Persian. Whether they can be called Persian or Arabic at that point would, I think, be a philosophical decision, since speakers probably didn't think about their word histories any more than modern English speakers do actively during conversation. I have had the experience many times of belatedly realizing an Uyghur word is ultimately from Arabic, since by this point it has been spit out of the Inner Asian word processor many times. Like the word vapat, "death" – took me a while to make the connection with Arabic wafāh. There is also a layer of words in Uyghur that come directly from Arabic religious texts. I think it makes sense that Uyghur speakers were more in contact with Persian speakers and therefore adopted what they would have thought of as Persian words. There may also be a more trustworthy source than Yakup and Nadjip.

I wrote a paper for Dr. Lowry about the treatment of foreign vocabulary in Sibawayhi's Kitab, one of the earliest Arabic grammatical works (which I suspect has a Sanskrit influence in its composition), where Sibawayh gives an excruciatingly detailed method for arabicizing foreign words so they can inflect using Arabic rules. It was a controversial issue whether there was any non-Arabic vocabulary in the Qur'an; of course, there is, including many of the names. As for the percentage, I haven't found a definitive answer. It's difficult to prove whether Arabic words are ultimately from Persian unless they are also attested as borrowings in Aramaic which was the lingua franca of the Lakhmids, who were important early intermediaries between Persian and Arab culture. After the Islamic conquests, Arabs employed Persians in government since they had no experience with complex state structures, and many Persian words entered Arabic at that time (or became mu'arrabah, arabicized). One of the extremely interesting cultural movements from this time was called shu'ubiyyah – it was a Persian ethnolinguistic movement to assert the superiority of their cultural heritage over the "lizard-eating, camel-driving-song-wailing" Arabs who suddenly controlled them.

I'm not sure about Urdu/Hindi, but Urdu is intensely Persianate. It even adopted the Persian morphological/syntactical structure for subordination called ezafe. In Hindi, you use kaa/ke/ki depending on the gender. It works like this: punjab kaa sher (Punjab / [subordination particle] / lion) = "the Lion of Punjab;" But in Urdu they tack on -e or –ye like in Persian and say sher-e punjab, also meaning "the lion of Punjab."

From Hiroshi Kumamoto, specialist on Khotanese:

The etymology of Skt. śárkarā- is "schwierig" according to Mayrhofer, EWA II 618.

 قلم qalam is from Arabic (verbal root "to cut", presumably "reed cut for writing") as well as the well-known کتاب kitāb from the verbal root "to write".  For  کاغذ kāγaẕ "paper", the thing being of Chinese origin the etymology of Laufer (Sino-Iranica 557ff.) sounds right to me (although there may be other proposals that I'm unaware of). It is not certainly Arabic (whose normal word for "paper" is  ورقwaraq ), nor is it found in Pre-Islamic Middle Persian, although it is found in the Shahnameh (via one of Central Asian Turkic languages?) which is said to have purged the Arabic elements in the vocabulary as much as possible.

k’γδy' ("paper"), with the final aleph after -y-. The -y' part is later dropped while the word traveled through languages and a vowel inserted betweenγ and δ. In any case there is no good Iranian etymology for such a form. It occurs in a Turco-Sogdian document from Dunhuang (9-10c.) which is nearly contemporaneous with Ferdowsi; see Hamilton/Sims-Williams 1990, 48, for other examples in Sogdian as well as Iranological speculations.

Going back to the beginning of the post, the meaning of the title, which must be painfully clear by now, is "Persian is sugar" and the language, of course, is Persian.  This sentence caught my eye as the heading of a flyer for an Elementary Persian course to be taught at Penn in the fall.  The poster had this explanatory paragraph:

An Indo-European language, Persian (aka Farsi, Dari, and Tajiki) is spoken by over 100 million people in Iran, as well as Afghanistan and Tajikistan.  Proficiency in this critical language is the key to unlocking access not only to an emerging market, but also to a cultural treasure trove of literature, cinema, music, and much more.

Here's the sentence in the current orthography of the language:  فارسی شکر است .  It comes from a short story writer from the middle of the last century, Mohammad-Ali Jamalzada (1892-1997).

[Thanks to Wilma Heston]


  1. Pat Barrett said,

    April 11, 2015 @ 12:12 am

    Urdu has both cini and shakar for sugar, cini the everyday word. Is the spelling of an Urdu word with an "unusual" Arabic consonant a sure sign of its Arabic origin?

  2. John Thayer Jensen said,

    April 11, 2015 @ 12:24 am

    When I saw the question, I didn't catch the 'shekar' but I did the 'Farsi' and the 'ast.' Reading the article quickly saw the 'sugar.' But I thought I must be mistaken – what in the world could "Persian is sugar?" mean??


  3. James Bradbury said,

    April 11, 2015 @ 3:26 am

    Where does the flowchart-like etymology diagram in the middle of the post come from?

  4. Bob Ladd said,

    April 11, 2015 @ 3:36 am

    What John Jensen said.

  5. John Walden said,

    April 11, 2015 @ 3:49 am

    Persian is sweet?

    If Persian is rigorously SOV it can't be "Sugar(the word) is Persian" .

  6. David Donnell said,

    April 11, 2015 @ 4:09 am

    From my minimal familiarity with Persian word order–hearing my friends speaking it for 3 decades–"Persian is sweet" sounds correct to me, John Walden.

    And I'm happy we are speaking of the language as "Persian". For me, hearing English-speakers speak of "Farsi" is as irritating as hearing some say, "Do you speak Italiano?" etc etc…


  7. AG said,

    April 11, 2015 @ 4:47 am

    Great post, with a lot to look over at leisure and think about.

    Side note:

    "Seersucker" might be the only English word that contains the modern, not indo-European-era, version of the second word in the post title. Oder?

  8. AG said,

    April 11, 2015 @ 4:54 am

    … A few minutes' googling shows me that my previous comment was raving idiocy. For some reason I'd thought that English "sugar" had a Greek or Latin origin, (like "sucro-" something), not medieval Arabic. Never mind and please forgive me. I still recommend that everyone contemplate the word "seersucker", however, because it's a fun word and a fun fabric.

  9. GeorgeW said,

    April 11, 2015 @ 6:54 am

    FWIW, the Arabic s-k-r root occurs a couple of times in the Qur'an with an 'intoxication' meaning.

    Does Farsi have such a meaning? If so, could one translate the sentence as "Farsi is drunk, or maybe intoxicating?"

  10. AW said,

    April 11, 2015 @ 8:26 am

    James Bradbury, it looks like it comes from An Etymological Dictionary of Persian , English and other Indo-European Languages.

  11. Victor Mair said,

    April 11, 2015 @ 8:31 am

    From Richard Foltz, who is on the road somewhere between Armenia and Iran:


    Thanks for sharing the log! I was intrigued that so many colleagues failed to note the Turkic origin of kaghaz. (It even sounds Turkic!) I don't know if I previously suggested you consult Steingass' Persian-English dictionary (or whether you already know it), but it identifies the origins of foreign words via letter codes (A, T, M, etc.).

    There is also a very good article on Persian words in the Qur'an but I can't remember what it is right now and I am in a Tbilisi B&B without ready access to reference sources. Maybe next week once we're settled "at home" in Tehran….


    Upon reading Richard's message, I dutifully went to my downstairs library and pulled down the big Steingass Persian-English Dictionary from its shelf. On p. 1066a, I find:


    کاغذ kāghaz, kāghiz, [VHM: "gh" and "z" underlined in both cases] Paper; a letter; a charter or patent, presented by the kings of Persia to those whom they mean to honour, and by virtue of which the governors of every district through which the holder of it travels must supply him, the moment he presents it, with carriages and every necessary to which his rank is entitled….


    There is no "T" for Turkish at the beginning of the entry.

  12. Victor Mair said,

    April 11, 2015 @ 8:43 am

    After reading Leo's long contribution, I told him that I loved the line about "lizard-eating, camel-driving-song-wailing" Arabs. He kindly replied:


    Here's the poem where that specific idea can be found, but there are plenty of others like it in other Shu'ubiyya works. The Shahnameh also mentions the Arabs as lizard-eaters. This poem is by the Iraqi Bashshar ibn Burd (714-783).


    Is there a messenger, who will carry my message to all the Arabs,

    to him among them who is alive and to him that lies hid in the dust?

    To say that I am a man of lineage, lofty above any other one of lineage:

    the grandfather in whom I glory was Chosrocs, and Säsän was my father,

    Caesar was my uncle, if you ever reckon my ancestry.

    How many, how many a forebear I have, whose brow was encircled by his diadem,

    Haughty in his court, to whom knees were bowed,

    Coming in the morning to his court, clothed in blazing gems,

    One splendidly attired in ermine, standing within the curtains,

    The servitors hastening to him with golden vessels:

    He was not given to drink the thin milk of a goatskin, or to sup it in leather vessels;

    Never did my father sing a camel-song, trailing along behind a scabby camel.

    Nor approach the colocynth, to pierce it for very hunger;

    Nor approach the mimosa, to beat down its fruits with a stave;

    Nor did we roast a skink, with its quivering tail,

    Nor did I dig for and eat the lizard of the stony ground;

    Nor did my father warm himself standing astraddle to the flame;

    No, nor did my father use to ride the twin supports of a camel-saddle.

    We are kings, who have always been so through long ages past…


    To which Leo added: "This poem was written in Arabic, by the way!"

  13. Victor Mair said,

    April 11, 2015 @ 8:47 am

    @James Bradbury

    AW is right about the Etymological Dictionary of Persian, English and other Indo-European Languages, which is by Ali Nourai.

  14. Jerry Friedman said,

    April 11, 2015 @ 9:20 am

    Leopold Eisenlohr wrote: "Qalam is related to Greek kalamos, Latin calemus, and Sanskrit kalama, all from IE, originally meaning a kind of reed which was used for writing."

    Hiroshi Kumamoto wrote: " قلم qalam is from Arabic (verbal root 'to cut', presumably 'reed cut for writing')"

    Is there any way to tell which is right? Could both origins have contributed?

  15. Peter B. Golden said,

    April 11, 2015 @ 9:21 am

    Most interesting post. In Turkish kâğıt is considered a Persian loanword, going back to kâğız/kâġaz (see Yaşar Çağbayır, Orhun Yazıtlarından Günümüze Türkiye Türkçesinin Söz Varlığı Ötüken Türkçe Sözlük (İstanbul: Ötüken Yayınevi, 2007), III:2332). In spoken Turkish, the Arabic and Persian element varied from speaker to speaker, depending on education and situation. Politics also played a role. More conservative individuals (one could observe this in newspaper columnists in particular) tended to use more "Ottoman" elements (Arabic and Persian). Those who were left of center (ortanın solu – a commonly used term in the 1960s, when I was doing graduate studies in Ankara) used neologisms "revived" or created by the Türk Dil Kurumu. Many of the neologisms have remained (and quite a few replaced by even newer neologisms). This produced a language division within families. Parents complained that they could not fully understand their childrens' school textbooks. Another interesting (at least for me) aside. A number of the older newspaper writers in the 60s, were people whose initial education has been in the pre-language/alphabet reform era. They wrote Turkish in Arabo-Ottoman script (usually called the "old letters" eski harfler") and had underlings transcribe their articles into the Modern Latin-Turkish alphabet – often with amazing results. Foreign names were butchered. "Khrushchev" (which posed enough problems for Americans: Хрущёв – Khrushchiov, usually becoming "Krushev" in American English) appeared in all manner of forms. Some neologisms have taken strong root, thus "cevap" "answer" (from Arabic) is now commonly "yanıt." "Central Asian Turkic" is too general a term here. Uzbek and Uyghur, two closely related languages, derive from Southeastern Türkî. Both groupings absorbed substantial Iranian urban elements (some still speaking Tajik in Uzbekistan, bilingualism in Bukhara and Samarqand is common) and this accounts for the high percentage of Persian (and Arabo-Persian) terms. Modern Uzbek is based on urban dialects (esp. of Tashkent) that were heavily Iranized-to the extent that vowel harmony, so typical of Turkic languages, was broken. In the north, where one finds the actual Uzbeks deriving from the same grouping of eastern Qıpchaq Turkic-speakers that form the foundation of the Qazaq people, vowel harmony remains and the Perso-Arabic element (going back to the Chaghatay literary language) is far less. The language is more like Qazaq. In western Uzbekistan (also retaining vowel harmony) there is a strong Oghuz influence in vocabulary and grammar.

  16. Leo E said,

    April 11, 2015 @ 10:24 am

    Yi Jing (635–713), in his Sanskrit-Chinese dictionary Fànyǔ qiān zì wén 梵語千字文 (One Thousand Written Words from the Sanskrit Language) gives kākali काकलि as the word for zhǐ 紙, paper. There is no such definition in the Monier-Williams Sanskrit dictionary, which only has "a soft sweet sound" and "name of an apsara." Just throwing this out there since kākali could presumably be related to kāghit/qaghaz. Not sure about the version of this text but it has the Sanskrit written in: http://www.jingdianguoxue.com/kaishu/6802_19.html (top right, third from right). The Chinese transcription is jiā (yǐn) jiā lī 迦(引)迦哩, long ka + ka + li (li can transcribe Sanskrit ṛ). There are images from another edition on the SAT Daizokyo database, but I don't know how to link it.

  17. Rodger C said,

    April 11, 2015 @ 10:38 am

    I thought "crocodile" meant "yellow/light-green lizard." Cf. "crocus."

  18. Jonathan Wright said,

    April 11, 2015 @ 1:41 pm

    I'm surprised that those you consulted were not unanimous that qalam comes from the Greek or Latin words kalamos/calamus. Some apparently didn't even consider it or perhaps hadn't even heard of it. Given the status of Hellenic culture in Syria and the north of the Arabian peninsula in the relevant period, I would think the Greek/Latin etymology would be the overwhelming favourite. Somehow I don't find the derivation from qalama – to clip or pare – very convincing. The form is not what one would extended for 'something pared'.

  19. Jonathan Wright said,

    April 11, 2015 @ 1:42 pm

    The form is not what one would EXPECT for 'something pared', I mean

  20. Leo E said,

    April 11, 2015 @ 1:52 pm

    Adding to the note about Yi Jing, I just noticed that the word for pen cognate with qalam is right there too: कलम (kalama), Jiā luó (yǐn) mó =bǐ 迦攞(引)摩=筆, "*ka+(long) la+ma = pen." (using Charles Müller and Soothill's entries for what Chinese characters are used to transcribe Sanskrit sounds). No mention of kitab, of course, though that would be interesting; shū 書, book, translates the words lekha लेख and rikh रिख (which means "to scratch, scrape"…could he mean ṛc ऋच्, "sacred writing"?).

  21. J. W. Brewer said,

    April 11, 2015 @ 2:00 pm

    Is there something about Western-style classroom instruction, even in a full-immersion approach, that focuses on the technology of literacy? I mention this because as best as I recall pretty much the first example sentences in Japanese I learned way back when (as a 3d grader at the American School on the outskirts of Tokyo) were "Kore wa hon desu" (glossed for us as "this is a book," although really "proximal-demonstrative topic-marker-particle book polite-copula") and "Kore wa empitsu desu" (the same but swapping in "pencil" for "book").

  22. Q. Q. Switcheroo said,

    April 11, 2015 @ 2:28 pm

    "FWIW, the Arabic s-k-r root occurs a couple of times in the Qur'an with an 'intoxication' meaning."

    Any relation to "shikker" – Yiddish for "drunkard"?

  23. GeorgeW said,

    April 11, 2015 @ 2:44 pm

    "Any relation to "shikker" – Yiddish for "drunkard"?"

    The Hebrew root sh-k-r means to be intoxicated. So, the Yiddish must be related.

  24. Jake Nelson said,

    April 11, 2015 @ 3:04 pm

    I've suspected for a while that "sarkarah" and "calculus" derive from the same word for gravel. I tend to suspect a lot of cognates that get classed as "possible but unproven", though.

  25. P'i-kou said,

    April 11, 2015 @ 3:12 pm

    Not sure what to think about this Sanskrit kākali/kakari. After some googling, all references to it seem to lead to two Chinese glossaries studied by Prabodh Chandra Bagchi, the Fànyǔ qiān zì wén 梵語千字文 (T. 2133), which Leo E discusses and links to, that has kākali, and the Fànyǔ zámíng 梵語雜名 by Lǐyán 禮言 (T. 2135) with what the CBETA edition renders as kakari, transcribed 迦迦里. (The Sanskrit transcriptions in the CBETA online edition aren't entirely consistent with the Siddham in the text linked to by Leo O, but they do agree on both kākali and kalāma.)

    Likh- means 'to write'.

    I'm not sure to what extent these Chinese transcriptions can attest a generalised Medieval Sanskrit usage beyond the medium in which they were transmitted to Chinese pilgrims. Leo O's link has hoti in Siddham, glossed as the vernacular copula 是 shì. Hoti is 'to be' in Pali and various Prakrits, but not in Sanskrit. This suggests that the form of 'Sanskrit' Yijing was writing down was not a particularly classical form of the language and had at least a component of spoken Indic. These words for 'paper' could also belong to such spoken varieties.

    As to what vernacular languages Yijing's informants or teachers might have spoken, it's anyone's guess since he was pretty much everywhere, from Nalanda to Sumatra.

    Monier Williams has कागद kāgada 'paper', apparently the word for it in e.g. Marathi.

  26. Leo E said,

    April 11, 2015 @ 3:45 pm

    @ P'i-kou
    Yes, absolutely agree with your statement about the vernacular form of (possibly Inner Asian-infused) Sanskrit with which Chinese pilgrims would have come in contact. For now let me just add that in the Fanyu za ming 梵語雜名 the three definitions of shi 是 are, as you said, hoti 戸底, as well as bhavati, ba+va+ti, 婆縛底 "it is," and iva, yi wa, 翳嚩, "like, too." Yi Jing also has evam, "thus," for shi 是, transcribed as yi zong (?) 曀鍐. That's very interesting about the use of hoti, and it really makes me wonder what would have fallen under the category of Fanyu 梵語. Would the Turkic Buddhist texts (oral or written), as extensions of the larger realm of Buddhistese, have been able to be called Fanyu? It would be extremely interesting to go through these glossaries and examine which words are chosen and see what can be inferred from that.

  27. Slava Kozlov said,

    April 11, 2015 @ 4:03 pm

    As you wrote, in Russian sugar is indeed 'сахар' (sakhar) but interestingly that in Kazakh it's 'қант' (kant); or at least it was translated as such when I was living there (some 20 years ago). Later I have learned that there is another word in Kazakh for sugar, 'шекер' (sheker), but it wasn't in use in my time, at least on a level of packaging (it was always 'қант').

  28. P'i-kou said,

    April 11, 2015 @ 4:41 pm

    I think the second character in the transcription of evam must be a (Japanese? doesn't the Yijing text come through a Japanese transmission?) variant of wàn, Middle Chinese (Baxter) mjomX which would be a more credible transcription of vam. The meaning would seem to be the same as that of pronounced cong or zong.

    This restaurant in Japan is called "鍐 – Van" (i.e. using the character that is pronounced zong in Chinese), and their logo has a prominent Siddham syllable vam.

  29. Leo E said,

    April 11, 2015 @ 4:49 pm

    @Slava Kozlov
    Glad you mentioned qand, since it reminded me of one definite example of a Persian loan used in Arabic – there's a 12th century book of the history of Samarqand called al-Qand fi dhikr `Ulama' Samarqand, or "The Sugarlump of Remembrance of the Samarqand Clerics." There's also a Muntakhab al-Qand fi Tarikh Samarqand, "Selections of Sugarlumps in the History of Samarqand." It's a happy coincidence between qand, city, and qand, sugarlump. I think city-qand is Soghdian, but I wonder about sugar-qand.

  30. Victor Mair said,

    April 11, 2015 @ 4:58 pm

    From Peter B. Golden:

    The extent of the Arabic and Persian (or Arabo-Persian) elements (some of the Arabic words in Turkish come directly from Arabic, others are Arabic via Persian) depends, I suspect, on who is doing the counting. Some words, like şey (from Arabic شى shay) “thing” and an all-purpose word used when you are searching for another term, or simply uhm-ing and oh-ing, are so common in everyday speech that no one (except for the educated) would think of it as a foreign loanword. The changes wrought by the Türk Dil Kurumu (Turkish Language Society, founded with the support of Atatürk, who wanted to de-Otttomanize Republican Turkish society and language) do, indeed, surface in the different generations. When I speak with younger people, they smile indulgently because I sound like their grandfather and use dated terms and expressions. Geoffrey Lewis wrote a wonderful (and witty) book on the subject: The Turkish Language Reform. A Catastrophic Success (Oxford: OUP, 1999). In my reading and correspondence, I daily encounter neologisms, which are increasingly expected to be used in dissertations and scholarly articles etc. At the same time, the would-be tyrant, Erdoğan, has pushed through the teaching of Ottoman in schools. I have mixed feelings about this. Yes, it is good to be able to reconnect with an older (and important) tradition, but Erdoğan is pursuing very strong political (and ultraconservative) agenda, that is a deliberate step back (or several steps back) from Atatürkist secularism.

  31. Victor Mair said,

    April 11, 2015 @ 5:32 pm

    I was going to include the etymology of "candy" in my original post, but was stymied when I found two quite different proposals:

    1. Online Etymology Dictionary

    late 13c., "crystalized sugar," from Old French çucre candi "sugar candy," ultimately from Arabic qandi, from Persian qand "cane sugar," probably from Sanskrit khanda "piece (of sugar)," perhaps from Dravidian (compare Tamil kantu "candy," kattu "to harden, condense").

    2. American Heritage Dictionary of the English Language, 5th ed.

    Middle English candi, crystallized cane sugar, short for sugre-candi, partial translation of Old French sucre candi, ultimately from Arabic sukkar qandī : sukkar, sugar + qandī, consisting of sugar lumps (from qand, lump of crystallized sugar, from an Indic source akin to Pali kaṇḍa-, from Sanskrit khaṇḍakaḥ, from khaṇḍaḥ, piece, fragment, perhaps of Munda origin).

    I should note that my printed 5th ed. of the AHD is almost the same as the online ed. up to sucre candi, after which it diverges thus: and Old Italian zucchero candi, both from Arabic sukkar qandī : sukkar, sugar + qandī, candied (from qand, cane sugar, probably from Dravidian kantu, lump.

    N.B.: In these comments, a left-facing arrowhead is replaced by "from". In the penultimate word, kantu, the "n" and "t" have underdots.

  32. Victor Mair said,

    April 11, 2015 @ 6:56 pm

    From an anonymous reader:

    'Farsi shekar ast' is an iconic saying the Iranian government has used over many decades to encourage learning Farsi. In the Azerbajanian region where everybody speaks Azerbajanian and not many Farsi it was part of the campaign to encourage to speak Farsi. My mum was telling me about that when I was a kid because my grandmother did not speak any Persian.

  33. Rodger C said,

    April 12, 2015 @ 11:35 am

    @J. W. Brewer: US language instruction does seem obsessed with common classroom objects. I have a doctoral minor in Spanish, but I sometimes reflect on the fact that I don't think I've ever seen the word pupitre ('schooldesk'), one of my first Spanish words, since I was fourteen. (To be sure, I've never taught elementary Spanish.)

  34. Coby Lubliner said,

    April 12, 2015 @ 2:59 pm

    Another word of Indic (originally Dravidian?) origin that migrated westward via Turkish and Arabic is the common ancestor of patlıcan/melanzana/berenjena/aubergine (eggplant for North Americans).

  35. cameron said,

    April 12, 2015 @ 4:57 pm

    The Indic sugar word was re-borrowed into English in a different form as "jaggery" – which refers to a specific variety of unrefined sugar.

  36. Andrew (not the same one) said,

    April 14, 2015 @ 8:27 am

    J.W. Brewer and Roger C: My father, as part of his studies in chemistry, had to learn chemical German. This, instead of of classroom objects, used laboratory objects. 'Ich bin ein Chemiker. Diese ist das Laboratorium. Das Laboratorium ist hell. Wo ist mein Spritzflasche?'

  37. January First-of-May said,

    April 16, 2015 @ 5:13 am

    It took me almost a minute to realize that the diagram's "Russian "suxar: sweet bread"" is the word сухарь (sukhar') "dry bread", and not anything to do with sweetness or sugar.
    For what it's worth, that word comes from сухой (sukhoy) "dry"; to the best of my knowledge, there's no etymological link to "sugar".

