Are all writing systems equally easy / hard?

« previous post | next post »

Some folks seem to think so, but not Benjamin James who wrote this letter to the London Review of Books, 47.6 (April 3, 2025), p. 4:

Simple Script

In his fascinating article on the recent decipherment of Linear Elamite, Tom Stevenson finds it difficult to accept that 'the Latin or Greek writing systems are simpler or "more precise" than mostly logographic writing systems like written Chinese' (LRB, 6 March). Does he really believe Chinese script is just as suited as Latin to the rendering of foreign words? 'Tom Stevenson' is far simpler and more phonetically precise than 汤姆•史帝⽂森,'Tangmu Shidiwénsen', which adds two syllables, six tones and six individual character meanings. The Committee for Language Reform in China acknowledged the relative simplicity of the Latin script as one of the factors behind its abandonment in 1956 of the attempt to develop a phonetic script based on Chinese characters.

If Stevenson says this because he thinks the idea of one script being simpler than another is somehow discriminatory (as well as untrue), then he might prefer to consider the example of the long-vanished Tangut people, who possessed, according to the Tangutologist Gerard Clauson, 

one of the most inconvenient of all scripts, a collection of nearly 5800 characters of the
same kind as Chinese characters but rather more complicated . . . It is extremely difficult to remember them, since there are few recognisable indications of sound and meaning in the constituent parts of a character, and in some cases characters which differ from one another only in minor details of shape or by one or two strokes have completely different sounds and meanings, Imagination boggles at the thought of teaching typesetters to set it up. 

The Tangut script, supposedly created by a single Chinese bureaucrat in 1038, died out at the start of the 16th century – to the probable relief of future generations, who were free to write in Chinese, Tibetan, Mongolian or some other comparatively simple script.

Here I would like to pay tribute to two Tangutologists:

Sir Gerard Leslie Makins Clauson (1894-1974) — larger than life, he was "an English civil servant, businessman, and Orientalist best known for his studies of the Turkic languages. He was born in Malta."  Here I am celebrating his achievements in Tangut studies, but his accomplishments in Turkology were. to me, supernatural.  When I behold his An Etymological Dictionary of Pre-Thirteenth-Century Turkish (Oxford: Clarendon Press [1972]), which was almost like a bible for me when I was studying Old Turkish and Tocharian (!!), I think that alone would be enough for any scholar to produce in one lifetime, but there is much more:

Clauson attended Eton College, where he was Captain of School, and where, at age 15 or 16, he published a critical edition of a short Pali text, "A New Kammavācā" in the Journal of the Pali Text Society. In 1906, when his father was named Chief Secretary for Cyprus, he taught himself Turkish to complement his school Greek. He studied at Corpus Christi College, Oxford, in classics, receiving his degree in Greats, then became Boden Scholar in Sanskrit, 1911; Hall-Houghtman Syriac Prizeman, 1913; and James Mew Arabic Scholar, 1920. During World War I, he fought in the battle of Gallipoli but spent the majority of his effort in signals intelligence, concerned with German and Ottoman army codes.

These were the years in which the great Central Asian expeditions of Sven Hedin, Sir Aurel Stein and others were unearthing new texts in a variety of languages including Tocharian and Saka (both Khotanese, and Tumshuqese). Clauson actively engaged in unraveling their philologies, as well as Chinese Buddhist texts in the Tibetan script.

Clauson also worked on the Tangut language, and in 1938–1939 wrote a Skeleton dictionary of the Hsi-hsia language. The manuscript copy is held at the School of Oriental and African Studies in London, and was published as a facsimile edition in 2016.

In 1919 he began work in the British Civil Service, which was to culminate in serving as the Assistant Under-Secretary of State in the Colonial Office, 1940–1951, in which capacity he chaired the International Wheat Conference, 1947, and International Rubber Conference, 1951. After his mandatory retirement at age 60, he switched to a business career and in time served as chairman of Pirelli, 1960–1969.

A partially filled notebook containing Sir Gerard Clauson's Notes on Kashgari's Divan lugat at-Turk [VHM:   the first comprehensive dictionary of Turkic languages] and other cognate subjects (1072-74) is held at the Cadbury Research Library, University of Birmingham.

(Wikipedia)

If we're talking about someone who knew difficult languages and mind-numbing scripts, it was Clauson.  He authoritatively knew whereof he spoke.

 

Nikita Kuzmin

He completed his PhD at the University of Pennsylvania in 2023 on the following subject:

"Pilgrimage in Tangut Xia:  Study of Tangut Epigraphy from Dunhuang and Tangut Woodblock Prints from Bezeklik".  (free PDF)

Abstract

This dissertation aims to examine the pilgrimage activities of the Tanguts in the 11th–13th centuries in the Hexi Corridor, based on the research of the two corpora of Tangut received textual materials – Buddhist inscriptions that pilgrims left on the walls of the Buddhist cave complexes of Mogao and Yulin and the fragments of Tangut Buddhist texts excavated from Bezeklik. Chapter 1 introduces various manifestations of pilgrimage and articulates features of Buddhist pilgrimage in multiple regions in Asia. Chapter 2 displays the historical and religious characteristics of Mount Wutai and the greater Dunhuang area, which played a crucial role in the establishment and development of Tangut Buddhism. It also discusses various external factors (Uyghur monks) that influenced the propagation of Buddhism among the Tanguts. In Chapter 3, I analyze the remained Tangut inscriptions from Mogao and Yulin caves and interpret them within corresponding historical and religious contexts. Based on the comparative research of the inscriptions, I argue the existence of a unified “inscriptional discourse” in the greater Dunhuang area in the 10th to 13th centuries. Chapter 4 discusses codicological and contextual features of a corpus of Tangut Buddhist woodblock prints from Bezeklik caves. In the end, the dissertation provides an English translation of 22 inscriptions and 12 pieces of Tangut woodblock prints.

To accomplish this arduous task, one of the first things I had Nikita do was go off to Kathmandu to study Classical Tibetan for a summer.  He already knew Mandarin (virtually native fluency) and Classical Chinese, Japanese, German, and his native Russian.  Oh, yes, and was fluent in English.

No, to all those doubting Thomases out there who think that mostly logographic scripts like Tangut and Chinese are as simple and precise as the Latin or Greek alphabet, they are not.

If you only have one year to learn a new script, don't try Tangut or Chinese.

 

Selected readings

[Thanks to Leslie Katz]



18 Comments »

  1. jin defang said,

    April 17, 2025 @ 1:02 pm

    re Sir Gerald—how does anyone become proficient in Pali at age 15 or 16, much less be able to write a critical essay on it? Surely Pali isn't part of the curriculum at Eton.

  2. Ron Vara said,

    April 17, 2025 @ 1:10 pm

    ᱮᱞᱚᱱ ᱨᱤᱣ ᱢᱩᱥᱠ ( /ˈiːlɒn/ EE-lon; ᱡᱟᱱᱟᱢ ᱡᱩᱱ ᱒᱘, ᱑᱙᱗᱑) ᱩᱱᱤ ᱫᱚ ᱢᱤᱫ ᱵᱮᱯᱟᱨᱤᱭᱟᱹ ᱠᱟᱱᱟᱭ ᱚᱠᱚᱭ ᱫᱚ ᱩᱱᱤᱭᱟᱜ ᱢᱩᱲᱩᱫ ᱵᱷᱩᱢᱤᱠᱟ Tesla, Inc., SpaceX, ᱟᱨ ᱴᱩᱭᱴᱚᱨ (ᱡᱟᱦᱟᱸ ᱩᱱᱤ ᱮᱠᱥ ᱞᱮᱠᱟᱛᱮ ᱧᱩᱛᱩᱢ ᱵᱚᱫᱚᱞ ᱮᱱᱟ) ᱨᱮ ᱵᱟᱰᱟᱭᱚᱜ ᱠᱟᱱᱟ᱾

    This is the first sentence in the article Elon Musk in Santali alphabet (Ol Chiki). Yes, it's an alphabetic writing system, not an abugida. What makes the Santali alphabet really elusive is that it resembles the shapes of the undeciphered Indus Valley script. Soviet archaeologists once tried to decipher IVC seals using Santali alphabet. Sounds ridiculously. But it's a sad truth that Santali is a unique language with little to none academic attention have been paid to it.

    What makes Santali language itself standout is the lack of "indianization" as compared to even Southeast Asian Austroasiatic languages like Mon and Khmer. Its phonology is still archetypically East Asian with a superimposed South Asian areal layer. Its morphology is somewhat mixed between SEA Austroasiatic prefixing and (self-innovative) highly polysynthetic verb akin to the Kiranti languages (spoken ~20 kilometres away in Nepal), so Santali is nothing remotely influenced by Sanskrit, except few Austroasiatic loanwords in early Vedic Sanskrit.

  3. Jonathan Smith said,

    April 17, 2025 @ 2:14 pm

    Well, "Tom Stevenson" is also "simpler and more precise" than "Tāng​mǔ Shǐ​dì​wén​sēn", the latter still featuring an additional "two syllables and six tones" etc: are we thus to conclude that the Latin alphabet as applied to English is "simpler and more precise" than the same applied to Chinese? Or indeed that English is just "simpler and more precise" than Chinese?

  4. Scott P. said,

    April 17, 2025 @ 4:20 pm

    Well, "Tom Stevenson" is also "simpler and more precise" than "Tāng​mǔ Shǐ​dì​wén​sēn", the latter still featuring an additional "two syllables and six tones" etc: are we thus to conclude that the Latin alphabet as applied to English is "simpler and more precise" than the same applied to Chinese? Or indeed that English is just "simpler and more precise" than Chinese?

    "Tom" is Aramaic in origin, "Steven" is Greek, only the suffix is Germanic, so I would aver "Tom Stevenson" is only 20% English to start with.

  5. Yves Rehbein said,

    April 17, 2025 @ 5:46 pm

    > Elam, like Sumeria, is an exochoronym, never used in the region itself

    Different letter from the same article. This is at best uncertain and more likely wrong.

    That's equivalent to the six individual character meanings of 汤姆•史帝⽂森 trans-something from "Tom Stevenson", e.g. 汤 Tang of Shang.

    VHM speaks of "[S]emantic interference, readers frequently misinterpret such expressions as 特納廣播電台 Tena Gudnghó Diantar as "Special Acceptance Broadcasting Station" instead of as "Turner Broadcasting Station [V. H. Mair, Modern Chinese Writing, in: The World's Writing Systems. Peter T. Daniels , William Bright. 1996)

    As a Steve, Ivan, Ilan (that one was new), I can confirm that this also happens in closely related languages.

  6. Chris Button said,

    April 17, 2025 @ 6:18 pm

    Chữ Nôm or Tangut script. Which is harder?

  7. Jonathan Smith said,

    April 17, 2025 @ 7:51 pm

    Assuming one knows the relevant language well, no reason (a standardized) Chữ Nôm should be harder than the modern standard Chinese script or than (a standardized) Sinographic Taiwanese script or whatever — same productive principles married to typologically similar languages. Tangut on the other hand is insane.

  8. Victor Mair said,

    April 17, 2025 @ 8:16 pm

    M. V. Sofronov, "Chinese Philology and the Scripts of Central Asia", 30 (October, 1991), 10 pp.

  9. Chris Button said,

    April 17, 2025 @ 8:19 pm

    The lack of standardization of Chữ Nôm is precisely the problem.

  10. Edith said,

    April 18, 2025 @ 4:04 am

    @jin defang

    I can't answer your question, but I can suggest how it might be possible.

    The Pali Text Society has existed since the 1880s, as likely every Eton schoolboy knew :)

    A friend a few years ago taught themselves Pali, just for fun, using resources from the PTS.

    Some people are just gifted in that way.

  11. András Róna-Tas said,

    April 18, 2025 @ 5:29 am

    I do not think tat one language may be more difficult than an other. It may be, however, more difficult to learn, depending on your own languae and learning capacities. The script rendering a language is a code system, designed, until the New Ages for a small group of people. Working on the decipherment of the two Khitan scripts it becomes clear, that decisive are the models. The Khitan large or linear script is a Sinitic one, where the pattern is the same as that of Chinese, though neither the formal elements nor the meaning of the units are simple copies of any Chinese. Nevertheless Khitan Large Script is as complicated as Chinese or as Tangut, and has it antecedent in the Sinitic scritpts of the Tabgach/Topa and the Tuyuhun. The Khitan Small Script or as G. Kara called it Assembled Script work with syllabic characters, of which more than 400 were identified. The name Tom Stevenson may be transcribed as

    t-om s-t-eu-en-s-on
    (247) (092) (247) (244) (067) (324) (244) (154)
    the Khitan fonts do not work here

    for us the transcription is relatively easy only ev has to be replaced by eu. From this point of view KSS is an "easy" one, but yet a third of the grahph are not deciphered, and those who used it had to memorize or than 400 signs.

    I had the privilage to be in friendship with Sir Gerard, we discussed many problems, among them the alphabetical order in his EtymDic. over a lunch in his house. He persuaded me that the unusual alphabetic sequence is the most ideal and useful – and he was right. I have a long correspondence with letters handwritten by him, the communist censorship had great problems in understanding, (Tibetan letters, Uigur words written in the UIgur alphabet and the like suspiciuos to be coded spy infos ) many of them were copied by the secret police, and I got back after 1990 a few letters by Sir Gerard in copy, the original of which I have lost.

    .

  12. Milan said,

    April 18, 2025 @ 1:42 pm

    @Ron Vara

    "What makes the Santali alphabet really elusive is that it resembles the shapes of the undeciphered Indus Valley script. Soviet archaeologists once tried to decipher IVC seals using Santali alphabet. Sounds ridiculously."

    Santali seems like a fascinating language, and I am happy to believe that it warrants more study. However, Wikipedia says that the Santali (or "Ol Chiki") alphabet was invented in 1925 by a man known as Pandit Raghunath Murmu. If the suggestion is that knowledge of the Indus Valley script has been submitted for over 3000 years without any material evidence in between, this seems indeed ridiculous

  13. KIRINPUTRA said,

    April 18, 2025 @ 8:27 pm

    Phonetic sinographs don't necessarily "add" tones & "character meanings". People overgeneralise faint-heartedly from How Kanji Work in Mandarin to How Kanji Work.

    And we forget: Katakana & hiragana are sinographs too.

  14. Chris Button said,

    April 18, 2025 @ 11:22 pm

    @ András Róna-Tas

    I have a long correspondence with letters handwritten by him, the communist censorship had great problems in understanding, (Tibetan letters, Uigur words written in the UIgur alphabet and the like suspiciuos to be coded spy infos ) many of them were copied by the secret police, and I got back after 1990 a few letters by Sir Gerard in copy, the original of which I have lost.

    That's so interesting! What an unexpected pleasure from a not so pleasant cause.

    I had the privilage to be in friendship with Sir Gerard, we discussed many problems, among them the alphabetical order in his EtymDic. over a lunch in his house. He persuaded me that the unusual alphabetic sequence is the most ideal and useful – and he was right.

    I have much empathy for Clauson based on his comments in the preface:

    "… the Turkish texts quoted in this book are written in a variety of alphabets, all more or less ambiguous, and it is often impossible to determine the correct transcription of a number of words; moreover, some words were pronounced slightly differently in different languages. It would, therefore, not be sensible to arrange the words in the strict alphabetical order to which we are accustomed in the dictionaries of European languages, since this would involve a great many double or multiple entries and greatly add to the difficulty of finding individual words."

    For the etymological dictionary I am slowly compiling, I have to wrestle with traditional vs simplified Japanese/Chinese forms of the characters, modern Japanese/Chinese pronunciations, reconstructed Sino-Japanese/Chinese pronunciations of various periods …

    In the end, I'm just opting for the sequence of the "22 heavenly stems and earthly branches" since they represent the basic onsets of Old Chinese according to my reconstruction (inspired by Edwin Pulleyblank, who incidentally gets a dozen mentions in Clauson's dictionary):

    甲 k, 乙 Ɂ, 丙 p, 丁 t, 戊 b, 己 ɣ, 庚 ᵏl, 辛 s, 壬 n, 癸 q, 子 ʦ, 丑 x, 寅 l, 卯 ʁ, 辰 d, 巳 ʣ, 午 ŋ, 未 m, 申 ɬ, 酉 r, 戌 χ, 亥 g

  15. ~flow said,

    April 19, 2025 @ 12:35 am

    @Chris Button

    If the heavenly stems and earthly branches were—perhaps even intentionally and not by chance—representing a catalog of onsets of Old Chinese, shouldn't we expect that fact to have left *some* trace in the written record? Like some way of speaking, some remark in a memoir, a page from a text book?

    On a related note, I believe to have read there are papyrus fragments that indicate that at some (late) point in Egyptian history there was a phonetically oriented way to teach at least some part of the hieroglyphic writing system (would like to pull out a citation but sadly can't, maybe some else can fill in).

  16. Gokul Madhavan said,

    April 19, 2025 @ 7:37 am

    Another way to approach this question is by looking at languages that are written in different scripts: BCS in Cyrillic and Latin; Hindustani in Devanagari and Nastaliq. (Turkish is a more complex example, as it consciously got rid of so many Arabic and Persian lexical items at the same time that it moved away from the Ottoman script.)

    When I TFed Hindi-Urdu at Harvard a long time ago, the program began with the Urdu script and transitioned to Devanagari much later in the first semester. The only students who had a harder time with Devanagari than with Nastaliq were the ones who came in knowing Arabic and/or Persian, and even they recognized that it was generally easier to sound out a new word written in Devanagari than in Nastaliq. (This is not to say that Devanagari holds all the advantages though: Nastaliq of course preserves the Arabic/Persian etymological spellings of words borrowed from those languages. Nastaliq also explicitly indicates aspiration and retroflexion for consonants, though this is likely only useful in the beginning before people internalize the Devanagari glyphs.)

  17. Chris Button said,

    April 19, 2025 @ 8:41 am

    @ ~flow

    Two main reasons.

    One practical:

    The signs were used to date inscriptions ("On the XY day, it was tested/divined: …."). To use them for something else in the inscriptions at the same time would have been confusing.

    One societal/cultural:

    The concept of writing as something to educate the illiterate masses comes far later in history. This early writing is something sacred for divinatory purposes that is reserved for the elites, who (in a cynical interpretation) can also use it to justify and maintain their grip on power. You mention Egyptian hieroglyphics. It is worth noting that they never went fully alphabetic.

  18. Chris Button said,

    April 19, 2025 @ 12:48 pm

    @ ~flow

    I should confess that I was for a long time skeptical of Pulleyblank's proposal (in all of its 3, or 4, formulations) too. However, I became convinced of the validity of his proposal after applying my reconstructions of the Old Chinese onsets to the 22 ganzhi and being surprised at how they aligned almost perfectly.

RSS feed for comments on this post · TrackBack URI

Leave a Comment