Language Log

Mongolian transliterations of Donald Trump's name

April 13, 2017 @ 8:53 am · Filed by Victor Mair under Transcription

We've looked fairly intensively at transcriptions of our new President's name in Chinese and, en passant, in Japanese, Korean, and other languages:

"Trump translated" (8/31/16) — about halfway down in the o.p.

"Transcription of "Barack Obama", "Hillary Clinton", and "Donald Trump" in the Sinosphere" (10/2/16)

"Chinese transcriptions of Donald Trump's surname" (11/23/16)

For those who are interested in how the POTUS's name and surname are rendered in Mongolian scripts, both Cyrillic and traditional Mongolian writing, we now have Bathrobe's post at Spicks & Specks:

"'Donald Trump' in Mongolian" (4/13/17)

One of the more interesting cases cited by Bathrobe is when, in Inner Mongolia, the Mongolian rendering of Donald Trump's name is based on the Chinese transcription. We've seen a similar deplorable practice with Tibetan:

"Tibetan –> Chinese –> Chinglish" (11/11/15)

"Tibetan –> Chinese –> Chinglish, ch. 2" (11/20/15)

The incidence of this happening with other Chinese "minority" languages is so great that the situation seems hopeless. I don't think that I will ever get used to Kashgar turning into Kashi, Ürümchi becoming Wulumuqi, Sibsongbanna or Sipsong Panna transforming into Xishuangbanna, and so on ad nauseam. For that matter, I still can't bring myself to say Xiamen for Amoy without wincing.

April 13, 2017 @ 8:53 am · Filed by Victor Mair under Transcription

Permalink

21 Comments

John said,

April 13, 2017 @ 10:54 am

I find the last sentence very interesting. What do you think about the ROC government changing the official romanization of 淡水 from Danshui (back) to Tamsui? Granted, the stated reason was because the town was known as Tamsui when the Western traders arrived in the late 19th century, and not because of a desire for a wholesale return to Minnan place names in romanization.
Victor Mair said,

April 13, 2017 @ 3:22 pm

From Juha Janhunen:

Thanks for this interesting information. Just a minor addition: since the Mongolian"old" script uses its alphabetic resources in a rather idiosyncratic way, it has always been a challenge to Romanization. The question is whether we should apply transliteration, transcription, or a combination of both. The conventional Romanization in Mongolian studies applies a combination of transliteration and transcription, which is why it uses information from both the written image and the underlying spoken language (or its modern forms). Unfortunately, it is not consistent in either respect: it ignores several details of both phonemic and graphic information and yields a result that is neither correctly pronounceable nor reconvertible to the original script. Because of these problems, Michael Balk (Berlin) and myself (Helsinki) have developed a transliteration system (BJR) which we propagate for scientific use, one reason being that our system, unlike the conventional Romanization, is fully reconvertible to the Mongolian script. To illustrate this I quote the Romanized shapes of the three different Mongolian renderings of Donald Trump's name:

c) dunaldx trampe

d) dunaldx juv truimpe

e) duvnadex terampux (based on the Chinese syllabification of the name)
David Marjanović said,

April 13, 2017 @ 5:36 pm

Throughout Russia and former Soviet Central Asia, all transcriptions of foreign names appear to be lifted wholesale from Russian, no matter whether the sound system of the language in question would make a more precise rendering possible than Russian does.
Bathrobe said,

April 13, 2017 @ 8:35 pm

Fedor Manin left a comment at the article making the important clarification that Дональд is a historical Russian spelling in which the soft sign ь is meant to indicate a light 'l', à la française. The fact that English uses a dark 'l' here is beside the point — the Russians took French as their model.

The Mongolian lateral is neither light nor dark. In Mongolia itself, it is a distinctive affricated lateral. In Mongolian words ь does serve to modify the pronunciation of the preceding vowel, resulting in something like /dɔnæld/, but this is irrelevant here. The sole basis for adopting the spelling Дональд is imitation of the Russian model. I've added a note correcting the earlier analysis.
Adrian said,

April 14, 2017 @ 10:58 am

As so often, a LL post sent me off on a fascinating trip around the Web. So reading about Amoy led me to Swatow (via postal romanisation https://en.wikipedia.org/wiki/Chinese_postal_romanization ), thence to this envelope http://www.chinaoverprints.com/china/Images/Agencies/Swatow%20I%20-D-/Swatow%20I%20-D-%201922%2007%2015%20(Cavendish%20636).jpg , thence to this book… https://missiology.org.uk/pdf/e-books/moody-c-n/abundance-of-rain_moody.pdf It seems that the Miss Burt who was there in 1922 was still there 15 years later.
Bathrobe said,

April 15, 2017 @ 4:58 am

Before studying Mongolian the most frustrating script I had come across was the Sinitic (Chinese characters). This is well described in recent posts on Sinological suffering.

I am not a Mongolist, but as a sporadic learner of the old script for almost a decade, I can honestly say it is one of the most maddening, frustrating, time-wasting, hair-pull-outing scripts I've come across (as well as being one of the most fascinating and beautiful).

In essence, you need to know Mongolian in order to be able to read it comfortably. The problem with the transcription is what Juha Janhunen described: "it uses information from both the written image and the underlying spoken language (or its modern forms)". Unfortunately that is also how dictionaries are arranged. If you don't know a word and its pronunciation, you can waste inordinate amounts of time looking it up in the dictionary because often there are several places where you could look.

The BJR system appears to solve this problem by ignoring the underlying pronunciation. Instead, it represents only the shape of the glyph, regardless of the pronunciation. Of course, this leads to another problem — the same problem as the reader experiences when reading the old script: it's not possible to know for sure how letters should be read. For example, juv above could be read either jon or jun. You only know it's jon from real-world knowledge of how it's supposed to be read.
Jichang Lulu said,

April 15, 2017 @ 9:05 am

A very interesting post indeed. As Prof. Janhunen says, this is one of the situations that illustrate the need for a true transliteration system for Mongolian. The transcription system common in the Mongolian studies literature, already one-to-many when restricted to the native vocabulary in modern, standard spelling, becomes a complete shambles when different periods, loanwords and foreign words are included: a string in traditional script can correspond to multiple Romanised strings, which in turn can correspond to multiple Mongol script strings. I'm having a hard time trying to think of an established transcription system in some other field of study that fails so spectacularly at one of its basic tasks.

For example, the last of Bathrobe's quoted transcriptions is ᠲᠧᠷᠠᠮᠫᠦ᠋ , which he transcribes terampü using the 'Mongolist' system (sometimes named after some of the scholars who created its more systematic versions). That transcription is correct as far as I can tell, but it requires the transcriber to choose the values of the two last vowels, underdetermined in the traditional script. Bathrobe's choice is informed by the knowledge that the name follows a Chinese model, even though it contains elements not present in the Chinese source. That background knowledge can be missing when dealing with more obscure names than Trump's, warranting such transcriptions as terampu or perhaps even terempü. At the same time, all these potentially legitimate transcriptions of the same string ᠲᠧᠷᠠᠮᠫᠦ᠋ would be compatible with other strings in the traditional script, such as ᠲᠠᠷᠠᠮᠫᠤ. Imagine a situation where such a string refers to a more obscure entity than Donald Trump occurring e.g. in the title of a book and you need to find it in a library catalogue. The Sinologist's blues in a recent LL post by Brendan O'Kane are nothing in comparison. After all, that involved just Pinyin and (broken) Wade-Giles. Although many Romanisations of (Mandarin) Chinese exist, I'd wager you can navigate most 20th-century Sinological literature knowing just Pinyin, Wade, EFEO, Pallady and maybe a few others, like E. Bruce Brooks' system, or the one that gave us Ts[']ingtao. At least those systems are individually one-to-one. On the Mongolian side of things, you have a literary tradition that involves multiple scripts to begin with, such as various stages of the traditional Uyghur-derived script, Cyrillic, Latin at some point, two scripts named 'square' ('Phags-pa and Zanabazar's), each with a number (possibly zero) of Romanisations. It doesn't help that the most popular Romanisation of the longest-lived of those scripts is a many-to-many relation.

As Janhunen says, a solution in the form of a decent transliteration exists, BJR (Balk-Janhunen Romanisation). It's described here by its creators, and shown in action in Balk's transcription of a few stanzas from the Mongolian Udānavarga 法句经. At least when restricted to attested Mongol script forms of Trump's name, BJR establishes a one-to-one correspondence between Mongolian and Romanised strings.

Perhaps advances in Mongolian script display and input will make transliteration systems such as BJR or Wylie obsolete, but I don't see that happening yet. I think Bathrobe has written about issues with Mongolian computer input on his blog. It's not just that several environments still can't properly handle traditional Mongolian text. The very paradigm chosen for encoding Mongolian Unicode shares many of the flaws of the 'Mongolist' Romanisation, which seems to have inspired it. In particular, I'm not sure if the Mongol script forms I'm using in this comment will display correctly. Some are copy-pasted, others written with my home-made input method. So, for the time being, the need for BJR is still there.

As a modest contribution to the Trump name collection, here's a minor variation on the Chinese-inspired item in Bathrobe's list: ᠲᠧᠷᠠᠮᠫ (BJR teramp – any errors in the BJR are mine, rather than B or J's). It comes from the Mongolian-language Xinhua outlet news.cn (新华网 Xīnhuá wǎng ᠰᠢᠨᠬᠤᠸᠠ ᠨᠸᠲ sivquwa net).

Even counting the Mongolian Cyrillic and traditional forms together, this is still less than the seven Tibetan names of Trump I quoted in comments to posts in this series:

ཊི་རུམ་ཕི་ Ti rum phi
ཐི་རུམ་ཕུ་ thi rum phu
ཁྲོང་ཕུ་ khrong phu
ཁྲོན་ཕུ་ khron phu
ཊོམ་ཕུ་ Tom phu
ཊོམ་ Tom
ཊམ་ Tam
Jichang Lulu said,

April 15, 2017 @ 9:36 am

As a minor addition to Bathrobe's comment on the role of the soft sign ь in the Mongolian Cyrillic script, it indicates palatalisation, which in Khalkha can show an effect on both preceding vowels and consonants. In particular, in certain positions the spellings л and ль can reflect different Khalkha pronunciations of the consonant, corresponding to contrasting phonemes: Khalkha (unlike perhaps other varieties) does have two laterals, one plain and one palatalised. The full picture of Mongolian palatalisation is too complex to describe in a few sentences, but an important factor are dialectal variations. The Khalkha situation (for consonant and vowels) is explained in papers by Svantesson and in the Svantesson et al. Mongolian phonology (in the Oxford phonology-of-stuff series). Janhunen's book Mongolian has a cross-dialectal treatment of palatalisation.

This doesn't seem to contradict Bathrobe's analysis of the ь in Дональд in the revised form of his post or in his comment above.
Bathrobe said,

April 15, 2017 @ 6:52 pm

@ Jichang Lulu

Thanks for the addition (ᠲᠧᠷᠠᠮᠫ)! Searching for terms in the old script is hindered by the fact that many sites don't use Unicode. BTW, all your Mongolian script renders fine on a Mac, but might not on some other systems (e.g., some versions of Android).

I don't think that ᠲᠧᠷᠠᠮᠫᠦ᠋ could be interpreted as terempü. Firstly, there is a standard list of transliterations from Chinese, and ᠧ is used for pinyin 'e' while a single tooth is used for pinyin 'a'. Moreover, ᠧ for 'e' is used only in foreign words, and it could be expected that two 'e' would be uniformly written as ᠧ. Since the second vowel is not written ᠧ, then I think it can only be interpreted as ᠠ 'a'. Even though the representation of Trump is a hybrid form, mixing transliteration conventions for Chinese and English, I don't think that kind of inconsistency in representing vowels would be tolerated.

With regard to the point that "in certain positions the spellings л and ль can reflect different Khalkha pronunciations of the consonant, corresponding to contrasting phonemes", I'm not totally sure of the phonemics here. My gut feeling is that the palatalisation occurs as a 'bundle', that is, both the vowel and the consonant are affected — one would not occur without the other — but I'll have to listen more closely.
Bathrobe said,

April 15, 2017 @ 7:50 pm

@ Jichang LuLu

"The very paradigm chosen for encoding Mongolian Unicode shares many of the flaws of the 'Mongolist' Romanisation, which seems to have inspired it."

I'm not sure of the provenance of the current Unicode encoding, but it is cumbersome and hard to implement. One of the biggest problems is that it encodes individual letters. The Mongolian script is traditionally studied as a syllabary, and I think the Menksoft proprietary input system in China treats it that way, with code points for each syllable. Despite this, whoever came up with the Unicode encoding (and I think it was the Mongolian side) decided to treat it as an alphabet. This opens a huge can of worms when combining letters. In order to get the correct forms, it's necessary to use a number of special symbols like MVS, FVS1, FVS2, FVS3, and non-breaking spaces. (For instance, MVS must be used to separate ᠠ and ᠡ when they are written as a 'tail' separate to the word. Getting ᠲ and ᠳ to render properly in foreign words can also require the use of special symbols.) If the Mongolians got their idea that the Mongolian script could be encoded as a simple alphabet from the Mongolists, then the current mess is indeed due to them.

Another problem with Unicode is that certain letters not distinguished in the old script are distinguished in Unicode. The letters о, у, ө, and ү are distinguished as ᠣ, ᠥ, ᠤ, and ᠦ respectively (although the difference is not showing up here), even though the differences between о and у, and ө and ү respectively never show up in the graphic surface form and a choice between them is determined solely by the pronunciation. Given that there are words that are in free variation between ө and ү (in fact, Mongolia has generally standardised one form, but this is a result of the introduction of the Cyrillic script and the desire for standardisation), the requirement that ө and ү should be distinguished in the Unicode encoding of the traditional script makes no sense at all. In actuality, most websites using Unicode — which are virtually all in China — ignore the difference and use them interchangeably.
Jichang Lulu said,

April 16, 2017 @ 7:06 am

@Bathrobe

Yes, terempü would have been hardly plausible, as it would map (BJR) and to the same transcribed letter. That's why I qualified that option with 'perhaps even'. My point was that the choice of the best 'Mongolist' Romanisation requires a lot of knowledge about the item being Romanised, in this case the informed assumption that the writer knows an English source word, a Chinese source word, and is likely to follow usual Chinese > Mongolian transcription rules, but only for the vowels. Your transcription certainly uses that background information to produce an optimal Romanisation, but such knowledge may not be available when dealing with more obscure items.

I think the general idea is that, in contexts with 'back'/'masculine' harmony, Khalkha has a number of plain-palatalised consonant phoneme pairs, including /l, lj/, while in Khorchin the phonological palatalisation is taken over by the vowels, creating new vowel phonemes. (This doesn't necessarily imply that every Khorchin dialect will have more vowels than Khalkha, because other mergers can occur – e.g. between palatalised vowels and diphthongs.) I'd say the Cyrillic spelling generally reflects this type of analysis for Khalkha, in that the palatalisation is marked on the consonant, using ь and other devices (but perhaps unmarked if non-contrastive, after a diphthong). In that sense, л and ль represent different consonants. While that phonological analysis could be questioned (especially in variants that are neither the Khalkha standard nor Eastern Inner Mongolian dialects), at a phonetic level the distinction is also there, specifically between the plain and palatalised l, which are pronounced differently. On the other hand, as you say, palatalised consonants do affect (especially preceding) vowels (which is also the case in Russian, although the details are different), so at the phonetic level it's not possible to say palatalisation is a phenomenon restricted to either consontant or vowels. My best attempt at an example of purely consonantal palatalisation would be a case where /lj/ is preceded by a neutral vowel, as in байгаль, as opposed to e.g. байдал. That's an 'attempt', in that I don't really know what the conclusion would be. In a Khalkha pronunciation, I don't think there's any difference in the vowels, and I should also listen more closely, should the chance arise, to see if the final consonants are actually different, or if that distinction is somehow neutralised there. (The difference certainly is there when applying case endings.) The example doesn't work in an Inner Mongolian pronunciation, where neither word has palatalisation. The байгаль/байдал example could be relevant to the question of whether there would be any audible difference between your Дональд and *Доналд, to the extent there's any between the final bits of байгальд and байдалд. At any rate, of course such a hypothetical difference isn't the reason the transcription was chosen in the first place.

When I referred to 'flaws' in Mongolian Unicode, I was thinking about the artificial distinctions between o and u, t and d etc, which you also complain about. They mean that visually identical strings, in the same language and standard script, are composed with different characters. Perhaps I was mistaken in blaming the 'Mongolist' transcription for inspiring such a decision, which, as you say, could have involved a background in the Cyrillic script or the traditional study of the Mongol script, that indeed involves reading words aloud pretty much as spelt in Mongolist Romanisation. At any rate, I think the decision was hardly the wisest, and it's already generating inconsistent use in the emerging Mongolian Unicode Internet. Here too, BJR provided, in my opinion, an ideal starting point.

A number of (especially older) Mongolian-language sites in China use other encodings which I see as walls of 乱码. Are those all produced with Menksoft input? Do you know if there's a way to decipher them?
Bathrobe said,

April 16, 2017 @ 8:53 am

I'm not tech-savvy enough to figure out the technology behind trad Mongolian websites. Frankly, I find the whole thing completely mystifying. Generally it's a good idea to try different browsers or a different OS (Windows, iOS, Android, maybe Linux), but even then some sites simply don't render properly.

I originally attached a few examples of the mystifying behaviour of sites but the post failed to appear. Possibly there were too many links.
Bathrobe said,

April 16, 2017 @ 9:01 am

You can get an idea of what is going on by viewing the page source. At times they will tell you that they use Menksoft. Many websites that use Menksoft (such as Xinjiang Broadcasting) render fine on a Mac. Others (such as Hailasi high school) use Menksoft and do not.

Burged uses something that looks like Unicode. It works fine on Safari (on a Mac) but is disjointed characters on Chrome.

The Khumuun Bichig site (Monsame) uses completely unfathomable strings of letters and numbers, but through some kind of magic they render fine on the page.
David Dettmann said,

April 16, 2017 @ 11:20 am

I share your frustration and fascination with Mongolian encoding/transcription issues! I happened upon a some information just yesterday that may be of interest to you… it is a page of advice on including unicode Mongolian text in web pages, and issues pertaining to embedding fonts and maintaining "correct" indexing. Apparently text can be displayed properly with "incorrect" spelling thus making search engine queries difficult or impossible. Here is that website, FYI: http://www.cjvlang.com/Writing/writmongol/tradmononsite.html
Bathrobe said,

April 16, 2017 @ 6:07 pm

@ David

Yes, that is from my website, and that particular problem is a direct result of what Jichang Lulu mentioned above: operating within Unicode, you are supposed to input characters according to their actual pronunciation. In indexing pages Google goes strictly by that requirement.

For instance, if you Google ᠲᠧᠷᠠᠮᠫᠦ (tErampü) you will get one set of results. If you Google ᠲᠧᠷᠡᠮᠫᠦ (tErempü) you will get a different set of results. If you Google ᠳᠧᠷᠠᠮᠫᠦ (dErampü) you will get no results at all.

Whether you use ᠡ (e) or ᠠ (a), what you see on the page is exactly the same (although for vowel harmony reasons the choice of vowel can impact on the shape of the preceding or following consonant). Whether you use ᠳ (d) or ᠲ (t), what you see on the page is exactly the same. But for the search engine, indexing is carried out based on the input used to achieve the graphic shape.

Quite clearly, many of the people who are actually inputting the content on these websites are not following this requirement, with the result that ᠲᠧᠷᠠᠮᠫᠦ, ᠲᠧᠷᠡᠮᠫᠦ, and ᠳᠧᠷᠠᠮᠫᠦ appear in different sets of search results.
Bathrobe said,

April 16, 2017 @ 6:37 pm

I neglected to mention when replying to Jichang Lulu above: I've contacted Menksoft in the past and was told that there was absolutely no prospect of the Menksoft input system being ported to Macs since it uses proprietary Windows technology. However, Menksoft do have technology to render Menksoft-input pages on different platforms. Major sites appear to be using (and paying for) that technology. Presumably high schools are not.
Jichang Lulu said,

April 17, 2017 @ 2:38 pm

Dettmann's link feeds back into Bathrobe's site, which the OP was all about linking to, but is no less useful. I second the recommendation. Besides Unicode input and display, Bathrobe's blog has covered many interesting issues related to Mongolian, such as names of international organisations in Mongolian, where to buy Mongolian books or Inner-Outer vocabulary differences. Certainly worth following.

@Bathrobe
Wordpress will sometimes swallow comments with too many links (more than two?). Sometimes it won't. I don't think I understand the criterion, which is unsurprising, since the spam filter is designed to confound Markov algorithms far smarter than me. If you keep the links to traditional-script sites you meant to post, maybe you could put them somewhere else, such as in a comment to your Trump post?
Bathrobe said,

April 17, 2017 @ 5:18 pm

I rewrote my comment here and at my quasi-blog but it keeps getting rejected. The problem is not links; it is content that I quote from the page source.

At any rate, since the comment has now been completely lost, I can only urge you to try some of the websites at my page on Websites using the Traditional Mongolian Script. To attempt to figure out what is going on, go to the page source and look at the underlying code (some browsers make this easier than others).

Some pages (like the Xinjiang radio page) make it clear in the source that they are using the Menksoft content management system (see the Menksoft link in an earlier comment). Others don't but are probably using it.

Some pages, like the Beijing Mongolian language and culture classes, don't seem to render properly on any kind of system. They just seem to be broken.

Some pages like Burgud are in Unicode but don't render properly in Chrome on a Mac. They do, however, render fine on Safari. This is a browser issue and changes from time to time (sometimes it's Safari that does the poor job).

Some pages (I don't remember which) render on Windows and iOS but not on a Mac.

The Khumuun Bichig site and the Mongolian president's site use some kind of software to convert a strange mixture of numbers and letters into Mongolian script. This is most likely a technology provided by a Mongolian company.

Basically, you simply have to try the page on different browsers and different platforms.
Bathrobe said,

April 17, 2017 @ 5:33 pm

I neglected to mention that the page on the Menksoft content management system features a graphic demonstrating that the system can convert Menksoft encoding to Unicode. That is the secret behind these sites rendering properly on Macs (and possibly Linux).
Jichang Lulu said,

April 18, 2017 @ 1:27 pm

Thanks for the link to your link list. It's a very useful resource.

Some Mongolian-language newspapers are available on the web, e.g. on local subdomains of 蒙古新闻网 (which your link includes). I remember managing to read such local papers by locating pdf's in what was otherwise displaying as cascades of mojibake.
Bathrobe said,

April 18, 2017 @ 4:39 pm

It's far from complete. I'm still adding (and deleting) pages

RSS feed for comments on this post

Mongolian transliterations of Donald Trump's name

21 Comments

John said,

Victor Mair said,

David Marjanović said,

Bathrobe said,

Adrian said,

Bathrobe said,

Jichang Lulu said,

Jichang Lulu said,

Bathrobe said,

Bathrobe said,

Jichang Lulu said,

Bathrobe said,

Bathrobe said,

David Dettmann said,

Bathrobe said,

Bathrobe said,

Jichang Lulu said,

Bathrobe said,

Bathrobe said,

Jichang Lulu said,

Bathrobe said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta