Transletteration

« previous post | next post »

A friend in Taiwan sent me the following inquiry:

From an article in the NYTimes:

"Early Thursday, the attackers sent out a wave of spam under the name Cyxymu, which is a Latin transliteration of the Cyrillic name of the capital of Abkhazia, Sukhumi."

By which is meant that Latin Cyxymu is a "transliteration" of Cyrillic  Сухуми (in italics С у х у м u ) .

I think that this is an improper use of the word "transliteration" (to refer to "Sukhumi" as a transliteration of Cyxymu, however, would be correct), but I don't know what to call this rendering of Cyrillic Cyxymu as Latin "Cyxymu".


When I asked my LLogger colleagues their opinions, I received the following suggestions:

Arnold Zwicky:  maybe it can be called "cross-alphabet transfer"

David Beaver:  transletteration

Benjamin Zimmer:  "Volapuk encoding," via The Tensor

And Mark Swofford weighed in with this:  There are probably discussions out there already on how the Russian word for restaurant appears (when written in all caps: "PECTOPAH" — sorry, not typed in real Cyrillic) to those who know the Latin but not Cyrillic alphabet as if it were a word pronounced "pectopah". Of course, that's just coincidence, not any sort of intentional letter play (at least on the part of the Russians).

This reminds me of a couple of faux amis that I have encountered in Pinyin Mandarin:

1. A Chinese friend of mine saw a sign near our home that depicted the tracks of a railroad followed by this word XING ("crossing"), which he read as 行  ("go")!

2. When I first went to Beijing in 1981, some naughty female clerks in a curio shop at the Temple of Heaven tried to embarrass me by showing me the word FUXING written on a piece of paper and asking me what it meant in English.  I played dumb and insisted that it was a Chinese word (復興) meaning "rebirth, renaissance, resurgence," etc.

My examples are not the same as the Cyxymu or Volapuk phenomenon, but all of these things seem like faux amis to me.



55 Comments

  1. dr pepper said,

    August 11, 2009 @ 2:56 pm

    there's a longstanding practice of using `u' and `v' to stand in for mu and nu. Is there a word for that?

  2. Mark P said,

    August 11, 2009 @ 3:06 pm

    @dr pepper – lack of the correct font? Not so much a problem these days.

  3. J. W. Brewer said,

    August 11, 2009 @ 3:21 pm

    A traditional way of representing Greek in our own alphabet from the dark ages before it was easy to use Greek letters on the internet is described here: http://www.ibiblio.org/bgreek/transliteration.txt. This is the weird-to-the-eye one that uses W for omega & Q for theta, so that there's character to character equivalence but complete arbitrariness in terms of suggesting pronunciation to anyone who doesn't know the code. But it seems fair to call it a transliteration system.

  4. John Chu said,

    August 11, 2009 @ 3:23 pm

    romanization?

  5. Faldone said,

    August 11, 2009 @ 3:24 pm

    You'll still see u for μ and probably v for ν, although I don't have much call for that one myself. It's a lot easier than whatever hoops you have to jump through to get the Greek letters. One problem I would have for Cyxymu for Сухуми is that, as I remember, the m in Cyrillic cursive is used for our t.

  6. Jennifer Ede said,

    August 11, 2009 @ 3:41 pm

    Funny!

    Many of my Russian friends would "transliterate" by finding the letters on their keyboard most resembling Cyrillic. And when they couldn't find a letter resembling it, they'd choose a number – 4 for the Russian "ch".

  7. michael farris said,

    August 11, 2009 @ 4:02 pm

    I'd call it script mimicry.

    4 is also used in a lot of adhoc internet romanization for Bulgarian 'ch' [tS] and w is sometimes used for 'sh' [S] and q for 'ja' [ja] and 6t for 'sht' [St] a single letter in the Bulgarian alphabet.

  8. Fluxor said,

    August 11, 2009 @ 4:05 pm

    There are some that insist that the Chinese word 餐廳 (cāntīng) meaning 'restaurant', 'cafeteria', or 'dining room', comes from the English word 'canteen'.

  9. Nathan Myers said,

    August 11, 2009 @ 4:16 pm

    I guess from the title of the post that David Beaver's suggestion wins.

  10. Philipp Angermeyer said,

    August 11, 2009 @ 4:17 pm

    Russian-English bilinguals in the US love to use this technique to create personalized "Cyrillic" license plates. It's the subject of multiple webpages, here are two:
    http://www.tanyakhovanova.com/RussianPlates.html
    http://www.flickr.com/groups/homepa/pool/
    Another environment where this occurs is 1-800 numbers, such as 1-877-KPACOTA (krasota = beauty, for a plastic surgeon; BTW I've collected and discussed such examples in my 2005 paper in Language in Society on script choice in written Russian-English codeswitching)

  11. Dan T. said,

    August 11, 2009 @ 4:33 pm

    And there's the fake-Roman initials for the old Soviet Union, "CCCP". I seem to recall that the Commodore 64 BBS community at one point created a protocol called something like "Commodore-to-Commodore Control Protocol" so they could abbreviate it CCCP, since Commodore buffs were sometimes known as "Commies".

  12. michael farris said,

    August 11, 2009 @ 4:40 pm

    Outing my horribly lowbrow tastes … I remember sometime in the 80's watching Professional Wrestling and a couple of "Russian" heels were baiting the crowd with the usual coldwar rhetoric (designed to unite the crowd in chanting "USA! USA!"). Anyway, at one point one of them made a reference to the "see see see pee".
    I don't know how many in the audience got that but it warmed my heart for some reason.

  13. Lazar said,

    August 11, 2009 @ 4:45 pm

    Ooh, does this give me another venue to complain about "MY BIG FAT GRSSK WEDDING"?

  14. michael farris said,

    August 11, 2009 @ 4:52 pm

    vai

  15. st_martyne said,

    August 11, 2009 @ 5:01 pm

    It was, in fact, Peter the Great, who (in the beginning of XVIIIth century) has sanctioned the change in the Cyrillic alphabet to bear even more resemble to Greek and Latin. Have it retained the more archaic form used previously, this kind of confusion could have been avoided. But the world would have become a duller place then too, wouldn't it? ;-)

    As for the term, I think "transliteration" is fine enough. The system for the conversion can be of any kind. So let's call it "official transliteration" for the specific type of phonetic conversion, and use "transliteration" as a more general term. "Volapuk" is more of a "l33t", then "transliteration". Although the possibility of using punctuation marks and other common symbols for transliteration purposes can also be considered.

  16. greenlight said,

    August 11, 2009 @ 5:37 pm

    I've seen the same thing done for arabic. I don't read arabic myself, but I've seen arabic speakers input an unreadable mess of letters and numbers on western mobile phones in some kind of "transletteration". Some cursory googling doesn't turn up any examples but I'm sure I've seen it online too.

  17. ø said,

    August 11, 2009 @ 6:04 pm

    Ooh, does this give me another venue to complain about "MY BIG FAT GRSSK WEDDING"?

    Oh, no! Pet peeve! The word "venue"!

  18. Richard M Buck said,

    August 11, 2009 @ 6:11 pm

    If I'm sending email in Greek, I usually use the Greek alphabet, but even in this day and age that can come out as garbage due to different software using different encodings; my Greek teacher said that she, and most people she knows, still just use 'Greeklish'. People use a variety of different compromises between transliteration and appearance: Εξηγησή 'explanation' could wind up as exiyisi or exigisi or e3hghsh. Or pretty much anything that you think the other person will understand, to be honest, but the last one of those is closest to the Cyxymu example.

    Incidentally, it's an interesting implicit nod to the dominance of English that writing Greek in Latin letters should be known as Grreklish…

  19. Mr Fnortner said,

    August 11, 2009 @ 6:21 pm

    Two decades ago (or more) the late humorist and Atlanta Journal columnist Lewis Grizzard visited Moscow and wrote in his column that everywhere he turned all the signs seemed to say KAOPECTATE. An anxious traveler could easily draw this association, especially in Cyrillic.

  20. fs said,

    August 11, 2009 @ 6:49 pm

    I'd use "approximation" here, since it is not really intended as an accurate transliteration or romanization.

    Early Thursday, the attackers sent out a wave of spam under the name Cyxymu, which is an approximation in the Latin alphabet of the Cyrillic Сухуми (Sukhumi), the name of the capital of Abkhazia.

    Or if the typesetter didn't have Cyrillic at their disposal,

    Early Thursday, the attackers sent out a wave of spam under the name Cyxymu, which is an approximation in the Latin alphabet of the Cyrillic spelling of the capital of Abkhazia, Sukhumi.

  21. tisoi said,

    August 11, 2009 @ 6:52 pm

    I asked this question in the linguaphiles community on LiveJournal. A Ukrainian and a Russian call it "visual transliteration."

    Link: http://community.livejournal.com/linguaphiles/4655115.html

  22. codeman38 said,

    August 11, 2009 @ 8:09 pm

    Similar to the Grizzard example, humor columnist Dave Barry, when visiting Greece, thought that all the signs said TIPIYOTKI.

  23. Q. Pheevr said,

    August 11, 2009 @ 8:38 pm

    Well, we could call it TPAHC-literation (at least for the specific case of Cyrillic-to-Roman.
    More seriously, though, I'd be inclined to see it as a specific type of transliteration—one out of (at least) three:

    sound-based transliteration (the usual sort, as in Сухуми → Sukhumi)
    shape-based transliteration (the kind discussed in this post, as in Сухуми → Cyxymu)
    position-based transliteration (the kind discussed by Barbara Partee in an earlier Language Log post, as in дневник → lytdybr)

  24. Ran Ari-Gur said,

    August 11, 2009 @ 10:16 pm

    I've also heard the Israeli lottery logo:

    http://www.global-lottery-review.com/images/IsraelLotteryLogo.jpg

    described as "iGiS". (It's actually לוֹטוֹ, /loto/, in stylized cursive.)

  25. Ray Girvan said,

    August 11, 2009 @ 10:21 pm

    And I guess we mustn't forget "L-ya-ee-old Ssnwlyaze-ee-egge-ya" in the credits at 3:48 here for the movie Ya-ed Nelt.

  26. Dan T. said,

    August 11, 2009 @ 11:12 pm

    A book up on my shelf is titled "Some of my best jokes are Jewish!", written in a quasi-Hebrew-ish font in which the "w" of "Jewish" resembles the Hebrew letter "shin".

  27. John Cowan said,

    August 11, 2009 @ 11:20 pm

    Richard M Buck: It can also be called "frangovlakhika".

  28. Franz Bebop said,

    August 12, 2009 @ 1:40 am

    Cyxymu is a transliteration, it's just a transliteration using different rules.

    There's more than one way to translate — you can translate using different styles. Even a bad translation is still a translation. There's more than one way to transliterate, too.

  29. bocaj said,

    August 12, 2009 @ 4:26 am

    @Fluxor

    you'll also get people who insist that 'casino' comes from the Chinese 开始咯

  30. Tom Saylor said,

    August 12, 2009 @ 4:26 am

    A variation on the PESTOPAH example:

    My old classics professor (who specialized in ancient Greek philosophy) had vanity license plates that read APETH. People were always asking him what "apeth" meant.

  31. Troy S. said,

    August 12, 2009 @ 4:48 am

    We might as well bring out the old classics Toys Ya Us and KLA automobiles, and the unfortunately titled Billy Joel album Kohuept.
    I also recently noticed Tribe brand Hummus' logo has the initial T stylized to have the shape of a final ح as if it were being right to left.
    This is probably more a marketing gimmick to make it seem more exotic, but I'm not sure how many people actually appreciate it.

  32. Ginger Yellow said,

    August 12, 2009 @ 4:51 am

    " There are probably discussions out there already on how the Russian word for restaurant appears (when written in all caps: "PECTOPAH")"

    My first Russian teacher named his dog Pectopah.

    The '4' for 'ch' thing is new to me, but I like it. Not only does 4 look a bit like 'ch', at least how it's sometimes written, but 'ch' is also the first letter of 4 in Russian.

  33. Jon said,

    August 12, 2009 @ 5:29 am

    When I was in Athens, I was surprised to see that all the car number-plates used the Roman alphabet. I asked a Greek friend about this, and he replied "They don't – they use the Greek alphabet". I then realised that what the authorities had done was limit the number-plates to those letters identical in Greek and Roman capitals.

  34. Rolig said,

    August 12, 2009 @ 6:21 am

    Chekhov famously played with such transalphabetic similarities in "Three Sisters" when he introduced (and perhaps coined) the word реникса (reniksa). The character Kulygin tells a story about how a teacher wrote the word "чепуха" (chepukha, "nonsense") on a student's paper, but the student, thinking it was Latin, read this as "renyxa". Since then, the word "реникса" seems to have entered Russian popular culture at least to some degree, judging by the Google hits (around 2300), one of which is a website devoted to debunking religion and superstition. Now we just need to start using the word pehukca to complete the circle. (In fact, a Google search shows that "pehukca" is already being used by someone as a user name. No big surprise.)

  35. Rolig said,

    August 12, 2009 @ 6:27 am

    Actually, I should clarify that the "Реникса" website I mentioned is the transcript of a book by A.I. Kitaygorodsky published in Soviet times (1973).

  36. MBM said,

    August 12, 2009 @ 7:19 am

    Transliteration is not a good term for this and neither is romanisation.

    One term which is used in the Unicode community is "script spoofing", the abuse of visual similarity between what logically are separate characters. It happens when, for example, a spammer fools you into navigating to a spoof website by making you click on "citibank.com" when the first character is not the Latin 'C' but the Cyrillic 'K'. The difference is invisible to people but visible to computers because the two characters are represented by totally different numbers.

    More about script spoofing here:
    http://unicode.org/reports/tr36/

  37. Aaron Davies said,

    August 12, 2009 @ 8:57 am

    @Dan T.: these days, CCCP often refers to the Combined Community Codec Pack, a set of video codecs. Given that they're mostly used to play pirated content, I imagine the allusion to communism is deliberate.

  38. Boris said,

    August 12, 2009 @ 9:37 am

    I am not sure pirating was the motive for this being called CCCP. It may have something to do with the fact that supporting the "Mtryoshka" format is one of the main reasons to download it.

    I wonder if the banking institution "West Bank", where my brother has an account, is intentional.

  39. language hat said,

    August 12, 2009 @ 10:55 am

    I came here to say what Rolig did about чепуха/renyxa, but since he got there first, I'll mention that a parallel phenomenon is лытдыбр or lytdybr, from дневник (dnevnik, 'journal, diary'), as if typed on a QWERTY keyboard rather than a Cyrillic one, which is (or was, it may have fallen out of fashion) a common bit of slang on Russian blogs.

  40. Bloix said,

    August 12, 2009 @ 11:56 am

    I'm aware of two Israeli corporate logos that are designed to work in both the Hebrew and Latin alphabets. One is the logo of the Israeli shipping line, ZIM, which looks like the letter Z in the center of of a rectangle, and is also a stylized spelling in Hebrew of the full word "Zim" – the logo is visible on this shipping container –
    http://www.matts-place.com/intermodal/part1/images/zim40white.jpg

    The other is the logo for the Lily brand of consumer paper products manufactured by Hegla-Kimberly Ltd – the word Lily is left-to-right in English and right-to-left in Hebrew.
    http://www.gabbay.org.uk/photos-israel/israel-typography-photodir/lily-(andrex).jpg

  41. Kate G said,

    August 12, 2009 @ 12:01 pm

    Isn't Xmas a (very old) example of this, with X standing in for Chi?

  42. language hat said,

    August 12, 2009 @ 1:01 pm

    Excellent example!

  43. J. W. Brewer said,

    August 12, 2009 @ 1:03 pm

    For transliteration-by-coincidental-shape, there's also the upside-down-calculator script (e.g. 7734 upside down reads "hELL"), which I remember from junior high school three decades ago. A state-of-the-art summation (including a revival of the lost letter eth!) can be found here: http://www.langmaker.com/calculatorwords.htm. I suppose this might be an ancestor of l33t.

  44. Maria said,

    August 12, 2009 @ 1:11 pm

    In general, the rendition of Russian words using the Latin alphabet is called "translit." An example of that would be "Sukhumi" for Сухуми. The mapping rules for this are a bit of a mess, which makes reading translit difficult for fluent readers of Russian Cyrillic script. The practice is even frowned upon on certain websites, where discussion posts in translit are banned.

    Using a mix of Roman letters and numbers to render Cyrillic-looking text is sometimes known as "gejmerskij jazyk", or "геймерский язык" (gamer language). It is the Russian equivalent of l33t speak. When used on license plates, it's a kind of immigrant in-joke.

  45. Lph said,

    August 12, 2009 @ 4:28 pm

    The "wrong keyboard layout" idea mentioned by language hat ("лытдыбр") reminded me of the captions on the "Borat" segments of Da Ali G Show, which are Cyrillic gibberish created by typing English into a Cyrillic keyboard layout. More info here and here, if anyone's interested.

  46. mollymooly said,

    August 12, 2009 @ 6:43 pm

    Linguists will know SAMPA and Kirshenbaum, ASCII transliterations of the International Phonetic Alphabet where the correspondences are based on [often tenuous] glyph similarity.

    Actually, that inspires me to coin "transglypheration".

  47. mollymooly said,

    August 12, 2009 @ 6:45 pm

    I'm nearly sure I heard Sky News' first Moscow correspondent, c.1990, actually saying "Russians call it Mockba".

  48. codeman38 said,

    August 12, 2009 @ 6:55 pm

    Also related to "lytdybr": In the movie version of The Bourne Identity, Jason Bourne gets a fake Russian passport with the name Foma Kiniaev. Or, as it's written in Cyrillic on the passport, "Ащьф Лштшфум" (Ashchf Lshtshfum). Yep, that's right: they just typed "Foma Kiniaev" on a QWERTY keyboard with the system keyboard layout set to Russian.

  49. Liz said,

    August 12, 2009 @ 9:21 pm

    I would add Saudi Arabia Basic Industries Companies, Sabic to this collection. see the logo at http://www.sabic.com.

  50. Craig Russell said,

    August 13, 2009 @ 9:36 am

    Every time I see Judd Apatow's name in print, the Ancient Greek recognizing part of my brain thinks for a moment that it's the Greek verb απατάω (although this involves substituting a for o, and accepting the equivalence of w and similar looking omega. Anyway, the verb means deceive, so at first it seemed a little humorous to me.

  51. Karen said,

    August 13, 2009 @ 1:27 pm

    Mohymehtajibhar Nponarahia

    That's how the title of a Russian novel translated into English was rendered by the publisher (Knopf) on the copyright page, as "originally published as…".

    The book? Монументальная пропаганда, more usually transliterated as Monumental'naya Propaganda.

    The Cyrillic Н (N) becoming H is simple, and the Я (YA or JA) becoming R is commonplace, but the Г (G) also becoming R is neat. I think my favorite touch of it is the Cyrillic Л (L) becoming JI, but the Д (D) losing its curved downstroke (which can be quite thin in some fonts) to become I is also nice.

  52. mossy said,

    August 14, 2009 @ 2:41 pm

    I vote for script mimicry or script spoofing or transletteration. .

    May I offer an obscene transliteration example? Years ago I worked in an organization with an old fashioned telex machine, the kind that clucked and spewed out tape (which I loved madly; I always felt like I was in a 1940s movie). We finally got a computerized machine, which had very complex and non-intuitive commands to save, send, clear screen, delete, etc. To totally delete a letter you had to hit control and type K H U Y. It didn't make any sense until we realized that the programmers were Russian emigres and had made a little joke — to get rid of something forever, you were sending it [na] khuy (an extremely rude version of "to hell" involving the interesting part of the male anatomy).

    Sorry if I've offended anyone, but it's for the sake of scholarly discussion.

  53. Rich Cheng said,

    August 18, 2009 @ 12:03 pm

    A couple of related links.

    uʍop-ǝpısdn ǝʇıɹʍ oʇ ʍoɥ: http://www.pacificit.ca/article/358

    And another: http://www.moillusions.com/2009/04/russian-cyrillic-optical-illusion.html

  54. language hat said,

    August 19, 2009 @ 9:40 am

    mossy: Thanks very much, I loved that story!

  55. Shii said,

    August 20, 2009 @ 5:49 pm

    When used in Greek this is called Greeklish.

    http://en.wikipedia.org/wiki/Greeklish

RSS feed for comments on this post