Terrorist speech recognition?

« previous post | next post »

According to Praveen Swami , "Terror mail analysis supports claim of Lashkar authorship", The Hindu, 12/1/2008:

Close textual analysis of a document issued by an until-now unknown terrorist group just after the recent massacre in Mumbai appears to vindicate claims by Indian intelligence experts that the document was generated by a non-Hindi speaker, using voice-recognition software.

For one, a series of spelling errors mar the Hindi-language text, typed in the Devnagari script, which was issued by a group calling itself the Mujahideen Hyderabad Deccan — a fictitious group, investigators now say, invented to distance the Pakistan-based Lashkar-e-Taiba from the attacks.

Hindi-language voice-recognition software, though commercially available, is at a development stage and often registers incorrect spellings. In the document, the word silsila, or incidents, is spelled with the wrong matras, or vowel markings. The word chetaavani, or warnings, and zindagi, or life, are again spelt with incorrect matras.

Moreover, the name of the organisation Mujahideen Hyderabad Deccan. The phrase “Hyderabad Deccan” is frequently used in Pakistani comment to identify India’s southern plateau. It is, however, rarely used in this country.

The context here, as I understand it, is that Hindi and Urdu are different versions of the same language, more or less. Urdu, the national language of Pakistan, uses a writing system based on Arabic letters, whereas Hindi uses the Devanagari writing system. For more discussion, see "Language in Pakistan"; "Camp language"; "Scripts, scriptures, and scribes" — in the last-cited post, I quoted Bob King to the effect that "It is rare, except for scholars, for Hindi speakers to learn to read Urdu script or for Urdu speakers to learn to read Devanagari".)

So the idea seems to be that a literate Pakistani Urdu-speaker might be able to compose a manifesto in Hindi, but would not be able to write it accurately in Devanagari, and might therefore make use of speech recognition software, whose output he would not be able to proofread.

I have no idea whether this is a plausible argument or not. I'd expect that a responsible journalist, not to speak of a responsible intelligence analyst, would be able to cite more details about the behavior of particular releases of particular speech-recognition software packages in spelling silsila, chetaavani, and zindagi, rather than speaking in generalities about how "Hindi-language voice-recognition software, though commercially available, is at a development stage and often registers incorrect spellings". At least, it would be nice to know which vowel matras were wrong, and why it's hard for software to get them right.

Perhaps these details are available in the Hindi-language press in India — if you can find a discussion, or if you can find the original text of the cited document, please let us know in the comments.



39 Comments

  1. Oskar said,

    December 1, 2008 @ 10:33 am

    Seems implausible to me. This was a hugely complicated operation, probably months or even years in the planning, with lots and lots of people. Are you telling me that they couldn't chase someone down that could write in Devanagari? Or at least borrow Arabic-Devanagari dictionary (or whatever a book is called that helps translate one script into another) from the library? I mean, this is their manifesto after all, you'd think that it would be a crucial document. After all, this would be the document that tells the world why they're doing what they're doing.

    To just use voice-recognition to get the text down, which you can't proofread at all? This wasn't done in an afternoon, it seems incredibly unlikely to me that this is how they did it.

  2. Morten Jonsson said,

    December 1, 2008 @ 10:46 am

    Oskar, are you saying you can't believe terrorists wouldn't take time to do a good proofread? I work at a publisher. If you're right, I only wish more of our authors were terrorists.

  3. Mark Liberman said,

    December 1, 2008 @ 10:49 am

    Oskar: To just use voice-recognition to get the text down, which you can't proofread at all?

    Well, "can't proofread at all" would be putting it too strongly.

    Devanagari is a pretty transparent writing system — without knowing any Hindi/Urdu, I can puzzle out the pronunciation of a passage, I think, modulo some issues with deleted vowels. So a native speaker of Urdu should be able to learn to read Hindi text fairly easily, though slowly in the absence of practice. Someone with that background might possibly believe, or assert to superiors, that he can "read (and write) Hindi".

    If the stories about the responsibility of Lashkar-e-Taiba are true, the author (or translator, or proofreader) might also have been (say) a Punjabi native speaker who had learned Hindi in school.

    It would be easier to evaluate the plausibility of this whole story if we knew what the misspellings were, and which software systems (if any) are likely to commit them, and why. As I explained, this may all be set out in the Hindi-language press; on the other hand, this may be one of the many rumors apparently reported in the press as fact during the fog of the past few days, like the two (or four or eight) British subjects said to be among the arrested terrorists.

  4. dw said,

    December 1, 2008 @ 12:00 pm

    A couple of points to bear in mind:

    * If the terrorists really were from the "Deccan" in the south of India, they would probably not be native Hindi speakers.

    * Even among native Hindi speakers, a significant percentage are illiterate. Also, based on my own experiences in India, very many signs in commercial stores, markets, etc. are nonstandard Hindi (i.e. they contain "spelling errors")

    Given these caveats, there is a possible explanation. In standard Urdu script (which as you say is derived via Persian from Arabic script), the three short vowels "i" "a" "u" are not represented at all, and the three corresponding long vowels are all represented by the same character (there are disambiguation symbols available, but they are usually not used except in dictionaries and texts intended for children). In the Devanagari script used by Hindi, however, all six vowels are represented differently.

    If the text were first composed in Urdu script and then transliterated to Hindi (which seems to me a more likely option than using voice-recognition software), then errors in the three words quoted would all be possible. For example, in "silsila" the two "i"s are both short. Thus it is possible that the result of transliterating from Urdu could be something like "salsala" or "sulsala" (which reminds me, the Bollywood movie "Silsila" featuring Amitabh Bachchan and Rekha is one of my favorites!)

  5. dw said,

    December 1, 2008 @ 12:13 pm

    On another point: Hyderabad, on the Deccan Plateau in the southern Indian province of Andhra Pradesh, was historically a center of the Muslim faith, and it still retains many Urdu (as opposed to Hindi) speakers. (In fact it has its own subvariety, known as "Dakkhini" or "Deccani": http://en.wikipedia.org/wiki/Dakhni). Thus it is possible that the terrorists were native Urdu/Dakhni speakers, and that they were from the Deccan region of India rather than from Pakistan

  6. Jonathan Badger said,

    December 1, 2008 @ 12:17 pm

    "It is rare, except for scholars, for Hindi speakers to learn to read Urdu script or for Urdu speakers to learn to read Devanagari".

    Is it really that hard to master two writing systems? Plenty of ex-Soviet people such as Moldavians can read and write their language in the old official Cyrillic as well as in the reintroduced Latin alphabet.

  7. Mark Liberman said,

    December 1, 2008 @ 1:12 pm

    Jonathan Badger: Is it really that hard to master two writing systems?

    How about ˈtɹɐ.jɪŋ.jɚ.səɡˈdʒɛs.tʃɪn.bɐjˈɹi.ɾɪŋ.ˈɪŋɡ.lɪʃ.ɪnˌɐjˌpiˈej ?

    Learning to read and write the International Phonetic Alphabet is an exercise that phonetics students around the world (as well as opera singers and others) work on mastering, every year. And anyone who teaches it can vouch for the fact that it is not always perfectly learned — and that even people with quite a bit of experience often make mistakes in writing it quickly, and find it hard to proofread, and remain completely unable to read fluently an essay or story written even in dictionary-pronunciation IPA for their native language.

    To move from simple memorization of letter-sounds and spelling rules to fluent reading and writing takes many, many hours of practice, typically spread over months or years. There's no mysterious trick to it, it just takes time and diligent application. The claim made by Bob King and others — I can't personally vouch for it, but I trust his testimony — is that relatively very speakers of Hindi or Urdu invest the time and effort needed to become fluent readers and writers of the other community's language variety and orthographic system.

    An added complication, in this case, is that there are many other languages in the area under discussion, and a majority of the population is neither a native speaker of Urdu or of Hindi, but rather has learned Urdu or Hindi as a second or third language.

  8. John Lawler said,

    December 1, 2008 @ 1:16 pm

    @ Jonathan Badger:

    Latin and Cyrillic are much more similar than Devanagari and and Urdu. In Serbo-Croatian, for instance, both alphabets are in use and everybody knows them both. They're simply two different sets of symbols that can be interchanged at will, as a purely political act. One can even do a crossword puzzle in either alphabet.

    By contrast, Hindi and most Indian languages are written in an abugida, not an alphabet. Each "letter" represents a whole syllable, with special symbols included for each part of consonant clusters and vowels. These are written together in idiosyncratic ways, leading to a list of hundreds of special symbols, each for a different cluster. Some are transparent, but plenty aren't.

    Urdu, by comparison, is based on the Persian modification of the Arabic alphabet, in which vowels are often not marked at all, and vowel length is very iffy. That's what Mark was talking about; long and short vowels are very prominent in Hindi and it's easy to see that getting the Hindi vowel spellings right could be a nightmare when coming from an Urdu text.

  9. Daniel von Brighoff said,

    December 1, 2008 @ 1:16 pm

    No one said it's a question of difficulty, Mr Badger. Rather, it's a question of opportunity and motivation. What would the typical Hindi-speaker use Urdu script for even if she knew it? Moreover, given the nature of sectarianism in modern India, I wouldn't be surprised to find that many Hindi-speakers are proud of their ignorance of Urdu script and vice-versa for Urdu-speakers.

  10. Nigel Greenwood said,

    December 1, 2008 @ 1:47 pm

    @ dw: the three short vowels "i" "a" "u" are not represented at all, and the three corresponding long vowels are all represented by the same character .
    The first part of this statement is correct, but not the second: long i, a and u are all different characters in Urdu. I don't know the word chetaavani (presumably it's Hindi) but silsila and zindagi are both Persian (the former being an Arabic loan). It is indeed true that all the vowels in these two words apart from the final -i of zindagi are omitted in the Urdu script (ie they're wrtitten slslh and zndgi respectively).

  11. Geoffrey K. Pullum said,

    December 1, 2008 @ 2:58 pm

    @ Daniel von Brighoff: Quite right. I have known Urdu speakers simply tell me they didn't know any Hindi at all. The two names denote essentially the same language, but not only are any differences emphasized, and enhanced through deliberate use of Persian and Arabic loan words in Urdu and Sanskritic ones in Hindi, but more than that, the people in question would prefer to be thought of as not knowing each other's languages. Bob King didn't say it was hard to learn Devanagari if you were an Urdu speaker: Devanagari is beautifully designed and fitted to the language, and quite easy to learn. Anyone could learn it if they spoke the language. (I learned it easily without speaking the language well at all.) What King said was that it is rare to find Urdu and Hindi users learning each other's scripts. The cultural and religious baggage is such that they are ignorant of each other's writing systems and proud to be ignorant. Which would turn out to be a problem if you were a Pakistani wanting to release a fake apparently-Indian manifesto document.

  12. bulbul said,

    December 1, 2008 @ 3:10 pm

    Nigel,

    chetaavani – चेतावनी – is Hindi for "warning". If I'm not mistaken, Urdu uses a Perso-Arabic loan, تحذير [taẖḏīr] or something like that.

  13. Dan Everett said,

    December 1, 2008 @ 4:45 pm

    When I taught at the University of Manchester, one assignment that I regularly gave students in my introduction to linguistics class was to walk down Oxford Street, home to dozens of Indian and Pakistani restaurants and get word lists in Urdu and Hindi from the waiters. Some waiters would give both, throwing in the occasional Arabic for Urdu. Some said they only spoke Hindi or Urdu, not the other. Some said that they were the same language, just different scripts and a few different words. The attitudes were quite mixed. But students were able to recognize immediately that they are minimally variant dialects of the same language. What interested the students most after they figured this out were the speaker attitudes about differences, similarities, and how to tell the two apart. Nothing systematic or even very sensible was ever said so far as I recall.

  14. Faith said,

    December 1, 2008 @ 4:58 pm

    I'm having trouble deciding if this comment is on-topic. If not, delete at will.

    Mark's comment @1:12 reminds me of a typical issue that comes up for non-native Yiddish/Hebrew speakers such as myself. In American texts in Yiddish or Hebrew, you often find English words spelled out phonetically in the Hebrew alphabet. These words are among the hardest for non-native speakers to read. For a native speaker, those words jump out as foreign; for an English speaker, these English words are baffling. You struggle to make sense of it as a Yiddish/Hebrew term, you work it over a few times trying to find a root word within it that would anchor it in the language you believe yourself to be reading. When you finally figure it out, it's infuriating. It seems that English words are attached in our minds to the roman alphabet and don't easily move into another alphabet.

  15. Killer said,

    December 1, 2008 @ 6:09 pm

    This is further off-topic, so again, delete at will.

    @Faith: Your comment reminds me of when I was studying basic German. I'd be reading a passage and spot an unfamiliar word. But it wouldn't appear in the glossary or even in a dictionary. Eventually I'd figure out that it was someone's name — all nouns are capitalized in German, so I assumed the name was just another noun. Exasperating!

  16. Stephen Jones said,

    December 1, 2008 @ 6:15 pm

    —-"And anyone who teaches it can vouch for the fact that it is not always perfectly learned — "——-

    I have a copy posted next to the blackboard, so I can take a peek if I have to transcribe a word.

    Learning another script is incredibly difficult for many people. And, as reading involves much more than simple phonetic decoding, even if you know the script perfectly, as I do Arabic script, you still sound like a seven year old trying to read it until you have spent hundreds of hours mastering it.

    And it makes a hell of a difference regarding learning the language. Vast amounts of linguistic input are denied to you if they're written in another script.

  17. Ahmed Jawad said,

    December 1, 2008 @ 6:22 pm

    You will always find a pattern when you look for one. (Leibnitz). I am doing a PhD in computer science and I dont think this story is plauisble. This is a real dangerous situation where correlations and police style interrogative thinking is being used in indian press to escalate tensions between two nuclear armed states. Is this story plausible or every body is in a hurry to find a pattern, a specific one? Is every body wanting to mimic what US did with Sadam and Taliban, without any proof attacking ? Do these people want India to loose all what she has because of one terror attack. I tell you all friends, this is what terrorists want. Let us defeat it by showing sense.

  18. Thought Or Two said,

    December 1, 2008 @ 6:23 pm

    It depends on ethnic groups, not national identities. I'm a Pakistani who grew up around Punjabi and Urdu. I bet I can understand Punjabi Sikhs in India better than Tamil Hindus can understand them.

    I'd point out that Urdu speakers are at an advantage compared to Hindi speakers, since they can more readily grasp Arabic, Persian, Kurdish, Pashto, and any other language that employs Arabic characters.

  19. Nathan Myers said,

    December 1, 2008 @ 6:24 pm

    I think dw is onto something: most likely the "voice recognition software" bit of the report is faulty interpolation for "automatic transcription software", which would be applied not to the spoken words of the manifesto, but to a Microsoft Word document originally keyed in Urdu.

  20. Bill Poser said,

    December 1, 2008 @ 6:44 pm

    Although it is true that in many situations many Hindi speakers and Urdu speakers try to distance themselves from the other language, this is not always the case. The Hindi used in Bollywood films is actually rather close to Urdu because the producers want to attract both audiences and, while most Urdu speakers do not understand the Sanskritic vocabulary introduced in India in order to de-Islamicize Hindi, Hindi speakers do understand the Arabic and Persian vocabulary of Urdu so long as it isn't too high-falutin or from the technical vocabulary of Islam. The influence of Muslim speakers on "Hindustani" was sufficiently great that even non-Muslim speakers of Hindi use a considerable amount of "Islamic" vocabulary.

  21. Devon Strolovitch said,

    December 1, 2008 @ 7:24 pm

    This all brings to mind a comment I just saw in an article on Slate, that "some eyewitnesses said the gunmen spoke Hindi, which could mean that they were of Indian origin." I'm a trained linguist with the basic professional understanding of the relationship between Hindi and Urdu, scripts and all (my own work was on medieval Portuguese written in Hebrew script). So I'm having a linguists gut-skeptical reaction to that statement — specifically, that amidst all the chaos there was enough "data" to judge the attackers' speech as Hindi and not Urdu, and, of course, that this fact entails their Indian origin. I'm sure alot of that skepticism can be chalked up to my Indic ignorance; nonetheless I'm bothered by the off-handedness with someone seems able to judge (or at least blithely report) fragments of chaotic speech as categorically one language or the other, as though that's not actually a judgment.

  22. bulbul said,

    December 1, 2008 @ 8:32 pm

    What Bill Poser said. My first encounter with Hindi was Vincent Pořízka's classic "Hindi Language Course". It's an excellent textbook, no doubt, but as it turns out, it is a little heavy on the Sanskrit side. Imagine therefore my suprise when some years later I got to see my first Bollywood movie ("Main Hoon Na") and it was "agar" here and "lēkin" there, and "dōst" (instead of "mitra"), "qadam", "momken nahī̃" (instead of "sambhava nahī̃"), "doshman" (instead of "śatru"), "ǧasūs", "intizār" (instead of "pratīkṣā") and so forth.
    What follows is a public service announcement: Go see "Main Hoon Na". You won't regret it. Thank you.

  23. Mark said,

    December 1, 2008 @ 8:45 pm

    Praveen Swami, in his article, appears to be putting forth his evidence in order to support a popular political view that Pakistan is to blame for what's just occurred in Mumbai. I've got three problems with his evidence.

    Firstly, I balk at the idea that the relatively common word 'zindagi' would get transliterated incorrectly in any sort of software conversion.

    Secondly and more generally, with relatively high levels of illiteracy and subliteracy in India, spelling errors aren't particularly significant either way. When I see a Greengrocer's Apostrophe on a sign, I don't immediately suppose that the signmaker acquired English only as a second language.

    Thirdly, the use of a Pakistani political term ("Hyderabad Deccan") probably shows the Mujahideen's ideological orientation towards Pakistan-based groupings. But that doesn't rule out their being composed of home-grown extremists, either from Andhra Pradesh, Kashmir, or elsewhere in India. A fair number of New Leftists, in the '60s, affected the rhetoric and even accents of Huey Newton or Bobby Seale without being Black, after all; sported little goatees without being Russians named Lenin or Trotsky; donned black berets and smoked cigars without having been guerrilla fighters in Cuba's Sierra Maestra.

  24. Akshay said,

    December 2, 2008 @ 3:49 am

    Actually, the very fact that they seem to have used _Hindi_ in itself seems suspect, transliteration errors or not.

    I'm from Hyderabad, and I can say this: usage of Urdu over Hindi is quite a big thing for Muslim groups out here. The Muslim Ittehad-ul Muslimeen, for example, often 'fights' for the usage of the language on notice-boards and such. And yes, Urdu and Hindi are considered quite different; you're bound to see shops festooned with four scripts, for example, English, Hindi, Telugu _and_ Urdu. I, for instance, consider myself to be more of a Dakhni (Urdu) speaker than a mainstream Hindi speaker; am more at home with the local expressions and idiosyncracies than what's considered acceptable up north.

    Additionally, while written Urdu probably has no difference (the script, Nashtaliq, is the same), spoken Urdu in Hyderabad is quite different from spoken Urdu in Pakistan; it's very very easy for a native speaker to make out the difference. Not just pronounciation, but also idioms, expressions and most importantly slang; I'm no expert on dialects, and quite evidently, I may be emotionally-biased, but to my ear, there's no way those recorded statements from terrorists holed up in the Oberoi can be Hyderabadi. There's a certain sing-a-long quality to the dialect that was missing; it definitely sounded more Punjabi-Pakistani than anything else.

    To me, it's quite unimaginable for a local group 'fighting' for Muslim rights to not issue manifestos and declarations in Urdu. It may not _clinch_ involvement of Pakistan-based groups, and I don't think even Praveen Swami would argue that, but the very fact that they presumed nobody would understand Urdu and thought it was necessary to 'translate' their document into Hindi sounds very strange indeed.

  25. [links] Link salad is woozy with the heady smell of December | jlake.com said,

    December 2, 2008 @ 9:21 am

    […] new word for the day: Abugida — Picked it up while reading this fascinating piece on textual analysis being used to investigate the Mumbai terrorist attacks. (Specifically, the […]

  26. Aaron Davies said,

    December 2, 2008 @ 10:59 am

    @faith, @killer: reminds me of an amusing incident in my first (and only) semester of japanese: during the conversation-with-the-instructor section of one of the exams, he asked me if i'd seen the movie スパイダマン (su-pa-i-da-ma-n). it took me most of a minute, and a repetition on his part without the Japanicization, to figure out that he was talking about Spiderman.

  27. dr pepper said,

    December 2, 2008 @ 4:40 pm

    Could someone please offer me an analogy? I'm californian. I speak the majority dialect. In relative terms, what dialect would differ as much from mine as urdu differs from hindi?

    Brooklyn?
    Georgia?
    Gulla?
    Jamaica?
    Orkney?

  28. dw said,

    December 2, 2008 @ 5:05 pm

    dr pepper: there is no real equivalent of the urdu-hindi divide in English. Imagine would be someone also from California who speaks the same dialect with pretty much the same accent but is of a different religion, writes their language in a different script, and uses different higher-level vocabulary.

  29. dr pepper said,

    December 2, 2008 @ 7:06 pm

    Like say, standard speech vs geekspeak with l337 spelling?

  30. Michael Norrish said,

    December 2, 2008 @ 8:05 pm

    I can vouch that it’s hard to learn to read a foreign script. I’m fairly good at Japanese hiragana and katakana when they make up familiar words, but can be reduced to having to spell words out when the word is unfamiliar (common with katakana renderings of English, like the Spiderman example above). Similarly, as a theoretical computer scientist, I’m very familiar with the individual letters of the Greek alphabet, but because I never see these characters making up real words, I have to slowly spell out Greek when I see it.

  31. john riemann soong said,

    December 3, 2008 @ 4:36 am

    "Learning to read and write the International Phonetic Alphabet is an exercise that phonetics students around the world (as well as opera singers and others) work on mastering, every year."

    Well, it depends on interpretation of the sound system too. At some points I would have used different transcription practices, marking the /j/ part of diphthongs differently, etc.. I mean, you split the diphthong /a^j/ into two different syllables! Often I think that's why it's so difficult to read. I mean I can read *my* own transcriptions pretty fluently. ;-)

  32. Nigel Greenwood said,

    December 3, 2008 @ 6:33 am

    @ dr pepper & dw:

    An analogy in the Slavic languages might be Serbian (Cyrillic) vs. Croatian (Latin). The existence of a single language, formerly called Serbo-Croat, is hugely controversial.

    Another analogy might be Turkish in the period 1930-1950, say. The older generation had grown up with the Ottoman language (with a large Perso-Arabic element) written in the Arabic script; while the younger generation knew only the Latin script & a form of Turkish moulded by a vigorous language reform aimed at "purging" the non-Turkish elements of the language.

  33. Quote Of The Month, December 2008 | Social Services for Feral Children said,

    December 3, 2008 @ 7:33 pm

    […] Jonsson, at an in-and-of-itself fascinating post at Language Log, comments: Oskar, are you saying you can't believe terrorists wouldn't take time to do […]

  34. Akshay said,

    December 4, 2008 @ 2:38 am

    @dr pepper: In fact, I'd say it's a bit like American-English versus British-English but with different scripts. Polite diction is quite different in Hindi and Urdu, as are words for many common things, although those words are often used interchangably. An area where there's considerable difference now is in official or technical terms; most governmental terms

    To take a South-East Asian example, it's perhaps like the difference between modern Bahasa Indonesia and Malay, if it's written in Jawi exclusively.

  35. Akshay said,

    December 4, 2008 @ 2:51 am

    @dw: Actually, and I've been trying to highlight this here, but there's a considerable amount of geographic-specificity and pronounciation difference. At least in Hyderabad, there's a fair amount of difference in the specific words used; a hospital may be called as 'अस्पताल' ('aspataal') in Hindi, but a 'dawa-khaana' in Urdu on the same notice-board. There's a noticeable amount of difference in pronounciation as well, especially if there's a 'kh' sound involved.

    Finally, while religious extremists on either side like to point out a religion-based distinction between Urdu and Hindi, the reality is that it's a lot more complicated than that; what's considered as Hindi or Urdu literature, for example, is often more a result of where that work was written, than what the religion of the writer is. I've seen Urdu notice-boards in Hindu temples for instance, even those in such far-flung places as Singapore and beyond.

  36. Karan said,

    December 4, 2008 @ 7:30 am

    Here's the letter itself:

    http://www.scribd.com/doc/8537000/Deccan-Mujahideen-Letter

    I can second what Akshay says about the accent of the voice in the recorded statement to NewsX being Pakistani-Punjabi and not Indian-Hyderabadi (I'm from the Indian Punjab), although I'm not sure that the voice has been confirmed as being of one of the terrorists during the attack. There was a line or two that the person on the phone addressed to someone in the background that was entirely in Punjabi, roughly translated: "What are our demands?"

  37. David Marjanović said,

    December 4, 2008 @ 8:54 pm

    An analogy in the Slavic languages might be Serbian (Cyrillic) vs. Croatian (Latin).

    As mentioned, the situation with the scripts is quite different: firstly, Serbian uses both almost at random — if you're in Serbia and can only read one, you're functionally illiterate –, and secondly, the two alphabets have an exact 1:1 correspondence to each other (with a little fudging, but still); if it weren't for the false friends between the alphabets, you could mix them in the same crossword puzzle, and it would still work!

  38. David Marjanović said,

    December 4, 2008 @ 8:57 pm

    To wit: Давид Марјановић. Can I have a "duh"? :-)

  39. nosleepingdog said,

    December 10, 2008 @ 7:55 pm

    @Mark Liberman

    in an early reply (December 1, 2008 @ 10:49 am) you said,
    " I can puzzle out the pronunciation of a passage, I think, modulo some issues with deleted vowels."

    "Modulo", ablative of "modulus", has mathematical meanings. In your use does it mean "except for [with an implication of small difference]"?

    [(myl) Like most educated adults, I'm aware of the mathematical meaning, as well as the extended or figurative sense that the OED glosses as A.b.(a) "With respect to an equivalence defined by (some feature), disregarding differences indicated by (some unimportant feature)". Or sense 2 of the American Heritage Dictionary's entry "Correcting or adjusting for something, as by leaving something out of account". ]

RSS feed for comments on this post