"Love in Translation" (with footnotes)

In the Aug. 8 & 15 issue of The New Yorker, staff writer Lauren Collins has a "personal history" piece entitled "Love in Translation" (subtitled, "Learning about culture, communication, and intimacy in my husband's native French"). It's very nicely written and will surely be of interest to Language Log readers. But Collins relies on some linguistic research without giving proper credit, an oversight I've tried to rectify below.

About halfway through the piece comes this paragraph:

Linguists have attempted to make an objective assessment of the relative difficulty of languages by breaking them down into parts. One factor is the level of inflection, or the amount of information that a language carries on a single word. The languages of large, literate societies usually have larger vocabularies. You might think that their structures are also more elaborate, but the opposite is generally true: the simpler the society, the more baroque its morphology. In Archi, a language spoken in the village of Archib, in southern Dagestan, a single verb—taking into account prefixes and suffixes and other modifications—can occur in 1,502,839 different forms. This makes sense, if you think about it. Because large societies have frequent interaction with outsiders, their languages undergo simplification. Members of relatively homogeneous groups, on the other hand, share a base of common knowledge, enabling them to pile on declensions without confusing one another. Small languages stay spiky. But, amid waves of contact, large languages lose their sharp edges, becoming bevelled as pieces of glass.

The Russian linguist Aleksandr Kibrik studied Archi morphology and came up with 1,502,839 as the number of different forms a verb can take. See Kibrik's entry on Archi in The Handbook of Morphology (Andrew Spencer and Arnold M. Zwicky, eds., Blackwell, 1998, pp. 455–476), and also these articles that draw on Kibrik's work: "Morphological complexity of Archi verbs" (Marina Chumakina, in Tense, Aspect, Modality and Finiteness in East Caucasian Languages, Brockmeyer, 2011, pp. 1-14) and "The unique challenge of the Archi paradigm" (Greville G. Corbett, in BLS37, 2013, pp. 52-67).

But given her larger argument about linguistic simplicity and complexity, Collins is almost certainly relying on the work of John McWhorter. In the introduction to his book What Language Is (Gotham, 2011), Archi is McWhorter's opening example (Kibrik's figure of 1,502,839 verb forms is given on p. 9). Later on in the book, he has a chapter on how Pashto is more complex than Persian and why. Archi also comes up in McWhorter's Jan. 19, 2016 article for The Atlantic, "What's a Language, Anyway?" A Reddit thread (in the r/linguistics subreddit) inspired by McWhorter's piece provides the calculations Kibrik made for Archi (COMM = "commentative" and CNM = "class-number markers").

(See also McWhorter's most recent piece for The Atlantic, "The World's Most Efficient Languages," June 29, 2016.)

In the next paragraph, Collins writes:

Another way to try to rate the difficulty of a language is to consider its unusual features: putting the verb before the subject in a sentence, for example, or not having a question particle ("do"). Researchers analyzed two hundred and thirty-nine languages to create the Language Weirdness Index, anointing Chalcatongo Mixtec—a verb-initial tonal language spoken by six thousand people in Oaxaca—the world's oddest language. The most conventional was Hindi, with only a single unusual feature, predicative possession. English came in thirty-third, making it a third as weird as German but seven times weirder than Purépecha.

The Language Weirdness Index credited to unnamed "researchers" is actually the work of Tyler Schnoebelen, a Stanford-trained linguist who helped found the language analysis startup Idibon (which has now sadly joined the startup graveyard in the sky). Schnoebelen posted on "the weirdest languages" on the now-defunct Idibon blog on June 21, 2013 — his post lives on thanks to the Wayback Machine and Schnoebelen's Corpus Linguistics blog. The Idibon post got a fair amount of attention at the time, including on Language Log:

As Victor Mair notes, "in this context, 'weird' means roughly 'has linguistic features that are unlike those of most other languages'" (with features derived from The World Atlas of Language Structures).

Here are the top 10 and bottom 10 languages according to Schnoebelen's Weirdness Index:

Rank Language Weirdness Index
1 Mixtec (Chalcatongo) 0.972
2 Nenets 0.935
3 Choctaw 0.924
4 Diegueño (Mesa Grande) 0.920
5 Oromo (Harar) 0.919
6 Kutenai 0.908
7 Iraqw 0.900
8 Kongo 0.883
9 Armenian (Eastern) 0.861
10 German 0.858
230 Basque 0.189
231 Bororo 0.153
232 Quechua (Imbabura) 0.151
233 Usan 0.151
234 Cantonese 0.143
235 Hungarian 0.132
236 Chamorro 0.128
237 Ainu 0.128
238 Purépecha 0.100
239 Hindi 0.087

Despite the lack of references, Collins's piece is well-researched and well-argued, which is a relief given The New Yorker's spotty record on language-related matters. See our past coverage on Language Log:


  1. Rebecca said,

    August 10, 2016 @ 12:05 am

    Now that virtually everything is published electronically (even if also in print), I wish it would become the norm to include a digital appendix with citations such as you give. Some types of writing, such as Collins' article, might lose their style if they include in text citations like an academic article does. But it's nice for the readers who want to go deeper to have that choice (not to mention it gives credit where it's due)

  2. Thomas Rees said,

    August 10, 2016 @ 3:06 am

    What Rebecca said. Ms. Collins does give credit to Pullum.

  3. RP said,

    August 10, 2016 @ 3:16 am

    This snippet "not having a question particle ("do")" interested me, because I don't regard "do" as a question particle. But nor does WALS:

  4. leoboiko said,

    August 10, 2016 @ 7:57 am

    Yeah I mean English has this totally weird auxiliary inversion thing: "you can kiss him" → "can you kiss him"? In reasonable languages with question particles, you just tack the particle and presto, now it's a question.

  5. David Marjanović said,

    August 10, 2016 @ 8:29 am

    You might think that their structures are also more elaborate, but the opposite is generally true: the simpler the society, the more baroque its morphology.

    It's not that simple. Languages spoken in mountain villages at the other side of Asia are often isolating.

  6. J.W. Brewer said,

    August 10, 2016 @ 9:39 am

    @leoboiko, many non-prestige varieties of BrEng do have a question particle ("innit") that works by being tacked on to the end of what would otherwise be a declarative sentence. I don't know to what extent scholars of typology have had occasion to comment on that, as potential evidence of a universal tendency popping up under the radar of the prestige standard form of the language in question.

  7. Coby Lubliner said,

    August 10, 2016 @ 11:19 am

    The two linked articles by John McWhorter are intriguing. In the second one he contradicts what he wrote in the first by referring to Riau Malay as a "dialect of Indonesian."

  8. Michael Watts said,

    August 10, 2016 @ 12:19 pm

    Well, I'd say WALS is right that "do" is not a question particle in any sense. I sympathize with the journalist's desire to provide examples in a language her readers know, though.

    But I have some serious questions about how WALS assigns features.

    English is listed as "no tone" for the feature "tone". At least in American english, the I-don't-know three-note contour, the difference between "You're not going?" and "You're not going.", uptalk, and flat what are all purely tonal phenomena. The rising tone applied to yes/no questions is a syntactic requirement, which I would think makes it hard to argue that english is atonal.

    Mandarin chinese is listed as "3rd person singular only" for the feature "gender distinctions in independent personal pronouns". Mandarin speakers experience extreme difficulty selecting the correct third-person pronoun when speaking a language that does have gendered pronouns. A gender distinction is obeyed when writing the third person pronoun, but if speakers are unable to perceive it in real time I don't see the case that the distinction is present in the language. And if the difference in writing is what they're looking at, certain mandarin-writing populations also distinguish 你 from 妳, which is second-person.

  9. Michael Watts said,

    August 10, 2016 @ 12:38 pm

    Followup question relating question particles and tone: why is applying a rising tone to the end of a yes/no question different in kind from applying the clitic syllable ma to the end of a yes/no question, as mandarin does, or from applying the clitic syllable ne after the first word of a yes/no question, as latin does? English already features zero-syllable clitic particles (the 's of Jack's).

  10. Bloix said,

    August 10, 2016 @ 1:07 pm

    JW Brewer – every lawyer (and anyone who watches lawyer shows) knows that standard English has a question particle: right.
    Q. You got on the train at 59th Street; right?
    A. Correct.
    Q. And your testimony was you felt something touching you from behind?
    A. Yes.
    Q. But you don't know what that was at that point that was touching you; right?
    A. No.
    Q. And then you turned around?
    A. I turned around and saw a bag, handle of the bag on his shoulders.
    Q. I'm sorry. You turned around and you saw the bag on his shoulders?
    A. Yes.
    Q. And he was wearing a red shirt; right?
    A. Correct.


  11. Rodger C said,

    August 11, 2016 @ 7:05 am

    The idea that "do" is a question particle reminds me of Pedro Carolino: "Don't you will not more?"

  12. flow said,

    August 12, 2016 @ 6:00 am

    The more I think about this whole enterprise the more skeptical I'm becoming. I mean I'm sure the engineering principles are sound—which is what Seven of Nine said about Icheb's wormhole detector. I'm confident we're measuring *something* here, some property Ψ that may or may not be open to intuitive interpretation—which may or may not be what some / all people would describe as 'weirdness'.

    The problem I have with that is that I've encountered such a situation numerous times when counting, sorting, grouping Chinese characters according to some graphical and / or structural properties. Believe me when I say I know a weird CJK glyph when I see it. I have reams of data and long lists that group characters by weirdness because they have so many horizontal strokes, radicals in strange places, form little internal phrases, look like a face, have curved strokes and so on and so on.

    But then again, can I measure that? 一 'one' and 〇 'zero' are quite 'weird' because there's only that single character with that minimal shape—a single stroke—in common use, and only that common character that has a circle. But you use them each day, doesn't that make them commonplace?

    When I do come up with a new measure I frequently find myself pondering what the results are trying to tell me. I'm guessing those numbers only start to make sense when you can put them into context, when there's a baseline, some sort of gauge.

    I just read the Wikipedia articles on German modal particles (https://de.wikipedia.org/wiki/Modalpartikel, https://en.wikipedia.org/wiki/German_modal_particle) and now I feel I'm speaking a weird language. Go try and stick those particles *mal* into your English sentences, and you will *schon* see what comes out of it. I didn't feel that weirdness before but now I acutely do *schon*. You can *ja doch gar* not measure that objectively, can you?

  13. flow said,

    August 12, 2016 @ 6:55 am

    Addendum: For those who'd prefer to see an actual collection of weird characters, head over to https://gist.github.com/loveencounterflow/b8be8b1436f138f8854eb7fa362c4ae0. There you find a short excerpt from a file that I've been treating as a dump yard of sorts, unloading stuff that came to my attention (watt mir ufffiel). The collection would definitely profit from a cleanup but there you have it.

  14. John Cowan said,

    August 14, 2016 @ 1:25 am

    Michael Watts: Tone is a difference in pitch that makes a lexical difference. Chinese and Swedish have tone because many distinct words are differentiated only by their pitch. That's not the same as pitch differences that convey different meanings to whole sentences.

  15. Florence Artur said,

    August 15, 2016 @ 3:49 pm

    @John Cowan: thank you for clarifying this, as I was getting confused for a moment. I always thought of French as a clearly atonal language (it is, isn't it?) but we do use tone to transform an affirmative sentence into a question.

