Language extinction and language creation

« previous post | next post »

I just thought of a (not so) funny phenomenon that amounts to a linguistic paradox.  Namely, as languages die out, one after another, so do they arise at a steady rate. This has been a verity throughout human history.  So inexorable are these trajectories that a clever mathematician might be able to work out an equation to account for them.

The cycle of language saṃsāra संसार is ceaseless.

We know well enough how languages disappear — usually forever (see "Archive for Language extinction"; see also here).  We have also witnessed the birth of languages, which happens for a variety of reasons (social, political, linguistic, etc.).  Increasingly, however, artificial languages are being invented in astonishing numbers.

I was led to these ruminations by having read the following article:

“Dune” and the Delicate Art of Making Fictional Languages
The alien language spoken in Frank Herbert’s novels carries traces of Arabic. Why has that influence been scrubbed from the films?

By Manvir Singh
February 28, 2024

This is a long piece full of detailed information.  I will not attempt to summarize or abridge it, but will simply hit a few of the high spots.  I will preface the excerpts by saying that this is all about conlangs, or constructed languages.

A constructed language (shortened to conlang) is a language whose phonology, grammar, and vocabulary, instead of having developed naturally, are consciously devised for some purpose, which may include being devised for a work of fiction. A constructed language may also be referred to as an artificial, planned or invented language, or (in some cases) a fictional language. Planned languages (or engineered languages/engelangs) are languages that have been purposefully designed; they are the result of deliberate, controlling intervention and are thus of a form of language planning.

There are many possible reasons to create a constructed language, such as to ease human communication (see international auxiliary language and code); to give fiction or an associated constructed setting an added layer of realism; for experimentation in the fields of linguistics, cognitive science, and machine learning; for artistic creation; for fantasy role-playing games; and for language games. Some people may also make constructed languages as a hobby.


Second, the star of this New Yorker feature is David J. Peterson (b. 1981).  I remember him from several earlier Language Log posts, especially here, here, and here.

Born in Long Beach, California, Peterson started to create languages in 2000, while he was a sophomore at U.C. Berkeley. His early projects were amusing experiments: X, a language that could only be written; Sheli, which included only sounds that he liked and was initially unpronounceable; and Zhyler, which he created because he enjoyed Turkish and which, in honor of the Heinz Company, had fifty-seven noun cases. In 2005, he graduated with a master’s degree in linguistics from U.C. San Diego. Two years later, he co-founded the Language Creation Society with nine other conlangers.

Peterson’s big break came in 2009, when HBO reached out to the Language Creation Society with a strange request. They were creating a television show (which would turn out to be “Game of Thrones”) and wanted someone to develop a language (which would emerge as Dothraki). Nothing like this had ever happened before, so the society organized a competition that would be judged by the show’s producers. After signing a nondisclosure agreement, applicants were invited to send in a phonetic breakdown of Dothraki, a romanized transcription system, six to eight lines of translated text, and any additional notes or translations.

Peterson had an edge over his competitors: unemployment. For two and a half weeks, he worked eighteen-hour days, assembling a hundred and eighty pages of material. He made it to the second round and eventually produced more than three hundred pages in Dothraki. He landed the job and was later invited to develop five more languages for the series, including High Valyrian, which proved especially popular among fans. In 2017, a High Valyrian course launched on the language-learning app Duolingo; at one point in 2023, more than nine hundred thousand people had signed up as active users.

Along with James Cameron’s “Avatar” (2009), which appeared in theatres soon after Peterson was hired by HBO, the first season of “Game of Thrones” demonstrated that audiences not only tolerated fictional languages—they loved them. What had previously been a nerdy pastime transformed into a standard of fantasy filmmaking. Peterson became the go-to language wizard….

Peterson does not just make his fantasy languages out of whole cloth.   He puts in a lot of hard spadework.

Peterson’s success stems from a commitment to naturalism. He knows languages well; he has studied more than twenty, including Swahili, Middle Egyptian, and Esperanto, and seems to have an endless mental Rolodex of the lexical, grammatical, and phonological patterns found around the world. Yet, when an interviewer asked him how, when assembling a new conlang, he decides “which aspects of a language to borrow from and mimic” (Greek suffixes? Mongolian tenses? Japanese particles?), he rejected the premise. “If you just ripped out a structure from one language and put it in your own, the result would be inauthentic,” he replied.

Peterson's lingual imaginings are not all a bed of roses:

Peterson’s idea of authenticity sometimes puts him at odds with his source texts. When creating High Valyrian, Peterson was forced to include words that George R. R. Martin had composed for the books, including dracarys, meaning “dragon fire.” The word was obviously inspired by the Latin draco, meaning “dragon,” a decision that Peterson found “unfortunate.” “In the universe of the books, there is no such thing as the Latin language—or any of the other languages on Earth,” he once wrote. “It is literally impossible for any word (or anything else) in the Song of Ice and Fire universe to be related to anything in our universe.” As a result, he made dracarys its own root and chose zaldrīzes as the word for “dragon,” provoking a string of disappointed comments from “Game of Thrones” fans on his blog.

As Peterson laid out in his 2015 book, “The Art of Language Invention,” he treats languages as evolving systems whose features are interconnected and shaped by a unique history. To design verbs in High Valyrian, for example, he simulated a four-stage evolution from a prehistoric form. In the version of High Valyrian spoken in “Game of Thrones,” verbs have an imperfect stem (for past actions that were continuous or incomplete) and a perfect stem (for past actions that were completed). The perfect stem, he decided, was formed in ancient times by appending -tat­ to the end of the imperfect. Over time, this became -tet and then -et, which often reduces to -t in the version spoken in the television show. (During that imagined history, -tat also gave rise to the verb tatagon, meaning “to finish.”) There are countless other intricacies to High Valyrian verbs, yet, for Peterson, even producing this lone grammatical feature required simulating generations of linguistic change.

When Peterson was invited to work on "Dune", he decided "to develop what conlangers call an a-priori language—one whose vocabulary and grammar are wholly original, and not derived from an existing linguistic system."

Creating something new might have made sense for other projects, but, as fans will surely inform you, language functions differently in “Dune.” Written by Frank Herbert, and originally published in 1965, the novel recounts how noble houses compete to control the desert planet Arrakis (the eponymous Dune), the only source of the most precious substance in the universe. The story entwines the fate of the aristocratic Paul Atreides with the indigenous Fremen, whose harsh desert life style and religious prophecies set the scene for ecological challenges and epic political face-offs.

Herbert’s “Dune” takes place unimaginably far in the future. The time span separating us from the events of “Dune” is roughly twice the distance between us and the end of the Ice Age; sabre-toothed tigers are closer to us than the plot of “Dune” is. Nevertheless, it’s a world suffused with familiar echoes, most of which manifest in language. The novel features words derived from French (“verite”), Turkish (“kanly”), Hebrew (“Kwisatz Haderach”), German (“schlag”), and Navajo (“Nezhoni”). Having been raised in a Sikh household, I remember noticing the emperor’s title, Padishah, a Persian term that has been used as an honorific for rulers in North Africa, the Middle East, and South Asia. Sikhs use it to refer to God and the ten prophet leaders, or gurus.

Often when one is dealing with fabricated languages, illusion comes smack dab against actuality.  The identity and quiddity of Arabic in "Dune" is one such clash of cultures, history, and sheer linguistics.  The dune universe began with the epic 1965 science fiction novel by Frank Herbert.  Along the way, it has evolved into a franchise with the same name, the last iteration of which in 2024 is under the directorship of Denis Villeneuve and the tutelage of verbalization has fallen into the hands of David J. Peterson.  It is perhaps inevitable that the visions of the science fiction novelist and the language maven would come in conflict.  In my estimation, at least in this instance, the linguist is the more honest and truthful because he does not allow himself to be constrained by the political polemics of the novelist.

The language with the greatest influence in “Dune” is Arabic. In the novel, the Fremen use at least eighty terms with clear Arabic origins, many of them tied to Islam. The Fremen follow istislah (“natural law”) and ilm (“theology”). They respect karama (“miracle”) and ijaz (“prophecy”), and are attentive to ayat (“signs”) and burhan (“proof”) of life. They quote the Kitab al-Ibar, or “Book of Lessons,” an allusion to the encyclopedia of world history penned by the fourteenth-century Arab historian Ibn Khaldun.

Central characters are dignified with Arabic names. The colossal sandworms are called shai-hulud (“thing of eternity”). Paul Atreides’s sister is Alia (“exalted”). Paul himself is known as Muad’Dib, an epithet that resembles the Arabic word for teacher (mu’addib), and he is fabled to be the Lisan al-Gaib, translated in the book as “Voice of the Outer World” but which, in modern Arabic, means something closer to “Tongue of the Unseen.”

The book explains these similarities. “We are the people of Misr,” says a Fremen wise woman, using the Arabic word for Egypt, elaborating that their “Sunni ancestors fled from Nilotic al-Ourouba,” or Nile of the Arabs. The intervening millennia fused their Sunni heritage with a variant of Buddhism, but that doesn’t change a basic fact: the Fremen are descendants of Muslim Arabs, and they wear that heritage in their speech.

I find it difficult to envisage how Sunni heritage could be fused with a variant of Buddhism, yet somehow Herbert seems to have allowed himself to be seduced by such a train of thought, one, as we shall see momentarily, that Peterson refused to follow.

Why did Herbert Arabize the Fremen? Scholars such as Ali Karjoo-Ravary, a professor of history at Columbia, and Haris Durrani, a Ph.D. student at Princeton, have argued that the Fremen identity is an allegory. The book, about barons and dukes with European names vying for a desert land and the invaluable commercial booty buried in it, is a transparent metaphor for the liberationist struggles that convulsed the world in the nineteenth and twentieth centuries. Herbert’s research materials, combined with explicit references in the novel, reveal that many of the campaigns that inspired him were by Muslims: by Chechens against Russians, by the Sudanese against the Anglo-Egyptians, by Algerians against the French. Herbert also said, in a 1976 interview, that he resented the tendency “not to study Islam, not to recognize how much it has contributed to our culture.” By making it a “strong element” in the book, Herbert may have been trying to convey the “enormous debts of gratitude” that he felt humanity owed Islam.

Although Peterson’s version of the Fremen language retains a vaguely Arabic sound, almost all other traces of the language have been expunged from Villeneuve’s “Dune” films. Peterson claims that this is in the name of believability. “The time depth of the Dune books makes the amount of recognizable Arabic that survived completely (and I mean COMPLETELY) impossible,” he wrote on Reddit. When a user asked him to explain, he pointed to “Beowulf,” which was written around a thousand years ago and is uninterpretable to most modern English speakers. “And we’re talking about twenty thousand years?! Not a single shred of the language should be recognizable.” Key terms like shai-hulud and Lisan al-Gaib have made it into the films, but they’re treated in Peterson’s conlang as fortuitous convergences, not ancient holdovers, as if English were to one day lose the word “sandwich” only to serendipitously re-create it thousands of years later from new etymological building blocks.

Of the Arabic excisions in the new “Dune” films, two in particular stand out. One is of jihad, Herbert’s term for the fervent crusade led by Paul Atreides with the Fremen against the oppressive interstellar regime. Herbert saw jihad as the embodiment of messianic and religious passion—a force that is socially transformative and potentially liberating, but also dangerous and to be feared: “The ancient way, the tried and certain way that rolled over everything in its path.” Though now the word is overwhelmingly associated with Islamic extremism and terrorism, the original “Dune” offers a nuanced consideration of the concept that goes beyond simplistic and negative portrayals.

The second omission is evident in that powerful moment from the trailer, Paul Atreides’s call to his fighters. From what we’ve seen, Paul speaks Peterson’s fictional language. Without a subtitle, he would be unintelligible. In the book, however, the phrase “Long live the fighters” is written as “Ya hya chouhada,” a reference to a celebratory chant from the Algerian war of independence, which Herbert renders in Frenchified Arabic. This line, more than any other, connects the Fremen’s struggle to recent independence movements, turning them from outer-space sand people into portraits of anti-imperialism. The scholar Khaldoun Khelil, drawing on his Palestinian Algerian heritage, has described the whitewashing of these characters as an effect of Western media’s tendency to portray Arabs as “bad guys—fanatics with unreasonable demands and a strange religion.” Because “Arabs can’t be heroes,” Khelil writes, “we must be erased.”

Herbert may have striven to restore erasure, but Peterson will have none of it.  Herbert is an enormously successful polemicist and novelist, but Peterson wishes only to maintain coherence and integrity within the scope of his fabricated language.

Herbert’s and Peterson’s competing takes on this phrase embody two approaches to speculative world-building. For Herbert, the imagined world becomes relevant when it includes fragments of our reality—when, as he put it in a 1978 interview, “something of here and now has been carried to that faraway place and time.” For Peterson, in contrast, the imagined world works best when it makes logical sense, when the languages its inhabitants speak are consistent with the grounds of speculation. His techniques serve to enhance internal coherence and thus immersion, which is why he has become so sought after in Hollywood.

Whatever Peterson says, the world we see in “Dune” was never meant to be fully sealed off from the one we know. The story supposedly takes place in the far-off future, yet even Villeneuve’s version is filled with elements from our here and now. Characters speak English. They have names like Paul, Duncan, Jessica, and Vladimir. The first film included a conversation in modern Mandarin between Paul Atreides and his doctor, Wellington Yueh. And then there are those bagpipes. These choices poke holes in some of the franchise’s internal logic, but they also evoke our sympathies and aesthetic associations, and make the story comprehensible to us.

When all is said and done, though, do fictional, artificial languages — whether of Herbert's compromised (I would argue) style or Peterson's pure (as I see it) type — work?  Do they fulfill their purpose qua language?  How often do they exist as a vehicle of sustained, coherent speech?

In short, are conlangs functional?  Or are they merely for effect — or are they for flavor and fragrance, like spices added to real languages.  Even if they are not systems of communication that can convey all the meanings, sentiments, and information that human beings find it necessary to pass on to each other, perhaps they impart an extra desirable ingredient that stimulates human introspection and sensitivity.

One thing is certain:  conlangs are proliferating, in considerable measure because of the genius of linguists like David J. Peterson.


Selected readings

[h.t. Jonathan Silk]


  1. Da Def said,

    March 2, 2024 @ 2:20 am

    "perhaps [conlangs] impart an extra desirable ingredient that stimulates human introspection and sensitivity."

    Nick Farmer's conlang, Lang Belta (aka "Belter Creole" or qdb in eSpeak), stimulated this site at the material to which it links:

    As to "vehicle of sustained, coherent speech": you have not experienced Shakespeare until you've read him in Belter Creole.

  2. Chris Button said,

    March 2, 2024 @ 7:39 am

    I enjoyed Peterson's book. Even for someone like me who knows the linguistics, it was still an interesting read.

    I suppose there is always going to be a suspension of disbelief. That can go for the conlang being used or even the visuals. Take A Sergio Leone film where a character suddenly appears from the edge of the screen to confront another character. The effect on the viewer is great, but it would hardly be possible in real life because they would have seen them coming.

  3. John Kozak said,

    March 2, 2024 @ 8:45 am

    "Nothing like this had ever happened before" – eh? Mark Okrand constructed Klingon for the Star Trek movies in the 1980s.

    It's a long time since I read Dune, but isn't Muad'Dib explained there as a desert mouse? I always hoped subsequent editions would render "Paul Muad'Dib" as "Paul the Gerbil".

  4. David Marjanović said,

    March 2, 2024 @ 8:50 am

    In short, are conlangs functional? Or are they merely for effect — or are they for flavor and fragrance, like spices added to real languages[?]

    Depends. Some are capable of being functional; it's just that few to no people have the dedication it takes to create native speakers for a conlang – as has happened, barely, with Esperanto and Klingon.

  5. David Marjanović said,

    March 2, 2024 @ 8:51 am

    (oops, I switched my italics tags around)

  6. Jamie said,

    March 2, 2024 @ 9:03 am

    I sympathise with Herbert's literary ambitions for the Fremen language and with Peterson's more functional view. But, in the end, this is a work of fiction/art and I can't help feeling that Peterson's take is undermined by the fact that most of the dialog is in contemporary English. If we accept that whatever language is spoken can be rendered based on English, why can't the other fictional language be rendered as something Arabic based?

  7. wsa said,

    March 2, 2024 @ 9:10 am

    Another language of the Fremen in the books is Chakobsa, a hunting language. The idea and name originates in the Caucasus, but in the books is just Romani, the use of which raises even more questions around appropriation. See this Language Hat post:

    The Islamic resonances are in Dune for a reason (not just with the Fremen by any means), and in this case I think "good" linguistics has overrun the needs of literature. But I have to wonder more broadly, would the films get approved in China, say, or do well in Western nations, if you have Arabic coded characters, speaking Arabic, shouting "long live the martyrs" in Arabic? I suspect not entirely brave economic decisions were part of the (partial) de-Islamification of the film adaptations.

  8. GH said,

    March 3, 2024 @ 7:25 am

    I would agree that Peterson's rationale, though probably sincere on his part, is almost certainly an alibi for the movie producers to avoid awkward political issues. And having seen the film, I think the effect is unfortunate.

    The Fremen are still very obviously Arab- and Muslim-coded, and removing authentic elements (such as references to the hajj, as well as the casting) in favor of imaginary imitations only serves to make them more of an Orientalist caricature.

    The purpose of science fiction is not to offer the most realistic projection of what the distant future will be like but to speak to the world that created it, and quibbles about plausible linguistic developments are beside the point.

    There is no way to come up with any internal logic by which Frank Herbert's mishmash of languages makes sense within the universe, but that mishmash creates the literary effect he was going for: a future where today's cultures have blended, remixed and differentiated in new ways. Also, each language brings familiar connotations. The Bene Gesserit Sisterhood's use of Latin suggests the Catholic church, the French vocabulary used by the Great Houses suggests European aristocracy, archaic English words suggests Biblical quotations, etc.

  9. GH said,

    March 3, 2024 @ 9:09 am

    @ wsa:

    As the page you link to mentions in passing, the "Chakobsa" texts used by Herbert are not "just" Romani, but include bits in Serbo-Croat (as well as some words from Arabic, not to mention the phrase giudichar mantene, which is medieval Italian). This is because they were mainly adapted from the book Gypsy Sorcery and Fortune Telling by Charles G. Leland, which includes a lot of Roma poems and songs from various countries, many of them in the local majority languages. The meaning Herbert gives them is sometimes loosely based on Leland's translation, sometimes invented for the novel.

  10. Lameen said,

    March 3, 2024 @ 4:12 pm

    We're talking about a fictional world in which Reverend Mothers, Paul, and who knows how many other people canonically have access to ancestral memories going all the way back to the dawn of human history – evidently including the languages spoken by their ancestors. If you're willing to swallow that, the preservation of ancient languages across 30,000 years should be a cinch.

RSS feed for comments on this post