The xkcd site is promoting Randall Munroe's forthcoming book Thing Explainer, in which things are explained in the style of his comic "Up Goer Five", "using only the ten hundred words people use the most often".

At the time that "Up Goer Five" came out, Theo Sanderson created the Up Goer Five Text Editor, which checks words as you type them:

I'm sure that others have composed interesting things in this particular form of constrained writing, but Aaron (Doctor Whom) Dinkin is the only person I know who published a version of the abstract of his dissertation in this style:

I went to the state that the biggest city in this part of the world is in—but the "up" part of the state, not the part near that city—and listened to how people talk. I picked that state because it sits between four or five areas where people talk in different ways, and I wanted to find out where one way of talking ends and the next begins.  

There's a sound change that a lot of people have in the cities near the Great Bodies of Water, where the sound in "cat" and "back" becomes very high, the sound in "hot" and "lock" becomes very front, and a few other sounds change too. In the state I was looking at, it turns out that a lot of people have this sound change even as far away as the right edge (so to speak) of the state—but the "cat" sound only becomes high in cities that were founded by people from the state that's further down and to the right, along the Sound. I think the reason the "cat" sound doesn't become that high in cities that were founded by people from other places might be because there's a bigger space between the "cat" sound and the "can't" sound in those cities, and the "can't" sound is already very high. This means that who founded which cities can still be important to how people talk even hundreds of years later.  

Also, I found out that people in most parts of the state have different sounds in "hot" and "lock" on the one hand and in "caught" and "talk" on the other hand. The only part of the state where most people have the same sound in the middle of all of those words is the far top right corner. However, it looks like changes are happening that will end up with people saying them the same in most of the rest of the state; for younger people, the "hot" sound is a lot further back than it is for older people. I didn't expect that. But it shows that when two different sounds come to be said the same way in one area, it really is easy for that change to be picked up by people in other areas. Having that sound change I mentioned above, where "hot" became more front, doesn't stop them from picking up this other change; they just started moving "hot" back again.  

Finally, I listened to how people say the long word that you use to talk about the kind of school that children between the ages of about five and ten go to. It turns out that in most of the state, except for the part near that one big city, they make the second-to-last bit of that word much stronger than people in most places (though the middle bit is still the strongest). That's kind of weird.

Here's the original version.

I should note in passing that Theo used a list that includes specific contracted forms with 'll and 'd, thus allowing I'll and we'll, but forbidding he'll and she'll, as well as that'll, it'll, that'd, it'd, etc. And of course other word-frequency lists will have a different set of the thousand most frequent words, depending on the sources used, the tokenization and lemmatization algorithms, etc.


  1. Adrian Morgan said,

    August 1, 2015 @ 9:09 am

    Of course, it's not exactly in the right spirit to use terms like "high" in their specialised phonetic sense, but "said with not much room between the tongue and the top of the mouth (this kind of sound is called 'high')" is OK.

    The same criticism applies when "there's a bigger space" is used with reference to the vowel diagram, but again, we can patch it up easily enough, e.g. "where the tongue is is more different".

  2. mira said,

    August 1, 2015 @ 9:37 am

    Wait, how do people in New York say "elementary"?

  3. Antti-Juhani Kaijanaho said,

    August 1, 2015 @ 9:38 am

    Adrian Morgan's comment about "high" reminds me of Guy Steele's keynote "Growing a Language" from a computer science conference in 1998. There's a video, and a corrected text was later published in a CS journal (paywalled).

    Steele started out with a restricted English (only words of one syllable) and then, like a programmer does, grew the language by definitions until he was able to actually discuss his main point (that programming languages should be designed for growth by definitions). Dunno how much the actual talk makes sense to a non-programmer, however.

  4. MattF said,

    August 1, 2015 @ 10:45 am

    This exposition of relativity in words of four letters or less (describing the achievements of Izzy and Al):

    is a predecessor of the thing-explainer.

  5. Guy said,

    August 1, 2015 @ 12:14 pm

    I disagree, the meaning of the technical sense of "high" is fairly transparent given its ordinary meaning, and doesn't strike me as particularly different from the locutions used elsewhere in the paper. The use of "front" instead of "in front", however, is harder not to take as a different lexeme.

    Oh wait –

    I don't agree, the tongue placement sense of "high" is pretty clear given its usual meaning, and doesn't seem to me as very different from the ways of talking used in other places in the paper. The use of "front" instead of "in front", however, is harder not to take as a different word.

    ("However" makes it? I'm guessing the word list isn't based on casual speech since not even my initial gloss of "special" for "technical" worked)

  6. Guy said,

    August 1, 2015 @ 12:16 pm

    Oops, "placement" should be "position"

  7. Tim O'Neill said,

    August 1, 2015 @ 12:27 pm

    Not sure what inspired the question about NY, but my wife is from New York and she says [ˌɛləˈmɛntejɹij] (rhymes with "wary" and possibly, depending on your dialect, with "merry" and "bury" ) whereas many English speakers skip that [ej], rendering it [ˌɛləˈmɛntʃɹij].

  8. AJD said,

    August 1, 2015 @ 12:56 pm

    Mira, I'll answer in the form of a double-dactyl:

    Plattsburgh to Buffalo,
    Interview evidence
    Shows us that fieldworkers
    Ought to be wary:
    Upstate New Yorkers pro-
    nounce documentary

  9. AJD said,

    August 1, 2015 @ 12:58 pm

    Um, except replace "documentary" with "elementary". :-p

  10. AJD said,

    August 1, 2015 @ 1:01 pm

    Adrian, I think the real way I cheated was by using "sound" in the 'body-of-water' sense.

  11. Victor Mair said,

    August 1, 2015 @ 1:18 pm

    Does anyone remember Basic English?


    Basic English is an English-based controlled language created by linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a second language. Basic English is, in essence, a simplified subset of regular English. It was presented in Ogden's book Basic English: A General Introduction with Rules and Grammar (1930).

    Ogden's Basic, and the concept of a simplified English, gained its greatest publicity just after the Allied victory in World War II as a means for world peace. Although Basic English was not built into a program, similar simplifications have been devised for various international uses. Ogden's associate I. A. Richards promoted its use in schools in China.[1] More recently, it has influenced the creation of Voice of America's Special English for news broadcasting, and Simplified English, another English-based controlled language designed to write technical manuals.

    What survives today of Ogden's Basic English is the basic 850-word list used as the beginner's vocabulary of the English language taught worldwide, especially in Asia.[2]

    See also:


    It's interesting that Ogden's Basic English had a basic 850-word list, 150 less than Randall Munroe's Thing Explainer.

    C. I. Ogden was an extremely intelligent man who put a lot of thought into the Design Principles and execution of Basic English. He was ably assisted in his endeavor by I. A. Richards.


    …[I]n works like The General Basic English Dictionary and Times of India Guide to Basic English, Richards and Ogden developed their most internationally influential project—the Basic English program for the development of an international language based with an 850-word vocabulary. Richards' own travels, especially to China, made him an effective advocate for this international program. At Harvard, he took the next step, integrating new media (television, especially) into his international pedagogy.



    Together with Ogden, in 1923 Richards had written a hugely influential book called The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism.

    Richards later taught in China and wrote a challenging book called Mencius on the Mind. I met Richards around 1967 when he was in his mid-70s and still sharp as a tack. He is thought of as the father of New Criticism and was the originator of the concept of "feedforward" (the opposite of feedback).


    According to the OED Richards coined the term feedforward in 1951 at the 8th annual Macy Conferences on cybernetics and hence Richards's influence extended to cybernetics which makes liberal use of the term feedforward. One of Richards's most famous students was Marshall McLuhan, who also made use of the notion of feedforward.



    Basic English may seem like a simple idea, but here we have two of the most powerful minds of the 20th century devoting a tremendous amount of their time and effort to its development and popularization.

  12. Jerry Friedman said,

    August 1, 2015 @ 1:19 pm

    Here in New Mexico, I hear "documentary" with a secondary accent on the "tar" a lot, but I haven't noticed the same thing with "elementary".

    Was the original version of this sort of thing Basic English?

  13. Jerry Friedman said,

    August 1, 2015 @ 1:37 pm

    Come to think of it, what I think I hear here is what Aaron would write as dócumentàry. It scans at least as well in his double-dactyl, though.

    And how do upstate New Yorkers pronounce "documentary"?

  14. AJD said,

    August 1, 2015 @ 1:39 pm

    Jerry, I've also heard anecdotal reports of stressed-penultimate -mentary words from places as disparate as Cincinnati and New Orleans. At some point I want to do a larger-scope study to try and find out where else this exists.

    Also, I suspect less-frequently-used words are more susceptible to this pattern, which means documentary is more likely to get the stressed penultimate than elementary is, in general. (Though I don't think I have any evidence in Upstate New York of someplace where one is used but not the other.)

  15. Victor Mair said,

    August 1, 2015 @ 1:58 pm

    During the 30s, I think that there were some efforts to come up with Basic Chinese on the model of Basic English, but at the moment I do not recall the details. Perhaps they will come back to me later.

  16. Theophylact said,

    August 1, 2015 @ 2:21 pm

    This somehow reminds me of Poul Anderson's "Uncleftish Beholding", an exposition of basic atomic theory using only words and compounds derived from Germanic sources. A sample:

    The underlying kinds of stuff are the firststuffs, which link together in sundry ways to give rise to the rest. Formerly we knew of ninety-two firststuffs, from waterstuff the lightest and barest, to ymirstuff the heaviest. Now we have made more, such as aegirstuff and helstuff.

    The firststuffs have their being as motes called unclefts. These are mighty small… .

  17. Charles Antaki said,

    August 1, 2015 @ 2:58 pm

    There's a movement in some places (the ones I know of are Australia, New Zealand and Finland, but I'm sure it's wider than that) to use "Easy English" (and its equivalent for Finnish) in writing for people with intellectual disabilities. (They usually have to cope with the usual officialese and bad writing that confounds the rest of us).

    Because the aim isn't so much to reduce volume of vocabulary so much as complexity of concept, Easy English doesn't have a word limit as such, but gives tips to write plainly (eg use active voice, use short sentences…).

    Here's an example from an Australian guide:
    "• Our organisation can help people with a disability. (not Easy English)
    • We can help you. (Easy English)"
    ("Use bullet points" is another rule.)

    Intriguingly, one of the recommendations is to limit one's use of punctuation – counter-intuitive perhaps, but dashes, brackets and the like add considerable load to the idea being expressed and can usually be avoided.

  18. Mr Punch said,

    August 1, 2015 @ 4:40 pm

    Aren't some of the later Dr. Seuss books written with a very limited (and simple) vocabulary?

  19. Adrian Morgan said,

    August 1, 2015 @ 5:42 pm

    @Guy As an experiment, try talking to someone with no linguistics background about "high" and "low" vowels and see if they have any notion of what you mean. :-)

  20. maidhc said,

    August 1, 2015 @ 6:32 pm

    There's also Simplified Technical English. This is a "controlled language" developed for the aerospace industry to minimize ambiguity.

    The aerospace and defense standard started as an industry-regulated writing standard for aerospace maintenance documentation, but has become mandatory for an increasing number of military land vehicle, sea vehicle and weapons programs as well. Although it was not intended for use as a general writing standard, it has been successfully adopted by other industries and for a wide range of document types…

    The regulated aerospace standard was formerly called AECMA Simplified English, because AECMA (a French acronym for the Association Européenne des Constructeurs de Matériel aérospatial, in English the European Association of Aerospace Manufacturers) originally created the standard in the 1980s. The AECMA standard originally came from Fokker, which had based their standard on earlier controlled languages, especially Caterpillar Fundamental English. In 2005, AECMA was subsumed by the Aerospace and Defence Industries Association of Europe (ASD), which renamed its standard to ASD Simplified Technical English or STE. STE is defined by the specification ASD-STE100, which is maintained by the Simplified Technical English Maintenance Group (STEMG). The specification contains a set of restrictions on the grammar and style of procedural and descriptive text. It also contains a dictionary of approx. 875 approved general words. Writers are given guidelines for adding technical names and technical verbs to their documentation. STE is mandated by several commercial and military specifications that control the style and content of maintenance documentation, most notably ASD S1000D.

    One characteristic is that each word may only be one part of speech. So "close" can only be used in the sense of closing a door. If you want to use an adjective it has to be "near" instead.

  21. Mark Mandel said,

    August 1, 2015 @ 7:04 pm

    Of course, one huge fundamental problem with Basic English is that it defines a "word" as a string of letters between spaces. E.g., instead of "surrender" it uses "give up". Look, we've just scored another point! Never mind the learner's need to memorize a totally unpredictable idiom as another (ahem) lexical item, as long as the word list stays small.

    … Oh dear, that's the second "Dr. Whom" I've heard of besides myself.

  22. ohwilleke said,

    August 1, 2015 @ 7:17 pm

    It is interesting to note that while limiting the lexicon to 1000 words can do a lot to make something easier to understand, that the exclusion of proper nouns actually makes the exposition much more difficult to understand and follow.

  23. John Swindle said,

    August 2, 2015 @ 12:18 am

    Wikipedia has an edition in Simple English.

    Voice of America offers programming in what they used to call Special English and now call Learning English. They limit the vocabulary and slow down the delivery. The result feels different from spontaneous foreigner talk.

    Are there similar offerings elsewhere for other languages?

  24. glenf said,

    August 2, 2015 @ 3:14 am

    Finland's national broadcaster, YLE, has a daily 5-minute news bulletin in "clear Finnish", again with limited vocabulary and slower delivery. Some of the newsreaders seem, to my ear, to be very uncomfortable speaking so slowly, and the resulting unnatural phrasing can actually make them more difficult to understand!

    Link here:

  25. maidhc said,

    August 2, 2015 @ 4:13 am

    I've heard that ESL teachers recommend that their students read USA Today, because it is written to a 5th grade reading level. Of course reading the news is good for language learning because you already have some idea what it's about.

    If you carry this too far, you will learn to say "The minister stated that increased employment will result from financial incentives to locate factories in the district" and still not be able to ask for a glass of water in a restaurant.

  26. Rubrick said,

    August 2, 2015 @ 4:27 am

    Project: Combine Charles Kay Ogden's work (as mentioned by Victor) with John Nash's seminal work on the foundations of game theory. Apply the result to the study of comic verse.

  27. CL Thornett said,

    August 2, 2015 @ 4:41 am

    The "Quick Reads' series in the UK is an excellent example of using restricted vocabulary and grammar to produce a variety of texts in the form of short inexpensive books. One of the distinctive features of the series is that well-known writers, especially novellists and sports writers, have been recruited. I had a small set which I loaned to my ESOL students for recreational reading. I read several with pleasure, as with good children's books.

    The series was created for adults whose literacy is limited for whatever reason.

    it's easier to write on this topic without using the restrictions of Quick Reads; I've tried writing for even lower adult literacy or English language levels, and it's a real challenge to produce an interesting text.

  28. Keith said,

    August 2, 2015 @ 4:49 am

    I'm being encouraged to use Simplified Technical English (STE) in my work… for manuals that I'm writing from scratch, this is almost doable, but trying to write in STE when updating an existing manual leads to more and more inconsistencies between chapters.

    Eventually each manual will have been updated so many times that every section will have been rewritten in STE, or so the theory goes. Kind of like "this is my grandfather's axe, my father replaced the haft and I replaced the head, but it's still my grandfather's axe".

  29. Sean Manning said,

    August 2, 2015 @ 5:01 pm

    Basic English appears in the fiction of H. Beam Piper spoken by Thothans, z'Srauff, and other aliens. (" I knew who the z'Srauff were; I'd run into them, here and there. One of the extra-solar intelligent humanoid races, who seemed to have been evolved from canine or canine-like ancestors, instead of primates. Most of them could speak Basic English, but I never saw one who would admit to understanding more of our language than the 850-word Basic vocabulary."- Lone Star Planet)

    Mark Mandel: I am not sure I agree. I think that most Indo-European languages, and some others, have an expression for "to surrender" based on a verb "to give" (Lat. dedere, Akk. something based on nadānu, Sum. probably something based on e3). The directionality and choice of preposition or case or auxiliary phrase may vary, but someone fluent in another language is likely to be able to pick some appropriate marker and be understood. So I do not see such idioms as "totally unpredictable" although the difference in connotations between "give up," "give over," and "hand over" certainly is!

  30. MMcM said,

    August 2, 2015 @ 8:29 pm

    Ogden, with Joyce's consent, translated(?) the part of Work in Progress read for a Gramophone recording into Basic English. (Reprinted here.)

