A thousand things to say… Not!

« previous post | next post »

It is not clear to me whether Chris Lonsdale, the managing shyster director at the language-teaching company Chris Lonsdale & Associates, is an out-and-out liar or merely has pork for brains and believes the nonsense he spouts. But what is clear to me is that not enough people are paying attention to the conjecture I mention in one section of this paper: that almost all strings of English words are ungrammatical.

The conjecture should be put more precisely, because if there is no upper bound on sentence length, the set of well-formed sentences is infinite, just like the set of ill-formed strings. One way to make the claim coherent is to phrase it in terms of what happens to the ratio of well-formed to ill-formed strings in the limit as the string length heads goes up. The claim is that when you consider longer and longer random word strings, as the string length increases toward (but never reaching) infinity the probability that the string will be grammatical heads very rapidly toward (but never reaches) zero.

In his vapidly mendacious TEDx bloviation about how you can become fluent in any language in six months, Chris Lonsdale, trying ineptly to get across the idea of syntactic systematicity, announces: "If you have 10 nouns, 10 verbs, 10 adjectives, you can say 1,000 things." What an extraordinarily dumb statement. It takes some work just to peel away a few layers of the onion of its dumbness.

You do get 1,000 possible strings if you pick one word at random from each of three 10-word lists, provided the choices are independent and there is a fixed order among the lists, because 10 × 10 × 10 = 1,000. (If you can also choose order of the lists, you get 6,000 possibilities.) But Lonsdale forgot to check how often random strings fail to represent something you can say because they are not even grammatically possible.

I'm assuming Lonsdale means you have to pick one noun, one verb, and one adjective (because otherwise you could pick Noun three times and get something like *Wombat wombat wombat, which doesn't look like a thing you can say). So to be maximally fair, let's consider only the order Noun-Verb-Adjective, which at least has some hope of yielding genuine English sentences: those with the structure of That felt good or Lonsdale was mendacious or TEDx is ridiculous. (Adjective-Noun-Verb would also work sometimes, as seen from Big girls cry or Mean people suck or Little dogs laughed; but adding that would mean there were 2,000 strings to consider, and he said there aren't, so let's just stick with Noun-Verb-Adjective.)

I made a list of 10 random nouns (hat, table, pizza, dog, milk, friend, gift, kidney, prohibition, war), and a list of 10 random verbs (expected, went, seemed, had, severed, fled, assisted, elapsed, put, expired — I generously inflected them in the preterite to avoid agreement problems like have vs. has), and a list of 10 random adjectives (new, bright, good, false, bad, cool, proud, hopeless, fond, delicious).

Then I wrote a program to construct all of the 1,000 Noun-Verb-Adjective strings you can make with these words. Here is a representative sample, just 10 percent of them:

*Dog assisted new. *Dog elapsed bright. *Dog expected good. *Dog expired false.
*Dog fled bad. *Dog went cool. *Dog had proud. *Dog put hopeless.
*Dog seemed fond. *Dog severed delicious. *Friend assisted new. *Friend elapsed bright.
*Friend expected bright. *Friend expired false. *Friend fled bad. *Friend went cool.
*Friend has proud. *Friend put hopeless. *Friend seemed fond. *Friend severed delicious.
*Gift assisted new. *Gift elapsed good. *Gift expected cool. *Gift expired fond.
*Gift fled false. *Gift went bad. *Gift has proud. *Gift put bright.
*Gift seemed new. *Gift severed delicious. *Hat assisted good. *Hat elapsed hopeless.
*Hat expected bad. *Hat expired false. *Hat fled fond. *Hat went proud.
*Hat has bright. *Hat put new. *Hat seemed good. *Hat severed delicious.
*Kidney assisted fond. *Kidney elapsed cool. *Kidney expected bad. *Kidney expired false.
*Kidney fled good. *Kidney went proud. *Kidney has bright. *Kidney put new.
*Kidney seemed good. *Kidney severed delicious. *Milk assisted good. *Milk elapsed good.
*Milk expected good. *Milk expired new. *Milk fled bright. *Milk went proud.
*Milk has false. *Milk put bad. Milk seemed hopeless. *Milk severed delicious.
*Pizza assisted good. *Pizza elapsed good. *Pizza expected new. *Pizza expired bright.
*Pizza fled proud. *Pizza went fond. *Pizza has delicious. *Pizza put false.
Pizza seemed bad. *Pizza severed cool. *Prohibition assisted delicious. *Prohibition elapsed good.
*Prohibition expected proud. *Prohibition expired delicious. *Prohibition fled proud. Prohibition went bad.
*Prohibition has false. *Prohibition put good. *Prohibition seemed fond. *Prohibition severed bright.
*Table assisted new. *Table elapsed bad. *Table expected false. *Table expired delicious.
*Table fled fond. *Table went hopeless. *Table has proud. *Table put fond.
*Table seemed bright. *Table severed new. *War assisted delicious. *War elapsed fond.
*War expected hopeless. *War expired proud. *War fled bad. *War went cool.
*War has false. *War put proud. ?War seemed bright. *War severed new.

It's not looking too good, is it? By my judgment there are just three clear English sentences in this set of 100, perhaps four: Milk seemed hopeless, Pizza seemed bad, Prohibition went bad, and maybe ?War seemed bright if you like that one. The nonsentences vastly predominate. The indications are that with your 10 nouns and 10 verbs and 10 adjectives you would find maybe 30 or 40 would be things you could say, if you were lucky, not a thousand. (Notice, if seem had not been one of my random verbs it would have been a catastrophe. And notice also that we are considering very short strings, which enormously enhances their probability of being grammatical.)

It is no surprise that the audience at a TEDx talk could be taken in by Lonsdale's outrageously false claim. People's bullshit detectors, even when switched on, don't work on linguistic material (though for heaven's sake, what I went through above is not rocket science). That's why they can believe there is a way to become a fluent Mandarin speaker between now and next January. Dream on.

It's just one more illustration of the fact that when the subject matter is language (and perhaps also when it's statistics) you can just about get away with any lie you might want to tell. Nobody checks. You can just make stuff up.

Comments are closed.