## Metaphor of the month

Joshua David Stein, "The Loud, Empty Word That Defines President-Elect Trump", The Daily Beast 1/1/2017:

Perhaps because there are so many casualties already accruing and so much damage already being done, it has gone less noted than it should that among the incoming Trump administration’s most endangered victims is the English language itself. Nouns shudder. Adjectives cower. The entire edifice of grammar quivers with fear as January 20th nears.

Of course, one could make the argument that at a time when all the groceries are up in the air, we must prioritize what to catch. Climate change and war are eggs; perhaps language is a loaf of bread.

But language, as any linguist, Lacanian or deliman knows, is the sandwich within which stuff our world. If a thing doesn’t fit inside our words, we can’t bring it to our mouths. It is fundamentally indigestible.

I'm going to guess that that there's a missing "we" in "the sandwich within which (we) stuff our world".

And are "linguist, Lacanian or deliman" three epistemological alternatives? Or are Lacanian and deliman subtypes of linguist? Compare "cow, sheep or goat" to "cow, Guernsey or Holstein"…

Morris Halle once told me about a lecture in Paris after which someone — perhaps a Lacanian — asked him suspiciously to define his philosophical orientation. Morris's answer: "Does a shoemaker need a philosophical orientation? If so, then that's mine as well." In this case, I guess I'll follow Morris in identifying myself as an adherent of the deliman school. Though someone that I respect has been trying to persuade me that Jacques Lacan was not, in Noam Chomsky's words, "an amusing and perfectly self-conscious charlatan". So stay tuned.

Anyhow, I can't take any credit or blame for Stein's linguistic world-sandwich. But I'm cited later in the article:

In an analysis of Trump’s language during the campaign, University of Pennsylvania Professor of Linguistics Mark Liberman found Trump’s usage of the word “very” surpassed only “they,” “I,” “don’t,” “going,” and “it.” It rates just above “tremendous.”

This passage makes it seem as if I'm making a claim about word frequencies — but that would be false. In the material that I surveyed (transcripts of four pre-convention debates) very was actually the 23rd commonest word in Trump's contributions, and tremendous was in 140th place. Here's what I actually wrote to Mr. Stein:

It's certainly true that Donald Trump uses "very" more than other politicians in the same context. Thus from the (within-party) debates during the presidential campaign, the overall rates of "very" (per million words) were

Trump   6960
Sanders 3405
Clinton 3246
Cruz    1966
Kasich  1344

If we compare Trump's word counts against the others in that set, and sort the words by the "weighted log-odds-ratio, informative dirichlet prior" algorithm (from Monroe, Colaresi, and Quinn 2009, "Fightin Words"), "very" is in sixth place, just above "tremendous":

they       569 (14043.8)  929 (5854.85) 1498 (7520.53) 8.087
i         1726 (42600.5) 4476 (28209.1) 6202 (31136.4) 7.170
don't      274 (6762.76)  380 (2394.88)  654 (3283.33) 6.492
going      323 (7972.16)  502 (3163.76)  825 (4141.82) 6.374
it         836 (20633.8) 1953 (12308.4) 2789 (14001.8) 6.075
very       282 (6960.21)  429 (2703.69)  711 (3569.49) 6.074
tremendous  45 (1110.67)    2 (12.6046)   47 (235.958) 5.829
you        848 (20930)   2115 (13329.4) 2963 (14875.4) 5.387
he         300 (7404.48)  557 (3510.39)  857 (4302.47) 5.071
nobody      50 (1234.08)   22 (138.651)   72 (361.468) 4.822
mexico      34 (839.175)    5 (31.5115)   39 (195.795) 4.753

where the columns are

WORD TrumpCount (TrumpRate) OtherCount (OtherRate) TotalCount (TotalRate) WLOR

So Donald Trump does tend to use certain intensifiers much more frequently than (most?) other politicians.

I should have sent the actual lexical histograms (i.e. the lists of words with their associated counts), and a link to an explanation of the "weighted log-odds ratio" method.

The idea of that method is to solve the problem of misleading predictions from small counts. If person A uses word X just once in a sample of 40,000 words, and person B doesn't use X at all in a sample of comparable size, then simply estimating probabilities in a naive way from the counts would imply that word X is an infallible indicator that the speaker is A rather than B. But in fact those usage counts are not telling us much at all, since in the next sample of comparable size, the counts might well be reversed.

In contrast, if person A uses word X 200 times in the same sample, and person B uses X once, then we can expect that in the future, use of word X is giving us a pretty strong (though not completely infallible) clue about person-A-ness. And the whole "weighted log-odds ratio" business is a particular way of carrying out the process of appropriately discounting the predictive value of things we haven't seen much — basically a form of what's called statistical shrinkage. (See here for some other approaches to this general problem.)

But it's hard to put "informative Dirichlet priors" into a palatable mass-media sandwich.

1. ### tangent said,

January 1, 2017 @ 11:16 pm

"Deliman" has no space or hyphen?

2. ### Neal Goldfarb said,

January 2, 2017 @ 2:50 am

I think there's a calypso song in this…

Come mister deli man,
Deli me Lacana.
Linguist come,
Donald Trump, go home!

3. ### Rubrick said,

January 2, 2017 @ 4:45 am

The final paragraph, from "But language" to "indigestible", seems to suggest that Stein thinks all linguists are Whorfians, if not indeed Orwellians (in the "if you can't say it, you can't think it" sense), which is rather far off the mark.

[(myl) But Edward Sapir, the first half of the "Sapir-Whorf Hypothesis", wrote that

The outstanding fact about any language is its formal completeness. This is as true of a primitive language, like Eskimo or Hottentot, as of the carefully recorded and standardized language of our great cultures. […] [W]e may say that a language is so constructed that no matter what any speaker of it may desire to communicate, no matter how original or bizarre his idea or his fancy, the language is prepared to do his work. […] The world of linguistic forms, held within the framework of a given language, is a complete system of reference, very much as a number system is a complete system of quantitative reference or as a set of geometrical axes of coordinates is a complete system of reference to all points of a given space.

as well as that

We see and hear and otherwise experience very largely as we do because the language habits of our community predispose certain choices of interpretation …

This is sometimes called the "weak Sapir-Whorf hypothesis", which Lane Green has expressed as "Language nudges thought". And it's pretty clearly true, for some value of "very largely".

(For the context of the quotations, see "The Grammarian and his Language", originally published in The American Mercury in 1924.]

4. ### Theophylact said,

January 2, 2017 @ 10:50 am

I should think a deliman might well be a tongue maven.

6. ### Jerry Friedman said,

January 2, 2017 @ 11:31 am

In my idea of punctuation, the possibility that "Lacanian" and "deliman" are (the) two varieties of linguist would need another comma:

"But language, as any linguist, Lacanian or deliman, knows, is the sandwich within which stuff our world."

However, it seems to me that a lot of people don't bother with the second comma in pairs that set off parenthetical elements, since all it does is make things clearer.

(If "linguist", "Lacanian", and "deliman" are three different kinds of people, I'd prefer a comma after "Lacanian", but of course that's a minority preference these days.)

The metaphor would have worked a lot better, in my opinion, if Stein had said that language is the bread we use to sandwich our ideas.

tangent: A space or hyphen in "deliman" would have done nothing but make things clearer.

Theophylact: I should think a deliman might well be a tongue maven.

Or well supplied with baloney.

7. ### Jerry Friedman said,

January 2, 2017 @ 11:35 am

Prof. Liberman: So if "Whorfian" isn't the right word for people who believe that "If a thing doesn’t fit inside our words, we can’t bring it to our mouths," what's the right word or short phrase? Or is it a Liff?

[(myl) The meaning of Whorfian is well established by custom, and so there's no need to find another word — and in any case I believe that Benjamin Lee Whorf was more of a Whorfian, in the current sense, than Sapir was.

Adjectives in -(i)an often drift away from their source. Consider boolean, which has come to mean a certain type of algebra, or perhaps a binary variable, although George Boole made contributions in other areas including differential equations and probability theory.]

8. ### Jerry Friedman said,

January 2, 2017 @ 2:52 pm

9. ### J.W. Brewer said,

January 2, 2017 @ 3:33 pm

I'm not sufficiently convinced that Lacan is amusing to accept Prof. Chomsky's characterization of him, but Prof. Halle's retort reminded me of a line from the at least occasionally amusing (and perhaps also a bit charlatanish) Alan Watts: "The self-styled practical man of affairs who pooh-poohs philosophy as a lot of windy notions is himself a pragmatist or a positivist, and a bad one at that, since he has given no thought to his position."

[(myl) I didn't take Morris as pooh-poohing philosophy in general, but rather as objecting to the presupposition that you can't make shoes properly with first working out your philosophical stance on the process.]

10. ### philip said,

January 2, 2017 @ 7:21 pm

I'm with Jerry on the punctuation … and I'm also with the followers of Whorf on the absence of a need for a word for a nameless concept, on the grounds that a nameless concept cannot exist (in language, anyway – it might exist in art, music or other spheres, but we could not talk about it, maybe just feel it?).

[(myl) Are you serious? There are indefinitely many well-defined concepts that are "nameless" in the sense that there is no single word that denotes them. Some simple ones from grade-school mathematics: "negative number"; "equilateral triangle"; "division by zero"; "greatest common denominator";… Or from grade-school English: "short story"; "topic sentence"; "subordinate clause"; …

I'm always surprised by instances of this strange idea, which you express so clearly: "a nameless concept cannot exist (in language, anyway – it might exist in art, music or other spheres, but we could not talk about it, maybe just feel it?)".]

11. ### philip said,

January 3, 2017 @ 3:07 am

I am half-serious, as (nearly) always, but my definition of nameless would not be 'described by a single word'. Concepts such as those you list are not nameless; they are defined in words. Here is a (futile) attempt at describing a nameless concept: 'You know that sort of thingy that you kind of half-feel or sense sometimes when that other thingy happens? That will be the subject of this essay.'

My original comment was based on the quotation you gave above: "[W]e may say that a language is so constructed that no matter what any speaker of it may desire to communicate, no matter how original or bizarre his idea or his fancy, the language is prepared to do his work. […]" … and was also back-referencing the other post on the subject.