Wright on language and linguistics

According to Dana Milbank, "Still More Lamentations From Jeremiah", Washington Post, 8/29/2008

The Rev. Jeremiah Wright, explaining why he had waited so long before breaking his silence about his incendiary sermons, offered a paraphrase from Proverbs yesterday: "It is better to be quiet and be thought a fool than to open your mouth and remove all doubt."

Barack Obama's former pastor should have stuck with the wisdom of the prophets.

Milbank focuses on the fact that at the National Press Club yesterday,

Wright praised Louis Farrakhan, defended the view that Zionism is racism, accused the United States of terrorism, repeated his belief that the government created AIDS to extinguish racial minorities, and stood by his suggestion that "God damn America."

We'll leave those issues to the political blogs. But Rev. Wright's recent divagations have extended into linguistic territory as well. And the results are mixed at best.

Read the rest of this entry »

Comments (15)


What is the polite word for "pimp"?

Hate speech laws, in my opinion, are in general offensive and counterproductive, but the Alberta Human Rights, Citizenship and Multiculturalism Act contains a provision that really takes the cake. Section 3(1) provides:

No person shall publish, issue or display or cause to be published, issued or displayed before the public any statement, publication, notice, sign, symbol, emblem or other representation that

(a) indicates discrimination or an intention to discriminate against a person or a class of persons, or

(b) is likely to expose a person or a class of persons to hatred or contempt

because of the race, religious beliefs, colour, gender, physical disability, mental disability, age, ancestry, place of origin, marital status, source of income or family status of that person or class of persons.

You can't "expose a person … to hatred or contempt because of …source of income"‽ I'm sympathetic to the goal of discouraging the idea that, say, toilet cleaners or leather workers are inferior to other people, but aren't there some occupations that are legitimately and more-or-less universally held in contempt? Are there ways to describe a pimp, a torturer, a pirate, or a slave trader that don't expose them to hatred or contempt? I hope not.

Comments (26)


Do People Know What They Say?

Earlier today in a speech about his relationship with Louis Farrakhan, the leader of the Nation of Islam, at the National Press Club, Jeremiah Wright, the controversial pastor of the church that Barack Obama attends, said (transcript) (video):

Louis Farrakhan is not my enemy. He did not put me in chains. He did not put me in slavery. And he didn't make me this color.

Let me get this straight. Putting someone in chains is bad, right? Putting someone in slavery is bad, right? So "making me this color" is also bad, right? Personally, I don't think that there's anything wrong with being black. I'm dismayed that Rev. Wright does.

Comments off


Happy Birthday: The Legal Story

Some time ago Geoff Pullum wrote about the connection between the song "Happy Birthday" and linguistics, via Archibald Hill, Professor of Linguistics at the University of Texas, who inherited part of the rights to the song. The convoluted story of the copyright to the song has now been sorted out by Robert Brauneis in a paper available here. Professor Brauneis is guest-blogging on the topic today at the Volokh Conspiracy.

Comments off


A sane survey of Crazy English

As promised a few days ago by Victor Mair, Amber R. Woodward's senior thesis has now been published: ("A Survey of Li Yang Crazy English", Sino-Platonic Papers No. 180, April 2008).

Comments (3)


Is English more efficient than Chinese after all?

[Executive summary: Who knows?]

This follows up on a series of earlier posts about the comparative efficiency — in terms of text size — of different languages ("One world, how many bytes?", 8/5/2005; "Comparing communication efficiency across languages", 4/4/2008; "Mailbag: comparative communication efficiency", 4/5/2008). Hinrich Schütze wrote:

I'm not sure we have interacted since you taught your class at the 1991 linguistics institute in Santa Cruz — I fondly remember that class, which got me started in StatNLP.

I'm writing because I was intrigued by your posts on compression ratios of different languages.

As somebody else remarked, gzip can't really be used to judge the informativeness of a piece of text. I did the following simple experiment.

I read the first 109 or so characters from the xml Wikipedia dump and wrote them to a file (which I called wiki). I wrote the same characters to a second file (wikispace), but inserted a space after each character. Then I compressed the two files. Here is what I got:

1012930723 wiki
2025861446 wikispace
314377664 wiki.gz
385264415 wikispace.gz
385264415/314377664 approx 1.225

The two files contain the same information, but gzip's model does not handle this type of encoding well.

In this example we know what the generating process of the data was. In the case of Chinese and English we don't. So I think that until there is a more persuasive argument we should stick with the null hypothesis: the two texts of a Chinese-English bitext are equally informative, but the processes transforming the information into text are different in that the output of one can be more efficiently compressed by gzip than the other. I don't see how we can conclude anything about deep cultural differences.

Note that a word-based language model also would produce very different numbers for the two files.

Does this make sense or is there a flaw in this argument?

Read the rest of this entry »

Comments (16)


Recent WTF reactions: some remarks

Last week, I posted a couple of example sentences that had given me pause:

  1. I'll never forget how he must have felt. (overheard)
  2. Aren’t you glad you archived instead of deleted? (over-read)

I promised I'd get back to these, so here I am.

Read the rest of this entry »

Comments (20)


Everyone knows each other

"Everyone knows each other", said someone on BBC Radio 4 this morning, speaking about some tight-knit community. And instantly I saw that this was the key to a definitive argument against the logic of the opponents of singular they. I wonder if I can make you see how awesomely beautiful the insight is.

The -s suffix on the present-tense verb knows tells us that the subject is morphosyntactically singular. That is, it counts as singular for purposes of subject-verb agreement. But each other, famously, requires a semantically plural subject. That is why They know each other is grammatical and *He knows each other is not. From this and nothing else it follows that semantic plurality and morphosyntactic singularity are compatible in English. No prescriptivist has suggested that there is something grammatically wrong with Everyone knows each other. But because of that, the logical objection to singular they just collapses. Everyone knows themselves has no grammatically relevant property that isn't already instantiated by Everyone knows each other.

Read the rest of this entry »

Comments off


Getter better

Yesterday on ADS-L, Doug Harris noted a surprise (boldfaced below) in a piece by TVNewser columnist Gail Shister:

With "CBS Evening News" on life support, Katie Couric should walk away.

Now.

So says Emily Rooney, former executive producer of "ABC World News Tonight," among others.

"She should do it sooner than later. I'd do it now," says Rooney, media critic for Boston's WGBH. "What's she waiting for? Will it getter better after the election? After the inauguration? Of course not.

(I'll post on "sooner than later" on another occasion.)

Was this just an inadvertent slip, with the -er of the comparative better anticipated on the preceding verb get (perhaps facilitated by the rhyme of get and bet-)? Almost surely not; Harris got 21,300 raw webhits for {"getter better"}, and even granting that there are many duplicates and that some might be slips, there are still many examples remaining that look like people are saying and writing just what they intend to. It looks like a new idiom — new to me and possibly to the usage literature, and possibly recent.

Read the rest of this entry »

Comments (20)


Mayo in the ano

[Update 4/28/2008: Let me spoil the fun by pointing out that this post was supposed to be a joke. Apologies, for being excessively indirect again, to the half-a-dozen commenters who have earnestly informed me that English-language puzzles limit themselves to our standard 26 letters. I was just trying to underline, jocularly, Roger Shuy's jocular point that analogous limiting conventions in texting will probably not destroy … Oh, never mind.]

In a recent post, Roger Shuy warned us about the threat to civilization posed by the New York Times crossword puzzle:

Correct answers to the Times puzzles require no apostrophes to mark the important distinction between “its” from “it’s” or even to indicate possessive nouns. No correctly hyphenated words are permitted. And even though you know better, have you ever been able to use a comma, colon, semicolon, quotation mark, virgule, or question mark in a New York Times crossword puzzle? No, you haven’t! Not even periods after abbreviations. No spaces between words in phrases. No dashes in front of suffixes. How’s that for creeping whateverism?

As Roger observed, it's striking that those who urge action against the barbarian hordes of txters are unconcerned about the fifth column of crossworders in our midst. But Joe Gordon is sounding the tocsin. A long-time Language Log correspondent, Joe has sent me a series of notes on this subject, focusing especially on the New York Times crossword for Thursday, April 24, in which the clue to 28 across was "Mayo can be found in it", and the required answer is A-N-O. As Joe explains:

This is wrong … N is not the same letter as the one that appears in the word Year translated into Spanish. It is a different letter. I swear. Look it up.

Given the error, the clue reads, translated, "Mayo can be found in it", answer, "Anus".

Read the rest of this entry »

Comments (17)


Smart mistakes

Students of speech errors have long observed that they provide insight into the way language is organized mentally; the inadvertent slips that people make show that they know (tacitly) enormous amounts of stuff about their language. So do mistakes of another sort, in which people produce what they intend to, but this diverges in some way from what they are expected to produce in some community or context: persistent misspellings (not typos) like loose for lose, for example (discussed here). Many of these mistakes are "smart mistakes", which show that those who produce them know a lot about the standard system; at the same time, they are "mistakes of ignorance", meaning ignorance of the complete standard system — but actually ignorance of just one or two relevant details.

Read the rest of this entry »

Comments (10)


Closer, my ex, of you

The most recent xkcd:

For me, at least, it should be "closer to", not "closer of". This isn't a necessary truth: "I can't get within 500 yards of you" would be perfectly fine; and in French, for example, the preposition used with "plus proche" is de, not à:

Tout ce qui est plus proche que 3 mètres ou plus éloigné que 5 mètres de vos yeux est flou.
C'est une saveur qui est plus proche du thym que de l'anis.

But in English, it seems to me, close and most of its synonyms — without or without -er and -est — should take to. Walt Whitman wrote "Come closer to me", not "Come closer of me". The old song is "Nearer, my God, to thee", not "Nearer, my God, of thee". There are more recent songs "Closer to you" and "Close to you"; as well as "Closer to me" and "Close to me". But the only pop resonance for "closer of you/me" seems to be a non-native-speaker's translation of "Un poco cerca de mi".

It's possible that Randall Munroe originally wrote, or at least thought, "I can't get within 500 yards of you", and then changed "within 500 yards" to "closer than 500 yards" without changing the preposition.

But it's also possible that this is one of the many points on which different speakers of English have different ideas.

Read the rest of this entry »

Comments (22)


320 kg or 4 people

Hesitant though I am to take on still more uses of or (most recently discussed here) — this could quickly become an endless chain of fascinating data — here's one that at first is puzzling, until you figure out what people are trying to do with it. It came from Benjamin Massot, a French linguist currently living in Germany, who noticed signs like the following in lifts (or, as we say in American English, elevators):

French: capacité: 320kg ou 4 personnes

German: Tragfähigkeit: 320kg oder 4 Personen

In English, 320 kg or 4 persons. Massot tried at first to figure out who (the lift company or the users of the elevator) would be declared responsible in case of an accident, for various combinations of weights, numbers of people, and interpretations of or, but eventually concluded (correctly, I think) that in this context a limit of "320 kg or 4 persons" just meant 320 kg.

Read the rest of this entry »

Comments (24)