The Cambridge Grammar of the English Language, Chapter 3, The Verb, by Rodney Huddleston, covers "The preterite forms could, might, would, should" in section 9.8, pp. 198-302. The section starts this way:
We have distinguished three uses of the preterite: past time, backshift, and modal remoteness. It is a distinctive property of the modal auxiliaries that the modal remoteness use is much more frequent and less restricted than the past time use — the complete reverse of what holds for other verbs.
There's a recently-fashionable construction, in which "would be" is used where plain "is" might have been expected. For example, in the imaginary Q&A below, I might respond with B2 rather than B1
A: I'm looking for Mark Liberman.
B1: That's me.
B2: That would be me.
A couple of weeks ago, our comments section featured a lively discussion of this phenomenon. (As far as I know, there isn't any common-used term for it, so pending a better idea, I'll call it the TWBM construction, for "That Would Be Me"). Opinions differed, as they often do in discussions of matters linguistic, about where to draw the boundaries of the phenomenon, as well as about its meaning, origins, circumstances of use, and so on. In particular, Bloix suggested that "The point of the 'would be' construction is that it implies doubt on the part of the speaker", while I expressed skepticism about the relevance of doubt to the meaning of this construction.
I can find no better description of Amazon's Mechanical Turk than in the "description" tag at the site itself:
The online market place for work. We give businesses and developers access to an on-demand scalable workforce. Workers can work at home and make money by choosing from thousands of tasks and jobs.
This is followed by a "keywords" meta tag:
make money, make money at home, make money from home, make money on the internet, make extra money, make money …
This makes the site sound a bit like the next stop on Dave Chapelle's tour of his imagined Internet as physical place, and indeed it does have its seamy side. But I come to defend Mechanical Turk as a useful tool for linguistic research — a quick and inexpensive way to gather data and conduct simple experiments.
One aspect of the relationship between meaning and interaction is explored here by taking the English particle actually, which is characterized by flexibility of syntactic position, and investigating its use in a range of interactional contexts. Syntactic alternatives in the form of clause-initial or clause-final placement are found to be selected by reference to interactional exigencies. The temporally situated, contingent accomplishment of utterances in turns and their component turn-constructional units shows the emergence of meaning across a conversational sequence; it reveals syntactic flexibility as both a resource to be exploited for interactional ends and a constraint on that interaction.
She cites a detailed subdivision of possible positions, from Karin Aijmer's 1986 paper "Why is actually so popular in spoken English?" (Tottie and Backlund, eds., English in speech and writing):
The opening of John McPhee's article on fact-checking in the current New Yorker (Checkpoints, Feb 9 & 16, 2009) suggests that checking the facts means checking each word for its factuality. Quoting a legendary fact-checker there, he writes:
Each word in the piece that has even a shred of fact clinging to it is scrutinized, and, if passed, given the checker's imprimatur, which consists of a tiny pencil tick.
This is revealed later on to be a metaphor and/or a record-keeping device; I think all involved know that literally checking at the word-level would be mostly pretty vacuous, and would miss a lot of assertions. My favorite non-word-level anecdote in the article:
Penn's daughter Margaret fished in the Delaware, and wrote home to a brother asking him to "buy for me a four joynted, strong fishing Rod and Real with strong good Lines …"
The problem was not with the rod or the real but with William Penn's offspring. Should there be commas around Margaret or no commas around Margaret? The presence of absence of commas would, in effect, say whether Penn had one daughter or more than one. The commas—there or missing there—were not just commas; they were facts, neither more nor less factual than the kegs of Bud or the colors of Santa's suit.
Victor Mair recently told me about someone who began a letter to him in a way that struck him as odd:
"I actually have a pseudo linguistics question for you about the title of the Manchu emperor."
Victor was surprised by this use of actually. He added:
The next day, my sister from Seattle, who was visiting us in Swarthmore after attending the inauguration in DC, happened to complain about this very usage of "actually" among our nieces and nephews and their friends (when they are expressing an opinion).
Just last week I reported on a couple of accounts describing Barack Obama's conversational skills in Indonesian, a language he learned living in Indonesia from age six to ten. In both of the accounts, Obama was said to handle conventional Indonesian greeting routines with aplomb. Now thanks to ABC News we have the video evidence, from an exchange that President Obama had with State Department staffer Charles Silver on Thursday as the president worked the State Department rope-line. Silver has been stationed in Jakarta at various times since 1969 and now works in the State Department's Office of Inspector General.
Back in July, Bill Poser noted that "Barack Obama is reported to speak Indonesian as result of the four years, from age six to age ten, that he spent in Indonesia." Bill asked for any evidence about Obama's competence in Indonesian. Since then, we've gotten some anecdotal reports about Obama's Indonesian (including from the President of Indonesia!), but we still don't know if his language skills rise above the basic conversational level.
In their new book Sense and Sensitivity, Brady Clark and LL's own David Beaver identify and discuss a class of intensives. The items they name are (most) importantly,significantly, especially, really, truly, fucking, damn, well, and totally. Here's one of their examples:
MTV like totally gave us TWO episodes back to back. It was like so random. The more the merrier, but it's like waay too much for one recap.
I'm intrigued by the classification and independently interested in some of words and phrases involved, so I went looking in a large weblog corpus I recently collected, to see if I could gain some new insights into where and why people use these things. This post describes a first experiment along these lines.
Verb phrase ellipsis in English normally requires an overt linguistic antecedent of approximately the right morphological form. That is, I can't normally begin my conversation with "He did!", but this is perfectly normal after "Sam said he would win, and …". There are exceptions, of course (Geoff Pullum's Hankamer Was! is lively and informative on this topic). Obama's campaign slogans "Yes, we can" and "Together, we can" were prominent exceptions. Lacking antecedents themselves, they invited inferred antecedents or allowed Obama to fill in occasion-appropriate ones. The first time I noticed headline writers playing with the slogan was November 5, 2008:
Using Google News, I gathered a bunch more, based on can, can't, and do.
Swearing is risky behavior. Many of its implications are out of the speaker's control. Thus, it is advisable to know your audience well before, say, dropping the F-bomb. I think this is basically true in any setting, and I expect it to be even more powerfully felt in situations where swearing is highly transgressive.
The Enron email dataset provides a nice chance to test out these claims. It is large (about 250,000 distinct messages, sent and received by over 11,000 distinct email addresses), and it contains a moderate amount of bad language. Not everyone swears, but a fair number of people do. The topics range widely: fantasy football, faith, energy markets, vacation time (and of course bankruptcy and the FERC). So, with some qualifications that I'll get to, it is a useful testing ground for claims about swearing and risky verbal behavior. The following email network graph is my first stab at conducting such a test:
In this earlier post, I was critical of the FCC's claim that the F-word "inherently has a sexual connotation" no matter what the context. (The Supreme Court took up this question yesterday.) However, my post doesn't offer any suggestions for how to get a clear look at what the F-word does contribute to a discourse. Though I don't have results for the F-word in particular, I do have results for more mildly-taboo items, including English damn and the Chinese intensive tama(de). (I'm hoping that this follow-up post allays any fears Geoff Pullum might have that I now see language asa big bag of words…)