I like journalists, really I do. But sometimes they make it hard for me to maintain my positive attitude. The recent flurry of U.K. media uptake of Language Log posts on UM and UH provides some examples of this stress and strain.
By chance, I came across this most revealing section of a perceptive book by Linda Jakobson entitled A Million Truths: A Decade in China (pp. 175-177), which shows how people from different parts of China often don't really understand each other. That includes people who are supposedly speaking different varieties of Mandarin. I know this from my own experience travelling across the length and breadth of China. For instance, see the second paragraph of this post, "Mutual Intelligibility of Sinitic Languages", also the next to the last paragraph of this article, "English and Mandarin juxtaposed", where I describe a climb up Emei Mountain in Sichuan, during which incomprehension among "Mandarin" speakers was a demonstrable, inescapable fact. Seldom, however, do people write about this semi-taboo topic so clearly as Linda Jakobson. Read the rest of this entry »
Read the rest of this entry »
We start with a psycholinguistic controversy. On one side, there's Herbert Clark and Jean Fox Tree, "Using uh and um in spontaneous speaking", Cognition 2002.
The proposal examined here is that speakers use uh and um to announce that they are initiating what they expect to be a minor (uh), or major (um), delay in speaking. Speakers can use these announcements in turn to implicate, for example, that they are searching for a word, are deciding what to say next, want to keep the floor, or want to cede the floor. Evidence for the proposal comes from several large corpora of spontaneous speech. The evidence shows that speakers monitor their speech plans for upcoming delays worthy of comment. When they discover such a delay, they formulate where and how to suspend speaking, which item to produce (uh or um), whether to attach it as a clitic onto the previous word (as in “and-uh”), and whether to prolong it. The argument is that uh and um are conventional English words, and speakers plan for, formulate, and produce them just as they would any word.
And on the other side, there's Daniel C. O'Connell and Sabine Kowal, "Uh and Um Revisited: Are They Interjections for Signaling Delay?", Journal of Psycholinguistic Research 2005:
Clark and Fox Tree (2002) have presented empirical evidence, based primarily on the London–Lund corpus (LL; Svartvik & Quirk, 1980), that the fillers uh and um are conventional English words that signal a speaker’s intention to initiate a minor and a major delay, respectively. We present here empirical analyses of uh and um and of silent pauses (delays) immediately following them in six media interviews of Hillary Clinton. Our evidence indicates that uh and um cannot serve as signals of upcoming delay, let alone signal it differentially: In most cases, both uh and um were not followed by a silent pause, that is, there was no delay at all; the silent pauses that did occur after um were too short to be counted as major delays; finally, the distributions of durations of silent pauses after uh and um were almost entirely overlapping and could therefore not have served as reliable predictors for a listener. The discrepancies between Clark and Fox Tree’s findings and ours are largely a consequence of the fact that their LL analyses reflect the perceptions of professional coders, whereas our data were analyzed by means of acoustic measurements with the PRAAT software (www.praat.org). [...] Clark and Fox Tree’s analyses were embedded within a theory of ideal delivery that we find inappropriate for the explication of these phenomena.
This is an old video of Jiang Zemin berating a female reporter and defending the right of the central government in Beijing to handpick the Chief Executive of Hong Kong, in this case the first, Tung Chee-hwa. The video, which is an amazing display of Jiang's verbal pyrotechnics, is getting a lot of circulation these days, for obvious reasons. Here it is as recently posted by Shanghaiist on Facebook.
A few months ago, I posted here (and on Slate's Lexicon Valley blog) about PangramTweets, a bot created by Jesse Sheidlower that combs Twitter for tweets that include all 26 letters of the alphabet. I mentioned that it would be interesting to see if PangramTweets turns up any particularly short "pangrammatic windows," i.e., pangrammatic strings in naturally occurring text. At the time, the shortest known example was 42 letters long, in a passage from Piers Anthony's Cube Route:
"We are all from Xanth," Cube said quickly. "Just visiting Phaze. We just want to find the dragon."
My post inspired Malcolm Rowe, a software engineer at Google, to set about finding short pangrammatic windows in an automated fashion, first on the Project Gutenberg corpus and then on the megacorpus of web pages indexed by Google. (Let's hear it for Google's 20 percent time!) On his blog, Malcolm now reports on his findings, including the discovery of a 36-letter pangrammatic window that appeared in a review of the movie Magnolia on PopMatters:
Further, fractal geometries are replicated on a human level in the production of certain “types” of subjectivity: for example, aging kid quiz show whiz Donnie Smith (William H. Macy) and up and coming kid quiz show whiz Stanley Spector (Jeremy Blackman) are connected (or, perhaps, being cloned) in ways they couldn’t possibly imagine.
Mouseover title: "If you want to have more fun at the expense of language pedants, try developing an hypercorrection habit."
That should be "…developing another hypercorrection habit", since making data plural in that situation is exactly analogous to using whom in "Whom are you, anyways?". But then, as Ben Zimmer has pointed out to me, that would spoil the joke involved in the choice of an in "an hypercorrection".
Far from prohibiting translation (see the last item here), the young demonstrators in Hong Kong are offering free translation services for the media and others who may be in need of them.
The following photograph was shared on Twitter by Newsweek's Lauren Walker:
From David Donnell:
"Not for nothin'," as the native NY'ers say, but I saw this commercial on the idiot-box tonight and was tickled by the play on words. Surprised to google and discover "half-fast" has been around for some time. But the TV ad still makes me laugh!
From "Signspotting around the world: Funny fails", a "Lonely Planet travel signs" feature of CNN Travel, I have selected an ensemble of four signs to illustrate different types of translation difficulties.
The first was spotted in a Beijing cafe:
Josef Fruehwald, "America's Ugliest Accent: Something's ugly alright", Val Systems 10/1/2014.
Update — See "The beauty of Brummie", 7/28/2004 — some quotes therein from Steve Thorne:
In May 2002, I recorded short samples of 20 different accents of English… In order to limit the influence of extraneous variables, the speakers chosen were all male, white, aged between 35 and 40, and upper-working to lower-middle class. These recordings were played to 96 native and 109 non-native English speakers who were then asked to briefly describe each accent and rate each one on a scale of 1-10 (1 = very unpleasant, 5 = neutral, 10 = very pleasant). [...]
From Dick Margulis, for the misnegation files:
This is another one of those posts that I wanted to write long ago (actually almost a year ago), but it got lost in the shuffle until now, when I found it going through my old drafts.
It was prompted by an article that Christine Gross-Loh wrote for The Atlantic (October 8, 2013) titled "Why Are Hundreds of Harvard Students Studying Ancient Chinese Philosophy? The professor who teaches Classical Chinese Ethical and Political Theory claims, 'This course will change your life.'"
We've previously observed a surprisingly consistent pattern of age and gender effects on the relative frequency of filled pauses (or "hesitation sounds") with and without final nasals — what we usually write as "um" and "uh" in American English, or often as "er" and "erm" in British English.
Specifically, younger people use the UM form more than older people, while at any age, women use the UM form more than men do. We've seen this same pattern in various varieties of American English and in John Coleman's analysis of the spoken portion of the British National Corpus, and we found the sex effect in the HCRC Map Task Corpus, which involves task-oriented dialogues among college students from Glasgow in Scotland.
It was even more surprising that Martijn Wieling found the same pattern in a collection of Dutch conversational speech. And to make the puzzle more puzzling, Joe Fruehwald's analysis of the Philadelphia Neighborhood Corpus, which includes recordings across several decades of real time, suggests an on-going change in the direction of greater overall UM usage, as well as a life-cycle effect within each cohort of speakers. And Jack Grieve's analysis of Twitter data indicates a pattern of geographical variation within the U.S.
For additional details, see "Young men talk like old women", 11/6/2005; "Fillers: Autism, gender, age", 7/30/2014; "More on UM and UH", 8/3/2014; "UM UH 3", 8/4/2014; "Male and female word usage", 8/7/2014; "UM / UH geography", 8/13/2014; "Educational UM / UH", 8/13/2014; "UM / UH: Lifecycle effects vs. language change", 8/15/2014; "Filled pauses in Glasgow", 8/17/2014; "ER and ERM in the spoken BNC", 8/18/2014; "Um and uh in Dutch", 9/16/2014.
Now Martijn Wieling has found the same pattern in German. His guest post follows.
Left-handed toons from 8/13/2010, "Jasper got a dog", starts like this:
In "Biomedical nerdview", I noted that the terms "sensitivity" and "specificity" seem to be hard even for biomedical researchers to remember, and also denote concepts that are deeply misleading from the perspective of patients and their physicians. I offered a "flash of insight" about why researchers chose to focus on the concepts — they're relevant to public health concerns, though not to patients — but I confessed to being baffled about the hard-to-remember choice of terminology. Bob Ladd responded by email:
While not wanting to take away anything from your flash of insight, I was wondering if you wanted to write another LL post, not about nerdview, but about inexcusably unmemorable terminology for related concepts that have to be sharply distinguished from one another.
Since Bob goes on to suggest an interesting morpho-phonological theory about why some terminological oppositions are so problematic, I got his permission to post his note.