This morning's BBC's News Hour program featured one of the most densely nonsensical three-minute sequences that I can ever recall having heard from a respectable media outlet:
The most preposterous stuff, of course, comes from Claire Bolderson, the BBC interviewer; but the responsible scientist, Mark Pagel, doesn't come off very well in this exchange either, in my opinion. (Though I recognize that such interviews are highly edited, and may seriously misrepresent the way that the interviewee would prefer to express the ideas involved.)
Part of the scientific background is Mark Pagel, Quentin D. Atkinson, and Andrew Meade, "Frequency of word-use predicts rates of lexical evolution throughout Indo-European history", Nature 2007, which was a sensible-enough paper.
There must be a new research report behind the recent blizzard of related stories, but none of those that I've seen so far tells us where to find it. I think that I can guess what Pagel et al. probably did, and how it relates to what's being said about it in the mass media, but I'll reserve further comment on the scientific background until I've had a chance to read the research in a form unfiltered by journalism. [Update 2/27/2009: actually, I'm now leaning to the conclusion that the 2007 Nature paper is all that there is behind this. ]
The print version at BBC News is here: "Oldest English words' identified", 2/28/2009.
Mark Henderson, "A handy little guide to small talk in the Stone Age", The Times, leads with the value to cross-era communication:
A “time traveller’s phrasebook” that could allow basic communication between modern English speakers and Stone Age cavemen is being compiled by scientists studying the evolution of language.
Ian Sample at The Guardian ("Word facing extinction: 'Dirty' will be scrubbed from the English dictionary") is worried about dirty:
The unrelenting force of evolution is about to take an unexpected toll on the English language by forcing some of our favourite words into extinction. The word "dirty" is most in danger of going the way of the dodo, and could vanish from use completely within 750 years, researchers said.
Robert Roy Britt at LiveScience ("Oldest English Words Revealed?") features the impact on Scrabble:
A game of Scrabble might not have been all that different in Stone Age times.
Using a computer simulation, a British researcher says he's examined the rate of change of words in languages to reveal the oldest English-sounding words, which would have been used by Stone Age humans 20,000 years ago.
Niall Firth in the Daily Mail explains it all as follows:
Some of the oldest words in the English language date back more than 20,000 years, it has been revealed.
Words like 'I', 'we', 'two', 'three' and 'five' were probably used by our ancestors in the Stone Age – and have changed very little since then.
All of these cute conceits, not to say idiocies, were prominent in the brief News Hour piece, which thus managed an extraordinary compression of nonsense. I wouldn't object if ideas like small-talk with cavemen and stone-age scrabble words were presented as little jokes, in the context of an explanation of why (for example) the effects of sound change mean that conserved cognates — like the various IE versions of five) are unlikely to be much more helpful to communication than unrelated words are, over a span of even a few thousand years, much less ten thousand or twenty thousand or (lord help us) forty thousand. But that's not what we have here, unfortunately.
[Update: As several commenters have observed, the source of the flurry in the news media was a University of Reading press release, "Scientists discover oldest words in the English language and predict which ones are likely to disappear in the future", 2/26/2009. There is apparently no associated research publication, at least so far.
The press release is rather misleading in several ways, most strikingly by talking about retention of cognates over time without mentioning the sound changes that may make them unrecognizable to ordinary speakers after a few hundred years, and will almost certainly do so after a few thousand years, much less 10,000 or 30,000 years. And the press release also makes it seem as if the scientists have "been able to go back almost 30,000 years" in reconstructing Indo-European, which is almost certainly not what they are claiming. In the absence of a coherent explanation it's hard to tell what they did do — but my current hypothesis is that they used the method documented here to estimate rates of replacement for various words in something like the Swadesh list for Indo-European languages, and then applied standard methods to those estimates in order to calculate how probable it is that each word might survive (in however mutated a cognate form) for N years. [On further reflection, I think that this is probably just a re-presentation of the work in the 2007 Nature paper, which uses Swadesh-list-type data to estimate the half-life of various words, allowing guesses about how likely it is that a particular word might have been retained -- in the sense of having a modern cognate -- from time depths many millennia before there is any documentation or even any reconstruction.]
But there's nothing in the press release, as misleading as it may be, about playing scrabble with cavemen.
In the context of making fun of the sillier journalistic excesses, and Mark Pagel's apparent lack of prudence or even collusion in encouraging them, I'd like to underline the fact that the methods (of computational phylogeny in general, and Bayesian approaches in particular) are worthwhile and interesting, and have in other cases (e.g. here) been applied in a responsible way to advance our understanding of linguistic history. ]