I've occasionally complained that when it comes to comparing sampled distributions, modern western intellectuals are mostly just as clueless as the members of the Pirahã tribe in the Amazon are said to be with respect to counting (see e.g. "The Pirahã and us", 10/6/2007). And it doesn't take high-falutin concepts like "variance" or "effect size" to engage this incapacity — simple percentages are often enough.
I discussed one example a few days ago: the coverage of news about a new blood test for Alzheimer's ("(Mis-) Interpreting medical tests", 3/10/2014). In that post, I cited an article in by John Gever ("Is Blood Test for Alzheimer's Disease Oversold?", MedPage Today 3/10/2014). Today, Gever has a follow-up article on the question of how to evaluate medical tests:
Last week's much-publicized study of a blood test purported to identify healthy elderly people who would develop clear cognitive impairments highlighted the uncertain grasp that most journalists, and even some healthcare providers, have on measures of diagnostic test accuracy.
In that study, you may recall, researchers from Georgetown University found that levels of 10 substances in blood differed in cognitively normal older individuals who developed mild amnestic impairment or Alzheimer's disease within 3 years, compared with a similar group of people whose cognition remained intact. The researchers reported that the 10-marker panel had both specificity and sensitivity of 90% for distinguishing the two groups.
But they left out an important statistic for judging the usefulness of such a test, as it would be applied in the clinic — the positive predictive value (PPV) or the accuracy of positive results seen in the target population (in this case, cognitively healthy seniors).
Contrary to what I later learned is popular belief, calculating a PPV is easy, requiring nothing more than fourth grade arithmetic.
And the positive predictive value in this case is 32%, based on the numbers available in the original blood-test report. As Gever observes, this is
terrible, given that there is currently no "gold standard" test that can confirm an individual patient's positive result, and there is also no treatment currently or likely in the near future. At this point, a positive result serves only to put the patient and his or her family on the alert for problems, which, for an older person, they probably already are.
It's true that the calculations required to derive the PPV involve nothing beyond adding , subtracting, multiplying and dividing easily-available modest-sized numbers. But in order to know what fourth-grade arithmetic to do, you also need to understand the concept of a "contingency table" or "confusion matrix" (though you don't need to know those terms) — and maybe kids aren't ready for that until the sixth grade.
I don't recall ever having encountered this concept in my own K-12 education, and I don't think I saw it in my children's homework assignments either. Given the lack of similar calculations in the many other stories about that widely-covered blood test research, it seems that few journalists (even those who cover relevant technical areas) have learned to deploy fourth-grade arithmetic in this way.
And as a result, the average science writer is apparently just as clueless about the concept of "accuracy" as the average Pirahã villager is about the concept of "seven".