Having had a quick look at the study, I'm not sure what 'allostatic load' really measures, and I'm a little surprised no-one else has mentioned it. It gives binary outcomes for the variables, and at the very least the blood pressure/BMI are going to be pretty highly correlated. I don't have access to the articles referenced, but they all seem to have the same person as 1st/last author, which increases my scepticism. Any thoughts?

]]>Moreover, they tend to work on very short deadlines and to be subject to editors whose departments are so expansive that they can't possibly have any kind of expertise in more than a tiny fraction of what crosses their desks. It's really surprising that the magazine (do they still call themselves a "newspaper"?) is as good as it is.

]]>Chart title: Who wins on Memory Task?

Subtitle: (pairs chosen at random)

Distribution labels: Rich girl wins/Poor girl wins

]]>Projecting outcomes with only one factor is not a good teaching strategy.

]]>Excellent point. (See this article to learn more about Tufte.) myl's graphical illustration of the two distributions is fine as far as it goes; but I can't help feeling that another chart — which I haven't got the software to produce just now, but which should be pretty simple for him to generate — would make the point more clearly. Underlying his 64/36 split between randomly-selected middle- & low-SES girls' performance on a working memory task is the basic statistical result that, for independent random variables, the variance of a difference is the sum of the variances. Without this knowledge, there's no simple way to jump from the 2 overlapping distributions to the answer to his original question.

The Tufte approach would, I suggest, be to show a distribution centred on 0.5 & with std dev sq root 2, entitled *Differences in Memory Task Scores (randomly-selected pairs)*. The larger part of this distribution (to the right of the y-axis) could be labelled "Middle-SES girl outscores low-SES girl"; & conversely the smaller part to the left (where the difference is negative) could be labelled "Low-SES girl outscores middle-SES girl". I'm sure that with a little thought the labels could be improved to make the point more succinctly.

For example, look at Carl Zimmer, a NYT contributor's comment:

"2) Continue to write about this research in with our essentialist fallacies, content in the knowledge that nobody–not even a linguist–can do better."

I don't even know what that means, other than the jargon term "essentialist fallacies" shouts "Please don't Watson me! I'm a true believer."

[(myl) If you don't know what Carl meant, you must not have read the post that he's commenting on. So let me explain it again.

We start from the observation that people often use generic expressions of the form "Xs are P-er than Ys" to refer to a situation where the mean value for P among Xs is somewhat higher than the mean value for P among Ys, such that a randomly selected X has (say) a 51% (or 53% or 64% or 79%) chance of being higher in P than a randomly selected Y. The "essentialist fallacy" is to take the generic expression to describe a fact about the essence of Xs as a group and Ys as a group, which leads a reader to the natural (but by hypothesis invalid) inference that (almost) any X is more P than (almost) any Y. When such relationships are used in public policy decisions (or in medical decisions or any other situation where we want to make inferences from group membership to individual characteristics), it's really important to know whether we're talking about a 51% chance of X being bigger, or an 80% chance, or a 99% chance. But the usual language of science reporting doesn't give us a clue about this.

Some people have philosophical objections to making decisions about individuals based on group membership. But the question of whether such objections are appropriate or not is not really relevant here; and James Watson's famous remarks about race and intelligence are doubly or triply irrelevant, since that controversy involved many questions of fact about his specific claims, and many other moral and political questions about his specific suggestions.

The problem here is that the statistical illiteracy of the public, combined with the lack of convenient ordinary-language expressions for comparing distributions between groups, makes things difficult for science writers. The way I put it is that you can have any two out of the three goals of clarity, brevity, and accuracy. And this applies whether you're talking about the effects of childhood stress on adult short-term memory, as in the case discussion in this post, or the effects of fetal testosterone on adult ability to read faces, or whatever. It's got nothing to do with political correctness. ]

]]>