Importance of publishing data and code

« previous post | next post »

J.W. writes:

In connection with some of your prior statements on the Log about the importance of publishing underlying data, you might be interested in Thomas Herndon, Michael Ash, and Robert Pollin, "Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff", PERI 4/15/2013 (explanation in lay language at "Shocking Paper Claims That Microsoft Excel Coding Error Is Behind The Reinhart-Rogoff Study On Debt", Business Insider 4/16/2013). In sum, a look at the data spreadsheet underlying a really influential 2010 economics paper reveals that its results were driven by selective data exclusions, idiosyncratic weighting, and an Excel coding error [!].


Paul Krugman notes another recent case of "death by Excel". And there have been some equally shocking and damaging cases in several other fields, including translational cancer research. From the abstract of Keith Baggerly's presentation in the symposium on "Reproducible Science" at AAAS 2011:

In this talk, we examine several related papers using array-based signatures of drug sensitivity derived from cell lines to predict patient response. Patients in clinical trials were allocated to treatment arms based on these results. However, we show in several case studies that the reported results incorporate several simple errors that could put patients at risk. One theme that emerges is that the most common errors are simple (e.g., row or column offsets); conversely, it is our experience that the most simple errors are common. We briefly discuss steps we are taking to avoid such errors in our own investigations.

In the case that Keith explores in detail, the program involved was R rather than Excel. For more detail, see Keith Baggerly and Kevin Coombes, "What Information Should Be Required to Support Clinical 'Omics'Publications?",  Clinical Chemistry 2011:

A major goal of “omics” is personalizing therapy—the use of “signatures” derived from biological assays to determine who gets what treatment. Recently, Potti et al. (1) introduced a method that uses microarray profiles to better predict the cytotoxic agents to which a patient would respond. The method was extended to include other drugs, as well as combination chemotherapy (2, 3). We were asked if we could implement this approach to guide treatment at our institution; however, when we tried to reproduce the published results, we found that poor documentation hid many simple errors that undermined the approach (4). These signatures were nonetheless used to guide patient therapy in clinical trails initiated at Duke University in 2007, which we learned about in mid-2009. We then published a report that detailed numerous problems with the data (5). As chronicled in The Cancer Letter, trials were suspended (October 2, 9, and 23, 2009), restarted (January 29, 2010), resuspended (July 23, 2010), and finally terminated (November 19, 2010). The underlying reports have now been retracted; further investigations at Duke are under way. We spent approximately 1500 person-hours on this issue, mostly because we could not tell what data were used or how they were processed. Transparently available data and code would have made checking results and their validity far easier. Because transparency was absent, an understanding of the problems was delayed, trials were started on the basis of faulty data and conclusions, and patients were endangered. Such situations need to be avoided.

In my opinion, problems of this general kind are endemic in linguistics, psychology, and other fields as well — our errors are not likely to damage world economies or cancer patients, but truth and science do suffer.

And of course there's a spectrum of error, delusion, and fraud, from simple coding errors to hidden choices about data exclusion, data weighting, modeling choices, hypothesis shopping, data dredging, and so forth. The current system of peer review, although it often delays publication for two years or more, does a very bad job of detecting problems on this spectrum. A more streamlined reviewing system, with insistence on publication of all relevant data and code, and provisions for post-publication peer commentary, would be much better.


  1. Jon Weinberg said,

    April 18, 2013 @ 6:40 am

    Reinhart and Rogoff (the authors of the economics paper) have now conceded the Excel error, although they defend their other choices.

  2. Michael Williams said,

    April 18, 2013 @ 7:19 am

    I attended a great talk by Cameron Neylon about just this.

    He cites in particular "Empiricism is not a matter of faith: by Ted Pederson.

  3. bks said,

    April 18, 2013 @ 9:59 am

    A very common spreadsheet error is to sort a single column without realizing that it results in corruption of the entire set of data.


  4. Brett said,

    April 18, 2013 @ 10:13 am

    Lest readers get the wrong idea, in the second case discussed above, the problem was not merely errors in the analysis but apparently fabricated data.

    [(myl) There was certainly "research misconduct", as discussed in the Wikipedia article. But it's not clear to me that this actually involved fabrication of data, rather than some coding mistakes that led to artefactual results, with subsequent stonewalling to prevent or at least delay verification and correction of the problems. Duke did not intervene effectively in the case until it was discovered that the key researcher had fabricated or at least exaggerated an item in his C.V. ]

  5. peter said,

    April 18, 2013 @ 11:27 am

    "Death by Excel", sounds exaggerated until one realizes just how few potential arithmetic coding errors we may be from major disasters:

  6. bfwebster said,

    April 18, 2013 @ 12:54 pm

    This is also a critical issue in climate science, where many of the published researchers have been loathe to release their data (and in some cases, their code) behind their models and claims. As it turns out, the hesitancy often was for good reasons.

  7. Alex Blaze said,

    April 18, 2013 @ 1:12 pm

    This is a huge issue when it comes to pharmaceuticals – drug companies think their clinical trials are their intellectual property and won't release them, but without them other researchers and the public are denied information about the actual efficacy of these drugs. Dean Baker's idea is just to let the government take over clinical trials for drugs, ensuring uniform administration and public domain of the information, at the same time cutting the cost of researching a new drug by more than 50% (which should mean that patent protections become looser.

    But, yeah, R&R's response points more towards academic dishonesty than a simple coding error. Rogoff particularly, at least from his previous work, is invested in this idea that debt levels are a valuable metric for predicting market crashes, so he might have wanted to just fudge it a little to get the "90% is death" result.

  8. J.W. Brewer said,

    April 18, 2013 @ 1:34 pm

    Here, the Herndon et al critique was possible because R&R did give them the data. I don't know the story, in terms of how many times H&c had to ask before they got the data, if others had previously asked and been rebuffed, and/or what may have factored into R&R's decision to respond positively to this particular request (if it was not simply that this was the first time anyone had asked), but it doesn't seem like the sort of response people who subjectively thought they had cheated would make. (Don't academics in those scandals usually have some elaborate story about how an unfortunate flood destroyed their contemporaneous research notes or something?) It does seem like some of the points R&R make in their defense as to what data wasn't included and how and why they did the weighting as they did are the sort of thing that could helpfully have been made more explicit in the original paper than I assume they were, but I don't know what's customary in the relevant discipline.

    [(myl) I haven't investigated the history in detail, but as I understand what I've read, the data is from generally-available sources. Herndon et al. tried to replicate R & R's results without success — after trying and failing they asked R & R for help, and were given the spreadsheet. I interpret this to mean that there was neither any dishonesty nor any stonewalling involved, just some arguable choices about data exclusion and data weighting, and one coding error that perhaps was not noticed because the result was what the researchers wanted to find.]

  9. Brett said,

    April 18, 2013 @ 2:37 pm

    @myl: As I recall, Potti's collaborator at Duke, Joe Nevins, eventually concluded that the problems with the data must have involved intentional fabrication. I believe this was covered in the 60 Minutes report on the case (although I can't get the CBS news Web site to load right now, so I'll have to double check that later).

  10. Alex Blaze said,

    April 18, 2013 @ 3:18 pm

    Ha, turns out both R and R are on payroll for an anti-debt advocacy group:

    Isn't it great that the next generation of Harvard economics phd's is taking intro macro from someone being paid to manipulate data to advance an interest group's agenda?

    And, yes, sure, could have been a sequence of simple mistakes that helped advance a conclusion that the authors wanted. Strange that all the elements of the sequence worked in the same direction, but that just might be a coincidence. It could also be a coincidence that they didn't want to release information about how they did their calculations, even to other scholars, because… I don't really know why. It's an even bigger coincidence that they never reviewed this study over the last half-decade, even though it was their most-cited paper in policy debates.

    On top of all that, it's an even bigger coincidence that all these mistakes that went ignored for years that R&R prevented other people from finding out all advanced the agenda of a group that was paying them money (salary and book deal, as well as fancy titles, plus Reinhart's husband works for another org that uses debt scare-mongering).

    That's a really huge coincidence! I wonder what the probability of all of that falling into place by chance is.

    Clearly, though, we can never admit that Serious and Respected Harvard Professors are corrupt. That's unthinkable!

  11. Sybil said,

    April 18, 2013 @ 9:39 pm

    @Alex Blaze: I'd be reluctant to assume that being paid meant being biased, but that's because that's how my own biases go. I tend to be extra skeptical of my own biases.

    I do agree (it seems you agree) that such things should be reported. It is information needed to assess reliability.

    Here there seems to be ample prior evidence of bias (conscious or unconscious), so the payroll evidence just ices the cake.

    But, as you say, interesting it went so long before being pointed out. As Captain Hook himself says,

    "We'll cook a cake quite large…
    Fill each layer in-between…
    With icing mixed with poison…
    'Till it's turned a tempting green!!! [obvious symbology]

    We'll place it near the house…
    Just were the boy's are sure to come…
    And being greedy they won't care…
    To question such a plum!!! [Who would?]

    The Boy's who have no mother sweet,
    No one to show them their mistake,
    Won't know it's dangerous to eat,
    such damp, and rich a cake!!!

    And so, before, and in the blinking of an eye…

    The boy's will eat that poison cake, and one by one…

    THEY'LL DIE!!!!!"

  12. Sybil said,

    April 18, 2013 @ 9:44 pm

    To my previous comment: I got the lyrics of Hook's song (lazily) from the WWWeb, which with its usual sense of humor had entered every instance os "boys" as "boy's". Terrible job, internet! [Slaps self]

  13. Mark P said,

    April 19, 2013 @ 8:32 am

    @bfwebster – I would like to see some examples of climate science in refereed journals in which data and methodology were not available.

RSS feed for comments on this post