Reproducible research

« previous post | next post »

For the last few days, I've been in Düsseldorf for the Berlin 6 Open Access conference, where I organized a session on "Open Data and Reproducible Research". Here's the abstract:

In many scientific and technical fields, research is increasingly based on published data. Researchers also often publish detailed instructions or even executable recipes for reproducing their results. Combined with inexpensive networked computing and mass storage, these trends can radically accelerate the pace of research, by lowering barriers to entry and decreasing the time required to reproduce and extend innovations. These changes may also modify the balance between data collection and data analysis, and between experimental and theoretical work.

Nevertheless, these potentially revolutionary developments are mostly happening below the surface, with uneven progress across disciplines, and little general discussion of how to guide or react to the process. The goal of this panel is to publicize the experience of several communities who have up to two decades of experience with what Jon Claerbout has termed "reproducible research", and to begin a general discussion of the broader implications for scientific, technical and scholarly publication.

The session is described on p. 8 of the program, but since two of the participants were taking part remotely (Segey Fomel from Las Vegas, and Jelena Kovacevic from Pittsburgh), I prepared a web page with everyone's presentations linked in. At some point the slides and videos of the presentations will be put up on the web by Cornelius Puschmann and the other conference organizers; but if the ideas involved interest you, you might want to take a look at the material linked to the page that I set up.



11 Comments

  1. Tim Kelby said,

    November 14, 2008 @ 8:21 am

    Two decades of experience? I thought the idea of "reproducible research" had been around since the Enlightenment. Maybe I've missed something, but surely "reproducible research" is a basic part of modern science. Part of the point of publishing papers in peer-reviewed journals is to allow other researchers to reproduce and continue with your research.

    [(myl) It's certainly true that the *ideal* of reproducible research has been around since the Enlightenment. In the particular case of computational experiments with published inputs, it's now possible to eliminate most of the things that usually get in the way of the ideal.]

  2. Mark P said,

    November 14, 2008 @ 11:31 am

    I guess this mainly applies to particular types of research that involve large quantities of collected data rather than, for example, experiments undertaken with some sort of physical apparatus. In the latter case, reproducibility is supposed to be ensured by describing the apparatus and techniques in sufficient detail that another researcher could build the apparatus and repeat the experiment. There is great suspicion of results that depend on hidden research, like the cold fusion research of a few years ago. But other types of research depend on large quantities of collected data, like long-term, large-population medical research. In cases like that, access to data as well as techniques is required for reproducibility.

    And we certainly have new capabilities available today for distribution of large quantities of data. For example, NOAA has truly enormous quantities of data collected over many years by many satellites. Twenty years ago you could access it by identifying an instrument, a time and a location and then requesting that NOAA staff find the right data and send it to you by magnetic tape, at a fairly significant cost and with a delay of weeks. Today you can go online and access it, identify the data sets you want and order them. Then you simply FTP the data on the same day. And the data sets I order, which total a hundred megabytes at a time, are free. This particular program, the Comprehensive Large Array-data Stewardship System, is a model for how government should work, and not a bad model for how other types of data could be made available.

    [(myl) These are good points — but the same ideas apply (for instance) to complex simulations, where the amount of data being modeled may be fairly small, and in fact the fit may be more or less qualitative.

    And likewise in the case of experiments on humans, where one (important) definition of reproducibility would involve recruiting another set of subjects, but another (and also important) one involves evaluating alternative hypotheses with respect to the raw results of a particular set of sessions. ]

  3. Peter said,

    November 14, 2008 @ 12:57 pm

    There's an obvious linkage here to the recent development of domain-specific programming languages for the remote and/or automated manipulation of scientific instruments, such as mark-up languages for the remote use of telescopes and microscopes. Not only is the data being shared, but so too are the expensive scientific instruments which generate it, and the GRIDS of computers which may process it.

  4. Dan T. said,

    November 14, 2008 @ 2:20 pm

    The Journal of Irreproducible Results is a parody publication playing on the scientists' need for the converse (reproducible results). (It's spawned a dissident schismatic publication, the Annals of Improbable Research, after a former editor and his staff quit in a dispute with the publisher.)

  5. Chris Callison-Burch said,

    November 14, 2008 @ 3:03 pm

    Ted Peterson had an excellent editorial in the latest edition of Computational Linguistics entitled Empiricism Is Not a Matter of Faith. It focuses on the difficultly in reproducing results in NLP and advocate that people release their software and data sets.

    I've taken it to heart and released the software and data for my latest EMNLP paper. I've bundled 6TB with of training data that I used, and created step-by-step instructions on how to extract paraphrases. It was a useful exercise for me, and I hope that other people give it a try for their own projects.

    [(myl) For those who unaccountably find themselves without a subscription to CL, there's a version of Ted's editorial on his web site here.]

  6. Cameron Majidi said,

    November 14, 2008 @ 4:27 pm

    Has there ever been a linguistics-based humor magazine along the lines of the Journal of Irreproducible Results, or Worm-Runner's Digest? I used to avoid browsing through both of those magazines while avoiding real work at the library.

  7. Diarmuid said,

    November 14, 2008 @ 5:28 pm

    Re linguistics humour: Try the Speculative Grammarian.

    And for the machine learning-minded there is (was?) also the Journal of Machine Learning Gossip, which published among other gems LaLoudouana and Tarare's Data Set Selection – an evil second-cousin of the Pedersen paper Chris cites.

  8. Theo Vosse said,

    November 15, 2008 @ 9:50 am

    In my former field, psycholinguistics, the standard was as described above: "reproducibility is supposed to be ensured by describing … in sufficient detail …" Even though that probably means that an exact repetition is likely, it does not guarantee generalizability. Linguistic stimuli can contain many confounds and it is hard to counter-balance for all factors or average them out, and sometimes people overlook them (a colleague once said that checking you shouldn't check your material too thoroughly if you want to get an effect). The upshot is that this strict definition of reproducibility is not sufficient; the thing you would ideally like to see is that the same effect appears under slightly different conditions with a different set of stimuli, in a different language, etc.

  9. Berlin 6 Open Access Conference » Wrapping up Berlin 6 said,

    November 17, 2008 @ 10:05 am

    […] Language Log […]

  10. Patrick said,

    November 17, 2008 @ 3:11 pm

    @Tim Kelby: Quickly commenting on this… Tim, yes, I agree that the essence of making research reproducible is already centuries old, and dates back to the scientific method itself. All research should be reproducible. And there we are at a crucial point: should! Unfortunately, I have the impression (speaking for myself, or my experiences in image processing) that with the descriptions in many peer-reviewed journal publications, I would not be able to reproduce the results. Often details, such as initialization, exact parameter values, and datasets are left out (generally because of limited space), which make it very hard, if not impossible, to reproduce the exact results presented in those papers.

    So to summarize: yes, I agree that the idea of reproducible research is around for a very long time. But I think we have quite some work (at least in image processing, but from what I read from other people, it applies probably more generally), to make our publications and research results really reproducible!

  11. Berlin 6 Open Access Conference at Pixeltje Blog said,

    November 19, 2008 @ 2:02 pm

    […] Language Log, by Mark Liberman […]

RSS feed for comments on this post