A surprising mistake

« previous post | next post »

The "Federal Demonstration Partnership" is "is association of federal agencies, academic research institutions with administrative, faculty and technical representation, and research policy organizations that work to streamline the administration of federally sponsored research." There are 155 participating universities, and a larger number of "participating organizations" since e.g. the University of California is counted as one "university" but 8 "institutions".

So this is a seriously heavyweight organization, with massive bureaucratic inertia behind its policy decisions. And it's therefore really unfortunate that the contracts it mandates for scientific, technological and scholarly data sharing are deeply problematic.

I discovered this a couple of months ago in preparing for a workshop about the development and future of an interesting government-funded linguistic dataset, for which I had submitted a letter of support. The dataset was to be made freely available to other researchers, but I was shocked to see that Paragraph 8 of the User Agreement read:

This Agreement shall terminate five (5) years after the Effective Date. Upon such termination, Recipient shall destroy all Data.

I can't emphasize strongly enough what a bad idea this is. It would mean that after five years, the "recipient" would no longer be able to replicate research based on this dataset.

It's also a stupid idea, since I seriously doubt that the providing institution would really want to police this, or to take violators to court to enforce it. And at some unnecessary bureaucratic cost, the recipient could just get another copy — unless the providing institution has stopped distributed the dataset, since (for instance) the author has retired or died.

The dataset in question involved annotation of some material taken from podcasts, so my first thought was this clause was dictated by the podcast owners. But no, all the podcasts were distributed under relatively open Creative Commons licenses.

So then I figured that some inexperienced lawyer at the institution's General Counsel's Office has stuck in some irrelevent contractual boilerplate. But no, the author told me that this was mandated by the Federal Demonstration Partnership, so his institution felt unable to change it.

But when he looked into it further, it turned out that the clause I objected to was from an earlier version of the mandated contract, and as of February 2019 the FDP had modified its "Data Transfer and Use Agreement" so that paragraph 8 now reads

Unless terminated earlier in accordance with this section or extended via a modification in accordance with Section 13, this Agreement shall expire as of the End Date set forth above. Either party may terminate this Agreement with thirty (30) days written notice to the other party’s Authorized Official as set forth below. Upon expiration or early termination of this Agreement, Recipient shall follow the disposition instructions provided in Attachment 1, provided, however, that Recipient may retain one (1) copy of the Data to the extent necessary to comply with the records retention requirements under any law, and for the purposes of research integrity and verification.

This is slightly better, depending on what Attachment 1 says — and the providing institution's lawyers, who seem to agree with my evaluation of this clause, decided to leave section 5 of Attachment 1 ("Disposition Requirements upon the termination or expiration of the Agreement") blank.

But there's still plenty of scope for damage, and I found it shocking that the people running the FDP are apparently so unaware of the issues involved in data publication for Open Science, traditional scholarship, and technological development.

Upate — if you happen to know any of the officers or "representatives", please bring this issue to their attention.



  1. MattF said,

    August 14, 2019 @ 9:53 am

    I suspect that a user actually reading the User Agreement was unanticipated.

    [(myl) There's that — but the agreement needs to be signed by an authorized representative of the receiving institution, which means that lawyers are likely to actually read it. And in the case, what actually first brought this to my attention was the reaction of a government agency, which regarded the "destroy after five years" clause as a deal breaker.]

  2. David Marjanović said,

    August 14, 2019 @ 1:44 pm

    may retain one (1) copy of the Data

    Copies of digital data are not countable. Does my latest peer-reviewed paper, which is in an open-access journal that does not publish on dead trees, exist in one copy (at the journal's website), two (there and on ResearchGate), three (I haven't checked if SciHub has downloaded it), or hundreds (on the harddisks of everyone who has downloaded it)?

  3. David Walker said,

    August 15, 2019 @ 12:50 pm

    This was briefly mentioned in the original, but it's important to emphasize that best practices say that all published research should keep the underlying data accessible (basically forever). This is for others who want to validate that the conclusions and analysis in the research, actually match the underlying data. Replication (or discovering errors in the analysis) is a powerful tool in academic research.

    If all data obtained for research purposes had to follow this policy, it would prevent double-checking or replication efforts that might happen 5 years later.

  4. Dennis Dow said,

    August 16, 2019 @ 5:59 am

    The seed for the 5-year data retention might actually be within the IRBs within the universities. I had an identical "destroy the source data in 5 years" for research I did. In many ways disappointing but the intent was to protect the information of people being "researched."

RSS feed for comments on this post