Language Log

"Instant replay" and intellectual referees

February 1, 2019 @ 6:27 am · Filed by Mark Liberman under The academic scene, The language of science

The title of a post at MedPage Today echoes the widely negative reaction to obviously blown calls in the recent NFL conference title games — "Is Journal Peer-Review Now Just a Game? Milton Packer wonders if the time has come for instant replay":

Many believe that there is something sacred about the process by which manuscripts undergo peer-review by journals. A rigorous study described in a thoughtful paper is sent out to leading experts, who read it carefully and provide unbiased feedback. The process is conducted with honor and in a timely manner.

It sounds nice, but most of the time, it does not happen that way.

For some comments about the process from the perspective of editors, reviewers, and authors, see the rest of Packer's post. His experience is in the biomedical field, but the situation is similar in other fields. Amazingly bad stuff is often published in respectable and even eminent journals, and genuinely insightful work can be delayed for years by painfully slow interactions with inattentive and dubiously competent reviewers.

Packer's conclusion:

The peer-review process is horribly broken. My dear friend Harlan Krumholz, MD, of Yale University, has written about this for years. In a famed paper ("The End of Journals"), he argued that journals are too slow, too expensive, too limited, too unreliable, too parochial, and too static.

In his efforts to forge a solution, Harlan has been the leading advocate for preprint servers, a platform that allows authors to post a version of a scholarly paper that precedes peer review. The idea has real merit, and its following is growing rapidly.

If you think of peer-review as a game, then the preprint server is easy to understand. If you are fond of football or baseball, just think of it as "instant replay." You can see the paper immediately and repeatedly in its original state from every possible angle. But in the case of preprint servers, the "instant replay" takes place before the play has actually occurred.

Got it? Peer-review is a game — with a dollop of talent and an abundance of chance. So, are you feeling lucky today?

The current structure of scientific, technical, and scholarly journals evolved in a situation where reproduction (typesetting, printing) and distribution (snail mail) were scarce and expensive resources that had to be rationed.

In increasingly many fields, actual intellectual communication now takes place through the digital distribution of drafts on sites like arXiv, data and code on sites like github, and conference proceedings (which are lightly reviewed since space and time at conferences are still limited resources). In those fields, publication in traditional journals has become an empty ritual like the wearing of academic robes, capes, and headgear on ceremonial occasions — a cultural survival without any real role in teaching and research.

A sample of vaguely relevant earlier posts:

"The business of newspapers is news", 12/10/2009
"Sproat asks the question", 9/17/2010
"Why most (science) news is false", 9/21/2012
"From the American Association for the Advancement (?) of Science (?)", 5/25/2013
"(Not) trusting data", 8/4/2013
"The open access hoax and other failures of peer review", 10/5/2013
"Bad Science", 10/19/2013

February 1, 2019 @ 6:27 am · Filed by Mark Liberman under The academic scene, The language of science

Permalink

20 Comments

Simon Spero said,

February 1, 2019 @ 7:34 am

Journals exist in order to serve as an ongoing shared data collection activity for scientometric researchers.

In return, these metrics can be used by tenure committees to reduce the amount of time wasted evaluating candidates, as well as providing a safe, non-violent outlet for The Madness of Reviewer II.

Not gate-keepers, nor goal-keepers, but goal-posts.
*doink* *doink*
AntC said,

February 1, 2019 @ 8:35 am

The benefit we (the general-reading public) is supposed to get from peer-reviewed journals is to sift stuff with reliable evidence from woo. Then the rationing of physical production went proxy for the rationing of attention-span. Unfortunately, some journals (like Nature) seem to actually amplify the woo.

I regularly get articles from preprint servers/arXiv. And much of it is excellent. But some of it is terrible. And some of it is so badly written/poor English/overblown/dense jargon, that I cry out for a sub-editor, even if the content might be useful. I worry that on subjects about which I have shaky knowledge, I'm not able to tell those categories apart/is it any better than the 'lucky dip' which is wikipedia: case in point from another thread.

For those in the field/tenure committees/interview panels, how much time do you use up chasing around arXiv trying to assess whether person X's impressively-long list of publications is just (say) one article reworked many times?

I agree the peer-review process is horribly broken. I'm unconvinced preprint servers really fix anything.
Levantine said,

February 1, 2019 @ 8:55 am

Woo?
Philip Taylor said,

February 1, 2019 @ 9:04 am

I think that the peer-review process is sub-optimal but can serve a useful purpose; however, it does not (in general) encourage heterodoxy, which may well hold science back. Furthermore, given that science is predicated on replicability, there are some fields in which that desideratum simply cannot be achieved (there is, for example, only one LHC). And the Sir Cyril Burt affair should teach us that even one's most eminent peers can be deceived if one is sufficiently determined …
AntC said,

February 1, 2019 @ 9:09 am

Woo = pseudo-science, unsubstantiated or unfounded ideas [Urban dictionary];

contraction from woo-woo = unfounded or ludicrous beliefs, esp New Age 'theories'.
Levantine said,

February 1, 2019 @ 9:36 am

Thanks, AntC.
Brett said,

February 1, 2019 @ 10:49 am

Even in theoretical particle physics, where everyone actually reads the papers on the arXiv, journal publication still plays a minor useful role. It provides an additional level of verification; lots of papers are significantly improved by the peer review process, with small (or occasionally large) errors corrected and clarity improved.
David L said,

February 1, 2019 @ 12:01 pm

@AntC:

Unfortunately, some journals (like Nature) seem to actually amplify the woo.

As a former editor at Nature, I would say that what you call 'woo,' we liked to call 'public impact.' :) Meaning that an important selection criterion was whether a piece of research would be likely to interest researchers in other fields and readers of newspapers. Of course, Nature and a few other journals had and still have an oversize role in influencing what science reporters cover, so there is an element of circularity here. But I think the winnowing factor in picking out a handful of papers from the vast numbers churned out by the academic world is a function that someone has to do.

I agree the peer-review process is horribly broken. I'm unconvinced preprint servers really fix anything.

I concur on both points. If a handful of select journals don't pick out the most 'notable' papers, then someone else will do it, and will have their own biases and prejudices.
Peter Erwin said,

February 1, 2019 @ 3:25 pm

Like a lot of "peer review is broken!" rants, this one strikes me as strongly biased by personal and parochial experience ("parochial" because people assume the peculiar details of how peer review and publication work in their field are universal), with anecdotes pretending to be universal experience.

Some comments with mention of how my field (astronomy) differs from what the author clearly thinks is universal practice:

In their cover letter, they often sell the paper to the editors.

We don't do cover letters.

The editors perform an initial cursory review, and reject many papers in an electronic instant. More than half of the submissions do not survive the screening process.

Whether this is good or bad depends on what's coming in, and on how "exclusive" the journal thinks itself. Any serious scientific journal has to discard the illiterate crackpottery that inevitably shows up. (From conversations with editors in my field, the vast majority of papers that aren't clear nonsense go out for review.)

… early- or mid-career investigators, who are anxious to curry favor with the journal — in the hope that the brownie points they earn might be useful when they submit their own work at a later time.

No one in my field thinks it works that way. People referee papers (the term is used for both US and European journals) out of a mixture of moral obligation, sense of duty, guilt (at not doing one's duty), and a desire to punish the wicked (i.e., you get to play a small role in improving the field — or at least preventing things from getting worse — by rejecting the papers that don't deserve publication).

When I act as a reviewer, I am sent copies of the comments of the other people who evaluated the paper.

A minor point, but since astronomy journals routinely use only one referee per paper (unless the authors are genuinely displeased with the report and request a new referee), this doesn't happen. (My only experience in this sense was refereeing for Nature; I thought one of the referees was a bit too lenient, but the other provided some very useful comments from a theoretical perspective that clearly complemented my observationally based critique.)

The really amazing part? Typically, the reviewers perform their evaluation without any ability to know if the authors accurately described their methods or their data

Well, sure. That's been a problem for centuries. It's really amazing there's been any progress in science at all. (Yes, I'm being slightly sarcastic.) And preprint servers do absolutely nothing to change that.

All too often, in their zeal to please the reviewers, the authors revise the paper in a way that makes it much worse than the original.

In my experience (speaking of papers I've authored or coauthored), the vast majority of times the revised paper is an improvement on the original. (Adding a few extra references to papers by the referee or their friends is a very, very minor negative at worst.)

And when the paper finally appears online or in print, the authors have the privilege of seeing their work ignored.

So? What on earth does that have to do with peer review? (You get exactly the same "privilege" if you put your paper on a preprint archive.)

And that final, "instant-replay" metaphor makes very little no sense to me.
Peter Erwin said,

February 1, 2019 @ 3:27 pm

painfully slow interactions with inattentive and dubiously competent reviewers.

Mark — since I'm genuinely curious about how these things might vary:

How long, on average, does it take to get the reviewers' reports from the time you submit a paper (or a revision) in your field? What does "painfully slow" mean in this context?

[(myl) Many (most?) journals have been trying to improve things, but it has been traditional to see a gap of 18 to 42 months between submission and publication.

I know of one important paper (in experimental psychology) that was delayed by more than three years — the author meanwhile had put his code, his raw data, and various notes and explanations up on his web site, had given several invited talks on the topic, etc., so that by the time the article finally appeared, everyone who mattered had known about the details as well as the basic ideas for several years.

If I look at the December 2013 edition of Language (the most recent one offered by JSTOR), the first 4 articles say:

[Received 15 June 2010;
revision invited 21 February 2011 ;
revision received 31 July] — so 42 months between submission and publication.

[Received 29 January 2012;
accepted pending revisions 1 August 2012;
revision received 15 January 2013;
accepted 15 February 2013] — so 23 months between submission and publication.

[Received 29 January 2012;
revision invited 10 July 2012;
revision received 21 March 2013;
accepted 25 March 2013] — so 23 months between submission and publication.

[Received 6 June 2011 ;
revision invited 12 February 2012;
revision received 12 March 2012;
accepted 27 March 2013] — so 30 months between submission and publication.

]
Chris C. said,

February 1, 2019 @ 4:36 pm

@David L — Yes, of course someone must do that winnowing job. But one would hope that the quality of a paper has more to do with selection than how many press releases your journal is likely to circulate on publication. You do no one any favors by publishing trash research. (You're obviously being at least a little sarcastic, but it's genuinely hard to tell how much of your reply to take seriously.)
David L said,

February 1, 2019 @ 6:29 pm

I don't see how you infer from my comments that I am or was in favor of publishing 'trash research.' IUnlike most journals, Nature covers a wide range of sciences, and an important factor in choosing which papers to review and possibly publish was whether they were likely to have some interest or appeal beyond a specialist audience. Such appeal may well translate into interest by science reporters.

I hope you don't think that any research that gains the attention of the press is ipso facto trash.
The Other Mark P said,

February 1, 2019 @ 7:17 pm

I would like Nature, and similar, to be rather more concerned about whether the article is correct than they appear to be. An article with lots of "impact" can also do a lot of damage (The Lancet and the MMR fiasco is one which has cost lots of money and lives — all because they were so overwhelmed by its "impact" they didn't check if it was correct.)

I read journals in a range of fields (history, education, mathematics, climate, Russian studies, architecture) and academic publishing works in some and definitely doesn't work in others. The more easily reproducible the field, the more likely it is to be working — so someone with experience in astronomy may well feel there is no need for change, but someone in education may think very differently. The political impact of a field also vastly increases the chance of very poor peer review, as politicised journals send it to suitably politically leaning reviewers. Publishing rubbish in astronomy will ruin your career, but there's plenty of total rubbish published in education and some complete charlatans work their way to the top.
Garrett Wollman said,

February 1, 2019 @ 9:07 pm

These issues of journal publication schedules are interesting to me in the metaphorically "academic" sense — I work in a computer science laboratory, and (with the exception of theory, which is really a branch of mathematics) our researchers overwhelmingly publish at conferences, not in journals. Nothing waits more than six or eight months until final publication in the conference proceedings. There are computer science journals, of course, so some research does get published in that venue, but far more goes to SOSP, PLDI, IMC, ECCV, CVPR, NeurIPS, FAST, and a bunch of other three- and four-letter acronyms.
Chris C. said,

February 1, 2019 @ 10:49 pm

@David L — The Other Mark P has said what I would have, only better. I'd also point out the widely noted bias in favor of positive results, which such a selection process clearly does nothing to address. Rather the opposite, from what you've said. "We confirmed the null hypothesis" isn't likely to garner headlines, but it's no less valuable a result.
David L said,

February 2, 2019 @ 12:02 pm

I would like Nature, and similar, to be rather more concerned about whether the article is correct than they appear to be.

Right, the editors of Nature spend their time deciding which of the obviously incorrect papers they have before them they most want to publish.

Back in the 1990s there was a Congressional hearing on some recent incident of academic fraud — or maybe several, I can't remember. One panel included both John Maddox and Dan Koshland, the editors of Nature and Science at the time. They were asked to give their assessment on the percentage of papers in their journals that could be reliably said to be correct.

Koshland gave the conventional answer, which was that their reviewers and editors did a wonderful job and that 90-something percent of the papers in Science were correct. Maddox said exactly the opposite — that if you look back over the long history of Nature, almost everything the journal has published, even the great papers, has turned out to be incorrect in one way or another. Because that's how science works.

"Correctness" in a formal way is a low bar in scientific publishing. You want some combination of originality, novelty, imagination, even a bit of provocation. That was always Maddox's belief when he was editor. Sometimes he went too far, to be sure, but I think that overall he had the right idea.
peterv said,

February 2, 2019 @ 3:49 pm

The notion that peer review is a game is not original. See the amusing paper:

Chambers, J. M. and Herzberg, Agnes M. (1968): A note on the game of refereeing. Journal of the Royal Statistical Society. Series C (Applied Statistics), 17(3), pp. 260-263

https://www.jstor.org/stable/2985643?origin=crossref&seq=1#metadata_info_tab_contents
Peter Erwin said,

February 3, 2019 @ 6:11 am

Mark — thanks for posting the 2013 Langauge dates. I also took a look at the most recent (December 2017) open-access issue on the Language website, which gives generally similar numbers, with a range of 18-30 months (median of 27 months) between submission and publication.

For comparison, I looked at the mid-January 2019 issue of The Astrophysical Journal (usually abbreviated ApJ), one of the top two or three journals in astronomy. Inspection of the first fifteen articles shows a median time from submission to publication of about four months. There is one paper with a time of 27 months, so long delays can happen, though that's a bit of an outlier; the next longest times are around ten months. The shortest was three months.

From personal experience with the four main journals in astronomy (including ApJ), the typical time between submission and receipt of the first referee report is around five weeks; the longest I've experienced was a little over two months. So from my perspective the four-to-nine-month referee turnaround time suggested by the Language articles (going by the difference between the "received" and "revision invited" dates) is… kind of astonishing.

Also, it looks as though it takes anywhere from eight months to a year between acceptance and actual publication in Language, which contrasts rather dramatically with the one-to-two month delay for most astronomical journals. (The ApJ issue I referenced is dated 20 January 2019; the article acceptance dates were all in late November or early December of 2018.) And, since almost all astronomers post their version of the accepted paper to the arXiv within a few days of acceptance (possibly replacing the submitted version they posted earlier), in practice those articles were probably "published" a median of three months after first submission.
Peter Erwin said,

February 3, 2019 @ 7:57 am

@ David L:
"Correctness" in a formal way is a low bar in scientific publishing. You want some combination of originality, novelty, imagination, even a bit of provocation. That was always Maddox's belief when he was editor. Sometimes he went too far, to be sure, but I think that overall he had the right idea.

The problem is that a bias towards papers which are "novel" and "provocative" may overemphasize dramatic positive results and discourage publishing negative or null results, which can contribute to things like the "file drawer problem" and discourage attempts at replications of previous studies.
mg said,

February 3, 2019 @ 3:56 pm

@The Other Mark P – the MMR fiasco was worse than that. Peer review actually worked – the reviewers recommended rejecting the obviously flawed paper – but the editor decided to publish anyway in search of the publicity and attention.

One problem for journals that actually care about doing peer review properly is the shortage of good quality peer reviewers. Doing peer reviews is volunteer work that doesn't count towards getting tenure or promotions, and doing it correctly is time consuming. I'm especially concerned with the number of medical journals who don't ensure that papers are reviewed by at least one methodologist (statistician or the like) to check that the data analysis methods are appropriate and the conclusions drawn are supported by the data.

RSS feed for comments on this post

"Instant replay" and intellectual referees

20 Comments

Simon Spero said,

AntC said,

Levantine said,

Philip Taylor said,

AntC said,

Levantine said,

Brett said,

David L said,

Peter Erwin said,

Peter Erwin said,

Chris C. said,

David L said,

The Other Mark P said,

Garrett Wollman said,

Chris C. said,

David L said,

peterv said,

Peter Erwin said,

Peter Erwin said,

mg said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta