Science's 'irreproducibility crisis' is a public policy crisis too

Science's 'irreproducibility crisis' is a public policy crisis too
© Getty

In 2012, the biotechnology firm Amgen could only replicate 6 out 53 “landmark” studies in hematology and oncology. That finding was not unusual. In the last 15 years, study after study has revealed that a great deal of peer-reviewed published scientific research fails the test of other researchers, using the same methods, finding the same results. This has been dubbed the “reproducibility crisis.”

How much of a crisis? In 2005, Dr. John Ioannidis, now of Stanford, estimated that as much as half of published research findings in biomedicine are probably false. In some other fields it may be worse. The research on which concepts such as “stereotype threat,” “power poses,” and “implicit bias,” for example, reproduce badly if at all.

The scientific community is not lacking for those who tut-tut these findings. Skeptics say that just because someone tried to reproduce an experiment and failed doesn’t mean the original results were wrong. That’s a reasonable point, but evidence keeps mounting that the reproducibility crisis is real.

My colleagues and I at the National Association of Scholars think we know one of the deep causes of the crisis: bad statistics. Nearly all science these days depends on assessing the likelihood of a hypothesis. Mess up the connection between hypothesis and data, and the result is a “finding” that might as well be pulled out of thin air.

This is how that often happens. Researchers keep running statistical tests, one after the next, until they hit a fluke correlation. They then publish the correlation as though it is a positive result. That’s like the Texas farmer who shoots a hole in the side of his barn and then paints a bullseye around it.

There are other ways to jiggle experiments towards results that look positive but aren’t. Some researchers are caught red-handed in fraud, but more commonly scientists see the statistical manipulations as a gray area, in which they are taking a short-cut to a legitimate finding. This, however, is where they go wrong. Short cuts are what gives us non-reproducible results.

The shortcuts are tempting because science journals are biased toward exciting, new results. They don’t favor hearing about hypotheses that didn’t pan out or replication studies that upset a previously published claim.

That’s especially true when the previously published claim is buttressed by political groupthink. Contrary to their public image as fearless skeptics who relentlessly look for flaws in their own work and the work of their colleagues, many scientists conform to the opinions of those around them. Those who break from the herd often encounter difficulties in their professional lives.

The statistical errors, the corner-cutting, and the groupthink flourish because we don’t have an effective oversight system.

A 2015 study estimated that America spends $28 billion annually on irreproducible preclinical research into new drug treatments. Economics policy, education policy, and environmental policy are all informed by scientific findings, half of which could be junk science. The irreproducibility crisis is a public policy crisis. We need new spending priorities, new regulations, and new policy initiatives to reform scientific research — and the way government uses scientific research.

John Ioannidis at Stanford and Brian Nosek at the Center for Open Science have already done excellent work to address the irreproducibility crisis, funded by the Laura and John Arnold Foundation and the Templeton Foundation. The National Institutes of Health have strengthened their reproducibility standards. This good work must be reinforced.

The National Association of Scholars recommends changes in the way government funds and regulates science. We need more studies of publication bias, more re-evaluations of older studies that have suspect statistical foundations, and more safeguards against groupthink. Agencies should prioritize funding this work and for improvements in experimental design.

Congress will have to pay for some steps to ensure greater reproducibility in the sciences. In the end, those steps will save enormous amounts now spent building blind allies and mirages. What’s needed are standardized descriptions of scientific materials and procedures, standardized statistics programs, and standardized archival formats. We should create “born-open” data, pre-registered research protocols, and journals dedicated to negative results.

The solutions to the reproducibility crisis are not obscure or expensive, but they may not be easy either. The scientific community knows it has a problem but it will resist efforts to give up its bad habits. It will not reform itself. It needs a push from Congress in the shape of a Reproducible Science Reform Act that requires all future government regulations to be based on genuine reproducible science.

Peter Wood is president of the National Association of Scholars, a network of scholars and citizens with a commitment to academic freedom, disinterested scholarship, and excellence in American higher education.