Study raises new questions about reproducibility of research

Predicting Reproducibility

Study raises questions about research published in many journals.

You have /5 articles left.
Sign up for a free account or log in.

Academics are easily able to predict whether an experiment’s findings will be reproducible, according to a study that raises more questions about the reliability of research published in leading journals.

A major new investigation -- undertaken by 12 research centers across the world in partnership with the Center for Open Science (COS) and published in Nature Human Behavior -- sought to test the most significant findings in social science papers published in Science and Nature between 2010 and 2015.

Conducting replication experiments for 21 eligible studies, the collaborative team of researchers from five laboratories found that 13 showed evidence “consistent with” the findings of the original paper. As many as eight failed to do so, however, suggesting a reproducibility rate of 62 percent.

To ensure “high statistical power,” average sample sizes for the replication studies were about five times larger than the originals. Strikingly, however, effect sizes were found to be about 50 percent smaller on average in the replication tests than their original studies had promised.

Lily Hummer, an assistant research scientist at the COS and co-author of one of the replication studies, explained that the smaller effect sizes showed that “increasing power substantially is not sufficient to reproduce all published findings.”

Project leaders set up prediction markets -- allowing other researchers to bet money on whether they thought each one of the papers would replicate or not -- before conducting the experiments.

Taking into account the bets of 206 researchers, the markets correctly predicted replication outcomes for 18 of the 21 studies. Furthermore, market beliefs about replication were highly correlated with the true replication effect sizes, authors noted.

Thomas Pfeiffer, professor in computation biology at the New Zealand Institute for Advanced Study, another of the project leaders, said that this suggested that “researchers have advance knowledge about the likelihood that some findings will replicate.”

Speaking to Times Higher Education, Brian Nosek, professor of psychology at the University of Virginia and executive director of the COS, said that this outcome did not necessarily indicate that academics were willingly or knowingly submitting papers with poor-quality data sets to journals.

“I wouldn’t infer that researchers are submitting work that they know to be irreproducible,” he said. “The market reflects the price that the whole community puts on the likelihood of replication, but there is a lot of variability between individuals.”

It was also likely that a “shift in culture” meant that scientists who might once have been defensive about criticisms against their findings are becoming increasingly aware of potential problems with reproducibility of data and, thus, more open to public discussion than they might have been even just a few years ago.

“It may be that researchers themselves will sharpen their intuitions about the plausibility of results,” Nosek said. “When the culture was just rewarding finding the sexy result, regardless of its plausibility, there was little reason to consider whether it was reproducible. That is one of the biggest benefits of the changing norms.”

This is not the first time that betting has been used as an indicator for the likelihood of results. A previous study published by the COS asked researchers to submit their predictions for the reproducibility of 44 studies published in prominent psychology journals.

Each researcher was given $100 to “trade” with, and a total of 2,496 transactions were carried out -- suggesting to researchers that prediction markets could be used as “decision markets” to help scientists prioritize which studies to attempt to replicate in the future.