Scientists are figuring out how to make scientific research more reliable. The federal government should take what they’ve learned and make it mandatory.
Since 2018, the National Association of Scholars (NAS) has been beating the drum about the irreproducibility crisis. That’s the failure of an enormous amount of modern scientific research to meet an elementary criterion for reliability: for the results of experiments to be reproduced by other scientists. The failure is rooted in politicization, groupthink, ambition, statistical error, and methodological sloppiness, and this has rendered a vast array of scientific research untrustworthy, false, or deceptive. There are easy solutions to the irreproducibility crisis. We’ve called for scientists to preregister their research, for example, as a way to increase reproducibility. That undermines one of the major contributors to irreproducibility, Hypothesize After the Research is Known, or HARK. Without preregistering, you vastly increase the chance that your “finding” is a false positive, or a statistical fluke. Because most scientific research is funded by taxpayer dollars, we’ve been calling for the federal government to require researchers to preregister their research if they want their research to get federal support or to inform federal policy.
New research just came out that makes the case that preregistration really does work. A Who’s Who of the scientists working to reform psychology (ground zero for irreproducible research) have just published a fascinating experiment. They preregistered a large amount of research studies and then worked to replicate them. Their preregistered research replicates about 86 percent of the time—a far higher proportion than the un-preregistered psychology research (about 50 percent). From the paper:
This paper reports an investigation by four coordinated laboratories of the prospective replicability of 16 novel experimental findings using rigour-enhancing practices: confirmatory tests, large sample sizes, preregistration and methodological transparency. In contrast to past systematic replication efforts that reported replication rates averaging 50%, replication attempts here produced the expected effects with significance testing (P < 0.05) in 86% of attempts, slightly exceeding the maximum expected replicability based on observed effect sizes and sample sizes. When one lab attempted to replicate an effect discovered by another lab, the effect size in the replications was 97% that in the original study. This high replication rate justifies confidence in rigour-enhancing methods to increase the replicability of new discoveries.
This research already has critics. It’s possible, the critics argue, that the reproducibility reformers committed some methodological errors of their own!
However, the fact that the researchers chose which of their studies to put forward for replication gives some pause to Berna Devezer, a metascientist at the University of Idaho (UIdaho) who was not involved in the work. The 16 findings used in this study were chosen very differently from past replication studies—they ‘are not randomly selected from, or representative of, a well-defined literature,’ she says.
The authors also checked whether their results looked different when they used other ways to define a successful replication. Some of those methods put the replication rate as low as 71%. But highlighting higher estimates and burying lower rates in the details of the paper is “disturbing and ironic,” says UIdaho metascientist Erkan Buzbas, ‘because some of the authors are ardent proponents of not cherry-picking results.’
Quibbles over statistics clouds the basic issue, however. Preregistration is a sign of good faith, the research equivalent of escrow. Why wouldn’t scientists voluntarily embrace preregistration? Do they need to be given a statistical probability of proof that it makes scientists’ research more reliable?
Sad to say, science suffers from the irreproducibility crisis in the first place because every professional incentive encourages scientists to publish exciting new results, regardless of whether they’re reproducible. Most scientists won’t want to reform—especially the ones foisting politicized agendas on the public rather than conducting disengaged research.
Only a federal requirement to preregister research will work—a requirement linked both to federal grant money and to qualifying research as eligible to inform federal policy. The federal government is the most important single funder of scientific research in the world. It has funded, rather, a gusher of irreproducible science: an abuse of taxpayer dollars. Scientists don’t preregister hypotheses because they are disincentivized from doing so. The federal government needs to change that landscape of incentives. It amounts to a fiduciary duty.
This new research also tells us that a federal requirement to preregister is doable. The means for voluntary, verifiable, and trustworthy preregistration already exists. If scientists can preregister effectively, it’s a short leap for federal regulators to require preregistration effectively. Metascience regulation ain’t rocket science.
Preregistration isn’t a silver bullet. Every system can be gamed, and requiring preregistration will encourage scientists to figure out ways to get around the requirement. But preregistration surely will help make science more trustworthy, an attribute in short supply lately, and the federal government can and should use its power to make preregistration reform effective.
Ruan J/peopleimages.com — Adobe Stock — Asset ID#: 602489238
Hm. I suspect one reason scientists might be reluctant about pre-registration is that this will inevitably increase the costs of scientific studies. What will registration require? A statement of primary and secondary endpoints, with a sketch of how the endpoints will be tested and at what level? That would increase the statistical compliance costs. Furthermore, how will the government suppress results that have not been pre-registered?
Any decent-size study is bound to present interesting results that the designers did not anticipate when they wrote the protocol.
How does one do Hypothesis Preregistration?
“…scientists working to reform psychology (ground zero for irreproducible research)…”
The bigger problem with psychology is that something like 90% of psychologists self identify as being on the far left on social issues. Self identified and far left….
In any other context, this would be defined as inherent bias and structural variance and likely a few more things. No other cohort this segregated (and it *is* segregated) would ever be viewed as objective, and any findings that such a group produced would be dismissed on the basis of the segregation.
It is similar to what is said, legitimately, when the VFW attempted to address patriotism in K-12. Memory is that they were veterans of Korea and Vietnam, guys who genuinely meant well (a lot of them were Boy Scout leaders) but the issue parents had was that they weren’t representative of society at large. Same thing with the psychologists.
Back in the 1970s, a group of women in Boston said the same thing about male gynecologists, arguing that a woman could understand things about the female body that no man ever could, and published Our Bodies, Ourselves, which I believe is still in print. And the pressure to increase the number of minorities in the sciences is not to meet quotas (at least not officially) but because of the belief that people from different backgrounds are going to see the same problem differently.
This is inherently valid — I, who have seen automobile-sized boulders bouncing around like empty beer cans in the surf during a storm is going to view things differently from someone from the midwest, I who grew up with 13′ tides is going to view things differently from someone from the Gulf Coast where they have 3′ tides (or a lake where they have none), and as someone who has actually experienced -35 degree (Fahrenheit) weather and who once dropped three lit matches into a pail of gasoline only to have all three go out as if they had been dropped into water, I have an understanding of vapor pressure that someone from Texas probably doesn’t.
Let me restate that, the gal from Texas may be able to comprehend the fact that gasoline at -35 F has so little vapor pressure that it takes multiple matches to evaporate enough to ignite, but she’ll have never actually seen that. The benefits of diversity that we’re all supposed to champion.
So I come back to the fact that some 90% of psychologists self-identify as being on the far left of social issues. That’s abnormal — that’s not reflective of the social norm, and that’s inherently a problem for that reason. There may be more to the problem of irreproducable research, but I keep thinking it is something along the lines of the VFW’s making an attempt to understand why boys in the 1970s wore long hair.