The most recent case of scientific fraud by Dutch social psychologist Diederik Stapel recalls the 2010 case against Harvard University of Marc Hauser, a well-respected researcher in human and animal cognition. In both cases, the focus was on access to and irregularities in handling of data. Stapel retained full control of the raw data, never allowing his students or colleagues to have access to data files. In the case of Hauser, the scientific misconduct investigation found missing data files and unsupported scientific inference at the center of the accusations against him. Outright data fraud by Stapel and sloppy data management and inappropriate data use by Hauser underscore the critical role data transparency plays in preventing scientific misconduct.
Recent developments at the National Science Foundation (and earlier this decade at the National Institutes of Health) suggest a solution — data-sharing requirements for all grant-funded projects and by all scientific journals. Such a requirement could prevent this type of fraud by quickly opening up research data to scrutiny by a wider community of scientists.
Stapel’s case is an extreme example and more likely possible in disciplines with substantially limited imperatives for data sharing and secondary data use. The research traditions of psychology suggest that collecting your own data is the only sound scientific practice. This tradition, less widely shared in other social sciences, encourages researchers to protect data from outsiders. The potential for abuse is clear.
According to published reports about Hauser, there were three instances in which the original data used in published articles could not be found. While Hauser repeated two of those experiments and produced data that supported his papers, his poor handling of data cast a significant shadow of uncertainty and suspicion over his work.
Hauser’s behavior is rare, but not unheard of. In 2008, the latest year for which data are available, the Office of Research Integrity at the U.S. Department of Health and Human Services reported 17 closed institutional cases that included data falsification or fabrication. These cases involved research funded by the federal government, and included the manipulation or misinterpretation of research data rather than the violation of scientific ethics or institutional oversight.
In both Hauser and Stapel's cases, graduate students were the first to alert authorities to irregularities. Rather than relying on other members of a researcher’s lab to come forward (an action that requires a great deal of personal and professional courage,) the new data sharing requirements at NSF and NIH have the potential to introduce long-term cultural changes in the conduct of science that may reduce the likelihood of misconduct based on data fabrication or falsification. The requirements were given teeth at NSF by the inclusion of new data management plans in the scored portion of the grant application.
NIH has since 2003 required all projects requesting more than $500,000 per year to include a data-sharing plan, and the NSF announced in January 2011 that it would require all grant requests to include data management plans. The NSF has an opportunity to reshape scientists' behavior by ensuring that the data-management plans are part of the peer review process and are evaluated for scientific merit. Peer review is essential for data-management plans for two reasons. First and foremost, it creates an incentive for scientists to actually share data. The NIH initiatives have offered the carrot for data sharing — the NSF provides the stick. The second reason is that the plans will reflect the traditions, rules, and constraints of the relevant scientific fields.
Past attempts to force scientists to share data have met with substantial resistance because the legislation did not acknowledge the substantial differences in the structure, use, and nature of data across the social, behavioral and natural sciences, and the costs of preparing data. Data sharing legislation has often been code for, "We don’t like your results," or political cover for previously highly controversial issues such as global warming or the health effects of secondhand smoke. The peer review process, on the other hand, forces consistent standards for data sharing, which are now largely absent, and allow scientists to build and judge those standards. "Witch hunts" disguised as data sharing would disappear.
The intent of the data sharing initiatives at the NIH and currently at NSF has very little to do with controlling or policing scientific misconduct. These initiatives are meant to both advance science more rapidly and to make the funding of science more efficient. Nevertheless, there is a very real side benefit of explicit data sharing requirements: reducing the incidence of true fraud and the likelihood that data errors would be misinterpreted as fraud.
The requirement to make one’s data available in a timely and accessible manner will change incentives and behavior. First, of course, if the data sets are made available in a timely manner to researchers outside the immediate research team, other scientists can begin to scrutinize and replicate findings immediately. A community of scientists is the best police force one can possibly imagine. Secondly, those who contemplate fraud will be faced with the prospect of having to create and share fraudulent data as well as fraudulent findings.
As scientists, it is often easier for us to imagine where we want to go than how to get there. Proponents of data sharing are often viewed as naïve scientific idealists, yet it seems an efficient and elegant solution to the many ongoing struggles to maintain the scientific infrastructure and the public’s trust in federally funded research. Every case of scientific fraud, particularly on such controversial issues such as the biological source of morality (which is part of Hauser’s research) or the sources of racial prejudice (in the case of Stapel) allows those suspicious of science and governments’ commitment to funding science to build a case in the public arena. Advances in technology have allowed the scientific community the opportunity to share data in a broad and scientifically valid manner, and in a way that would effectively counter those critics.
NIH and NSF have led the way toward more open access to scientific data. It is now imperative that other grant funding agencies and scientific journals redouble their own efforts to force data, the raw materials of science, into the light of day well before problems arise.
Felicia B. LeClere is a principal research scientist in the Public Health Department of NORC at the University of Chicago, where she works as research coordinator on multiple projects, including the National Immunization Survey and the National Children's Study.
For decades, debates about gender and science have often assumed that women are more likely than men to “leak” from the science and engineering pipeline after entering college.
However, new research of which I am the coauthor shows this pervasive leaky pipeline metaphor is wrong for nearly all postsecondary pathways in science and engineering. It also devalues students who want to use their technical training to make important societal contributions elsewhere.
How could the metaphor be so wrong? Wouldn’t factors such as cultural beliefs and gender bias cause women to leave science at higher rates?
My research, published last month in Frontiers in Psychology, shows this metaphor was at least partially accurate in the past. The bachelor’s-to-Ph.D. pipeline in science and engineering leaked more women than men among college graduates in the 1970's and 80's, but not recently.
Men still outnumber women among Ph.D. earners in fields like physical science and engineering. However, this representation gap stems from college major choices, not persistence after college.
Other research finds remaining persistence gaps after the Ph.D. in life science, but surprisingly not in physical science or engineering -- fields in which women are more underrepresented. Persistence gaps in college are also exaggerated.
Consequently, this commonly used metaphor is now fatally flawed. As blogger Biochembelle discussed, it can also unfairly burden women with guilt about following paths they want. “It’s almost as if we want women to feel guilty about leaving the academic track,” she said.
Some depictions of the metaphor even show individuals funneling into a drain, never to make important contributions elsewhere.
In reality, many students who leave the traditional boundaries of science and engineering use their technical training creatively in other fields such as health, journalism and politics.
As one recent commentary noted, Margaret Thatcher and Angela Merkel were leaks in the science pipeline. I dare someone to claim that they funneled into a drain because they didn’t become tenured science professors. No takers? Didn’t think so.
Men also frequently leak from the traditional boundaries of science and engineering, as my research and other studies show. So why do we unfairly stigmatize women who make such transitions?
By some accounts, I’m a leak myself. I earned my bachelor’s degree in the “hard” science of physics before moving into psychology. Even though I’m male, I still encountered stigma when peers told me psychology was a “soft” science or not even science at all. I can only imagine the stigma that women might face when making similar transitions.
For this fellowship, I worked with two computer science graduate students and one bioengineering postdoc on a “big data” project for improving student success in high school. We partnered with Montgomery Public County Schools in Maryland to improve their early warning system. This system used warning signs such as declining grades to identify students who could benefit from additional supports.
This example shows why the leaky pipeline narrative is so absurd. Many leaks in the pipeline continue using their technical skills in important ways. For instance, my team’s data science skills helped improve our partner’s warning system, doubling performance in some cases.
Let’s abandon this inaccurate and pejorative metaphor. It unfairly stigmatizes women and perpetuates outdated assumptions.
Some have argued that my research indicates bad news because the gender gaps in persistence were closed by declines for men, not increases for women. However, others have noted how the findings could also be good news, given concerns about Ph.D. overproduction.
More importantly, this discussion of good news and bad news misses the point: the new data inform a new way forward.
By abandoning exclusive focus on the leaky pipeline metaphor, we can focus more effort on encouraging diverse students to join these fields in the first place. Helping lead the way forward, my alma mater -- Harvey Mudd College -- has had impressive success in encouraging women to pursue computer science.
Maria Klawe, Mudd’s first female president, led extensive efforts to make the introductory computer science courses more inviting to diverse students. For instance, course revisions emphasized how computational approaches can help solve pressing societal problems.
The results were impressive. Although women used to earn only 10 percent of Mudd’s computer science degrees, this number quadrupled over the years after Klawe became president. To help replicate these results more widely, we should abandon outdated assumptions and instead help students take diverse paths into science.
David Miller is an advanced doctoral student in psychology at Northwestern University. His current research aims to understand why some students move into and out of science and engineering fields.