Checking other people's experiments is essential to the process and progress of science, but there's a catch: Scientists have to earn a living. They have to establish careers. Replicating other people's data wins little prestige and few job offers.
How should early-career scientists deal with this asymmetry between scientific and career-development imperatives? When their adviser asks them to repeat someone else's experiment, what should they do, and how should they think about what they're doing?
They should adopt a broader view of replication and look for the opportunities it presents, because replicating other scientists' work is a necessary step in advancing your own research in the same area. And that, in turn, is an essential career-development activity.
"It's not that there's an activity called ‘science’ and there's a separate activity called ‘replication,’ " says Gary King, director of the Institute for Quantitative Social Science at Harvard University and a vocal proponent for data sharing and replication. "There's this general misconception that replication is this activity for low-level academics, but actually it's what the best people do."
The trick, he says, is in seeing replication not as an end in itself but as a means for acquainting yourself with the methods used in a study, the original author's line of thinking, the complications he or she must have faced, and the solutions they devised to those problems. Replicating an experiment, or even the whole study, can be useful for young scientists who are learning their way around the bench or lab, he says.
"The advances in science over the last several hundred years have not come only from individual scientists, … [they've] come from scholars and scientists working in concert with each other and the community," King says. "If you're going to start a new project … the first thing you should do is get up to the cutting edge of the field, and by far the best way to do that is to take the best article and to replicate it."
Victoria Stodden, an assistant professor of statistics at Columbia University, agrees with King. Stodden serves on the National Science Foundation's Advisory Committee for Cyberinfrastructure and has given numerous talks championing better data sharing and replication practices. "I think it's important if you're building on results in a field to replicate them, to be able to not just verify them but to understand how these results actually came about," she says. "That makes you a little more educated in terms of how you'll be able to build on those results. It's easier to extend the science if you know how it got to the place where it is today."
To that end, both Stodden and King require their undergraduate students to form teams and replicate current studies in class. It's a good way to instill the importance of data replication, they say, and to illustrate that in order to develop new ideas, scientists must first thoroughly understand the field in its current state.
King says that by working through the original author's paper and discovering the "enormous number of small decisions" he or she made that aren't explicitly described in the methods section, young scientists can often identify entirely new questions that the original author never thought to address. "The best papers that have come out of my class are not the ones that just say, 'I replicated the article in this journal and it came out the same,' or, 'it came out different.' Those are boring," he says. "The ones that are really interesting start that way and then think of a question that wasn't asked in the original article, or improve the methodology and produce a different answer."
Read the special section on data replication in Science’s 2 December issue.
King gives an example of a group of students a few years ago who were working to replicate a paper about presidential election campaign strategies by a respected social scientist -- King declined to name names -- and after following the paper's methods section to the letter came up with vastly different results. "It wasn't even close. It wasn't anywhere near where it should have been," he says. The students spent the next several weeks tweaking this or that variable and checking their own work to determine where the error was coming from.
At 2 o' clock one morning, King received an excited e-mail from the students. They'd figured it out. It seems the original author -- someone King describes as highly cited -- had mentioned in his methods section that he'd applied two statistical corrections to his data to account for a selection bias. But the analysis only matched the paper's figures and graphs if you ran the data without those statistical corrections; apparently, the author had forgotten to apply the corrections, even though he said he did. It was probably just an honest mistake by the author, King says, but the effects reported in the paper completely disappeared once the correct manipulations were applied. The students went on to publish their findings. "It's only by people checking what we're doing that we can figure out what we really discovered originally," he says.
Richard Muller, a physicist at the University of California, Berkeley, who is leading a team of climate scientists in a high-profile replication of decades' worth of climate data and analysis, says that developing new ideas while doing a replication study is the only thing that can make them worthwhile for young scientists. Over the years, he's seen a number of ways in which replication studies can go wrong in career terms. He points to the Fleischmann–Pons cold fusion experiments of the late 1980s. Early on, a number of young scientists set out to replicate the attention-grabbing findings of Martin Fleischmann and Stanley Pons, and many of them did just that: They "verified" that Fleishmann and Pons had succeeded in achieving nuclear fusion by electrolyzing heavy water, he says. Within a couple years, many more studies had proven them utterly wrong. "People who replicated the experiments lost their reputations," Muller says, "while those who failed to replicate them got no credit whatsoever."
Muller's advice to young scientists is to limit replication studies to those that help move forward your own and your lab's novel research."There's a lot of replication that you do as a matter of course," he says. "When people discover a new elementary particle and determine its mass and how you produce it, for the next year people will be producing it to be able to use it." That kind of replication is valuable and pays dividends to young scientists. But doing a replication study just for the sake of checking whether the original author made a mistake isn't smart, he says.
Even in the case of Muller's recent verification of the findings from existing climate models, "we're still not published yet," Muller says. Despite the effort's high profile, a couple of journals have turned his paper down because, Muller says, they felt it didn't add anything novel to the existing literature. That illustrates why younger, less established scientists should avoid doing "pure" replication studies. Muller says his team's experiment is a little different from a traditional replication study, since they used new tools to analyze the same data.
Muller, King, and Stodden all say that with very few exceptions, if you replicate a study, you should limit its publication to a footnote -- no more -- in a paper reporting an original result. While occasionally a journal will publish a replication study by itself, they say, doing so is usually a waste of time and resources that would be better spent on original research that will further your career. Says King: "There's an enormous amount of information in any dataset that was worth publishing to begin with, and there is probably some other discovery to be made in there. It's good career advice to go find that thing."
Michael Price is a staff writer for Science Careers.