Home » Opinion » China Knowledge
Scientists scrap data and keep what fits their theory
IT sounds like a simple, straightforward proposition: Scientists should disclose how they collect and analyze the data supporting their scientific publications.
Yet, as Wharton operations and information management professors Joseph Simmons and Uri Simonsohn and UC Berkeley colleague Leif Nelson point out in a recent research paper, too much emphasis is placed on getting research results published in respectable journals, without worrying enough about whether the evidence backs up those findings.
Their stance is hardly new. Not just academics, but the public at large, have often looked skeptically at published studies that in some cases defy common sense. One problem with this, Simonsohn says, is that it ends up calling into question even solid research that can lead to new insights about everything from investment behavior to product marketing to consumer psychology.
Because it is so easy to find evidence for any hypothesis, and because counterintuitive findings are more likely to get noticed and praised, "the temptation is to conduct research that in the end doesn't contribute to society very much," Simonsohn notes. "Instead of asking questions that will lead to important findings in our respective areas, too often we ask questions that are more likely to get media attention."
The three researchers suggest that "the most costly error" in the scientific process - which includes generating hypotheses, collecting data and examining whether or not the data are consistent with those hypotheses - is a false positive, an effect for which statistically significant evidence is obtained despite it not being real.
False positives, the authors note, are persisten. They waste resources "by inspiring investment in fruitless research programs" and they can eventually create credibility problems in any field known for publishing them.
False positives will necessarily happen sometimes, but they occur too often because researchers have many decisions to make during the course of collecting and analyzing data. Furthermore, the authors note, it is "common practice for researchers to search for a combination of analytic alternatives that yields 'statistical significance,' and then to report only what worked.
False evidence
The problem, of course, is that the likelihood of at least one (of many) analyses producing a falsely positive findingó is high. Indeed, a researcher is often òmore likely to falsely find evidence that an effect exists than to correctly find evidence that it does not.ó
Add to that a researcher's desire to find a statistically significant result - the less intuitive the better. Or, as the authors write, "a large literature documents that people are self-serving in their interpretation of ambiguous information and remarkably adept at reaching justifiable conclusions that mesh with their desires" - and that get their work published.
Simonsohn gives a hypothetical example. Suppose a researcher is trying to help marketers figure out what would make a television or video ad appealing to young people. Say one possibility is to set it to the beat of a popular song. At that point, because we already know that this is a good idea, the researcher would merely study whether it is worth paying for the rights to that music based on anticipated revenues from the ad.
The less obvious, the better
"But that's boring," says Simonsohn. "Compare it to putting subliminal diagonal yellow lines on the ad as a way to increase sales. If you have two papers on this, the one that estimates the exact value of paying for the rights will not get published because it's not particularly interesting. The other one will, because it is less intuitive."
Asking less obvious questions makes sense because that is where information has the most value, Simonsohn notes, "but as soon as having correct information is no longer a requirement for studies to work and get published, then asking less obvious questions will tend to lead to less truthful findings. You, as a researcher, will go for the yellow lines. It distracts us from the questions that will lead to more substantive, verifiable findings."
While Simonsohn acknowledges that academics have long been concerned about the liberties that some researchers take when analyzing data, he highlights three unique contributions that his paper makes.
First, he and his co-authors offer a simple, low-cost solution to the problem that does not interfere with the work of scientists already doing everything right - asking them to disclose what they did in ways that would add only a few dozen words to most articles.
Second, they demonstrate just how big the problem can get: While up to now, it was suspected that the consequences of taking these liberties were relatively minor, the authors show it can increase the odds of finding evidence for a false hypothesis to over 50 percent.
And third, the authors did an actual experiment to illustrate their point about how data are manipulated to achieve a desired outcome.
The authors suggest that the disclosure requirements noted above "impose minimal costs on authors, readers and reviewers.... We should embrace these disclosure requirements as if the credibility of our profession depended on them. Because it does."
Adapted from Knowledge@Wharton, http://knowledge.wharton.upenn.edu. to read the original, please visit: http://bit.ly/LkAz6a
- About Us
- |
- Terms of Use
- |
-
RSS
- |
- Privacy Policy
- |
- Contact Us
- |
- Shanghai Call Center: 962288
- |
- Tip-off hotline: 52920043
- 沪ICP证:沪ICP备05050403号-1
- |
- 互联网新闻信息服务许可证:31120180004
- |
- 网络视听许可证:0909346
- |
- 广播电视节目制作许可证:沪字第354号
- |
- 增值电信业务经营许可证:沪B2-20120012
Copyright © 1999- Shanghai Daily. All rights reserved.Preferably viewed with Internet Explorer 8 or newer browsers.