The Data Vigilante | Psychology

Uri Simonsohn, a research psychologist at the University of Pennsylvania’s Wharton School, did not set out to be a vigilante. His first step down that path came two years ago, at a dinner with some fellow social psychologists in St. Louis. The pisco sours were flowing, Simonsohn recently told me, as the scholars began to indiscreetly name and shame various “crazy findings we didn’t believe.” Social psychology—the subfield of psychology devoted to how social interaction affects human thought and action—routinely produces all sorts of findings that are, if not crazy, strongly counterintuitive. For example, one body of research focuses on how small, subtle changes—say, in a person’s environment or positioning—can have surprisingly large effects on their behavior. Idiosyncratic social-psychology findings like these are often picked up by the press and on Freakonomics-style blogs. But the crowd at the restaurant wasn’t buying some of the field’s more recent studies. Their skepticism helped convince Simonsohn that something in social psychology had gone horribly awry. “When you have scientific evidence,” he told me, “and you put that against your intuition, and you have so little trust in the scientific evidence that you side with your gut—something is broken.”

Simonsohn does not look like a vigilante—or, for that matter, like a business-school professor: at 37, in his jeans, T-shirt, and Keen-style water sandals, he might be mistaken for a grad student. And yet he is anything but laid-back. He is, on the contrary, seized by the conviction that science is beset by sloppy statistical maneuvering and, in some cases, outright fraud. He has therefore been moonlighting as a fraud-buster, developing techniques to help detect doctored data in other people’s research. Already, in the space of less than a year, he has blown up two colleagues’ careers. (In a third instance, he feels sure fraud occurred, but he hasn’t yet nailed down the case.) In so doing, he hopes to keep social psychology from falling into disrepute.

Simonsohn initially targeted not flagrant dishonesty, but loose methodology. In a paper called “False-Positive Psychology,” published in the prestigious journal Psychological Science, he and two colleagues—Leif Nelson, a professor at the University of California at Berkeley, and Wharton’s Joseph Simmons—showed that psychologists could all but guarantee an interesting research finding if they were creative enough with their statistics and procedures.

The three social psychologists set up a test experiment, then played by current academic methodologies and widely permissible statistical rules. By going on what amounted to a fishing expedition (that is, by recording many, many variables but reporting only the results that came out to their liking); by failing to establish in advance the number of human subjects in an experiment; and by analyzing the data as they went, so they could end the experiment when the results suited them, they produced a howler of a result, a truly absurd finding. They then ran a series of computer simulations using other experimental data to show that these methods could increase the odds of a false-positive result—a statistical fluke, basically—to nearly two-thirds.

Just as Simonsohn was thinking about how to follow up on the paper, he came across an article that seemed too good to be true. In it, Lawrence Sanna, a professor who’d recently moved from the University of North Carolina to the University of Michigan, claimed to have found that people with a physically high vantage point—a concert stage instead of an orchestra pit—feel and act more “pro-socially.” (He measured sociability partly by, of all things, someone’s willingness to force fellow research subjects to consume painfully spicy hot sauce.) The size of the effect Sanna reported was “out-of-this-world strong, gravity strong—just super-strong,” Simonsohn told me over Chinese food (heavy on the hot sauce) at a restaurant around the corner from his office. As he read the paper, something else struck him, too: the data didn’t seem to vary as widely as you’d expect real-world results to. Imagine a study that calculated male height: if the average man were 5-foot‑10, you wouldn’t expect that in every group of male subjects, the average man would always be precisely 5-foot-10. Yet this was exactly the sort of unlikely pattern Simonsohn detected in Sanna’s data.

Simonsohn launched an e-mail correspondence with Sanna and his co-authors; the co-authors later relayed his concerns to officials at the University of North Carolina, Sanna’s employer at the time of the study. Sanna, who could not be reached for comment, has since left Michigan. He has also retracted five of his articles, explaining that the data were “invalid,” and absolving his co-authors of any responsibility. (In a letter to the editor of Psychological Science, who had asked for more detail, Sanna mentioned “research errors” but added that he could say no more, “at the direction of legal counsel.”)

Not long after the exchange with Sanna, a colleague sent Simonsohn another study for inspection. Dirk Smeesters of Erasmus University Rotterdam, in the Netherlands, had published a paper about color’s effect on what social psychologists call “priming.” Past studies had found that after research subjects are prompted to think about, say, Albert Einstein, they are intimidated by the comparison, and perform poorly on tests. (Swap Einstein out for Kate Moss, and they do better.) Smeesters sought to build on this research by showing that colors can interact with this priming in strange ways. Simultaneously expose people to blue (a soothing hue), for example, and the Einstein and Moss effects reverse. But a strange thing caught Simonsohn’s eye: the outcomes that Smeesters had predicted ahead of time were eerily similar, across the board, to his actual outcomes.

Simonsohn ran some simulations using both Smeesters’s own data and data found in other papers, and determined that such a data array was unlikely to occur naturally. Then he sent Smeesters his findings, launching what proved to be a surreal exchange. Smeesters admitted to small mistakes; Simonsohn replied that those mistakes couldn’t explain the patterns he’d identified. “Something more sinister must have happened,” he recalled telling Smeesters. “Someone intentionally manipulated the data. This may be difficult to accept.”

“I was trying to give him any out,” Simonsohn said, adding that he wasn’t looking to ruin anyone’s career. But in June, a research-ethics committee at Smeesters’s university announced that it had “no confidence in the scientific integrity” of three of his articles. (The committee noted that it had no reason to suspect Smeesters’s co-authors of any wrongdoing.) According to the committee’s report, Smeesters said “he does not feel guilty” and also claimed that “many authors knowingly omit data to achieve significance, without stating this.” Smeesters, who could not be reached for comment, resigned from the university, prompting another Dutch scholar to publicly remark that Simonsohn’s fraud-detecting technique was “like a medieval torture instrument.”

That charge disturbs Simonsohn, who told me he would have been content with a quiet retraction of Smeesters’s article. The more painful allegation, however, is that he is trying to discredit social psychology. He adores his chosen field, he said, funky, counterintuitive results and all. He studied economics as an undergrad at Chile’s Universidad Católica (his father ran a string of video-game arcades in Santiago; Simonsohn initially hoped to go into hotel management), but during his senior year, an encounter with the psychologist Daniel Kahneman’s work convinced him to switch fields. He prefers psychology’s close-up focus on the quirks of actual human minds to the sweeping theory and deduction involved in economics. (His own research, which involves decision making, includes a recent study titled “Weather to Go to College,” which finds that “cloudiness during [college] visits has a statistically and practically significant impact on enrollment rates.”)

So what, then, is driving Simonsohn? His fraud-busting has an almost existential flavor. “I couldn’t tolerate knowing something was fake and not doing something about it,” he told me. “Everything loses meaning. What’s the point of writing a paper, fighting very hard to get it published, going to conferences?”

Simonsohn stressed that there’s a world of difference between data techniques that generate false positives, and fraud, but he said some academic psychologists have, until recently, been dangerously indifferent to both. Outright fraud is probably rare. Data manipulation is undoubtedly more common—and surely extends to other subjects dependent on statistical study, including biomedicine. Worse, sloppy statistics are “like steroids in baseball”: Throughout the affected fields, researchers who are too intellectually honest to use these tricks will publish less, and may perish. Meanwhile, the less fastidious flourish.

Last summer, not long after Sanna and Smeesters left their respective universities, Simonsohn laid out his approach to fraud-busting in an online article called “Just Post It: The Lesson From Two Cases of Fabricated Data Detected by Statistics Alone.” Afterward, his inbox was flooded with tips from strangers. People wanted him to investigate election results, drug trials, the work of colleagues they’d long doubted. He has not replied to these messages. Making a couple of busts is one thing. Assuming the mantle of the social sciences’ full-time Grand Inquisitor would be quite another.

Psychology

Leave a Reply Cancel reply