The limitations and perils of scientific studies

A staple in college psychology courses is the story of polio and “spongy tar.” It seems that decades ago, researchers noticed that there were higher polio rates during times when tar was spongier in children's playgrounds. Based on that, they mistakenly concluded that spongy tar causes polio and raised an alarm; in response, some schools went so far as to dig up their tar playgrounds — before scientists realized that the tar and the polio were both symptoms of something else: hotter temperatures. Polio tended to be a summertime problem, and tar softens in the hot weather.

The lesson: Correlation does not imply causality. Just because two things are happening at the same time doesn't mean that one led to the other.

In this era of more robust and carefully vetted science, when policy decisions supposedly spring from research rather than hunches, the spongy-tar theory would probably not get much traction. Even so, studies aren't always what they seem — or what they're made out to be. Sometimes they're misanalyzed, and the wrong message is gleaned from them. Sometimes they can't be reproduced by subsequent studies, or the results aren't as clear-cut as the first studies suggested. (Vitamin E, anyone?) At times there are unintended flaws or biases in the study design, and at other times the findings are misrepresented or overblown.

One of the most recent reminders of this came from a massive study of studies. As reported in August in the journal Science, a group of scientists attempted to reproduce 100 psychology studies, all of which had been published in leading journals. Their finding: More than half the time, the results of those studies couldn't be replicated. Many were considered seminal studies frequently cited in other research.

The conclusion of the Reproducibility Project caused some snickering among those who had thought the “soft sciences” of psychology and sociology were sketchy fields to start with. But even research in the hard sciences, including medicine, often has similar problems. A 2012 study found that when medical research finds a “very large effect,” follow-up studies usually find a much smaller effect — or none at all.

Leave a Reply