Every Sunday morning on the news/talk shows, there's usually a segment hosted by a doctor. CNN, Fox, the networks, it's pretty common. They go over the medical news; some of it legitimate news, such as new surgical treatments or new drugs making news, but more often it's the type of story we refer to as "he who stories" or just "he whos". Just this week, we've heard that "he who" is stressed at work is under increased risk of heart attack (didn't you hear that when you were a kid?); heavy drinkers are at increased risk of stroke (did anyone ever say heavy drinking was good?), and that children of alcoholics are at increased risk of addictions (and children of Chinese speakers are at increased risk of speaking Chinese, too).
If you're like me, you've probably had it with these stories. I might listen to the story to see if the group was a few people or thousands, and throw it out if it's less than 5000 people, but the truth is that I pretty much hit the channel button.
Now comes the kind of analysis of this sort of story that gives a better reason to ignore them, "Coffee Causes Cancer?" on Statistics.com. Perhaps you remember that story the other day that eggs were as bad for your cardiovascular health as smoking. Like the vast majority of those studies, this was a food recall study, where they ask people how many eggs they ate over several years. In the broader sense, studies that rely on observations of groups, instead of controlled experiments are notoriously inaccurate. The worst kind are the "data mining" studies where groups of statistics assembled under a wide variety of observations are piled together in an attempt to extract more data out of them. As Statistics.com says:
Statisticians have long been wary of scientific claims based on observational data - there is too much room to "torture the data long enough, until Nature confesses," in the words of Ronald Coase, the Nobel Prize-winning economist. Fiddle with the variables and model parameters sufficiently, run enough comparisons, look at enough subgroups, and you can find statistical significance in almost any observed data. Statisticians prefer controlled experiments, but those are expensive and hard to do, and most published epidemiological research is based on observational data.In a study of 12 of these observational studies, investigators found 52 testable "statistically significant" claims to test, and they designed controlled experiments to test them. Quoting again:
There were 52 "statistically significant" claims arising from the original observational studies. None replicated in the controlled randomized studies. Five actually achieved statistical significance in the opposite direction.Hence my title. If 5 out of 52 actually were completely backwards, then the next time you hear coffee causes cancer, there's almost a 1 in 10 chance they have it exactly backwards. But it's virtually a lock guarantee that it's wrong. According to this study.