# More on stats in science

Kaiser Fung over at the junk charts blog has more to say on the article from Science News. I agree with much of what he says, but I’d like to bring out some nuances. For example:

“I kept coming back to this question as I read Tom Siegfried’s wrong-headed article (“Odds are, it’s wrong”) about the use of statistics by scientists. In it, he makes a mockery of the statistics profession, accentuates the negatives, ignores the positives while offering no useful alternatives.”

It’s true that the article uses inflammatory language, such as calling statistics “a mutant form of math” but I think Siegfried’s concerns are more relevant to scientists misusing statistics, and poor statistical education than it is to the failure of statisticians or the (modern) statistics profession. Since Fung is a statistician I can see why he takes issue with the article, but as someone who has worked with scientists with a very limited understanding of statistics, I can see why Siegfried’s assessment of *statistics as practiced by everyday scientists* has a ring of truth to it.

Statisticians have been saying to move away from p values and emphasize effect size, and not to pun on “significant” for a while now. Modern statistical researchers work with sophisticated modeling tools all the time. I think the big problem we have in science is that basic statistics education does not address these modern statistical ideas at all, and scientists only get basic statistical education.

If you are a productive, professional, grant-funded scientist today, you are probably about 50 years old. You went to graduate school in the 1980’s. When you learned about statistics, computers were just about fast enough that they could sort of keep up with your typing speed over the 1200 baud modem that connected you to the university mainframe. The idea of running a 10000 iteration MCMC sampling scheme on a partially nested 4 level model with 1/2 million observations was something Andrew Gelman was maybe just dreaming about, and if he was trying it out he was certainly writing custom FORTRAN code to do it.

Even friends of mine doing postdocs now who went to grad school in the 90’s or early 2000’s took a statistics course that started out saying something like “statistics can never prove a hypothesis, so we can only specify a null hypothesis and then check whether the observed values have a very small probability of occurring under that null hypothesis”. Such a course would go on to teach you about t-tests, ANOVA and F tests, and some kind of basic linear regression. You’d learn to read the tables of the Normal distribution, and if you’re lucky you’d learn a little bit about plotting data.

Within the statistics community, things have moved on, we have people doing research on causal inference, and modern computing tools let us work more on specifying the model in a way that is meaningful for the scientific question at hand, but among day-to-day biologists, physicists, ecologists, and the like, much of their statistical analysis is still basically focused on shoehorning their question into one of the 7 canned routines that “the stats guy” in the lab has coded up in an Excel spreadsheet. And it’s *true* that those canned routines are pretty much not what we want them doing, and that the p values that come out of them are misleading, and that as a result of asking the wrong statistical question, much of the basic background scientific findings are pretty much wrong.

Science can no more move away from statistics than it could move away from collecting data, and I think this is Kaiser Fung’s point. But Siegfried’s point is that in the absence of good quality statistical thinking, getting a couple of p values out of some canned routines and calling it good is leading to a lot of poor science, and having seen it in action, I have to agree.

### Trackbacks and Pingbacks

Comments are closed.

I’m just an econ grad student, not a statistician, but I did find Siegfried’s article unconvincing – while he certainly wrote about some relevant problems with statistics as practiced by scientists, he then incorrectly concluded that whole frequentist approach is wrong (and the bombastic tone of the article didn’t help either).

It seems to me that the problem is not with p-values, hypothesis testing and frequentist approach per se. The problem is that scientists often misinterpret these results for something they are not – in other words, they don’t even “get” those things that are tought in the basic statistics course. If that is the case, then teaching them even more fancy and sophisticated methods will hardly be a solution.

I actually think that you’re right about “they don’t even get those things that are taught in the basic statistics course” but I think this is not because scientists can’t understand statistics, but rather the things they learn are not really relevant to the questions that they actually ask, and therefore they go looking to try to fit those few concepts they learned into a broader context where they understand the science and the questions very well but can’t formulate them into a statistical language.

For example, suppose you’re an ecologist collecting data on a wetland habitat. You have measurements like latitude and longitude, water pollutant concentrations, population counts of frogs, counts of deformed/intersex frogs, regional information about agricultural practices, and you have all of these things on a time series over 5 years.

You know, from biochemical lab experiments, that there is a nonlinear relationship between concentration and time of exposure and probability of a frog being intersex.

If your concept of statistics is that you have to formulate a null hypothesis and pick a canned test and figure out whether your data could arise by chance under that hypothesis, you have a hard time figuring out how to use statistics.

On the other hand, if you’ve got a more nuanced idea of what statistics *IS* then you can start to graph relationships between variables, put your lab data into a useful context, and formulate a model for the spatial variability of exposure to chemicals and the effect of those chemicals on intersex development. Statistics becomes the tool to investigate the variability related to a simple deterministic mathematical model and to see how well that model predicts reality.

These kinds of questions just can’t be answered by simple t tests and the like, but they are questions that scientists naturally formulate, and if they haven’t seen that statistics are more than a collection of canned tests… then they are doomed to misuse those canned tests.

I studied statistics in the era roughly from Friden mechanical calculators and computer punchcards to handheld calculators that couldn’t do square roots, long before the introduction of the 8″ floppy disk and CP/M. What they banged into our heads was to always consult a statistician _before_ starting to collect data. That was the summary, every time, in every course that mentioned statistics.

So the teachers weren’t doing badly at the time. But back then, doing statistics was expensive and time-consuming and we really, really didn’t want to waste our effort doing it wrong and having to redo it.

What I see happening now — particularly at the “septic edge of the bogusphere” — is a vast wave of people brand new to the notion that statistical claims are powerful debating points, who’ve never taken a Stat 101 class, never had a clue about the issues, and have been given computers with programs on them that churn out what claim to be results.

They get hold of raw data files — from the comfy convenience of their chair — and “do statistics” on them. And if they don’t like the picture that comes up on the screen, why, they can just “do some other statistics” — and keep trying things til they see a chart they like. Then they post it and copypaste like mad.

And they don’t even see the irony and fakery in playing with the tools to make pictures they like.

They sound convincing to people who don’t understand the tool, and outshout anyone with a real sense of how to use the tool because they lack any doubt. Their big problem is self-esteem; they have far too much of it.

When any tool that’s been used by trained professionals but was too expensive to misuse suddenly is cheaply available to anyone who can go out and buy one, we start finding out just how many heretofore undiscovered ways there are to misuse, misdirect, and break it.

Statistics can be a very useful tool.

Duck and cover.

“I think the big problem we have in science is that basic statistics education does not address these modern statistical ideas at all, and scientists only get basic statistical education.”

So what course progression, or set of textbooks, would you recommend to cover these modern statistical ideas?

It’s a good question, I don’t think I’ve seen an intro stats textbook that would do what I’d want. perhaps some day i’ll have time to write one? I think I’d want to see the following in it:

I think this would be a 2 semester course, and it should be required for everyone in 3rd year of undergraduate scientific education in Biology, Chemistry, Physics, Engineering, Social Sciences (Econ, Psych, and Poly Sci at least).

Daniel: I agree with your response but not with the diagnosis. When I read the Siegfried article, I just can’t believe he’s talking about misapplication.

Your syllabus would make a good second course; it’s too technical for a first exposure to statistics. On this topic, my view is somewhat idiosyncratic. My ideal first course is strictly on concepts, rather than methods or techniques. (Not surprising that I just wrote a book on concepts.) For instance, your #7 can be taught without first digging into technical details.

I also think we can’t solve a problem of misapplication by introducing a different methodology. If the existing method is being misapplied, why would the new method not get misapplied too?

Kaiser: thanks for your reply. I think Siegfried’s article is sensationalistic on purpose, and that he himself is probably not that sure about whether the problems are misapplication or just fundamental. As people more familiar with what statistics is, I think we know that you can’t get away from statistics, and that many of the problems he talks about are caused by misapplication, so we have to crank the dialog up a notch above the sensationalistic magazine level. What things that he talks about are fixable? Why do they happen? etc. I guess I’m responding more to that than to his article’s specific individual bits of content.

I didn’t exactly mean to imply that the order of my syllabus was fixed. Perhaps I should have used an unnumbered list. In particular group 7 (philosophy of modeling) would be something that in every chapter or section I would revisit in a new context.

I agree that concepts can be taught without first knowing the technical details. In my mind, a college level course in statistics SHOULD be a second course. The first course in statistics should be at least a semester in high school. We don’t expect science majors to arrive at University without having taken algebra, they shouldn’t be coming to University without at least having seen binomial, poisson, and normal random variables, sampling distribution of the mean, and some basic statistical ideas like that.

Now… in reality at the moment, that’s not the case. so I’m not sure what to do about that.

As for the misapplication and alternative methods, I think one of the main problems with misapplication is that scientists don’t think about statistics as something that is inherent in the scientific enterprise, statistics is something that some magic software does to your data to let you release it to the world with a clear conscience… Because of that, scientists don’t frame their questions inside a reasonable framework that includes the idea of variability and the possibility of hidden causes. Scientists are very smart about their subject matter, so they have questions they want to answer, but they don’t have the tools to evaluate basic statistical questions that come along with their scientific questions.

Most scientists think the ideal set of experiments involves carefully, one by one, changing a single variable and evaluating its effect on their system of interest. Most of the time you can’t really do that, and even if you can do that, it’s not an efficient use of information gathering resources compared to blocking combined with randomization.

I’m thinking especially of laboratory and field studies in Physics, Chemistry, and Biology. Most scientists in these fields think along the lines of the Ernest Rutherford quote that Gelman put up… “If your experiment needs statistics, you ought to have done a better experiment.”

My wife’s favorite saying is “Biologists use statistics like a drunkard uses a lamppost – for support rather than illumination”. With a better general conceptual background, and some practical methodology that can be used right away in their everyday lab experience, they can start to think about their scientific questions directly in statistical context. Meshing the scientific question with the concept of variability and hidden causes and parameter estimates will lead to less cognitive disconnect, and that will lead to less misapplication, I hope.

Absolutely agree with the cookbook attitude towards statistics. It goes back to how statistics is taught, and I think part of the blame falls on us. For instance, a “design of experiments” course, which would go a long way of addressing the issue you raised above, is typically taught as if it were a linear algebra course. There are lots more behind the math, and especially when it comes to real world applications, there are even more to learn. Such a course for scientists (at the intro level) should dispense with all the math, assume that software is available to do matrix computations, and spend all the time dealing with design issues, constraints and interpretation.

I totally agree about the problems with teaching statistics that you mention. Dwelling too long on the details of the math is one of the paramount problems in introductory classes in mathematical topics in general. Too much of statistics (and applied mathematics) is about HOW and not enough is about WHY. Starting with the problem and working toward the math is always better than starting with some math and looking for a problem you can bolt it onto.

In many areas of applied mathematics, there are some scientifically useless models that are out there just because some mathematicians know how to solve them.

That being said, eventually it is also good to know how the math works. But I think it’s ok to treat the mathematical techniques like black boxes early on, and then open those boxes later as the interior workings become more important.