Shorthand terminology and personification of assumptions

2016 May 30
by Daniel Lakeland

In philosophical discussions it's often the case that we use a kind of shorthand which can be confusing. In many cases, we "personify" particular assumptions. That is, we sort of imagine a person who holds those assumptions as true and talk about what such a person would also logically believe. For example:

"A Frequentist has to believe that a moderate sized sample will fill up the high probability region of the sampling distribution."

or

"A Bayesian believes that if a moderate sized sample doesn't fill the data distribution it's not a problem for the model"

These and other statements like them are really shorthand for more complicated sentences such as:

"Once you assume that your distribution is the long run frequency distribution of IID repeated trials, this also implies that a moderate sized sample will have data points throughout the high probability region as can be seen by simulations"

or

"The logic of Bayesian inference does not require that we believe the data distribution is the frequency distribution, so if there is a portion of the high probability region which isn't occupied by any data in a moderate sized sample it doesn't invalidate the basic assumptions of the model"

Real people are free of course to have additional assumptions beyond the basics (for example a person doing a Bayesian analysis might actually want to fit a frequency distribution, so the model with gaps in the sampling would be considered wrong), or to acknowledge that their choice of distribution is intentionally approximate or regularized in a way that doesn't fully satisfy the basic assumption. But it's a useful construct to think about a person who takes their model at literal face value and then see what else that model logically implies they should believe. It helps to detect when additional assumptions are needed, and what they might be, or it also helps to detect when the basic assumptions of a model are contrary to reality.

As such, when talking about "Frequentists" vs "Bayesians" I'm really talking about what the math associated to pure forms of those kinds of analyses implies, not what particular real people actually think.

So much of Statistics is taught as formulas, procedures, heuristics, calculation methods, techniques etc that many people (myself included up to a few years ago) never really see an overarching organizing principle. Discussing those organizing principles explicitly can be very useful to help people with their future analyses.

3 Responses leave one →
  1. Christian Hennig permalink
    May 31, 2016

    Fair enough. I hope you don't think that this was the only issue we had in the previous discussion regarding the meaning of "frequentism" and "Bayesian Stats".

  2. Andrew Jebb permalink
    June 1, 2016

    "A Frequentist has to believe that a moderate sized sample will fill up the high probability region of the sampling distribution." Is sampling distribution the right term here? How can data fill up the sampling distribution?

    • Daniel Lakeland
      June 1, 2016

      For example if you believe that the distribution is approximately normal mean 0 standard deviation sigma, then a sample of 100 points won't have 80 of them negative and 20 positive, or if you make a histogram there won't be obvious "gaps" where the samples failed to fall (within the margin of sampling noise calculated from frequentist considerations).

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS