# On Flat Priors

Regular readers of my blog will know that I'm an IST nonstandard analysis fan. It fits so well with model building. So, I thought I'd clear up how I interpret "maximum likelihood". If you write down a likelihood for data D with parameter vector which we'll notate as and you maximize it without putting any prior on , clearly the maximum occurs at the same place independent of multiplication by any positive constant .

Now, suppose for ease of exposition, that is just a single parameter. In nonstandard analysis, we can choose a nonstandard integer and create a prior for which is for any q value in the range and zero otherwise. Clearly this is a constant for all standard values of .

Now, do the maximization of this new nonstandard posterior, and provided that your maximum likelihood method picks out a limited value for it picks out the same as your maximum likelihood method without this nonstandard prior. It's clear that multiplying by this prior couldn't have any effect on the maximum point.

Is this "legitimate?" Well let me ask you a totally equivalent question. Are integrals legitimate? Because the integral of a continuous function over a region can be defined as follows, let be a nonstandard integer, and and be the standard part function, then

and if you don't like the Riemann integral, you can do the Lesbesgue integral instead, in that case you need to evaluate f at standard locations:

Both of these ideas are mathematical constructs in which nonstandard numbers are used to define a mapping from a standard thing to another standard thing. So, maximum likelihood, when it gives a unique maximum, is just maximum a-posteriori for a nonstandard posterior with a nonstandard flat prior, the same way that the integral of a function is just the standard part of a nonstandard sum.

It's not like this isn't a known thing, that maximum likelihood is the same as maximizing the Bayesian posterior with a "flat" prior. But usually that's taken as a kind of "intuition" because there *is no* (standard) flat prior on the real line. Well, there *is no* standard value either, but that doesn't keep us from using nonstandard values to define an integral, and it doesn't keep us from using nonstandard priors to define a standard posterior either.

Of course, if you pick some likelihood that can't be normalized... then we're talking about a different story. You should probably rethink your model.