# Floating point arithmetic and probability

Kit Joyce asks about how floating point arithmetic interacts with random number generation

I thought about this for a while, and I think this is a good place to show the utility of concepts from nonstandard analysis and Internal Set Theory in particular. Here's how my reasoning goes.

First off, rather than using measure theory, we'll imagine the process of choosing a random number uniformly in the interval [0,1] as choosing one of a finite but nonstandard number of rational numbers, and then taking the standard part. So, we'll divide up the interval [0,1] into a bunch of possible outcomes as follows:

where for some nonstandard N, and so we are essentially choosing a random integer between 0 and N and then mapping that process into the interval [0,1]. To choose a standard number you take the standard part of .

Now, what's the probability of getting a value between where and both are standard numbers in [0,1]? It's where is the number of possible outcomes that result in a standard number between a and b. Since there is one different outcome every interval then . If you select a nonstandard number outside the interval [a,b] that is infinitesimally close to a or infinitesimally close to b you will also "round off" via the standard part to a or to b, so hence the symbol. Still because the number of outcomes in the halo of a or b outside the interval [a,b] is what I'm calling K and it is infinitesimal relative to N, so .

Now, I think this maps pretty directly to the idea of generating a random number say a 64 bit integer which maps then into rational numbers between 0,1 that are apart, and then the "standard part" operation is modeled by the "roundoff to nearest floating point" operation. And, so long as is big, the roundoff error doesn't introduce any problems, and that's true whether or not the floating point numbers between a,b are evenly distributed or unevenly distributed.

So, when do we have a problem? Mainly when we're trying to figure out the probability of landing in some region that is small relative to machine epsilon of the floating point numbers. For example, suppose you're trying to model the extreme tail of the normal distribution, by generating floats in the range 0.99999999999 to 1.0 and then transforming them by the inverse cumulative distribution to their normal quantiles... you are probably doing something wrong! If you want to generate normal variates in the extreme tail, better to do something like rejection sampling where you're generating the deviates directly in the outcome space and don't have big gaps caused by the roundoff errors in the process of generating numbers between 0.99999999999 and 1.0

Thanks Daniel, your second last paragraph, using a 64 bits to represent the numbers spaced 2^-64 apart on [0,1] was definitely the conceptual step I was missing. And I'll keep learning about nonstandard analysis; it seems like a really useful tool for thinking about a lot of issues in probability theory.