Feynman on Physical Law (apropos of Cox's Theorem)
I know, I know, everyone either loves Feynman or finds him pompous... but I do think this clip has something useful to say about what makes science different from all the other ways we might try to make sense of the world. And, in doing so, it helps me make the point about what part Cox's theorem and Bayesian probability theory play in science.
Guess is more or less his starting point. From a formal perspective this means somehow we specify some formal mathematical model that predicts the outcomes of experiments. We might also specify several competing models. We need to write down these models in a formal system, in modern terms, we need to program these models unambiguously into a computer. The basis of our formal systems is Set Theory.
Compute the consequences more or less this is straightforward if you are given some numerical values of certain quantities, the heavy lifting here is done by the theory of functions, computation, calculus, differential equations, numerical analysis, etc. But, we need those numerical values of the quantities. The Bayesian solution is to specify a state of knowledge about what those quantities might be, a prior distribution, where some values have higher probability than other values because we think those values are more reasonable (not more common under repetition, just more reasonable in this case)
Compare Predictions to experiment to a Bayesian this means see which values of the certain quantities are needed to make the outcomes of the experiment be "close" to the predictions, and not only that, but also see how high in that probability distribution around the prediction they will be (how precisely the best version of the model predicts reality). When there are several models, the ones that put the outcomes in a high probability region are going to get higher post-data probability, that is, we are automatically going to put probabilities over our several models based on how well they predict.
What parts are special to science? Certainly guessing is not unique to science, the history of Greek, Norse, Native American, and other mythologies shows that we like to guess a lot.
Computing the Consequences isn't unique to science either, Acupuncture / Chinese Medicine is largely guessing explanations in terms of Qi and hot vs cold foods and meridian lines and so forth... and then computing which spices or herbs or special tinctures or needle locations are recommended by the model.
Compare Predictions to Experiment: is really the essence of what makes science special, and in order to do this step, we need a meaning for "compare". The Bayesian solution is to force the model builder to use some information to specify how well the model should predict. In other words, what's the "best case"? Specifically the quantity p(Data | Params) should be taken to be a specification of how probable it would be to observe the Data as the outcomes of experiments, if you knew *precisely* the correct quantities to plug into Params. The fact that we don't put delta-functions over the Data values reflects our knowledge that we *don't* expect our models to predict the output of our instruments *exactly*.
So, what does Cox's axioms teach us? It's really just that if you want to use a real number to describe a degree of plausibility about anything you don't know, and you want it to agree with the boolean logic of YES vs NO in the limit, you should do your specifications of degrees of plausibility using probability theory.
Cox doesn't tell you anything much about how to Guess, or how to Compute The Consequences (except that you should also compute the probability consequences in a certain way), but it does have a lot to say about how you should Compare Predictions to Experiment.