Bayesian Uncertainty About Objective Functions in Optimization
Suppose we'd like to design some system to perform in an "optimal" way. One intuition I've often had is that "Optimal" is rarely worthwhile in real situations. It is difficult to specify a criterion in an exact enough way which makes the "optimal point" under your criterion actually "optimal" in any sense. Instead, it's better to settle for "Good" or if you're lucky "Pretty Darn Good". In other words, once you've climbed most of the way up the mountain, it really doesn't make much difference if you are standing on the very top, especially if your surveys can't really tell you which of the little peaks really is the highest one.
With that in mind, suppose we have a Bayesian model for the probability that some particular tradeoff function is truly our "favorite" tradeoff. To make it semi-concrete, suppose we have to choose amounts to consume between two quantities, say Beer (B) and Nachos (N). We have a certain amount of money, all of which we are going to spend. We can spend it in such a way that so that there is a line in space that defines the reachable region.
Now in simple micro-economics, the solution is to specify a utility function and maximize subject to the constraint on .
But, in reality, we don't know how to trade-off , there is no utility function. At best, we can sort of guess at how happy we'd be with any given position on the line.
Instead, suppose we specify some probability distribution over different , . This is a distribution over functions that specifies the region of space (say coordinates in a Hilbert space) where we think our true preferences lie. But it's now impossible to say with certainty what exact values of are best. We could try to maximize expected utility, we could maximize the utility under the most-probable , we could do a lot of things, but another way to think about this is that we could choose a range of which are in the high probability region, and find a range of points that are all "good" relative to all the in that region, and then any of those points are going to be good choices.
My intuition is, generally in real problems, this version of the problem is going to be easier to solve than finding the one-true-optimum point of a well specified . There are two issues.
Often in applied mathematics, the utility function is chosen to make things relatively easily solvable. Such a utility function doesn't necessarily incorporate all the things we know about the real problem. Once we do incorporate all those things, we'll often have a non-smooth, weird utility function (think of making investment decisions under accurate tax-code rules, where there are weird break-points and eligibility requirements, and soforth).
On the other hand, putting probability in the mix, we can't be sure of any "one true value" we can now compare "good values" across "good models". There are two scales for error, one is the difference between a "good" value and "the optimal" value for a given utility function, and then there is the difference between "the optimal" value under one utility function and "the optimal" value under another. It makes sense to stop trying to find "the optimum" of a given utility once the scale of your search-progress gets down below the scale of the differences between different utility functions. I think often solving a bunch of related optimizations in a very approximate way will give you a lot more information about what the good values are than misguiding yourself into believing you've specified a real utility function and then solving that (wrong) optimization to perfection.