Stochastic PDE and Chaos Expansions part 2

2010 January 17
by Daniel Lakeland

Let's remember where we left off the last time:

We have some differential operator, and if we put some function u into the operator we get a residual R(u) which is a function of space that we'd like to be 0 uniformly, or if we're doing approximations (which we always are) we'd like it to have a 'small' norm.

Suppose however, there's something that we are not exactly sure of about the operator. If the operator models a physical process then uncertainty arises naturally due to lack of information. From a Bayesian perspective for example, we can use randomness to describe our lack of information about the stiffness of a bar, which may change from place to place, or the magnitude of a load as a function of position which may have various shapes or different total force from just the "average" value. For a material like concrete the aggregate is randomly incorporated into the matrix, and the strength and size of the aggregate are randomly distributed quantities. In other words, the process of generating the residual R(u) has randomness incorporated into it, even if u is a fixed function.

Now a random function of space can be represented approximately by a sequence of coefficients with respect to some basis. These coefficients have a joint distribution, and the joint distribution of coefficients describes the distribution of shapes of the function. Sampling from the distribution of the coefficients and building a function from the basis functions gives you a sample function. If the basis functions diagonalize the covariance operator of the random function then you have the Karhunen-Loeve expansion, and the individual coefficients are uncorrelated (but not necessarily independent). For a typical real world situation you're not likely to have the easiest case where each coefficient is normally distributed and uncorrelated, and hence in this case will be independent. So to represent the coefficients what do we do?

Representing a set of random variables through transformation:

Suppose you have say 10 random variables that have a joint distribution that is non-gaussian. These random variables are the coordinates of the basis expansion of some random functions defined above. How can we draw samples from their joint distribution? One way is to start with a set of 10 uncorrelated gaussian random variables, and transform them through 10 functions which map them into new random variables which have the correct joint distribution. This is a slightly alternative version of the basic idea that if you have a single random variable uniform on [0,1] then you can transform it through the inverse CDF of your distribution of interest, and you can get a variable with the distribution you need.

In the Polynomial Chaos concept, we write a series expansion of the random field or random process that defines the randomness in our differential operator, and then each of the random coefficients are expanded in terms of a transformed version of independent gaussian random variables, and then this whole massive expansion is the representation of the randomness in our PDE.

For example, say w(x) = \sum \xi_i \phi_i(x) defines say our unknown load, with the set \{\xi_i\} being a set of random variables with some given joint distribution (someone else has done the hard problem of inference to determine a good joint distribution model). Then to represent the given joint distribution we use a set of \omega_i which are random independent standard normal variables, and we transform them through some N dimensional transformation function X_i(\omega_1,\omega_2,...,\omega_n).  They could just as easily be independent uniform random variables or something else, but computationally it makes some sense to use Gaussians, and it makes sense if you're using Gaussians to use the Hermite polynomials to represent the functions X_i. In particular the N random \omega variables get stuck into an expansion of order p of the transformation function X_i in terms of multidimensional hermite polynomials. Fooling around with index names we wind up with X_k = a_{k}(\sum_i a_{1ik}H_i(\omega_1)) (\sum_i a_{2ik}H_i(\omega_2))...(\sum_i a_{Nik} H_i(\omega_N)) where each sum has p terms. We'll define the zeroth order coefficient to be 1 in all cases so that we've factored the overall magnitude out into the front coefficient, and then there are N (p-1)+1 total independent coefficients for each of the N dimensions of the \vec X transformation function (if i haven't gotten somehow lost in counting). Remember the goal is to transform N independent standard gaussian random variables into N potentially correlated non-gaussian random coefficients in the expansion of w(x)

So now we have w(x) = \sum_i X_i(\omega_1 ... \omega_N) \phi_i(x) and each of the X_i(...) functions is expanded in its own coefficients and basis functions.

This is usually about where my mind boggles. It's especially impossible to figure out what's going on if I write out all the expansions all at once, so I think it's useful to do the bullet point version:

  • w(x) is a random function that represents (one source of) uncertainty in the physical system.
  • \sum \xi_i \phi_i(x) is a basis series representation of w(x)
  • X(\{\omega_i\}) is a function that transforms some easily computed random variables \omega_i into some strangely distributed interdependent random variables \xi_i that are needed by the above basis series.
  • a_{k} \Pi_j (\sum_i a_{ijk} H_i(\omega_j)) is the k'th dimension of a multidimensional basis expansion of the \vec X function, usually in terms of the Hermite Polynomials.

The Solution:

Now, finally we're going to calculate the solution u(x,\{\omega_i\}), again we want the residual R(u(x,\{\omega_i\})) to have a "small" norm. Since the \omega_i are random numbers, the residual is a random field or process (it's a "process" if time is involved, which is special since time is one-directional). So how are we going to make the residual small? The answer my advisor would give would be to define an inner product over the space of multidimensional functions (we've got say 1 to 3 spatial dimensions and several tens or hundreds of random variables to represent the uncertain fields or processes) and then do Galerkin projections on the residual to make it orthogonal to all the multidimensional basis functions. Let's think about what is the right inner product there?

Inner Product on the Basis for random functions:

Let's take a cue from the spatial dimensions. If you have a function of one dimension x, then you can define an inner product on those functions by \langle f(x), g(x) \rangle = \int_{-\infty}^\infty f(x) g(x) dx. If the domain of interest is smaller then we can replace the limits of integration by something else, but the basic idea is to multiply the functions together and integrate.  If we have two spatial dimensions then we do a double integral over x and y.  When we add an \omega dimension, that is a dimension characterized by a probability distribution like the normal distribution, then we should integrate not by d\omega, that is the Lebesgue measure of the \omega dimension, but rather by the probability measure of the random dimension, if you're fancy you write this \int f(\omega) g(\omega) d\mu(\omega) but it's basically the same thing as \int f(\omega) g(\omega) {\rm pdf}(\omega) d\omega so long as the pdf of the omega variable is a regular function and not some discrete probability distribution with delta functions involved or something.

This particular integral is the same as an expectation with respect to the variable \omega so we could write this as \mathbb{E}[f(\omega)g(\omega)] if it helps you understand what's going on.

So the inner product we need to use for doing our Galerkin projections is the one that multiplies the two functions together and then integrates across all the dimensions... in excruciating detail as follows:

\langle f(x,y,z,\omega_1,\omega_2 ... \omega_n), g(x,y,z,\omega_1,\omega_2 ... \omega_n) \rangle \\ = \int_x \int_y \int_z \int_{\omega_1} \int_{\omega_2}...\int_{\omega_n} f(x,y,z,\omega_1,\omega_2 ... \omega_n) g(x,y,z,\omega_1,\omega_2 ... \omega_n) dx dy dz {\rm pdf_1}(\omega_1) d\omega_1 ... {\rm pdf_n}(\omega_n)d\omega_n

Or if you understand that the \omega integrals are expectations, then we could write it as \mathbb{E}[f g]_{\{x,y,z,\omega_i\}} which is to say that we multiply the functions together, and then do the weighted integral  giving the standard inner product with respect to the spatial dimensions and the expectation across all the random dimensions. So if that's what we mean by the inner product, then the Galerkin solution is just this:

Find all the a coefficients such that \langle R(u), \phi_i(x,\{\omega\}) \rangle = 0 where the \phi_i are the full basis functions for the spatial and random dimensions (big multi-dimensional products). That is, the residual in the equation is orthogonal to all the basis vectors for the full problem.

YIKES!

No comments yet

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS