Over at Andrew Gelman’s blog, Phil and I have been having a conversation about issues related to the building envelope problem I blogged recently.

He rightly points out that we can’t reduce the problem to one equation, the second equation, which I didn’t mention explicitly, is the conservation of mass equation $$0 = \epsilon_{hg} (p_{hg}/p_0)^{n_{hg}} – \epsilon_{go} (p_{go}/p_0)^{n_{go}}$$. This equation is theoretically exact, in the sense that conservation of mass is most likely exactly true, down to the last molecule, but in fact this equation has some assumptions, first of all it would only be exact if there was absolutely no compression of the air in the building, otherwise you need a differential equation for the rate of compression and soforth, but this is probably a small and unimportant effect. A further problem is that the equation is supposed to be for flows, but we are predicting the flows as power laws of the pressure, and that could be not quite right. However, whatever error in prediction we have can be thought of as combined in this equation so that for example the left hand side might be modeled as a normal random variable averaging around zero. On the other hand, perhaps the errors are systematically different at different flow rates. In essence that is saying that the coefficients or the exponents are not really a single constant over the entire range of conditions. That might be another form of modeling error.

In addition to a modeling error induced by the simplified model for flow, the measured values of pressures still have measurement error. In that sense, there is an error term in the equation when the p values that appear are measured rather than some theoretical exact values. This produces something like:

$$err_{\mathrm model} = \epsilon_{hg} ((p_{hg}+err_{hg})/p_0)^{n_{hg}} – \epsilon_{go} ((p_{go}+err_{go})/p_0)^{n_{go}}$$

where now the pressure values are measured and the $$err_hg$$ and $$err_go$$ are the measurement errors for the pressures.

Since we’re allowing a variety of errors, perhaps it’s best to separate them. Looking at the full two sets of complete equations we have (with P values being measured):

$\frac{(Q_{ho}+err_Q)}{Q_0} = (1+\epsilon_{ho})(\frac{(P_{ho}+errP_{ho})}{P_0})^{n_{ho}} + \epsilon_{hg}(\frac{(P_{hg}+errP_{hg})}{P_0})^{n_{hg}} + err_{\mathrm model 1}$Â and

$err_{\mathrm model 2} = \epsilon_{hg} (\frac{(P_{hg}+errP_{hg})}{P_0})^{n_{hg}} – \epsilon_{go} (\frac{(P_{go}+errP_{go})}{P_0})^{n_{go}}$
Perhaps though this form helps us put some priors on the size of the modeling errors and focusing on them, perhaps there is some bias in these errors as for example if $$\epsilon_{hg}$$ is small then there is a large pressure difference between the house and the garage, but there is a small flow, and since there is a small flow, there is a small pressure difference between the garage and outdoors, so we’re operating in different ranges of the power law and might expect modeling errors not to cancel out but rather have a bias!