A recent paper came out in The Lancet about alcohol consumption claiming that overall considering all causes, there’s no “safe” level of consumption. Of course that made a big splash in the news. It happened that Andrew Gelman also posted another study about alcohol and when I mentioned some serious concerns about this Lancet paper, the study author actually responded on Gelman’s blog.

I followed up with some specific concerns, but the main issue is that if you want to get causal inference you must either have randomized assignment to ensure on average zero probabilistic dependence on important unaccounted for effects, or have a *causal model* that mimics the real mechanism at least approximately and is believable. Of course the mechanism of death from diseases is … life and all its complexities. In particular the time-dependency aspects.

To illustrate how this time dependency is important I created a qualitative differential equation model that is intended to simulate some kind of simple risk model through time. The basics are this:

1 unit of people are born at time t=0, as time goes on (1 unit of time is about 20 years) they slowly die of various causes. There are 3 causes we track through to 6 units of time:

1. “Heart disease” which we assume begins to affect people around 2.5 units of time, and is reduced by 25% for each dose of alcohol, assumed valid only for small doses, such as less than 2.
2. “Cancer” which we assume begins to affect people later, around 3 units of time, and is completely unaffected by alcohol (just for illustration purposes).
3. “Other causes” which ramps up somewhat slowly between about 3 and 5 units of time but ensures that very few people really make it past 5 units or so (~100 yrs). Other causes are unaffected by alcohol.

Code is below, but first, the graph which shows overall survival, cumulative heart disease deaths, and cumulative cancer deaths through time for two groups: Red = no alcohol, and Blue = 1 dose of alcohol daily

Qualitative model of Alcohol risk and deaths

Notice that by construction the people who take 1 dose of alcohol have lower instantaneous heart disease risk, and the same cancer and other cause risk. This means they live longer (blue survival curve on top is above red). But if you look at the individual causes, you see a reduced heart disease risk, but in cancer you see there are MORE deaths? By construction, alcohol doesn’t change your cancer risk in this model… So why do more alcohol drinkers die of cancer? The answer is that they live longer on average because they don’t die of heart disease as much… So eventually they die of cancer, or “other causes”.

At the very minimum you need to reproduce this kind of thing in your model, or you have no hope of teasing out how things work.

Now for the code:

```kill(all);
reset();

load(rkf45);
load(draw);

invlogit(t) := 1.0/(1.0+exp(-t));

rH(t,D) := invlogit((t-2.5)/.5)*.2*(1.0-D/4.0);
rC(t,D) := invlogit((t-3.0)/.5)*.1;
rO(t,D) := .02+invlogit((t-5.0)/0.2)*10.0;

dAdt: A*(-rH(t,D)-rC(t,D)-rO(t,D));
dHdt: A*rH(t,D);
dCdt: A*rC(t,D);
dOdt: A*rO(t,D);

solnD1:rkf45(subst(D=1.0,[dAdt,dHdt,dCdt,dOdt]),[A,H,C,O],[1,0,0,0],['t,0,6]);
solnD0:rkf45(subst(D=0.0,[dAdt,dHdt,dCdt,dOdt]),[A,H,C,O],[1,0,0,0],['t,0,6]);

draw(terminal='png,file_name="AlcoholRisks",dimensions=[400,800],
gr2d(title="Survival Overall (D=1 blue, D=0 red)",points_joined=true,
points(map(lambda([x], x[1]),solnD1),map(lambda([x],x[2]),solnD1)),
color="red",
points(map(lambda([x], x[1]),solnD0),map(lambda([x],x[2]),solnD0))),
gr2d(title="Death by Heart Disease (D=1 blue, D=0 red)",
points_joined=true,
points(map(lambda([x], x[1]),solnD1),map(lambda([x],x[3]),solnD1)),
color="red",
points(map(lambda([x], x[1]),solnD0),map(lambda([x],x[3]),solnD0))),
gr2d(title="Death by Cancer (D=1 blue, D=0 red)",
points_joined=true,
points(map(lambda([x], x[1]),solnD1),map(lambda([x],x[4]),solnD1)),
color="red",
points(map(lambda([x], x[1]),solnD0),map(lambda([x],x[4]),solnD0))));

```

4 Responses leave one →
1. Chris Wilson permalink
October 26, 2018

Daniel, I think your point here is a very good one, and is part of what troubles me about this kind of work. For simulation purposes, though, how well would “age adjusting” an epi (linear) model account for this effect? I will be even more disturbed if I find out some kind of age adjustment isn’t standard here…

• October 31, 2018

I’m not sure how well age adjustment would work. Ideally what you need is some age adjustment that takes into account alterations on earlier mortality caused by the consumption pattern. That seems unlikely to be the kind of thing that’s done. I think more likely a linear model is used.

Basically, age is treated as if it’s a pre-treatment predictor variable, but age itself is a post-treatment outcome variable: you live longer/shorter due to the consumption. What’s needed to do a good job is a causal model through time.

• Chris Wilson permalink
January 11, 2019

Hi Daniel,
Super long lag in response here ðŸ™‚ I was reminded of this conversation by reading through some of the comments on Gelman’s blog vis a vis Pearl’s new book. I agree that a better causal model would be optimal. However, I am thinking that, if
“By construction, alcohol doesn’t change your cancer risk in this model… So why do more alcohol drinkers die of cancer? The answer is that they live longer on average because they don’t die of heart disease as much… So eventually they die of cancer, or “other causes”.”
Then, a hypothetical statistical model that controlled for age when deaths in the cohort were observed (even in a simple linear hazard model framework) would soak up the effect, and the estimate of alcohol effect would not be biased by this phenomenon. Shouldn’t be too hard to test this idea with some sort of simulation here…

• January 15, 2019

Then, a hypothetical statistical model that controlled for age when deaths in the cohort were observed (even in a simple linear hazard model framework) would soak up the effect

I don’t think so, specifically the effect at each age is in part determined by the integral of the effect at every prior age, since alcohol consumption is time-varying and so are the risks of various *other* causes of death, this general effect is time-varying, and you’ve got a definitely nonlinear response. The “correct” model in my toy problem is the solution to the differential equation. No linear regression as a function of age for example is going to do the right thing, though it might improve things. If you can specify a family of hazard curves you might be able to fit the hazard by fitting the parameters of the hazard curves, but you’ll have a hard time doing a causal model there because it will be entirely confounded with the “accidents of history” such as the price of corn syrup which maybe caused changes in cost of certain foods and maybe changed diabetes risk, and the alcohol drinkers are exposed to more of that risk because they don’t die of heart attacks, but it looks like alcohol “causes” diabetes if you just fit hazard curves… Etc