From Gelman’s blog, he shows yet another regression discontinuity. Apparently people have never heard of the Runge phenomenon, or the basis for why it happens. Here’s some R code, and a PDF of the output…

``````## regression discontinuity such as:
## https://statmodeling.stat.columbia.edu/2020/01/09/no-i-dont-think-that-this-study-offers-good-evidence-that-installing-air-filters-in-classrooms-has-surprisingly-large-educational-benefits/

#generally is garbage that ignores what's essentially a correlary to
##the well known Runge phenomenon... we demonstrate here.

library(ggplot2)

set.seed(131211)

datasets = list()
for (i in 1:20) {
datasets[[i]] = data.frame(x=runif(20,0,2),y=rt(20,5))
}

plotgraph = function(d){
g = ggplot(d,aes(x,y)) + geom_point() + geom_smooth(data=d[d\$x < 1,],method="lm") + geom_smooth(data=d[d\$x >= 1,],method="lm")
return(g)
}

graphs = lapply(datasets,plotgraph)
pdf("discplots.pdf")
sapply(graphs,print)
dev.off()

``````

In almost every plot there is “something going on” at the discontinuity, either the level of the function has changed, or the slope, or both. And yet, the whole thing is random t-distributed noise…

I don’t know what that paper did to calculate its p values, but it probably wasn’t simulations like this, and it should have been.

2 Responses leave one →
1. January 10, 2020

There’s a typo in the code. See if you can find it without running it!

• 