# Under Stan-ding the Hot Hand

Summary: It is hard to detect the “hot hand” in basketball with hit/miss shot data. In general in science, we often see people claiming that a thing doesn’t exist because “they looked carefully and didn’t find it”. Well, the problem is if your method of looking can’t find the thing, the fact that you didn’t find it tells us nothing, yet this is taken as evidence that the thing doesn’t exist. It’s not just a triviality about basketball, it’s a common problem in other important areas of science.

Imagine a player shoots baskets, and their probability of success is 0.5 + a sinusoidal function of shot. We don’t know the phase or period exactly, we want to look at a sequence of hits and try to infer the amplitude of the variation and its frequency. In my case I’m using period = 40 and phase = 0.

## the hot hand, a player takes shots at a basket with a sinusoidal ## underlying prob of success, looking at N successive shots, if we ## know the period and phase perfectly can we detect the magnitude for ## different N = 20, 50, 100, 200, 500 library(rstan); rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectCores()) set.seed(1); hitmiss = sapply(1:500,function(x){return(rbinom(1,1,prob=0.5 + 0.1 * sin(2*pi*x/40)))}) stancode = " functions{ real binp(real A,int i,real f,real ph) { return(0.5 + A*sin(2*pi()*(i+ph)/f));} } data{int hitmiss[500];} parameters{realA20;real A50; real A100; real A200; real A500; vector [5] f; vector [5] ph; } model{ A20 ~ normal(0,.5); A50 ~ normal(0,.5); A100 ~ normal(0,.5); A200 ~ normal(0,.5); A500 ~ normal(0,.5); f ~ gamma(2,2.0/50); ph ~ normal(0,20); for(i in 1:500){ if(i < 21){hitmiss[i] ~ binomial(1,binp(A20,i,f[1],ph[1])); } if(i < 51){hitmiss[i] ~ binomial(1,binp(A50,i,f[2],ph[2])); } if(i < 101){hitmiss[i] ~ binomial(1,binp(A100,i,f[3],ph[3])); } if(i < 201){hitmiss[i] ~ binomial(1,binp(A200,i,f[4],ph[4])); } if(i < 501){hitmiss[i] ~ binomial(1,binp(A500,i,f[5],ph[5])); } } } " samps <- stan(model_code=stancode,data=list(hitmiss=hitmiss), init=list( list(f=rep(50,5),A20=0.2,A50=0.2,A100=0.2,A200=0.2,A500=0.2,ph=rep(0,5)), list(f=rep(50,5),A20=0.2,A50=0.2,A100=0.2,A200=0.2,A500=0.2,ph=rep(0,5)), list(f=rep(50,5),A20=0.2,A50=0.2,A100=0.2,A200=0.2,A500=0.2,ph=rep(0,5)), list(f=rep(50,5),A20=0.2,A50=0.2,A100=0.2,A200=0.2,A500=0.2,ph=rep(0,5))), iter=10000,thin=10) samps

Running this model gives the following results:

Inference for Stan model: c8293fb08b00b7823bcc1fc716df0ef9. 4 chains, each with iter=10000; warmup=5000; thin=10; post-warmup draws per chain=500, total post-warmup draws=2000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat A20 0.15 0.01 0.11 0.01 0.06 0.12 0.21 0.39 422 1.00 A50 0.11 0.00 0.08 0.01 0.04 0.10 0.17 0.30 410 1.00 A100 0.09 0.00 0.06 0.00 0.04 0.09 0.14 0.24 569 1.00 A200 0.09 0.00 0.06 0.00 0.04 0.09 0.13 0.20 288 1.01 A500 0.13 0.00 0.03 0.06 0.12 0.13 0.16 0.20 270 1.01 f[1] 45.95 1.09 35.87 5.22 19.06 37.12 62.08 138.20 1090 1.00 f[2] 39.30 1.46 34.07 3.89 12.61 29.49 54.20 132.16 541 1.00 f[3] 37.54 1.66 29.00 5.82 17.11 32.19 47.29 112.63 305 1.01 f[4] 38.49 1.50 27.51 8.49 19.08 35.79 45.11 108.88 336 1.01 f[5] 40.39 0.28 2.02 39.52 40.31 40.62 40.98 41.68 54 1.08 ph[1] -0.97 0.95 18.74 -34.41 -13.40 -3.96 11.86 38.11 387 1.02 ph[2] 1.18 0.97 20.88 -40.78 -11.06 0.30 15.52 42.05 465 1.00 ph[3] -2.06 2.08 20.75 -43.41 -14.11 -1.80 12.54 36.89 99 1.03 ph[4] 1.48 1.35 20.94 -39.44 -10.69 1.07 14.84 42.38 240 1.02 ph[5] 5.01 0.28 4.57 -2.00 2.57 4.69 7.45 12.91 267 1.00 lp__ -581.57 0.33 4.48 -591.30 -584.43 -581.32 -578.54 -573.58 185 1.02

Suppose we decide that we've "Detected" the effect if we have an expected amplitude / sd(amplitude) of about 2, then it takes upwards of 200 shots, because at 200 shots E(A)/sd(A) = .09/.06 ~ 1.5. Also note how few effective samples I get (4 cores, 10000 samples per core, thinning by a factor of 10 so I have 2000 total samples but typical parameters have around a couple hundred effective samples. The model is hard to fit because the whole thing is noisy and messy, traceplots reflect this).

Also, this is with the benefit of a precisely repeated periodic signal and informative priors on the size, period, and phase of the effect.

If the effect varies like a smooth gaussian process with a ~ 40 shot scale but no periodicity and no informative priors, it'd be like trying estimate simultaneously 20 or 30 fourier coefficients or something... you'd need even more data and long runs to overcome the noise and get good effective sample size.

The model is ill posed, the measurement is not very informative, the Hot Hand is hard to detect from hit/miss data.

**SO LACK OF DETECTION IS NOT EVIDENCE OF ABSENCE.**