Why gaming on a DSL line is terrible and the math says there’s nothing you can do about it

2020 December 5
by Daniel Lakeland

TL;DR even with *perfect* queue control on your router, if you have less than about 3Mbps upload you will NEVER get competitive game play on CoD or similar games, and it will get worse and worse the slower your upload, and there is nothing your router can do about it.

I like to help people with their interactive internet issues: online gaming, video conferences, VOIP phone calls, etc. There are multiple long threads on forum.openwrt.org involving me and a few others such as moeller0 (Sebastian Moeller) teaching people how queues work and how to configure their routers to implement queue control either with the great “plug and play” SQM package on OpenWrt, or with more customized queue systems for people who want stricter prioritization on games.

The great work done by the Cake developers, such as Dave Taht and his ragtag band of internet warriors makes this fairly plug-and-play when you have proper broadband connections. Unfortunately even the crazy “good” broadband most people have is often highly asymmetric and focuses on very high download speeds, at the expense of futzing up your video chats etc due to insufficient upload speed.

So let’s get to the hardest pain-point of the internet: gaming on a low end DSL line. For example I’ll imagine like one person I’ve tried to help that you’re playing Call Of Duty Black Ops – Cold War which is an interactive shooting game and you have a DSL line that is 15000 kbps download and 850kbps upload. Notice that huge asymmetry.

Now, based on packet captures, we know that this game like many others operates on more or less a 64Hz network tick rate, meaning it sends a game packet every 1/64 = 0.0156 seconds or 15.6 ms. Maybe it’s closer to 1/60, but it doesn’t matter because we don’t need super precision to get the idea here.

2020-11-27 (1)

Here is the actual bandwidth usage, in Bytes per second, which shows the client on PS4 sends 18000 bytes/s which is 144kbps… and the server sends about 30000 bytes/s with occasional peaks up to say 40000, or 320kbps. If the sending rate is 64 packets per second, then each packet from the server is about 625 bytes.


Now. Let’s imagine you have two things going on at the same time in your network. First, you are playing your game, and you have somehow managed to ensure that your game packets *always have strict priority*. So as soon as a game packet is received at your router, it is sent, and other packets will wait in line. But the other thing that’s going on is that someone is uploading their latest pictures to Google Photos or something like that, where the point is to send a bunch of data, but latency is totally unimportant. The process will take a minute or two, so delays of 100ms while totally unacceptable for your game are completely unnoticeable to your photos upload.

Let’s look at the worst case: Your router receives a photo upload packet which by the convention for packets on the internet is around 1500 bytes, and at the moment it has no other packets to send and there has been sufficient time since the last packet that the router’s shaper software determines it’s ok to send this packet… so it sends the 1500 byte packet to the DSL modem.

Immediately after that a CoD game packet arrives, and because it’s strict priority it gets sent.

Now, the DSL modem has two packets in line in a FIFO… the first one is 1500 bytes, and the second one is about 600. The first one has 1500*8 bits to send, and your upload speed is 850 kbps or 850 bits/ms, which means 1500*8/850 = 14.1 ms… Now lets look at the game tick rate: 1/64 of a second = 15.6ms. Which means that *even a single packet ahead of you* in the modem delays your game packet by 14.1/15.6 = 90% of the tick time. In other words, this packet is almost as good as a packet drop. In fact, if there are any packets other than your game going over your DSL upload you will experience jitter exceeding your inter-packet tick rate on a routine basis just because that’s what’s inevitably going to happen.

How can you fix this? The first thing is to increase your upload speed. Suppose you want that a single packet upload time takes at most 1/3 of the inter-arrival time between ticks… That means you should be able to send 1500*8 bytes every (1/3 * 1/64 seconds), so your bandwidth should be 2.3 Mbps at the very minimum. Below about 2.3 Mbps even very good packet controls will inevitably result in packet delays that are a significant fraction of the inter-packet arrival time for your game. To get very good response, you want to be well above this 2.3 Mbps rate. So let’s say above 3 Mbps is where you can reliably schedule packets in a queuing system and not have trouble squeaking your game packets in between other packets on the network.

If you don’t have 3Mbps then your only chance to play competitive level online games is to get yourself a SECOND DSL line and dedicate it to your gaming machines. If you do have 3Mbps then high quality queuing disciplines such as cake using layer_cake diffserv4, or HFSC with custom queue designs, or other mechanisms that shape and prioritize packet flows can work well. Below 3Mbps you will get progressively worse game play or voip or video chats EVEN WITH good queue control systems.

Let’s look at the next pain point, the speed below which two packets in a queue will equal to your interarrival time: 2*1500*8/(1/64) = 1.5Mbps. Below 1.5Mbps if you ever get two bulk packets ahead of you, your game packet is as good as dropped even if it gets sent… Finally let’s see where 1 packet ahead of you is as good as a drop… it’s 1500*8/(1/64) = 768kbps. Do NOT try to game competitively on a DSL line with less than 768kbps upload. In fact, don’t try to game on a DSL line with less than 1.5Mbps upload because then only 2 packets ahead of you is as good as a drop and 1 packet ahead of you is a 50% delay in your packet…

If you actually make money in e-sports for example you should be gaming on a line with a minimum of 10Mbps up and download, or a line with at least 1.5Mbps upload that is 100% dedicated to gaming.

If you care enough, and you have the cash to pay, OpenWrt with Mwan3 package can let you connect two DSL modems and send all your gaming machine traffic over one line and the general web browsing stuff over the other line. Combine it with cake on the appropriate interfaces and you’ll be all set to play competitively even down to 800kbps or maybe a little below.

Regression Discontinuity fails again

2020 July 5
by Daniel Lakeland

Regression Discontinuity analysis is not a failure as an idea, it’s just a failure as a practical way to learn about the world in most cases. The problem is that most situations in which it’s being applied are noisy human / social studies where the effects are much smaller than the level of noise.

Andrew Gelman picks apart another one of these here.

I’ve been teaching myself to use Julia for all my future data analysis projects. It’s just a fabulous language. So here’s the graphs I came up with:

Years lived post election against percentage point margin.
With LOESS fits using 0.2 0.1 and 0.05 bandwidths

What this shows is … NOTHING. As you decrease the bandwidth you can detect more rapid variation in the function value at the expense of more noise in the estimate. By the time you’re down to bandwidth 0.05 you’re using only 5% of the data to fit any given location in the fit. Right at margin = 0 you can see using the orange or red curves that the estimates are extremely noisy, and certainly nowhere near a 5-10 year bump in longevity moving from left of 0 to right of 0.

Science is broken. Here is Erik defending his study.

As we can clearly see in the raw data, there is no discernible signal, none. So whatever signal is supposedly there, if it’s there it just happens to be *exactly* hidden by the offsetting effect of whatever covariates he’s adjusting for. It just so happens basically that people who won elections didn’t live longer on average than people who lost elections, but if they hadn’t won those elections we have somehow strong evidence that they would have died earlier because they were all sicker people than the losers which we can determine from their covariates…

Whatever. Here’s what it looks like when you have an actual signal… Here I’ve used the same x coords, and then generated random y coordinate Normal(0,s) noise, and then added a signal to it… Three different kinds of signals. In the first graph is a step function that steps up by 5 units as we pass x = 0. The second one adds a little wavelet that decreases right before zero and increases right after zero (negative of the derivative of a gaussian) and the last one is the same as the second one, except confined only to x > 0.

Updated state by state graphs with bug fix

2020 May 17
by Daniel Lakeland

I discovered that the method I was using to smooth the cases per day had a bug.

The shapes of the case-per-day function were right, but the overall scale was reduced. Basically what i was doing was convolving by the derivative of a smoothing kernel… But when you calculate the derivative its (f(x+dx)-f(x-dx))/(2dx) that you’re trying to calculate, so when you’re averaging across multiple sizes of dx you need to take that into account… fixed. Now the grey points are the raw data, the black line is the short-term smoothed data, and the blue line is the ggplot smoother.

I wasn’t using this method for doing deaths per day, though maybe I should be now… in any case here’s the current versions.

Updated state by state graphs

2020 May 7
by Daniel Lakeland

Here’s the current status…

State by state graphs of COVID-19 data (from covidtracking.com)

2020 April 25
by Daniel Lakeland

I’ve got a script that grabs data from covidtracking.com and generates several pdfs that give an overview of the pandemic situation one graph per state… I’ll try to update the graphs about weekly. But here are the ones as of today.

Cryptographically Distributed COVID contact tracing through WiFi ad-hoc networking

2020 April 10
by Daniel Lakeland

This is a quick note to try to sketch out an idea that I thought up about how to have people cooperatively determine if they have come in contact with a COVID patient. Here’s the basic idea.

Every Android or iPhone generates a random UUID. Then, when walking around, periodically the phones beacon out a peer-to-peer SSID on their WiFi radios called something like COVID-CONTACT-{UUID}. Everyone’s phones scans the surroundings for stations, and whatever stations they hear, they record the UUID of the phone.

Now… at the end of the day, each phone uploads a one-way cryptographic hash of their own UUID, and the UUIDs that they contacted today.

If a person tests COVID positive, they upload a record of their cryptographic hash, and the fact that they tested positive.

Now, every day you look at all the contacts you’ve contacted in the last ~ 10 days, you hash those, and you see if any of those hashes report being COVID positive. Also, you look up all the COVID positives in the last 10 days, and you see if they report having contacted YOUR hash…

Now, there’s probably some subtlety to this which requires working out by people who are more crypto nerdy than I am, but the dataset is such that you can’t determine whether A contacted B unless you know the UUID of *both* parties. Since the UUID itself is stored internal to the app and never sent to anyone else, basically the UUID itself is a secret, and it’s only possible to determine if A contacted B or vice versa if you are in fact either A or B.

Of course, you could just try EVERY UUID that’s possible… Good luck with that, since there are ~ 2^128 = 340282366920938463463374607431768211456 of them. If you tried 1 Million per second, it’d take 10^25 years to try them all.

So, is this a viable non-invasive contact tracing strategy? What am I missing?

Grocery handling, good bad or ugly?

2020 April 1
by Daniel Lakeland

Apparently this guy’s video is controversial:

I’m going to come right out and say this is a great video, it shows people how to handle objects in a way that minimizes transmission of virus from surfaces. Apparently the controversial part though is where he dumps his oranges in soapy water? Are you kidding me? Everyone should be washing their produce at all times people! Have you ever heard of e-coli?

A frequently heard thing in the “anti” group is something along the lines of “there is zero evidence that xyz”, such as “there is zero evidence that food packaging is a significant source of infection” or “there is zero evidence that washing your food in soapy water is good for you” or whatever. This is typical “Null Hypothesis Significance Testing” type logic… Until we have collected a bunch of data rejecting the “null hypothesis” that “everything is just fine” then we should just “act as if everything is just fine”. Another way to put this is “until enough people have died, you shouldn’t take precautions to protect yourself”. Put that way it’s clearly UTTERLY irresponsible to “debunk” this video using that logic.

What we KNOW is that viruses are particles, essentially complex chemicals, which sit in droplets, which can be viable after floating in the air for 3 hours, which can settle out onto cardboard and be viable for 24 hours, and which can be viable for 3 days on plastic and steel. Guess what your groceries come in? Plastic bags, cardboard boxes, steel cans, plastic jars…

The assay used in the NIH study that established those timelines was to actually elute (wash) the virus off the surface and then infect cells in a dish with it and see how many were infected. It wasn’t just detecting the virus was there, but actually showing that it was active and viable.

So, there’s your evidence. There is *direct* laboratory evidence that the virus *can* be transmitted off the surfaces into cells and infect them.

Whether this is a significant source of infection or not is more or less irrelevant. How do you make a decision as to whether you should spend ~ 1hr every 2 weeks cleaning all your groceries?

Here’s the Bayesian Decision Theory:

Suppose two actions are possible: 1) do nothing, or 2) handle your groceries carefully and wash your fruits and vegetables in dish-soapy water

Costs of (1): probability p0 of getting infected from contaminated surface. We don’t know what p0 is, but leave it as a symbolic quantity for the moment. Let’s just use 0.5% chance of dying if you’re infected as the dominant problem, and a “statistical value of a life” as on the order of 10M dollars… so p0*.005*10000000 = 50000*p0

Cost of (2): probability of getting infected from contaminated surface reduced to p0/100000 perhaps, the same 0.5% chance of dying if you’re infected, plus 1 hr of cleaning time. So cost is 0.5*p0 + w*1 where w is an “hourly wage”. Suppose you are willing to work for a median type wage, 50k/yr. This is 25$/hr. So, what does the probability p0 need to be to “break even”? Ignoring negligible quantities 0.5*p0, we have 50000*p0 = 25 so p0 = .0005. If you think there’s something like a .0005 chance you could transmit virus from your grocery items to your face by “doing nothing” then YOU SHOULD BE CAREFUL and wash your items. For me, I’ll spend some time quarantining my groceries, and washing my produce… I also find it keeps the produce from spoiling and hence lasts longer in storage, so that should go into the “plus” side as well.

As to what to wash your produce with. I’m using sudsy water from dye and fragrance free dish soap (main ingredients: Water, Sodium Lauryl Sulfate…). I’m washing my fruit and veg, and then rinsing it thoroughly. The quantity of soap I’m ingesting is substantially the same as if I hand washed a glass, rinsed it, and then filled it with water and drank it… It’s substantially less than you get from brushing your teeth with a typical toothpaste. If you are afraid of washing your dishes with soap, or of brushing your teeth, then by all means don’t wash your fruit with soap either… For the rest of us, do a good job rinsing just like you’d rinse your glasses or bowls before putting food in them.

Confusion about coronavirus testing and the role of testing capacity

2020 March 30
by Daniel Lakeland

Here’s some code to simulate a process whereby we saturate testing capacity… First the graphs:

Confirmed cases (blue) follows the real cases (red) so long as the cases per day are below the maximum… once we saturate, the green line increases linearly, and so does the blue line…
Green line (tests) parallels the blue (positive tests), as we saturate

t = seq(1,40)
realcases = 100*exp(t/4)
realincrement = diff(c(0,realcases))

testseekers = rnorm(NROW(realincrement),4,.25)*realincrement

maxtests = 20000

## now assume that you test *up to* 20k people. if more people are
## seeking tests, you test a random subset of the seekers
## getting a binomial count of positives for the given frequency

ntests = rep(0,NROW(t));
ntests[1] = 100;
confinc = rep(0,NROW(t));
confinc[1] = 100;
for(i in 2:(NROW(t)-1)){
    if(testseekers[i] < maxtests){
        confinc[i] = realincrement[i]
        ntests[i] = testseekers[i]
    else if(testseekers[i] > maxtests){
        confinc[i] = min(realincrement[i],rbinom(1,maxtests,realincrement/testseekers))
        ntests[i] = maxtests

cumconf = cumsum(confinc)
cumtests = cumsum(ntests)

ggplot(data.frame(t=t,conf=cumconf,nt=cumtests,real=realcases))+geom_line(aes(t,cumconf),color="blue")  + geom_line(aes(t,nt),color="green")+ geom_line(aes(t,real),color="red") +coord_cartesian(xlim=c(0,35),ylim=c(0,400000));

ggplot(data.frame(t=t,conf=cumconf,nt=cumtests,real=realcases))+geom_line(aes(t,log(cumconf)),color="blue") + geom_line(aes(t,log(nt)),color="green")+ geom_line(aes(t,log(real)),color="red") +coord_cartesian(xlim=c(0,30),ylim=c(0,log(400000)));

The longer term outlook…

2020 March 10
by Daniel Lakeland

Coming out the other end of this whole COVID-19 thing… how do we do a good job of sustaining social distancing, and then returning sanely to productivity? The “flatten the curve” idea extends the amount of time one needs to be in “lockdown” but ultimately reduces deaths and severe morbidity… That’s good, but it starts to run into the “how long can we hole up?” question. If things go crazy through the roof, like in China, the duration is shorter. Data here shows from “oh shit” to relatively small per day caseload was about 20 days in china.
That’s a bad thing, because that represents the really “peaked” shape that overwhelms healthcare facilities. Many people died who otherwise might not have…
But if we make that slower, then also the peak occurs later, and the duration is longer, we might need, say 80 days of rather intense social distancing to make that happen. If we figure lockdowns are going to start now and build up through the next 10 days (it’s already something WaPo and The Atlantic and etc are saying)… And then we need 80 days after that… you’re talking 90 days which is 3 months, and puts us starting to return to work around June 1.

Now let’s talk food supply. Unlike China, this virus is spreading country-wide. It’s not contained to a particular place. So mobilizing the national guard to bring food from the midwest to WA because people in the midwest are ok… is not a possibility. How do we feed our country for 80 days without people having to be in contact with each other? We need food delivery systems.

Fortunately, as people get the virus and then recover, they should be immune for at least some period of time. Recovery to the point that they’re not shedding the virus is however probably 30 days? Just a guess, we’ll have to see with serology and PCR combo tests (to test that someone had the virus at some point, and doesn’t shed it now).

This doesn’t help us a lot. We have to do 90 days of relative isolation, and during the first 30 days people are getting the thing and then over the next 30 days those early people are recovering… by the time we hit 90 days, if you haven’t gotten it, you’re running pretty lean on food and things even if you’re well stocked now (and most people really aren’t). Obviously we’ll need to distribute food throughout the 90 days. This is going to require coordination from govt I believe, otherwise we’ll have sick people out there handling food… not good.

Everything you need to know about what to do about Coronavirus

2020 March 9
by Daniel Lakeland

You need to stop interacting with people. And I’m not joking about this.

Here’s the facts out of Italy: about 10% of tested positive cases require ICU ventilation. The death rate for people under age 65 is probably only ~ 1% **if you get the ventilators to the 10% needing ventilation**… If you overwhelm the hospitals, the death rate will go to ~10% which is on the order of magnitude of about 10x as bad as pandemic influenza in 1918.

The current trending idea is #flattenthecurve to describe to people HOW IMPORTANT it is to start *NOW* avoiding the spread of the disease. This avoidance of overloading the infrastructure is a core idea in Civil Engineering (my PhD is in CE).

Reducing the spread of the disease is not important just because fewer people will eventually get it (though that is probably true) but because the peak number of people who need ventilators and other intensive type care will be lower, so that fatality rates can stay low. If all the ICU beds are full, and 300 patients show up needing ICU today… all 300 patients will die. Since 10% of cases may need ventilators, it’s a serious situation.

Does social distancing, closing schools, etc work? Evidence out of 1918 says HELL YES: Unfortunately servers are getting swamped, so the best way for me to link you to this info is via twitter, who will probably stand up to the pounding.


So, what do you need to do? TODAY make plans to not be at work by the end of the week. Why? Because the virus is doubling the number of symptomatic verified cases outside china about every 2-4 days, let’s call it 3 days. And, btw it takes 5 days to onset of symptoms and for many people ~ 10 or 15 days before they say “hey I need to go to the hospital” (though for the elderly… it can be like 1hr after onset of fever). So, whatever’s going on in a hospital near you… it’s maybe what was the case 3 or 4 doubling periods ago, so today it’s on the order of ~ 10x worse than that. 10 days from now, it will be 100x worse already, but that will show up at the hospital about 20 days from now.

Early, proactive and significant reduction in interaction with other people WORKS and is one of the only things we can do. So we WILL be doing it. If we wait, we’ll be doing it AND have a massive tragedy. If we start now, we’ll be doing it but have less of a massive tragedy. The boulder is rolling down the hill, we can start walking off the path now, or get hit.