RED, the random-early-drop QDISC as leaky-bucket

2019 March 12
by Daniel Lakeland

The RED qdisc, which is available in Linux to control network traffic is mostly not used these days in favor of things like codel and fq_codel. Originally it was designed to give feedback to TCP streams by dropping packets randomly before hitting a hard tail-drop, as a way to get the TCP streams to throttle back.

Today, the place where it might get used more often is something like a core router at an ISP because it doesn't require much computation and can maybe handle things like 40Gbps or 100Gbps ethernet.

But I recently had some discussions that led me to think it might have a use in certain circumstances where it wasn't traditionally in use. The particular use I'm considering is for UDP streams carrying realtime traffic. For example, game or phone traffic where a small packet is sent every 10 or 20 ms or so to update the state of the game or provide the next piece of audio data.

Now, if you set up GAMEBW to contain the bandwidth allocated in kbps for your total game traffic, and you assume typical packets are about 220 bytes (as estimated by example data I collected) you can set up a RED queue as follows:

tc qdisc add dev ${DEV} parent 1:20 handle 20: red limit $(($GAMEBW * 70 /8)) min $(($GAMEBW * 20/8)) max $(($GAMEBW * 50/8)) avpkt 220 burst $(($GAMEBW *30/8 / 220)) probability 1.0 bandwidth ${GAMEBW}kbit harddrop

 

Here the magic numbers 20, 50, 30, etc refer to the time in milliseconds that it would take at GAMEBW to send the data in the queue so far. min at 20ms means that the RED queue acts like a bfifo for the first 20ms worth of data... assuming that this is a leaf qdisc for some bandwidth limiting qdisc, we assume that enough bandwidth is available to ensure that this level of queue rarely occurs. But if there's a burst of traffic for some reason, we might start queuing past a 20ms queue, and then we have some probability to drop packets... which rises at 20ms from 0 linearly up to 100% (probability 1.0) once we hit 50ms of queue (we set limit at 70ms just in case).

If we start to exceed our allotted bandwidth for a while, we build up a queue, but we want to ensure that this queue doesn't build up past some point, in our case 50ms worth of data. If we have 1.5 times the bandwidth allowed coming in, we'll build up to a steady state where 1-1/1.5 = 33% of the packets are dropped, which occurs at 20+(50-20)*.33 = 30ms of queue length.

If a burst of packets lasts for a while, we either need to buffer it and hope that later it will go away, in which case our delay grows without bound until the burst stops... (bufferbloat!) or drop some of the packets and make sure that what goes out is no more than the max bandwidth and try to keep the delay low at the expense of sending only some of the packets.

The RED queue linearly interpolates between 0% drop at min, and 100% drop at max, so that no matter what the incoming over-bandwidth rate is, the delay stays below 50ms in this case.

RED is like a "soft tailed" bfifo, and although the TCP exponentially backs off when there's a little droppage... UDP does no such thing, and so the recommendations in the RED manual page don't really apply here... we want to get up to 100% droppage at our maximum allowable delay level, whereas the RED manual page assumes something like just 1-2% droppage, because TCP will dramatically reduce its bandwidth usage when it detects drops.

You can think of a model of the RED queue as a bucket with a small hole in the bottom, connected to a pipe, and a large hole in the side that dumps water on the ground. Water pours into the top of the bucket, and goes out the bottom. If we pour a little faster, we can build up a queue of water waiting to go down the pipe... up to 20ms of delay and the bucket doesn't leak on the ground.... As we pour even faster... the hole in the side pours some of the water out on the ground (drops packets). Depending on how fast we pour, the height of the water increases, and the side hole dumps water faster... Eventually the water reaches the top of the bucket and 100% of the excess dumps onto the ground. That's basically what's going on. When we turn off the tap... no more than 80ms later we've got an empty bucket.

2 Responses leave one →
  1. hisham permalink
    March 12, 2019

    Nicely written!
    Anyway i'm the person who contributed in this test and provided the data.
    i hope we can write another post about advanced HFSC !

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS