Good News for Solving Bufferbloat: CoDel Provides "No Knobs" Solution

Data transfer speeds have been getting faster and faster, but that doesn't mean that we're actually reaping the full benefits. A few years ago, Jim Gettys put his finger on the "criminal mastermind" behind poor networking performance. Dubbed Bufferbloat, the problem was not a simple one to solve. Not simple, but Controlled Delay (CoDel) active queue management (AQM) may provide serious progress towards a solution.

The problem, in a nutshell, is that TCP was not designed with today's bandwidth in mind. As Gettys wrote in "the criminal mastermind" post, the problem lies "end-to-end" in applications, operating systems and home networks. Buffering is necessary, but too much buffering is a problem. And today's devices and operating systems are doing too much buffering - which is degrading performance. Says Gettys, "TCP attempts to run a link as fast as it can, any bulk data transfer will cause a modern TCP to open its window continually, and the standing queue grows the longer a connection runs at full bandwidth, continually adding delay unless a AQM is present."

The Linux Tips page on the Bufferbloat wiki highlights the scope of the problem. According to the page, buffers can "hide" in the operating system layer (Linux transmit queue), device driver, hardware (which has buffers of its own), and on and on. One of the long-term solutions to bufferbloat is active queue management (AQM), and the Controlled Delay (CoDel) AQM proposed by Kathleen Nichols and Van Jacobson might be a big piece of the puzzle.

CoDel To the Rescue?

CoDel (pronounced "coddle"), is being called a "no-knobs" AQM. That means users and admins aren't expected to tweak any parameters to get best performance out of CoDel. According to the paper published on ACM Queue, "CoDel’s algorithm is not based on queue size, queue-size averages, queue-size thresholds, rate measurements, link utilization, drop rate or queue occupancy time. Starting from Van Jacobson's 2006 insight (PDF), we used the local minimum queue as a more accurate and robust measure of standing queue."

More importantly, CoDel promises to distinguish between "good" queues and "bad" queues. It's supposed to minimize delay without hampering bursts of traffic. "The core of the bufferbloat-detection problem is separating good from bad ... good queue is occupancy that goes away in about one RTT (round-trip time); bad queue persists for several RTTs. An easy, robust way to separate the two is to take the minimum of the queue length over a sliding time window that's longer than the nominal RTT."

Finally, CoDel is supposed to be suitable for deployment in a wide range of devices. The paper says CoDel is "simple and efficient," and can be deployed in low-end devices or "high-end commercial router silicon."

Proof is in the Pudding, er, Deployment

A fair amount of testing has been done on CoDel, but the proof has to come via real-world deployments. According to the paper, Nichols and Jacobson performed "several thousand simulation runs" that showed CoDel "performed very well" with results "compelling enough to move on to the next step of extensive real-world testing in Linux-based routers."

Note that deploying this in a home-based router may not be enough to rid yourself of bufferbloat. The researchers point out "a savvy user could be tempted to deploy CoDel through a CeroWrt-enabled edge router to make bufferbloat disappear. Unfortunately, large buffers are not always located where they can be managed but can be ubiquitous and hidden. Examples include consumer-edge routers connected to cable modems and wireless access points with ring buffers. Many users access the Internet through a cable modem with varying upstream link speeds. ... The modem's buffers are at the fast-to-slow transition, and that's where queues will build up: inside a sealed device outside of user control."

So, don't expect CoDel to solve the bufferbloat problem overnight, or even by the end of the year. Gettys says that "work to integrate adaptive AQM algorithms into wireless systems work will take months or years, rather than the week that initial CoDel prototype implementation for Ethernet took. But at least much testing of the CoDel algorithm, experimentation, and refinement can now take place."

There's also the matter of tackling wireless, which Gettys says "may be much more difficult, both because queuing is sometimes much more complex than Ethernet, but also since packet aggregation has resulted in OS/driver boundaries hiding information that is necessary for proper functioning."

But CoDel is still very good news, and shows that the community that's come together around the bufferbloat problem is making progress. Bufferbloat won't be solved in one fell swoop by a single breakthrough, but with a number of technology improvements over time.

(Lead image provided courtesy of Jeremiah Ro via Flickr, under the Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0) license.)