This wide- and large- screen layout may not work quite right without Javascript.

Maybe enable Javascript, then try again.

Chuck Kollars Home School PC Administration

Traffic Shaping
to Prevent Monopolization


"Traffic shaping" is reordering network packets from their usual simple first-come-first-served order according to some desired priority scheme. It's an extra feature that's sometimes layered on top of a working TCP/IP connection; it's not needed just to get a connection to work at all. Traffic Shaping is sometimes used by carriers (and even ISPs) and large shared connections (corporate, school, etc.), and was previously even sometimes used for individual connections over slow modems. It's almost irrelevant to current SOHO networks though.

(Somewhat confusingly, the same term "traffic shaping" is also sometimes used to refer to the "bandwidth limiting" that some carriers have sometimes done. In the past, a few such implementations have been of questionable validity, perhaps unfairly giving a bad reputation to Traffic Shaping in general. Depending on the exact implementation details, sometimes carrier bandwidth limiting is virtually impossible to detect unambiguously, while other times it causes clear protocol violations which are sometimes reported to the user as communications errors.)

We can use traffic shaping for example to guarantee reasonable/fair access of every client user/computer to the Internet, prohibiting any one user/computer from either intentionally or accidentally monopolizing our bandwidth. And we can use it to emphasize legitimate educational uses while deemphasizing downloads of very large files so our school-provided network doesn't become simply a "fast download utility".

Bandwidth monopolization had not severely affected us, so we were leisurely about our implementation. Our implementation is now in place, and it's comforting to know it's there, handling the transient problems we were unaware of and otherwise serving mainly as a backstop.

Regular and careful attention to blocking particular sites largely stopped problem traffic before it could monopolize our bandwidth. We block most (but not all) P2P (peer-to-peer) file sharing traffic with a combination of IP "ports" and known external confederate servers. (Many P2P applications rely on a confederate server to bootstrap a new computer into the peer to peer network. Identifying then blocking those confederate servers one at a time is the current Achilles heel of many P2P applications.) Since P2P confederate servers (and even protocols) change all the time, we examine our logs and make appropriate changes very frequently.

Social Engineering

If your goal is simply to minimize appropriation of your bandwidth for big downloads, you may only need traffic shaping to "suggest" which uses are preferred (or even not need it at all). You may not need to completely solve the technical problem of how to rigorously enforce all restrictions. Possibly just making discouraged uses such as big downloads run "slowly" at the same time as preferred uses such as web browsing continue to run "quickly" will be enough.

Demand quickly and thoroughly adapts to whatever resources are provided. So just a minor shift in performance may be enough to profoundly change usage patterns.

Equipment

You'll need a single box at your "choke point" interface between your internal network and the Internet to "shape" the network traffic you place out on the open Internet. This box should almost certainly be a Linux box. Neither Windows nor most non-Linux *nix systems are even remotely capable of traffic shaping, and purpose-specific boxes are usually expensive.

Linux has much much more capability for traffic shaping/traffic control than virtually anything else. A Linux computer is probably better for bandwidth management than the priciest purpose-specific box. Using some other OS (*nix, Windows, etc.) as a reference point will probably lead to complete disbelief that Linux is so powerful ...but it is.

That's the plus. The minus is making use of these capabilities can be quite confusing. Linux tends to be on the "bleeding edge". There are multiple ways to do most things, multiple possible programs to do them, and multiple possible ways to configure it. As of today (summer 2006), Linux traffic shaping/traffic control can appear to be a techie's dream but an administrator's nightmare. Nevertheless it's one of the very few realistic options.

Software

There are several application programs that do traffic shaping, as well as very sophisticated capabilities built right into the Linux kernel. The kernel facilities can interact with applications, so most common configurations are actually a combination of use of the kernel with application extensions and/or configuration tools. Linux kernel configuration for traffic shaping is so arcane that some packages that appear to completely implement complex capabilities in fact consist mostly or exclusively of just "simpler" configuration utilities that manipulate the capabilities already in Linux.

Most of the bandwidth management capabilities in Linux are focused on handling a mix of more important (or more time sensitive) and less important traffic. For example, a common use is to give SSH traffic first priority no matter what. In fact, this emphasis on complex bandwidth management problems is so pervasive that almost all Linux documentation talks only about the fancy applications but not about the simple stuff. One can read lots of "Traffic-Control-HOWTO"s and learn a whole lot, yet never get a clue how to do the very basic things we need to do. The simple stuff simply isn't covered.

After experimentation with more than one tool, we wound up using the capabilities built into Linux. Specifically we use the Heirarchical Token Bucket (HTB) and Stochastic Fair Queueing (SFQ) queuing disciplines that are built into most Linuxes. We use Shorewall as the front-end user interface to these capabilities, rather than trying to configure the capabilities directly ourselves. We found that using Shorewall (which is mainly used to configure firewall capabilities but is also able to configure traffic shaping capabilities) made our traffic shaping configuration much simpler to create and much easier to understand.

What Purpose?

Typically "traffic shaping" involves classifying traffic flows --often based on the contents of individual packets-- and treating them differently. But our focus will be on a different purpose: preventing any one user/computer from monopolizing our Internet connection. "Traffic shaping" is the tool for this purpose too. But the existing documentation and example configurations barely scratch the surface of using traffic shaping for this purpose. In fact they often seem to incorrectly imply that traffic shaping can't be used at all for the purpose of ensuring every user/computer reasonable access to the Internet. Schools will need to tweak a little bit to adapt the existing close-but-not-exact technology to their needs.

Most current traffic shaping documentation focuses on two things: 1) making a single user's system work as fast as possible by separating "interactive" from "batch" traffic and handling them differently, and 2) making downloads run as quickly as possible even when something else is also going on. Schools need something a little different, and most current traffic shaping documentation is a little bit off the mark. For example the parts about making downloads of large files run quickly even though something else is going on are exactly the wrong thing to do for schools and will definitely need to be modified. Most schools would be well served by simply looking at the amount of bandwidth used so far and lowering the priority of any traffic flow that's been going on too long. But most traffic shaping doesn't work exactly this way, and this use isn't even mentioned by most documents.

It can be tricky to tune traffic shaping parameters so everyone is happy, especially with the Squid method where the parameters must allow for not only normal traffic but also traffic that's "outside" the Squid traffic shaping regime because the traffic is not web-like. In fact, getting the Squid parameters right may be so difficult that the rule of thumb "if it ain't broke, don't fix it" should apply. In other words, if nobody has complained of network monopolization and you don't have any evidence of it and you're not using the Linux method, you may be better off just letting your network alone than trying to take preemptive action to solve a problem that hasn't annoyed your users.

Biggest Technical Challenge

Schools usually want to deemphasize large downloads and give preference to everything else. One way to do this is to identify every type of important traffic leaving your network and explicitly give them all higher priority, then let what's left (hopefully mainly the large downloads) default to a low priority.

While you can easily exclude most downloads this way, downloads over HTTP (web port 80) can't easily be distinguished; they appear to be just more legitimate web traffic. What's different is the sheer size and length of such flows, but each individual packet in isolation seems innocuous. Trying to identify and mark such flows using the Linux traffic shaping capabilities (that mainly key off the contents of individual packets) is currently a technical challenge.

In fact currently the easiest solution may be to not even try to identify and shape such flows, but rather discourage them some other way. A web filter like DansGuardian/Squid might also be used to psychologically discourage HTTP downloads by requiring the entire file to be received into the cache before any part of it is returned to the user.

Where To Do Traffic Shaping

With a Linux system you have three obvious candidate places to do Traffic Shaping.

  1. the delay_pools feature of the web proxy application `Squid` (Note that if you're using the web filter DansGuardian, you're already running `Squid`.)
  2. within the IPchains/IPtables part of the Linux kernel using a direct interface such as `tc` to specify queueing disciplines such as HTB and SFQ
  3. the Shorewall front-end to the Linux kernel packet filtering and prioritizing capabilities (at first glance Shorewall traffic shaping looks quite different from Linux kernel capabilities even though it's actually the same)

Each candidate has advantages and disadvantages.

Some of these methods prevent monopolization in a round about way or as a side effect. For example effectively limiting outbound ACKs will almost certainly provide fairness in use of inbound bandwidth even though inbound bandwidth isn't being directly metered.

Traffic shaping with Linux has changed a whole lot fairly recently. Using Linux kernel capabilities for traffic shaping was nerd nirvana two years ago, still on the edge of what mere mortals could handle a year ago, difficult at the beginning of the next year, yet fairly straightforward by August 2006. The hot new technology a couple years ago (wondershaper) has already been obviated by similar functionality built into Shorewall.

Squid candidate
  advantages

Probably most importantly, Squid traffic shaping meters the incoming (download) packets for traffic shaping. Everything else must look at the outgoing (upload) packets because a system has no control over the rate someone sends it packets. Squid is able to meter traffic in the reverse direction because it "holds" the received information and can sometimes pass it along to the end station at a different rate than it was received.

The configuration appears simpler. This is especially true in two cases. If you're comparing to the direct configuration of Linux kernel capabilities Squid will look quite a bit simpler. And separating the traffic by IPaddress would require hundreds of Linux kernel rules (usually generated by some kind of loop, for example in a shell script), but is done automatically by Squid.

All traffic from one workstation (really one IPaddress) is counted together. Other forms of traffic shaping often separate the traffic into "flows", which more closely correspond to individual application programs on an end system than they do to the whole end system. (In many cases this isn't really much of an advantage.)

There's a lot of room to add enhancements, and the syntax for doing so is straightforward. Bandwidth limitations can be tied to certain times of day. Some stations can be excepted or given different limits very easily. And destinations can be specified as a group of systems sharing a DNS "domain name" rather than as individual IPaddresses.

  disadvantages

Squid sees and can shape only HTTP (web) traffic. Other things like email and most file transfers don't go through Squid. In the best case this makes the Squid parameters a little more difficult to tune as they need to be set loose enough to accommodate other traffic. In the worst case it completely rules out using Squid for traffic shaping. If you suspect you have significant traffic that doesn't use the www port (ex: bittorrent), do not try to implement traffic shaping in Squid. Although you expend a lot of effort, in the end it will be ineffective.

The current Squid methodology does not allow one station to run at full speed when no other station is active. Bandwidth limits will always be applied to individual stations, not just when they're really necessary. Without support of bandwidth "borrowing", your users will be throttled even when some of your bandwidth is idle. If one of your requirements is having one station run full speed when everybody else is idle, you cannot use Squid traffic shaping.

Linux kernel candidate
  advantages

The Linux kernel capabilties are extremely flexible. You can make them do almost anything.

Configurations are widely available; you may be able to find something already built that fits your needs and just copy it.

  disadvantages

Configuration is pretty complex. Only computer nerds seem able cope with it.

"Tweaking" an example that's almost but not quite right pretty much requires you to understand the whole thing. So the advantage of using a pre-built configuration is negated if it doesn't exactly fit your need.

Identification of type of traffic simply by IP "port" doesn't work very well any more, as too many modern applications try to sneak by filters by masquerading or using IP "port" in unconventional ways. This is particularly a problem with PeerToPeer (p2p) file sharing protocols. Clever configuration that demotes the non-preferred traffic without explicitly identifying it can largely obviate this, as can the "ipp2p" Linux feature.

Shorewall candidate
  advantages

Configuration is much simpler ...maybe even simpler than with Squid. This is largely because all the structure and most of the parameters have already been chosen. All you have to do is fill in the few remaining parameters.

A fair round-robin-like system (SFQ) is built right into the Shorewall system and is always enabled by default. Without even thinking about it, you get this additional layer of fairness which helps prevent bandwidth monopolization even without configuring traffic classes. At root this is the same SFQ functionality that exists in the Linux kernel, but if you configure traffic shaping directly in the Linux kernel, you have to explicitly specify its use, whereas if you use Shorewall it's always on by default.

  disadvantages

You can't do anything different than what the dictated structure and default parameters provides. The dictated structure and default parameters have been very wisely chosen so they'll probably work for you. But in the few case where they don't work for you, there's no way to do something different.

Extensions such as different limitations at different times of day may be tricky. (On the other hand because of the way Shorewall traffic shaping operates, you hopefully won't need to make any such extensions.)

Specifying whole groups of destination systems by IPaddress isn't quite as simple as just specifying the "domain name" they all share.

Since Shorewall is a front end configuration utility for the Linux kernel functionality and has no functionality of its own, it suffers many of the same limitations of the direct Linux kernel method of traffic shaping. Specifically, identification of type of traffic simply by IP "port" often doesn't work very well any more, as too many modern applications try to sneak by filters by masquerading or using IP "port" in unconventional ways. Clever configuration that demotes the non-preferred traffic without explicitly identifying it can largely obviate this, as can the "ipp2p" Linux feature.

Our Squid Experiences

We tried several different Squid traffic shaping parameters. But we could never get it tuned so its operation satisfied everyone; eventually we gave up.

Hindsight revealed we had all along an un-verbalized requirement that one station run at full speed when all the other stations were idle. Since no Squid delay_pools parameters can fully satisfy this requirement yet still prevent monopolization, it's not surprising we failed.

Squid Implementation Details

There's quite a bit of documentation available. Web search for "squid delay pools" and you'll find it. Even most of this documentation though appears more complex than it needs to be; take it with a grain of salt.

Unfortunately most earlier and some current pre-built distributions of Squid don't include the delay pools facility. So it may be necessary to either obtain or rebuild Squid with "--enable-delay-pools". Pre-built binaries may be available from a non-standard website (possibly http://www.acmeconsulting.it/pagine/opensource/download/squid.htm). If you feed a delay pools configuration to a version of Squid that isn't built to include the delay pools facility, you'll probably just get a bunch of (possibly nonsenical and even misleading) error messages.

Squid provides three slightly different forms of delay pools for specifying traffic shaping in different situations. Class 1 delay pools are most appropriate for limiting overall bandwidth usage without metering individual stations. Class 2 delay pools are most appropriate for small networks that are entirely encompassed by a single Class C IPaddress range. Class 3 delay pools are most appropriate for either a collection of subnetworks such as departments in a college, or for large networks without subnetworks.

Squid delay pools don't understand Subnet Masks, CIDR, or even the internal structure of IP Addresses. If you specify class 2, Squid delay pools will simply use the fourth eight bits of the IPaddress to select which computer pool no matter what and will be limited to 255 computer pools. If you have more than one internal computer with the same last eight bits in its IPaddress, all such computers will be placed in a single delay pool and their combined bandwidth use will be limited to what you intended for one computer. For example the total network traffic from the computers 172.16.5.34 and 172.16.8.34 would be limited just as if the pair were a single computer #34.) Class 3 delay pools use the third eight bits of the IPaddress to select the uber-pool, and the fourth eight bits of the IPaddress to select the pool-within-a-pool for a total of 256^2 pools.

Squid's delay pools algorithm ("leaky bucket") is fast, simple, and widely used, but it doesn't always behave quite like you expect it to. It's convenient to think of the two numbers as "bandwidth limit" and "max file size" (or "throttle" and "burst size"); but actually they're the bucket fill rate and the bucket capacity. The convenient fiction of thinking of the numbers as something like "bandwidth limit" and "max file size", which you can use to suggest initial parameter values, is not really quite right and is not exact. For example Squid delay pools sometimes won't "demote" a connection until it's rapidly consumed two or three times the configured number of bytes.

It's straightforward to put modified configurations into Squid quickly without rebooting or restarting anything. You might need to do this in the process of adding the delay pools configuration. Once everything is running okay, you won't need to do it any more. To force Squid to reprocess its configuration parameters, enter at a shell prompt (logged in as the same user that started squid)

      squid -k reconfigure

Note that a local Squid client such as DansGuardian must be stopped while you restart Squid. If you force a restart of Squid while DansGuardian is still alive running on top of it, you will wind up with a nonfunctional (and possibly hung) system.

Our Shorewall Experiences

We use Shorewall to put all our traffic into one of five priority categories. Our categories, highest priority first, are:

  1. SSH and Telnet - ports 22 (or non-default) and 23
    Normally there's no traffic in this category. But in an emergency, this ensures we can always get in quickly to fix the problem, even from remote locations.
  2. DNS - port 53
    We have internal caching DNS servers which absorb almost all traffic. We treat the little bit that goes offsite as very important, since just a little bit of missing DNS information can delay access to a web page.
  3. Incoming Email (POP3) - port 110
  4. World Wide Web - ports 80 and 443
    This is the bulk of our traffic. Much of the time there's nothing in the categories above it and this is in effect our most important traffic.
  5. everything else (bulk downloads, P2P, outgoing [SMTP] email, etc.)

This configuration works very well to meet our goals:

Specifically identifying all the bulk download activity we want to discourage would be difficult, time-consumimg, and require constant maintenance. By simply letting it all fall into the "everything else" category with lower priority than world wide web access, we finesse this difficulty. We keep our network from becoming nothing more than a download utility service, yet we don't need to constantly invest effort in maintenance.

We did not try to categorize traffic all the way down to the individual computer that generated it. We found that just the more general prioritization by kind of traffic was sufficient. Simply by giving preference to world wide web traffic, any computer that's generating some other kind of traffic is demoted without our having to exactly identify either the computer or the traffic. Monopolization of bandwidth by a specific computer is a bit difficult to comprehend even in theory, and our experience is it doesn't happen in practice and isn't worth worrying about.

Shorewall Caveats and Assumptions

Each publicly visible server would go in the DMZ (not on the LAN), and is expected to require adjustments to both packet rules and traffic shaping. Such adjustments would probably involve alternate destination IPs, alternate port numbers, SOURCE columns, and something like Proxy ARP. Publicly visible servers points up the larger issue of balancing two competing uses of the same Internet drop: internal users of external services, and external users of our own servers. As we currently have no publicly visible servers, the example below does not accommodate them.

If it was quicker or if there was any doubt at all or if there was any possibility of a strange implementation surfacing, UDP as well as TCP ports were specified; also a few extra TCP ports were specified. (For example, there's almost certainly no legitimate UDP traffic on ports 20, 21, or 110; and there's probably no legitimate incoming TCP traffic other than ACKs on port 20). Also, because of confusion over some FTP-like clients using "passive mode" and others not, the configuration which attempts only to accommodate outgoing FTP-like traffic generated by clients may appear to start to accommodate incoming FTP-like traffic to a server too. This careless paranoia doesn't hurt anything in our environment, but the extraneous inclusions may make the example a bit less clear.

Shorewall Implementation Details

The configuration is simply a matter of setting up three Shorewall configuration files, then running Shorewall just as we already did for our firewall. (Shorewall traffic shaping configuration may be more complex with multiple simultaneous ISPs, but that was not our situation and we avoided any possible complexity.)

The three configuration files are /etc/shorewall/tcdevices, /etc/shorewall/tcrules, and /etc/shorewall/tcclasses. Here are our contents of those three files. Especially note the option "default" on the last line of tcclasses file; our scheme would not work without it.

The Shorewall configuration below was used in production on the RedHat distribution of Linux, which back then had very few tools for direct use of the Linux kernel's IPtables capability. Pay close attention to any current information about using Shorewall on different distributions of Linux. Without taking current distribution-specific steps, use of the Shorewall front end is likely to conflict with direct use of IPtables on many current distributions.

tcdevices
#INTERFACE      IN-BANDWIDTH    OUT-BANDWIDTH
# These available bandwidth numbers _should_mesh_with_reality_.
#  The example numbers shown below meshed with the service provided by our ISP. 
#  The outgoing bandwidth number _must_ be adjusted to what your ISP's 
#  equipment and plant are capable of, less expected inbound traffic bandwidth.  
# Note we limit out-bandwidth to considerably less than the modem is capable
#  of because inbound traffic _always_ predominates at IMHS.
# Also note we specify an in-bandwidth of 0, which means no qdisc is assigned
#  to the inbound direction of the interface. This is quicker and it eliminates
#  any possibility of the "Classifier actions preferred over netfilter" message.
#  But it means we don't control ingress at all so packets _could_ back up
#  in a queue somewhere inside our ISP.
eth1            0mbit           3mbit

tcrules
#MARK   SOURCE          DESTINATION     PROTOCOL        PORT(S) SOURCE PORT(S)
# Net effect is highest priority to admin/interactive, second to DNS in/out,
# third to email in, fourth to web (both http and https) in/out,
# and everything else last
1       0.0.0.0/0       0.0.0.0/0       tcp             20,21,22,23,220,222,2222
1       0.0.0.0/0       0.0.0.0/0       udp             20,21,22,23,220,222,2222
2       0.0.0.0/0       0.0.0.0/0       tcp             53
2       0.0.0.0/0       0.0.0.0/0       udp             53
3       0.0.0.0/0       0.0.0.0/0       tcp             110
3       0.0.0.0/0       0.0.0.0/0       udp             110
4       0.0.0.0/0       0.0.0.0/0       tcp             80,443

tcclasses
#INTERFACE      MARK    RATE    CEIL            PRIORITY        OPTIONS
# We use
#  mark/prio 1 for interactive (only remote administration, very little),
#  mark/prio 2 for DNS (normally much much less, this is a worst case ceiling,
#   the high priority is to minimize hangs as much as possible)
#  mark/prio 3 for email in (usually uses above allocations too, runs quickly
#   when needed, but just passes allocation down most of the time),
#  mark/prio 4 for web (main legitimate use),
#  and mark/prio 5 is everything else (i.e. P2P, MMORPG, "bulk", etc.)
#   (last class is assumed to contain most discouraged uses and little else)
# Class 5 guaranteed rate is very small to reduce
#  responsiveness so much MMORPGs aren't usable - mostly class 5 bandwidth
#  depends on "surplus" bandwidth cascading down unused from higher 
#  priorities, and as a result is too bursty for interactive use
eth1            1       2*full/100      full    1
eth1            2       20*full/100     full    2
eth1            3       10*full/100     full    3
eth1            4       64*full/100     full    4
eth1            5       4*full/100      full    5               default
#                       Note sum of RATEs is exactly 100%.
#
# Placing a CEIL other than "full" (or almost-full?) on the last
#  class slows our web responsiveness noticeably (!?) - some sort of interaction
#  with our ISP seems the most likely explanation; maybe enforcement of CEIL
#  introduces short quiet periods and our ISP interprets those quiet periods
#  as our needing less bandwidth...  otherwise it might be prudent to 
#  specify CEIL as 9*full/10 as yet one more method of enforcing that no
#  one class could ever completely exclude other classes no matter what

Extensions

You could do more than just simply prevent bandwidth monopolization with traffic shaping. Additional things you could do include:


Location: (N) 42.680943, (W) -70.839384
 (North America> USA> Massachusetts> Boston> Metro North> Ipswich)

Email comments to Chuck Kollars
Time: UTC-5 (USA Eastern Time Zone)
 (UTC-4 summertime --"daylight savings time")

Peruse Chuck Kollars' Facebook Profile
All content on this Personal Website (including text, photographs, audio files, and any other original works), unless otherwise noted on individual webpages, are available to anyone for re-use (reproduction, modification, derivation, distribution, etc.) for any non-commercial purpose under a Creative Commons License.