http://luxik.cdi.cz/~devik/qos/htb/
http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#ceiling
New text is in red color. Coloring is removed on new textafter 3 months. Currently they depicts HTB3 changes
This document shows you how to use HTB. Most sections have examples, charts (with measured data) anddiscussion of particular problems.
This release of HTB should be also much more scalable. Seecomparison at HTB home page.
Please read: tc tool (not only HTB) uses shortcuts to denote unitsof rate. kbps means kilobytes and kbit means kilobits ! This is the most FAQ about tc in linux.
HTB ensures that the amount of service provided to each class isat least the minimum of the amount it requests and the amount assignedto it. When a class requests less than the amount assigned, the remaining (excess) bandwidth is distributed to other classes which request service.
Also see document about HTB internals - itdescribes goal above in greater details.
Note: In the literature this is called "borrowing" the excess bandwidth.We use that term below to conform with the literature. We mention, however,that this seems like a bad term since there is no obligation to repay theresource that was "borrowed".
The different kinds of traffic above are represented by classes inHTB. The simplest approach is shown in the picture at the right.
Let's see what commands to use:
tc qdisc add dev eth0 root handle 1: htb default 12This command attaches queue discipline HTB to eth0 and gives it the "handle" 1:.This is just a name or identifier with which to refer to it below. The default 12means that any traffic that is not otherwise classified will be assigned to class 1:12.
Note: In general (not just for HTB but for all qdiscs and classes in tc),handles are written x:y where x is an integer identifying a qdisc andy is an integer identifying a class belonging to that qdisc. The handle for a qdisc must have zero for its y value and the handle for a classmust have a non-zero value for its y value. The "1:" above is treatedas "1:0".
tc class add dev eth0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps tc class add dev eth0 parent 1:1 classid 1:10 htb rate 30kbps ceil 100kbps tc class add dev eth0 parent 1:1 classid 1:11 htb rate 10kbps ceil 100kbps tc class add dev eth0 parent 1:1 classid 1:12 htb rate 60kbps ceil 100kbps
The first line creates a "root" class, 1:1 under the qdisc 1:. The definition of a root class is one with the htb qdisc as its parent.A root class, like other classes under an htb qdisc allows its childrento borrow from each other, but one root class cannot borrow from another. We could have created the other three classes directly under the htb qdisc, but then the excess bandwidth from one would not be available to the others.In this case we do want to allow borrowing, so we have to create an extraclass to serve as the root and put the classes that will carry the real dataunder that. These are defined by the next three lines.The ceil parameter is described below.
Note: Sometimes people ask me why they have to repeat dev eth0 when they have already used handle or parent. The reason is that handles are local to an interface, e.g., eth0 and eth1 could each have classes with handle 1:1.
We also have to describe which packets belong in which class.This is really not related to the HTB qdisc. See the tc filterdocumentation for details. The commands will look something like this:
tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 \ match ip src 1.2.3.4 match ip dport 80 0xffff flowid 1:10 tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 \ match ip src 1.2.3.4 flowid 1:11(We identify A by its IP address which we imagine here to be 1.2.3.4.)
Note: The U32 classifier has an undocumented design bug which causes duplicate entries to be listed by "tc filter show" when you use U32 classifiers with different prio values.
You may notice that we didn't create a filter for the 1:12 class. It might be more clear to do so, but this illustrates the use of the default.Any packet not classified by the two rules above (any packetnot from source address 1.2.3.4) will be put in class 1:12.
Now we can optionally attach queuing disciplines to the leaf classes.If none is specified the default is pfifo.
tc qdisc add dev eth0 parent 1:10 handle 20: pfifo limit 5 tc qdisc add dev eth0 parent 1:11 handle 30: pfifo limit 5 tc qdisc add dev eth0 parent 1:12 handle 40: sfq perturb 10That's all the commands we need. Let's see what happens if we sendpackets of each class at 90kbps and then stop sending packets of oneclass at a time. Along the bottom of the graph are annotations like "0:90k". The horizontal position at the center of the label (in this case near the 9, also marked with a red "1") indicates thetime at which the rate of some traffic class changes. Before the colon is an identifier forthe class (0 for class 1:10, 1 for class 1:11, 2 for class 1:12) andafter the colon is the new rate starting at the time where the annotation appears. For example, the rate of class 0 is changed to90k at time 0, 0 (= 0k) at time 3, and back to 90k at time 6.
Initially all classes generate 90kb. Since this is higher than anyof the rates specified, each class is limited to its specified rate. At time 3 when we stop sending class 0 packets, therate allocated to class 0 is reallocated to the other twoclasses in proportion to their allocations, 1 part class 1 to 6 parts class 2.(The increase in class 1 is hard to see because it's only 4 kbps.)Similarly at time 9 when class 1 traffic stops its bandwidth isreallocated to the other two (and the increase in class 0 is similarly hardto see.) At time 15 it's easier to see that the allocation to class 2 isdivided 3 parts for class 0 to 1 part for class 1. At time 18 both class 1 andclass 2 stop so class 0 gets all 90 kbps it requests.
It might be good time to touch concept of quantums now. In fact whenmore classes want to borrow bandwidth they are each given some number ofbytes before serving other competing class. This number is called quantum.You should see that if several classes are competing for parent's bandwidththen they get it in proportion of their quantums. It is important to knowthat for precise operation quantums need to be as small as possible andlarger than MTU.
Normaly you don't need to specify quantums manualy as HTB chooses precomputedvalues. It computes classe's quantum (when you add or change it) as itsrate divided by r2q global parameter. Its default value is 10and because typical MTU is 1500 the default is good for rates from15 kBps (120 kbit). For smaller minimal rates specify r2q 1 whencreating qdisc - it is good from 12 kbit which should be enough. Ifyou will need you can specify quantum manualy when adding or changingthe class. You can avoid warnings in log if precomputed value would bebad. When you specify quantum on command line the r2q is ignored forthat class.
This might seem like a good solution if A and B were not differentcustomers. However, if A is paying for 40kbps then he would probablyprefer his unused WWW bandwidth to go to his own other service rather than to B. This requirement is represented in HTB by the class hierarchy.
Notes: Packet classification rules can assign to inner nodes too. Thenyou have to attach other filter list to inner node. Finally you shouldreach leaf or special 1:0 class. The rate supplied for a parent should be the sumof the rates of its children.
The commands are now as follows:
tc class add dev eth0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps tc class add dev eth0 parent 1:1 classid 1:2 htb rate 40kbps ceil 100kbps tc class add dev eth0 parent 1:2 classid 1:10 htb rate 30kbps ceil 100kbps tc class add dev eth0 parent 1:2 classid 1:11 htb rate 10kbps ceil 100kbps tc class add dev eth0 parent 1:1 classid 1:12 htb rate 60kbps ceil 100kbps
We now turn to the graph showing the results of the hierarchical solution.When A's WWW traffic stops, its assigned bandwidth is reallocated to A's other traffic so that A's total bandwidth is still the assigned 40kbps.
If A were to request less than 40kbs in total then the excess would be given to B.
The graph at right differs from the previous one at time 3 (when WWW traffic stops) because A/other is limited to 20kbps. Therefore customer A gets only 20kbps in total and the unused 20kbps is allocated to B.
The second difference is at time 15 when B stops. Without the ceil, all of its bandwidth was given to A, but now A is only allowed to use 60kbps, so the remaining 40kbps goes unused.
This feature should be useful for ISPs because they probably want tolimit the amount of service a given customer gets even when other customers are not requesting service. (ISPs probably want customers to pay more money for better service.)Note that root classes are not allowed to borrow, so there's really nopoint in specifying a ceil for them.
Notes: The ceil for a class should always be at least as high as the rate. Also, the ceil for a class should always be at least as high as the ceil ofany of its children.
If cburst is smaller (ideally one packet size) it shapes bursts to not exceed ceil rate in the same way as TBF's peakrate does.
When you set burst for parent class smaller than for some childthen you should expect the parent class to get stuck sometimes (becausechild will drain more than parent can handle). HTB will remember thesenegative bursts up to 1 minute.
You can ask why I want bursts. Well it is cheap and simple wayhow to improve response times on congested link. For example www trafficis bursty. You ask for page, get it in burst and then read it. Duringthat idle period burst will "charge" again.
Note: The burst and cburst of a class should always be at leastas high as that of any of it children.
On graph you can see case from previous chapter where I changed burstfor red and yellow (agency A) class to 20kb but cburst remaineddefault (cca 2 kb).
Green hill is at time 13 due to burst setting on SMTP class.A class. It has underlimit since time 9 and accumulated 20 kb of burst.The hill is high up to 20 kbps (limited by ceil because it has cburstnear packet size).
Clever reader can think why there is not red and yellow hill at time7. It is because yellow is already at ceil limit so it has no spacefor furtner bursts.
There is at least one unwanted artifact - magenta crater at time 4. Itis because I intentionaly "forgot" to add burst to root link (1:1) class.It remembered hill from time 1 and when at time 4 blue class wanted to borrow yellow's rate it denied it and compensated itself.
Limitation: when you operate with high rates on computer with lowresolution timer you need some minimal burst and cburst to be set for all classes. Timer resolution on i386 systems is 10ms and1ms on Alphas.The minimal burst can be computed as max_rate*timer_resolution. So thatfor 10Mbit on plain i386 you needs burst 12kb.
If you set too small burst you will encounter smaller rate than you set.Latest tc tool will compute and set the smallest possible burst when itis not specified.
There is also second face of problem. It is total delay of packet. It is relativelyhard to measure on ethernet which is too fast (delay is so neligible). Butthere is simple help. We can add simple HTB with one class rate limiting toless then 100 kbps and add second HTB (the one we are measuring) as child. Then wecan simulate slower link with larger delays.
For simplicity sake I use simple two class scenario:
# qdisc for delay simulation tc qdisc add dev eth0 root handle 100: htb tc class add dev eth0 parent 100: classid 100:1 htb rate 90kbps # real measured qdisc tc qdisc add dev eth0 parent 100:1 handle 1: htb AC="tc class add dev eth0 parent" $AC 1: classid 1:1 htb rate 100kbps $AC 1:2 classid 1:10 htb rate 50kbps ceil 100kbps prio 1 $AC 1:2 classid 1:11 htb rate 50kbps ceil 100kbps prio 1 tc qdisc add dev eth0 parent 1:10 handle 20: pfifo limit 2 tc qdisc add dev eth0 parent 1:11 handle 21: pfifo limit 2Note: HTB as child of another HTB is NOT the same as class underanother class within the same HTB. It is because when class in HTB can sendit will send as soon as hardware equipment can. So that delay of underlimitclass is limited only by equipment and not by ancestors.
Simulator is set to generate 50 kbps for both classes and at time 3s itexecutes command:
tc class change dev eth0 parent 1:2 classid 1:10 htb \ rate 50kbps ceil 100kbps burst 2k prio 0As you see the delay of WWW class dropped nearly to the zero whileSMTP's delay increased. When you priorize to get better delay it alwaysmakes other class delays worse.
What class should you priorize ? Generaly those classes whereyou really need low delays. The example could be video or audiotraffic (and you will really need to use correct rate hereto prevent traffic to kill other ones) or interactive (telnet, SSH)traffic which is bursty in nature and will not negatively affectother flows.
Common trick is to priorize ICMP to get nice ping delays even on fullyutilized links (but from technical point of view it is not what you want whenmeasuring connectivity).
# tc -s -d qdisc show dev eth0 qdisc pfifo 22: limit 5p Sent 0 bytes 0 pkts (dropped 0, overlimits 0) qdisc pfifo 21: limit 5p Sent 2891500 bytes 5783 pkts (dropped 820, overlimits 0) qdisc pfifo 20: limit 5p Sent 1760000 bytes 3520 pkts (dropped 3320, overlimits 0) qdisc htb 1: r2q 10 default 1 direct_packets_stat 0 Sent 4651500 bytes 9303 pkts (dropped 4140, overlimits 34251)First three disciplines are HTB's children. Let's ignore them as PFIFOstats are self explanatory.
tc -s -d class show dev eth0 class htb 1:1 root prio 0 rate 800Kbit ceil 800Kbit burst 2Kb/8 mpu 0b cburst 2Kb/8 mpu 0b quantum 10240 level 3 Sent 5914000 bytes 11828 pkts (dropped 0, overlimits 0) rate 70196bps 141pps lended: 6872 borrowed: 0 giants: 0 class htb 1:2 parent 1:1 prio 0 rate 320Kbit ceil 4000Kbit burst 2Kb/8 mpu 0b cburst 2Kb/8 mpu 0b quantum 4096 level 2 Sent 5914000 bytes 11828 pkts (dropped 0, overlimits 0) rate 70196bps 141pps lended: 1017 borrowed: 6872 giants: 0 class htb 1:10 parent 1:2 leaf 20: prio 1 rate 224Kbit ceil 800Kbit burst 2Kb/8 mpu 0b cburst 2Kb/8 mpu 0b quantum 2867 level 0 Sent 2269000 bytes 4538 pkts (dropped 4400, overlimits 36358) rate 14635bps 29pps lended: 2939 borrowed: 1599 giants: 0I deleted 1:11 and 1:12 class to make output shorter. As you see thereare parameters we set. Also there are level and DRR quantuminformations.
You have to patch to make it work with older kernels. Download kernel source anduse patch -p1 -i htb3_2.X.X.diff to apply the patch. Then usemake menuconfig;make bzImage as before. Don't forget to enable QoS and HTB.
Also you will have to use patched tc tool. The patch is alsoin downloads or you can download precompiled binary.
If you think that you found an error I will appreciate error report.For oopses I need ksymoops output. For weird qdisc behaviour addparameter debug 3333333 to your tc qdisc add .... htb.It will log many megabytes to syslog facility kern level debug. Youwill probably want to add line like:
kern.debug -/var/log/debug
to your /etc/syslog.conf. Then bzip and send me the log via email(up to 10MB after bzipping) along with description of problem andits time.