xen traffic control problems


Problem 1: UDP, VLANs and Lack of Flow Control

Problem
    VLAN devices do not support scatter-gather
    This means the that each skb needs to be linearised and thus cloned if they are trasmitted on a VLAN device
    Cloning results in the original fragments being released
    This breaks Xen's netfront/netback flow-control

Result
    A guess can flood dom0 with packets
    Very effective DoS attack on dom0 and other domUs

Work-Around
    Use the credit scheduler to limit the rate of a domU's virtual interface to something close to the rate of the physical interface:
        vif = [ "mac=00:16:36:6c:81:ae,bridge=eth4.100, script=vif-bridge,rate=950Mb/s" ]
    Still uses quite a lot of dom0 CPU if domU sends a lot of packets
    But the DoS is mitigated

Partial Solution
    scatter-gather enabled VLAN interfaces
    Problem is resolved for VLANS with supported physical devices
    Still a problem for any other device that doesn't support scatter-gather

Patches
    Included in v2.6.26-rc4
        "Propagate selected feature bits to VLAN devices" and;
        "Use bitmask of feature flags instead of seperate feature bit" by Patrick McHardy.
        "igb: allow vlan devices to use TSO and TCP CSUM offload" by Jeff Kirsher
    Patches for other drivers have also been merged

Problem 2: Bonding and Lack of Queues

Problem
    The default queue on bond devices is no queue
        This is because it is a software device, and generally queuing doesn't make sense on software devices
    qdiscs default the queue-length of their device
   
Result
    It was observed that netperf TCP STREAM only achieves 45-50Mbit/s when controlled by a class with a ceiling of 450Mbit/s
    A 10x degredation!

Solution 1a
    Set the queue length of the bonding device before adding qdiscs
    ip link set txqueuelen 1000 dev bond0
   
Solution 1b
    Set the queue length of the qdisc explicitly
    tc qdisc add dev bond0 parent 1:100 handle 1100: pfifo limit 1000

Problem 3: TSO and Lack of Accounting Accuracy

Problem

Problem
    If a packet is significantly larger than the MTU of the class, it is accounted as being approximately the size of the MTU.
    And the giants counter for the class is incremented
    In this case, the default MTU is 2047 bytes
    But TCP Segmentation Offload (TSO) packets can be much larger: 64kbytes
    By default Xen domUs will use TSO

Result
    The result similar to no bandwidth control of TCP

Workaround 1

Disable TSO in the guest, but the guest can re-enable it

    # ethtool -k eth0 | grep "tcp segmentation offload"
    tcp segmentation offload: on
    # ethtool -K eth0 tso off
    # ethtool -k eth0 | grep "tcp segmentation offload"
    tcp segmentation offload: off

Workaround 2

Set the MTU of classes to 40000
Large enough to give sufficient accuracy
Larger values will result in a loss of accuracy when accounting smaller packets
#tc class add dev peth2 parent 1:1 classid 1:101 rate 10Mbit ceil 950Mbit mtu 40000

Solution
    Account for large packets
    Instead of truncating the index, use rtab values multiple times
        rtab[255] * (index >> 8) + rtab[index & 0xFF]
    "Make HTB scheduler work with TSO" by Ranjit Manomohan was included in 2.6.23-rc1

你可能感兴趣的:(xen)