OpenStack中遇到的MTU问题 ( by quqi99 )

OpenStack中遇到的MTU问题 ( by quqi99 )

作者:张华  发表于:2013-11-10
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

( http://blog.csdn.net/quqi99 )

         这两天接手一个bug, 说是openstack部署在两台物理计算节点机上的两台虚机之间通过ssh执行大数据输出的命令时有hang的情况,老外甚至怀疑是因为两台物理机是通过10G光纤交换机相连造成的巨大帧引发的MTU问题。研究、分析、搜索、试验、讨论MTU一天多,最后是通过下列链接的第二种方法解决的:

http://openstack.redhat.com/forum/discussion/comment/1565

https://review.openstack.org/#/c/27937/16/etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini


粘贴如下:

So I've been playing with OpenStack Quantum/Neutron for a while now in a flat/bridged networking configuration across a six node cluster on top of CentOS 6.4. Over that time I've had two problems related to MTU's sizes, one on the host and the other inside VM's.

The problem on the host is related to the taps/tunnels/bridges that OpenVSwitch constructs between the integration bridge br-int and your OVS bridge interface associated with a physical NIC, in my case br-em1, in docs sometimes referred to as br-ex. This problem manifests itself in dropped packets to the target VM/instance, for example commands failing to finish when executed inside an SSH session to the VM, simply try to edit a large text file and because the packets are at the maximum MTU they are dropped and you don't see the file in your terminal. This problem is already documented in thislaunchpad bug. It is caused by extra VLAN headers that OpenVSwitch adds to route the packets internally, these headers cause the packets to become too big for the MTU limits of the OpenVSiwtch tunnels, causing them to be dropped. My workaround is to configure the quantum dhcp agent to set a lower MTU for VM instances using dnsmaq DHCP options, this can be done by specifying a dnsmasq config file in the dhcp-agent.ini(/etc/neutron/dhcp_agent.ini). Simply add/uncomment the line dnsmasq_config_file=/var/lib/neutron/dnsmasq.conf, the specified file should contain a single line dhcp-option=26,1454 to allow leeway for the VLAN headers to be attached and let the packet through OpenVSwitch out onto the network. Also don't forget to make the quantum user the owner of the dnsmasq config file. My suggestion/question on this is; would there be any harm in setting the the default MTU for these tunnels to maximum value that OpenVSwitch supports to avoid such issues in the future? I can think of situations when people will want to start using Jumbo Frames inside instances and this could flummox them.

The second issue I have had is again related to MTU's and the above limits, but from a viewpoint inside the VM and a NIC capability known as generic segmentation offloading. This can manifest itself in very slow scp file coipes between VM's, or failing VNC sessions to VM's. The problem is the virtio_net kernel module driver creating packets way to big for the above OpenVSwitch tunnel limits. The solution to this is to disable gso altogether in virtio_net, to check if tcp segmentation offload is enabled runethtool -k eth0, where eth0 is the NIC in question inside the VM. To disable instantly useethool -K eth0 tso off, to disable permanently add the line options virtio_net gso=0 to /etc/modprobe.conf(RHEL5/CENTOS5 etc) OR in a new file /etc/modprobe.d/virtio_net.conf(RHEL6/CENTOS6 etc).


查看PMTU值 (sudo ip link set eth0 mtu 1500)
1, ping -c 2 -s 1472 -M do 192.168.99.1  # IP首部20 bytes + ICMP首部8 bytes
2, tracepath 192.168.99.1

3, traceroute --mtu 192.168.99.1


将 qvrXXX网桥上的qvbxxx网卡和虚机网卡tapXXX设置MTU

http://img.kuqin.com/upimg/allimg/140525/11305334V-5.png

http://images.cnitblog.com/blog/103165/201405/231722161528556.png

VLAN="122"
for qbridge in `ovs-vsctl show |grep -A1 "tag: ${VLAN}" |grep 'Interface' |cut -d '"' -f 2 |grep 'qvo'`; do
    bridge=`echo $qbridge | sed -s 's/^qvo/qbr/'`
    for interface in `brctl show $bridge | awk '{print $NF}' | grep -v interfaces`; do
    ifconfig $interface mtu $MTU
    done
done


2014-10-23,

今天又遇到此问题, 即在远程机器上能ping但不能ssh, 就是MTU问题, 进虚机修改: sudo ip link set eth0 mtu 1454, ok, 搞定. 参考: https://ask.openstack.org/en/question/30502/can-ping-vm-instance-but-cant-ssh-ssh-command-halts-with-no-output/


另一种提升性能的方法是不再虚机, 改GRE沿路的物理口的MTU改成1546, 也要记得为交换机设置mtu, 见: http://techbackground.blogspot.com/2013/06/path-mtu-discovery-

你可能感兴趣的:(openstack,MTU)