专业的发包仪器非常不错,比如思博伦通信Smartbits测试设备,但唯一的缺点就是太贵,而这个唯一的缺点又不是那么容易克服。还好,伟大的Linux为我们提供了一个先进的发包工具pktgen(http://www.linuxfoundation.org/collaborate/workgroups/networking/pktgen、http://lxr.linux.no/#linux+v2.6.38.8/Documentation/networking/pktgen.txt、ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/),这个工具以内核模块的形式存在,理论上性能应该比同等运行在应用层的工具性能要好,而且还是所谓的多核支持(的确是实打实),下面测试一下(临时用自己家里电脑上的虚拟机搭的环境,以便能够虚拟出需要的多个网卡来,不在意测试结果,而是这个测试过程)。
1,首先加载这个模块,当然需要先确定已经编译了它,加载成功后就会有对应的/proc接口,pktgen是每一个cpu绑定一个内核线程,我的虚拟机有4个cpu,所以这里可以看到4个kpktgend_*文件:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.38.8 #4 SMP Mon Oct 31 20:49:48 CST 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]# cat /usr/src/linux-2.6.38.8/.config | grep CONFIG_NET_PKTGEN
CONFIG_NET_PKTGEN=m
[root@localhost ~]# insmod /usr/src/linux-2.6.38.8/net/core/pktgen.ko
[root@localhost ~]# cat /proc/cpuinfo | grep processor
processor : 0
processor : 1
processor : 2
processor : 3
[root@localhost ~]# ls /proc/net/pktgen/
kpktgend_0 kpktgend_1 kpktgend_2 kpktgend_3 pgctrl
[root@localhost ~]#
|
2,虚拟机一共有3个网口,拿其中的2个网口(eth5/eth6)用于pktgen测试,设置ip地址,修改速率,:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
|
[root@localhost pktgen]# ifconfig eth5
eth5 Link encap:Ethernet HWaddr 00:0C:29:97:9B:B4
inet addr:192.168.1.95 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe97:9bb4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15044140 errors:0 dropped:0 overruns:0 frame:0
TX packets:14210498 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:903846363 (861.9 MiB) TX bytes:852636332 (813.1 MiB)
[root@localhost pktgen]# ifconfig eth6
eth6 Link encap:Ethernet HWaddr 00:0C:29:97:9B:BE
inet addr:192.168.1.96 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe97:9bbe/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:14256340 errors:0 dropped:0 overruns:0 frame:0
TX packets:14998657 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:856589260 (816.9 MiB) TX bytes:899925597 (858.2 MiB)
[root@localhost pktgen]# ethtool -s eth5 autoneg off speed 1000 duplex full
[root@localhost pktgen]# ethtool -s eth6 autoneg off speed 1000 duplex full
[root@localhost pktgen]# ethtool eth5
Settings
for
eth5:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
Link detected: yes
[root@localhost pktgen]# ethtool eth6
Settings
for
eth6:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
Link detected: yes
[root@localhost pktgen]#
|
查看两个网口的中断号,并把eth5/eth6做亲和性绑定到特定的cpu(先关闭系统的irqbalance服务):
1
2
3
4
5
6
7
8
9
10
11
12
|
[root@localhost ~]# cat /proc/interrupts | grep eth
16: 2586 540 1351588 4172554 IO-APIC-fasteoi Ensoniq AudioPCI, eth6
19: 5117 1949714 6098060 40 IO-APIC-fasteoi eth4, eth5
[root@localhost ~]# /etc/init.d/irqbalance stop
Stopping irqbalance: [ OK ]
[root@localhost ~]# echo 4 > /proc/irq/19/smp_affinity
[root@localhost ~]# echo 8 > /proc/irq/16/smp_affinity
[root@localhost ~]# cat /proc/irq/19/smp_affinity
04
[root@localhost ~]# cat /proc/irq/16/smp_affinity
08
[root@localhost ~]#
|
3,pktgen测试脚本如下(从文件:ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/pktgen.conf-2-2修改而来),可以看到相当于是把eth5和eth6拿根网线直连起来相互对发数据包:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
|
#! /bin/sh
# FileName: pktgen-eth5-eth6.conf
# modprobe pktgen
function pgset() {
local result
echo $1 > $PGDEV
result=`cat $PGDEV | fgrep
"Result: OK:"
`
if
[
"$result"
=
""
]; then
cat $PGDEV | fgrep Result:
fi
}
function pg() {
echo inject > $PGDEV
cat $PGDEV
}
# Config Start Here -----------------------------------------------------------
# thread config
# Each CPU has own thread. Two CPU exammple. We add eth1, eth2 respectivly.
PGDEV=/proc/net/pktgen/kpktgend_2
echo
"Removing all devices"
pgset
"rem_device_all"
echo
"Adding eth5"
pgset
"add_device eth5"
PGDEV=/proc/net/pktgen/kpktgend_3
echo
"Removing all devices"
pgset
"rem_device_all"
echo
"Adding eth6"
pgset
"add_device eth6"
# device config
# delay 0 means maximum speed.
CLONE_SKB=
"clone_skb 1000000"
# NIC adds 4 bytes CRC
PKT_SIZE=
"pkt_size 60"
# COUNT 0 means forever
#COUNT="count 0"
COUNT=
"count 0"
DELAY=
"delay 0"
PGDEV=/proc/net/pktgen/eth5
echo
"Configuring $PGDEV"
pgset
"$COUNT"
pgset
"$CLONE_SKB"
pgset
"$PKT_SIZE"
pgset
"$DELAY"
pgset
"dst 192.168.1.96"
pgset
"dst_mac 00:0C:29:97:9B:BE"
PGDEV=/proc/net/pktgen/eth6
echo
"Configuring $PGDEV"
pgset
"$COUNT"
pgset
"$CLONE_SKB"
pgset
"$PKT_SIZE"
pgset
"$DELAY"
pgset
"dst 192.168.1.95"
pgset
"dst_mac 00:0C:29:97:9B:B4"
# Time to run
PGDEV=/proc/net/pktgen/pgctrl
echo
"Running... ctrl^C to stop"
pgset
"start"
echo
"Done"
# Result can be vieved in /proc/net/pktgen/eth[5,6]
|
4,执行上面那个pktgen测试脚本,看数据:
1
2
3
4
5
6
7
8
|
[root@localhost pktgen]# sh pktgen-eth5-eth6.conf
Removing all devices
Adding eth5
Removing all devices
Adding eth6
Configuring /proc/net/pktgen/eth5
Configuring /proc/net/pktgen/eth6
Running... ctrl^C to stop
|
另开一个终端,执行mpstat命令查看cpu占用率,看网卡中断的处理是否正常:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
[root@localhost ~]# mpstat -P 2,3 1
Linux 2.6.38.8 (localhost.localdomain) 01/15/2012 _x86_64_ (4 CPU)
09:13:16 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
09:13:17 AM 2 0.00 0.00 86.87 0.00 0.00 13.13 0.00 0.00 0.00
09:13:17 AM 3 0.00 0.00 74.23 0.00 0.00 25.77 0.00 0.00 0.00
09:13:18 AM 2 0.00 0.00 87.63 0.00 0.00 12.37 0.00 0.00 0.00
09:13:18 AM 3 0.00 0.00 76.53 0.00 0.00 23.47 0.00 0.00 0.00
09:13:19 AM 2 0.00 0.00 85.71 0.00 0.00 14.29 0.00 0.00 0.00
09:13:19 AM 3 0.00 0.00 75.51 0.00 0.00 24.49 0.00 0.00 0.00
09:13:20 AM 2 0.00 0.00 86.60 0.00 0.00 13.40 0.00 0.00 0.00
09:13:20 AM 3 0.00 0.00 70.10 0.00 0.00 29.90 0.00 0.00 0.00
^C
[root@localhost ~]# cat /proc/interrupts | grep eth
16: 2586 540 588 6664664 IO-APIC-fasteoi Ensoniq AudioPCI, eth6
19: 5117 714 9729867 40 IO-APIC-fasteoi eth4, eth5
[root@localhost ~]#
|
执行一段时间后看统计数据:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
[root@localhost pktgen]# sh pktgen-eth5-eth6.conf
Removing all devices
Adding eth5
Removing all devices
Adding eth6
Configuring /proc/net/pktgen/eth5
Configuring /proc/net/pktgen/eth6
Running... ctrl^C to stop
^C
[root@localhost pktgen]# cat /proc/net/pktgen/eth5
Params: count 0 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1000000 ifname: eth5
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 192.168.1.96 dst_max:
src_min: src_max:
src_mac: 00:0c:29:97:9b:b4 dst_mac: 00:0c:29:97:9b:be
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 4405203 errors: 0
started: 2423250036us stopped: 2570673239us idle: 61432us
seq_num: 4405204 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x5f01a8c0 cur_daddr: 0x6001a8c0
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 147423202(c147361769+d61432) usec, 4405203 (60byte,0frags)
29881pps 14Mb/sec (14342880bps) errors: 0
[root@localhost pktgen]# cat /proc/net/pktgen/eth6
Params: count 0 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1000000 ifname: eth6
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 192.168.1.95 dst_max:
src_min: src_max:
src_mac: 00:0c:29:97:9b:be dst_mac: 00:0c:29:97:9b:b4
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 2846965 errors: 0
started: 2423223982us stopped: 2570673243us idle: 45963us
seq_num: 2846966 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x6001a8c0 cur_daddr: 0x5f01a8c0
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 147449261(c147403297+d45963) usec, 2846965 (60byte,0frags)
19308pps 9Mb/sec (9267840bps) errors: 0
[root@localhost pktgen]#
|
看起来,这个测试结果也太差了,双向平均才2.5Wpps,后续再拿真实机器测试。
利用pktgen发包工具做性能测试(续)
最近在真实机器上测试了一下pktgen发包工具的性能,使用的pktgen来之:http://tslab.ssvl.kth.se/pktgen/,在Linux 2.6.37的内核上进行测试,原生pktgen源码为:http://tslab.ssvl.kth.se/pktgen/patches/net-next-v2.6.37/pktgen_rx.tgz,另外我对它做了一些改动,最主要的改动是可以将同一个网口按队列绑定到多个cpu上,收包也改成按网口分别统计。
基本软硬件环境为:Linux kernel 2.6.37 x86-64,ixgbe-3.8.21,4核心E5504(2.00GHz)的至强CPU,开启超线程后8个CPU线程,DDR3 1333 4G的内存,2个82599EB。
pktgen设置为“”clone_skb 0″、pkt_size 60″、”count 0″、”delay 0″”,两个万兆网口(eth0,eth1)设置为混杂模式并用光纤对接,加载ixgbe驱动后,网口自动按系统当前CPU线程数分为8个收发队列(eth0-TxRx-0、…、eth0-TxRx-7,eth1-TxRx-0、…、eth1-TxRx-7),将每一个收发队列绑定到一个CPU线程(cpu0处理eth0-TxRx-0和eth1-TxRx-0的收发、cpu1处理eth0-TxRx-1和eth1-TxRx-1的收发、…、cpu7处理eth0-TxRx-7和eth1-TxRx-7的收发),双向发收数据包(eth0和eth1同时发送和接收)。
测试结果:双向400Wpps,即eth0和eth1各自同时发送200Wpps,接收200Wpps,mpstat看到的cpu占用为 %sys/35.00,%soft/65.00,%idle/0.00。这是未做任何优化情况下的测试结果,额,貌似合符理论逻辑。
其它链接:
ftp://robur.slu.se/pub/Linux/bifrost/seminars/workshop-2011-03-31/
http://robur.slu.se/Linux/
http://caia.swin.edu.au/genius/tools/kute/