drbd性能测试及调优

今天在qq群里,有个哥们说千万不要用drbd做mysql的HA,说对性能影响非常大,drbd对性能有影响是肯定的,把数据通过网络发送给对端备库必定有性能损耗,而我正好有一套drbd,借此测试一把,看看drbd对性能的影响到底有多大,也给网友一个参考。

我测试的是一套两节点的drbd+pacemaker+corosync的mysqlHA高可用集群,主机都是普通的过时的pc机,内存2g,cpu 2核。

1.首先使用sysbench对正常状态的集群初始化数据:
[root@topdb ]# sysbench --test=oltp --oltp-table-size=5000000 --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --db-driver=mysql prepare
初始化的时候可以用dstat命令看看主备的压力:
主库:
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  5   4  41  51   0   0|  80k   19M| 959k 9388k|   0     0 |5697    13k
  4   5  44  47   0   1|  88k   15M| 963k 9474k|   0     0 |4845    10k
  7   4  36  53   0   0|   0    16M| 326k 9048k|   0     0 |5324    12k
  8   3  41  47   0   1|   0    18M| 939k 8708k|   0     0 |5963    12k
  7   4  30  59   0   0|   0    21M| 975k 9659k|   0     0 |5763    14k
...
  6   5  42  48   0   0|   0    17M|1389k 7702k|   0     0 |5524    13k
 10   3  39  48   0   0|   0    17M| 380k   10M|   0     0 |5198    11k
  4   3  45  48   0   1|   0    19M| 950k 8993k|   0     0 |6003    14k
  5   4  43  48   0   0|   0    13M| 991k   10M|   0     0 |4863    11k
   
--备库:
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  9   5  86   0   0   1|   0  8580k|9114k  319k|   0     0 |  12k   14k
  4   3  92   0   0   1|   0  7572k|7992k  280k|   0     0 |9422    12k
  0   3  96   0   0   1|   0  8348k|8842k  309k|   0     0 |  10k   14k
  0   2  97   0   0   1|   0  7544k|7988k  279k|   0     0 |9351    12k
  0   3  96   0   0   1|   0  9164k|  10M  345k|   0     0 |  12k   14k
  0   3  97   0   0   1|   0  8180k|8232k  293k|   0     0 |9662    13k
 ...
  0   3  97   0   0   1|   0  8544k|9036k  314k|   0     0 |  10k   14k
  0   3  96   0   0   1|   0  7672k|8123k  285k|   0     0 |9543    12k
  0   3  97   0   0   1|   0  8888k|9448k  327k|   0     0 |  11k   14k^C
可以看出,主库每秒要写10几兆左右,网络发送也要将近10M,从库写每秒7M-9M左右,网络接收也和写操作的速率差不多
查看下sysbench的测试表的大小:
[root@db163 mcldb]# du -sh sbtest.*
12K sbtest.frm
1.2G sbtest.ibd

2.测试drbd集群状态的性能:
采用复合模式,即增删改查模式:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10:  multi-threaded system evaluation benchmark

WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.

OLTP test statistics:
    queries performed:
        read:                            11130
        write:                           3975
        other:                           1590
        total:                           16695
    transactions:                        795    (13.21 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 15105  (251.06 per sec.)
    other operations:                    1590   (26.43 per sec.)

Test execution summary:
    total time:                          60.1639s
    total number of events:              795
    total time taken by event execution: 240.4650
    per-request statistics:
         min:                                100.55ms
         avg:                                302.47ms
         max:                                889.42ms
         approx.  95 percentile:             614.07ms

Threads fairness:
    events (avg/stddev):           198.7500/3.90
    execution time (avg/stddev):   60.1163/0.03
可以看出,1分钟内请求了16695次查询,tps为13.21(不足为奇,我的机子就比较烂)

3.测试单机状态的性能:
把备库离线:
[root@db162 ~]# crm 
crm(live)# status
Last updated: Mon Jul 28 18:23:32 2014
Last change: Sat Jul 26 10:05:58 2014 via cibadmin on db163
Stack: classic openais (with plugin)
Current DC: db163 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured, 2 expected votes
7 Resources configured
Online: [ db162 db163 ]
 Master/Slave Set: ms_drbd_mysql [drbd_mysql]
     Masters: [ db163 ]
     Slaves: [ db162 ]
 Resource Group: g_mysql
     fs_mysql (ocf::heartbeat:Filesystem): Started db163 
     p_ip_mysql (ocf::heartbeat:IPaddr2): Started db163 
     mysqld (lsb:mysqld): Started db163 
 Clone Set: cl_ping [p_ping]
     Started: [ db162 db163 ]
crm(live)# node standby db162 

再来看下主库drbd的状态:
[root@db163 mcldb]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
 0: cs:WFConnection ro:Primary/ Unknown ds:UpToDate/ DUnknown C r-----
    ns:6074648 nr:0 dw:6075004 dr:294960 al:466 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:308  
确实已经不在同步数据了,下面再来测试:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10:  multi-threaded system evaluation benchmark

WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.

OLTP test statistics:
    queries performed:
        read:                            16394
        write:                           5851
        other:                           2340
        total:                           24585
    transactions:                        1169   (19.45 per sec.)
    deadlocks:                           2      (0.03 per sec.)
    read/write requests:                 22245  (370.14 per sec.)
    other operations:                    2340   (38.94 per sec.)

Test execution summary:
    total time:                          60.0990s
    total number of events:              1169
    total time taken by event execution: 240.2136
    per-request statistics:
         min:                                 73.11ms
         avg:                                205.49ms
         max:                                741.33ms
         approx.  95 percentile:             432.89ms

Threads fairness:
    events (avg/stddev):           292.2500/3.34
    execution time (avg/stddev):   60.0534/0.03
确实有drbd的话,性能损耗了(1-16695/24585)= 32%,但目前drbd没有调优,下面调整下drbd的参数,再来测试
4.调优drbd后测试性能:
先还原drbd备库:
[root@db162 ~]# crm
crm(live)# node online db162
crm(live)# status
Last updated: Mon Jul 28 18:45:16 2014
Last change: Mon Jul 28 18:45:30 2014 via crm_attribute on db162
Stack: classic openais (with plugin)
Current DC: db163 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured, 2 expected votes
7 Resources configured


Online: [ db162 db163 ]

 Master/Slave Set: ms_drbd_mysql [drbd_mysql]
     Masters: [ db163 ]
     Slaves: [ db162 ]
 Resource Group: g_mysql
     fs_mysql (ocf::heartbeat:Filesystem): Started db163 
     p_ip_mysql (ocf::heartbeat:IPaddr2): Started db163 
     mysqld (lsb:mysqld): Started db163 
 Clone Set: cl_ping [p_ping]
     Started: [ db162 db163 ]

[root@db162 ~]# cat /proc/drbd 
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
    ns:0 nr:38064 dw:38064 dr:0 al:0 bm:18 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
可以看出,drbd又还原回来了,现在又开始了数据同步。
调整优化drbd参数:
[root@db163 ~]# vim /etc/drbd.d/global_common.conf 
 disk {
                on-io-error detach;
                disk-flushes no;
        }
        net {
                max-buffers 8000; #增大缓存区为8000
                max-epoch-size 8000;
                sndbuf-size 0; #是sendbuffer自动调整
        }
        syncer {
        rate 10M; #我的是百兆带宽,所以调整10M就行了
        al-extents 257; #增大活动日志区为257个
        }
common {
        protocol C; #协议仍采用C,即主要把数据发送到从的tcp缓存区才算完成,这是最安全严格的方式了。
 }
把文件也拷贝到从上面:
[root@db163 ~]# scp /etc/drbd.d/global_common.conf db162:/etc/drbd.d/
global_common.conf                                                                                 100% 2181     2.1KB/s   00:00    
主备都在线调整下配置文件:
[root@db163 ~]# drbdadm adjust all
[root@db162 ~]# drbdadm adjust all
再来测试:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10:  multi-threaded system evaluation benchmark

WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.

OLTP test statistics:
    queries performed:
        read:                            16366
        write:                           5845
        other:                           2338
        total:                           24549
    transactions:                        1169   (19.45 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 22211  (369.46 per sec.)
    other operations:                    2338   (38.89 per sec.)

Test execution summary:
    total time:                          60.1174s
    total number of events:              1169
    total time taken by event execution: 240.1222
    per-request statistics:
         min:                                 70.51ms
         avg:                                205.41ms
         max:                                685.97ms
         approx.  95 percentile:             413.63ms

Threads fairness:
    events (avg/stddev):           292.2500/5.45
    execution time (avg/stddev):   60.0306/0.05
这次测试的tps为19.45,和单机测试的tps是一样的,也就是drbd对性能几乎没有影响了。

5.dd测试:
由于数据库都是小io,多随机读写,对吞吐量测试不了,我再测试下大数据块的写入及传输速度:
[root@db163 data]# dd if=/dev/zero of=/data/dd.test bs=1M count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 341.432 s, 12.6 MB/s
其中/data是我的drbd磁盘,在drbd网络传输的情况下,速度为12.6MB/s
下面关闭备节点,再来测试:
[root@db162 ~]# crm node standby
[root@db163 data]# cat /proc/drbd 
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:10309484 nr:0 dw:10407952 dr:533640 al:1514 bm:18 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:65724
[root@db163 data]# dd if=/dev/zero of=/data/dd.test2 bs=1M count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 81.3911 s, 52.8 MB/s
速度为52MB/s,不过这也是正常的,因为单机情况下dd都是写到缓存中了,drbd集群状态下速度为12.6M/s已经是网络带宽的最大容量了,因此写入速度受限于带宽而不是磁盘。如果在生产者用千兆内网,网络应该不成问题,对主库性能应该不会造成多少影响。而且将来drbd模块会集成到linux内核中,所以drbd确实性能还是可以的。

你可能感兴趣的:(性能,测试,drbd)