目录
1. 使用 fio +rbd ioengine
2. 使用 fio +libaio ioengine
3. 总结
参考:
环境准备:
1)如果是新版本的fio(>3.1),默认安装都已经支持rbd ioengine了。查看是否支持,如下:
yjiang2@admin-node:~/Downloads/fio$ fio --enghelp
Available IO engines:
binject
cpuio
e4defrag
falloc
ftruncate
libaio
mmap
mtd
net
netsplice
null
posixaio
rbd
rdma
sg
splice
sync
psync
vsync
pvsync
pvsync2
yjiang2@admin-node:~/Downloads/fio$ fio -v
fio-3.1
可以看到有rbd了,我的版本是fio-3.1.
2)如果fio版本低不支持,下载 fio 代码重新编译 fio:
$ git clone git://git.kernel.dk/fio.git
$ cd fio
$ ./configure
[...]
Rados Block Device engine yes
[...]
$ make
fio 使用 rbd IO 引擎后,它会读取 /etc/ceph/ceph.conf 中的配置去连接 Ceph 集群。
3)创建一个rbd image(不需要映射rbd map image),用 fio 测试:
rbd create -p rbd --size 20G fio_test_image
fio配置文件
root@client:/home/s1# cat write.fio
[global]
description="write test with block size of 4M"
#默认读取 /etc/ceph/ceph.conf 文件。确保 fio 运行的服务器有ceph集群的/etc/ceph/文件夹,如果没有scp 拷贝到fio 服务器上
ioengine=rbd
clientname=admin
pool=rbd
rbdname=fio_test_image
runtime=120
rw=write #write 表示顺序写,randwrite 表示随机写,read 表示顺序读,randread 表示随机读
bs=4M
[iodepth]
iodepth=32
这将对整个rbd大小(将通过librbd确定)执行100%顺序写入测试,因为ceph用户管理员使用ceph集群pool rbd(默认值)和刚刚创建的fio_test_image 块测试,4M块大小和iodepth为32(正在运行的io请求数)。
下面是 fio 跨网络的测试(一个SSD物理盘,rbd image 20GB)结果:
$ sudo fio ./write.fio
iodepth: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [W(1)][99.5%][r=0KiB/s,w=84.0MiB/s][r=0,w=21 IOPS][eta 00m:01s]
iodepth: (groupid=0, jobs=1): err= 0: pid=166918: Tue Oct 29 13:57:36 2019
Description : ["write test with block size of 4M"]
write: IOPS=28, BW=112MiB/s (118MB/s)(20.0GiB/182508msec)
slat (usec): min=769, max=9477, avg=1131.69, stdev=488.97
clat (msec): min=119, max=3205, avg=1139.12, stdev=639.71
lat (msec): min=121, max=3206, avg=1140.25, stdev=639.70
clat percentiles (msec):
| 1.00th=[ 342], 5.00th=[ 393], 10.00th=[ 430], 20.00th=[ 510],
| 30.00th=[ 609], 40.00th=[ 751], 50.00th=[ 953], 60.00th=[ 1217],
| 70.00th=[ 1586], 80.00th=[ 1888], 90.00th=[ 2089], 95.00th=[ 2198],
| 99.00th=[ 2400], 99.50th=[ 2500], 99.90th=[ 2702], 99.95th=[ 3071],
| 99.99th=[ 3205]
bw ( KiB/s): min=98107, max=139264, per=99.96%, avg=114857.43, stdev=7951.04, samples=363
iops : min= 23, max= 34, avg=28.03, stdev= 1.95, samples=363
lat (msec) : 250=0.10%, 500=19.14%, 750=20.90%, 1000=11.76%, 2000=33.63%
lat (msec) : >=2000=14.47%
cpu : usr=30.88%, sys=69.09%, ctx=1537, majf=0, minf=556167
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.3%, 32=99.4%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwt: total=0,5120,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s), io=20.0GiB (21.5GB), run=182508-182508msec
Disk stats (read/write):
sdb: ios=0/5880, merge=0/619, ticks=0/2268, in_queue=2264, util=0.55%
OSD节点上本地运行 fio 测试的结果:
$ fio ./write.fio
iodepth: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=116MiB/s][r=0,w=29 IOPS][eta 00m:00s]
iodepth: (groupid=0, jobs=1): err= 0: pid=49099: Tue Oct 29 14:03:52 2019
Description : ["write test with block size of 4M"]
write: IOPS=60, BW=242MiB/s (253MB/s)(20.0GiB/84753msec)
slat (usec): min=828, max=10937, avg=1645.38, stdev=547.21
clat (msec): min=173, max=1643, avg=527.74, stdev=204.72
lat (msec): min=175, max=1644, avg=529.39, stdev=204.63
clat percentiles (msec):
| 1.00th=[ 228], 5.00th=[ 275], 10.00th=[ 305], 20.00th=[ 363],
| 30.00th=[ 388], 40.00th=[ 430], 50.00th=[ 498], 60.00th=[ 550],
| 70.00th=[ 609], 80.00th=[ 667], 90.00th=[ 802], 95.00th=[ 936],
| 99.00th=[ 1167], 99.50th=[ 1250], 99.90th=[ 1351], 99.95th=[ 1569],
| 99.99th=[ 1636]
bw ( KiB/s): min=16582, max=425984, per=100.00%, avg=247500.82, stdev=77976.28, samples=169
iops : min= 4, max= 104, avg=60.21, stdev=19.11, samples=169
lat (msec) : 250=2.52%, 500=47.89%, 750=37.46%, 1000=8.73%, 2000=3.40%
cpu : usr=35.68%, sys=64.13%, ctx=437, majf=0, minf=646673
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=0.3%, 32=99.4%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwt: total=0,5120,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=242MiB/s (253MB/s), 242MiB/s-242MiB/s (253MB/s-253MB/s), io=20.0GiB (21.5GB), run=84753-84753msec
Disk stats (read/write):
sda: ios=0/27841, merge=0/311548, ticks=0/2858132, in_queue=2858416, util=99.55%
举例说明fio的常用参数:
fio -rw=randwrite -ioengine=libaio -direct=1 -thread -numjobs=1 -iodepth=64 -filename=/testfolder/fio.test -size=100M -name=job1 -offset=0MB -bs=4k -name=job2 -offset=10G -bs=512 -output TestResult.log
下面是每个参数的介绍
-rw=randwrite:读写模式,randwrite是随机写测试,还有顺序读read,顺序写write,随机读randread,混合读写等。
-ioengine=libaio:libaio指的是异步模式,Linux native asynchronous I/O;如果是同步就要用sync。
-direct=1:是否使用directIO。
-thread:使用pthread_create创建线程,另一种是fork创建进程。进程的开销比线程要大,一般都采用thread测试。
–numjobs=1:每个job开几个线程,这里是每个job开1个线程,有几个job用-name指定。所以最终线程数=任务数* numjobs。-name=job1:一个任务的名字,名字随便起,重复了也没关系。这个例子指定了job1和job2,建立了两个任务。-name之后的就是这个任务独有的参数。
-iodepth=64:队列深度64.
-filename=/testfolder/fio.test:数据写到/testfolder/fio.test这个文件,也可以是写到一个盘(块设备),例如/dev/sda4
-size=100M:每个线程写入数据量是100M。
-offset=0MB:从偏移地址0MB开始写。
-bs=4k:每一个BIO命令包含的数据大小是4KB。一般4KB IOPS测试,就是在这里设置。
–output TestResult.log:日志输出到TestResult.log。
参考 Ceph集群搭建系列(四):CephFS client客户端使用CephFS 挂载CephFS
参考Ceph集群搭建系列(六):RBD块设备的使用场景、原理分析及其创建 挂载Ceph RBD
这里以测试CephFS为例:
$ df -h|grep ceph
ceph-fuse 6.4G 0 6.4G 0% /home/yjiang2/client_cephfs_mnt
运行进入目录/home/yjiang2/client_cephfs_mnt,运行下列 fio 命令:
fio -rw=randwrite -ioengine=libaio -direct=1 -thread -numjobs=1 -iodepth=64 -filename=./cephfs.fio.test -size=20M -name=job1 --bs=512 -output TestResult2.log
查看运行结果:
$ cat TestResult2.log
job1: (g=0): rw=randwrite, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B, ioengine=libaio, iodepth=64
fio-3.1
Starting 1 thread
job1: Laying out IO file (1 file / 20MiB)
fio: native_fallocate call failed: Operation not supported
job1: (groupid=0, jobs=1): err= 0: pid=10322: Wed Jul 17 15:42:29 2019
write: IOPS=487, BW=244KiB/s (249kB/s)(20.0MiB/84067msec)
slat (usec): min=1552, max=95529, avg=2045.93, stdev=960.57
clat (usec): min=4, max=499402, avg=129194.66, stdev=17616.38
lat (usec): min=1635, max=501689, avg=131241.49, stdev=17755.64
clat percentiles (msec):
| 1.00th=[ 116], 5.00th=[ 120], 10.00th=[ 121], 20.00th=[ 123],
| 30.00th=[ 124], 40.00th=[ 126], 50.00th=[ 128], 60.00th=[ 130],
| 70.00th=[ 132], 80.00th=[ 134], 90.00th=[ 138], 95.00th=[ 144],
| 99.00th=[ 155], 99.50th=[ 190], 99.90th=[ 489], 99.95th=[ 498],
| 99.99th=[ 502]
bw ( KiB/s): min= 152, max= 265, per=100.00%, avg=243.19, stdev=14.94, samples=168
iops : min= 304, max= 530, avg=486.41, stdev=29.87, samples=168
lat (usec) : 10=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.04%
lat (msec) : 100=0.07%, 250=99.55%, 500=0.31%
cpu : usr=0.47%, sys=0.76%, ctx=81923, majf=0, minf=1
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.8%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwt: total=0,40960,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: bw=244KiB/s (249kB/s), 244KiB/s-244KiB/s (249kB/s-249kB/s), io=20.0MiB (20.0MB), run=84067-84067msec
FIO结果解析:
FIO会为每个Job打印统计信息。 最后面是合计的数值。我们一般看重的是总的性能和延迟。首先看最后总的带宽,bw=244KiB/s (249kB/s)
再来看看延迟Latency。Slat是发命令时间(submission latency),clat是命令执行时间(complete latency),lat就是总的延迟。
clat percentiles (usec)给出了延迟的统计分布。
其它参数说明
io=执行了多少M的IO
bw=平均IO带宽
iops=IOPS
runt=线程运行时间
bw=带宽
cpu=利用率
IO depths=io队列
IO submit=单个IO提交要提交的IO数
IO complete=Like the above submit number, but for completions instead.
IO issued=The number of read/write requests issued, and how many of them were short.
IO latencies=IO完延迟的分布
io=总共执行了多少size的IO
agg=group总带宽
min=最小.平均带宽.
max=最大平均带宽.
io_queue=花费在队
这里也可以写成fio 配置文件来执行:
$ fio ./testceph.fio
job1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
job2: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.1
Starting 2 threads
job1: (groupid=0, jobs=1): err= 0: pid=10973: Wed Jul 17 16:31:08 2019
read: IOPS=3168, BW=12.4MiB/s (12.0MB/s)(10.0MiB/808msec)
slat (usec): min=205, max=1768, avg=311.69, stdev=85.39
clat (nsec): min=1795, max=26418k, avg=19430002.02, stdev=2434089.74
lat (usec): min=213, max=28188, avg=19742.11, stdev=2455.07
clat percentiles (usec):
| 1.00th=[ 5800], 5.00th=[17171], 10.00th=[17695], 20.00th=[18744],
| 30.00th=[19006], 40.00th=[19268], 50.00th=[19530], 60.00th=[19792],
| 70.00th=[20317], 80.00th=[20841], 90.00th=[21365], 95.00th=[22152],
| 99.00th=[23200], 99.50th=[23725], 99.90th=[24773], 99.95th=[25297],
| 99.99th=[26346]
bw ( KiB/s): min=11896, max=11896, per=46.93%, avg=11896.00, stdev= 0.00, samples=1
iops : min= 2974, max= 2974, avg=2974.00, stdev= 0.00, samples=1
lat (usec) : 2=0.04%, 250=0.04%, 500=0.04%, 750=0.04%, 1000=0.04%
lat (msec) : 2=0.20%, 4=0.31%, 10=0.94%, 20=60.86%, 50=37.50%
cpu : usr=1.61%, sys=1.73%, ctx=2564, majf=0, minf=65
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwt: total=2560,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
job2: (groupid=0, jobs=1): err= 0: pid=10974: Wed Jul 17 16:31:08 2019
read: IOPS=3180, BW=12.4MiB/s (13.0MB/s)(10.0MiB/805msec)
slat (usec): min=207, max=1776, avg=310.35, stdev=80.05
clat (usec): min=2, max=25557, avg=19362.08, stdev=2337.87
lat (usec): min=223, max=27336, avg=19672.84, stdev=2356.14
clat percentiles (usec):
| 1.00th=[ 6259], 5.00th=[17433], 10.00th=[17957], 20.00th=[18482],
| 30.00th=[19006], 40.00th=[19268], 50.00th=[19530], 60.00th=[19792],
| 70.00th=[20055], 80.00th=[20579], 90.00th=[21103], 95.00th=[22152],
| 99.00th=[23725], 99.50th=[24511], 99.90th=[24773], 99.95th=[25035],
| 99.99th=[25560]
bw ( KiB/s): min=11920, max=11920, per=47.03%, avg=11920.00, stdev= 0.00, samples=1
iops : min= 2980, max= 2980, avg=2980.00, stdev= 0.00, samples=1
lat (usec) : 4=0.04%, 250=0.04%, 500=0.04%, 1000=0.04%
lat (msec) : 2=0.20%, 4=0.27%, 10=0.90%, 20=65.23%, 50=33.24%
cpu : usr=1.00%, sys=2.49%, ctx=2565, majf=0, minf=65
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwt: total=2560,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=24.8MiB/s (25.0MB/s), 12.4MiB/s-12.4MiB/s (12.0MB/s-13.0MB/s), io=20.0MiB (20.0MB), run=805-808msec
testceph.fio文件内容如下:
$ cat ./testceph.fio
[global]
filename=/home/yjiang2/client_cephfs_mnt/libaio-test-file
direct=1
iodepth=64
thread
rw=randread
ioengine=libaio
bs=4k
numjobs=1
size=10M
[job1]
name=job1
offset=0
[job2]
name=job2
offset=10M
;–end job file
工具 |
用途 |
语法 |
说明 |
fio + rbd ioengine |
fio 结合 rbd IO 引擎的性能测试工具,io path 路径是librbd+librados. | 参考 fio --help |
|
fio + libaio |
fio 配置 linux aio方式,rbd和cephfs都可以测试。测试rbd(io path 路径是rbd.ko+libceph.ko): rbd map 后以/dev/rbd0呈现,格式化后再挂载;测试cephfs:创建cephfs后,挂载即可以测试。 |
Ceph Performance Analysis: fio and RBD
fio测试ceph块设备rbd性能