rocksdb提供benchmark工具来对自身性能进行全方位各个纬度的评估,包括顺序读写,随机读写,热点读写,删除,合并,查找,校验等性能的评估,十分好用。
这里使用的是cmake的方式,主要是为了在自己用户下指定对应的第三方库的路径(glfags等)以及指定是否开启rocksdb自己的压缩算法编译,否则直接make 原生的Makefile问题较多且 db_bench所依赖的一些库都没法自动链接进去(zstd,snappy等压缩算法 默认是不编译到db_bench中的)
如果你的db_bench工具已经安装好了,可以跳过当前步骤
基本流程如下:
git clone https://github.com/facebook/rocksdb.git
如果需要指定对应的版本,可以下载好之后执行git branch xxx
切换到对应版本的分支a. git clone https://github.com/gflags/gflags.git
b. cd gflags
c. mkdir build && cd build
#以下DCMAKE_INSTALL_PREFIX 之后的路径为自己想要安装的路径,如果有root权限且可以安装到系统目录下,那么可以不用指定prefix选项,BUILD_SHARED_LIBS选项表示开启编译gflags的动态库,否则默认不编译动态库
d. cmake .. -DCMAKE_INSTALL_PREFIX=/xxx -DBUILD_SHARED_LIBS=1 -DCMAKE_BUILD_TYPE=Release
e. make && make install
#增加gflags的include 和 lib库的路径到系统库下面,如上面未指定路径,则系统默认安装在
#/usr/local/gflags
f. 编辑当前用户下的bashrc,添加如下内容:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/xxx/gcc-5.3/lib64:/xxx/gflags/lib
export LIBRARY_PATH=$LIBRARY_PATH:/xxx/gflags/include
b. 安装snappy
sudo yum install snappy snappy-devel
zlib
yum install zlib zlib-devel
bzip2
yum install bzip2 bzip2-devel
lz4
yum install lz4-devel
zstardard
wget https://github.com/facebook/zstd/archive/v1.1.3.tar.gz
mv v1.1.3.tar.gz zstd-1.1.3.tar.gz
tar zxvf zstd-1.1.3.tar.gz
cd zstd-1.1.3
make && sudo make install
cd rocksdb && mkdir build
#以下的prefix路径需要指定安装gflags的prefix路径,否则编译过程中无法链接到gflags的库
#如果cmake 版本过低,使用cmake3
#DWITH_xxx 表示开启几个压缩算法的编译选项,否则运行db_bench时rocksdb产生数据压缩的时候无法找到对应的库
cmake .. -DCMAKE_PREFIX_PATH=/xxx -DWITH_SNAPPY=1 -DWITH_LZ4=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DCMAKE_BUILD_TYPE=Release=release
make -j
成功后db_bench
工具会在当前目录下生成
由于db_bench工具的选项太多了,这里直接提取社区的测试方式
核心是benchmark,它代表本次测试使用的压测方式,benchmark的列表如下
fillseq -- write N values in sequential key order in async mode
fillseqdeterministic -- write N values in the specified key order and keep the shape of the LSM tree
fillrandom -- write N values in random key order in async mode
filluniquerandomdeterministic -- write N values in a random key order and keep the shape of the LSM tree
overwrite -- overwrite N values in random key order in async mode
fillsync -- write N/100 values in random key order in sync mode
fill100K -- write N/1000 100K values in random order in async mode
deleteseq -- delete N keys in sequential order
deleterandom -- delete N keys in random order
readseq -- read N times sequentially
readtocache -- 1 thread reading database sequentially
readreverse -- read N times in reverse order
readrandom -- read N times in random order
readmissing -- read N missing keys in random order
readwhilewriting -- 1 writer, N threads doing random reads
readwhilemerging -- 1 merger, N threads doing random reads
readrandomwriterandom -- N threads doing random-read, random-write
prefixscanrandom -- prefix scan N times in random order
updaterandom -- N threads doing read-modify-write for random keys
appendrandom -- N threads doing read-modify-write with growing values
mergerandom -- same as updaterandom/appendrandom using merge operator. Must be used with merge_operator
readrandommergerandom -- perform N random read-or-merge operations. Must be used with merge_operator
newiterator -- repeated iterator creation
seekrandom -- N random seeks, call Next seek_nexts times per seek
seekrandomwhilewriting -- seekrandom and 1 thread doing overwrite
seekrandomwhilemerging -- seekrandom and 1 thread doing merge
crc32c -- repeated crc32c of 4K of data
xxhash -- repeated xxHash of 4K of data
acquireload -- load N*1000 times
fillseekseq -- write N values in sequential key, then read them by seeking to each key
randomtransaction -- execute N random transactions and verify correctness
randomreplacekeys -- randomly replaces N keys by deleting the old version and putting the new version
timeseries -- 1 writer generates time series data and multiple readers doing random reads on id
创建一个db,并写入一些数据 ./db_bench --benchmarks="fillseq"
但是这样并不会打印更多有效的元信息
DB path: [/tmp/rocksdbtest-1001/dbbench]
fillseq : 2.354 micros/op 424867 ops/sec; 47.0 MB/s
创建一个db,并打印一些元信息./db_bench --benchmarks="fillseq,stats"
--benchmarks
表示测试的顺序,支持持续叠加。本次就是顺序写之后打印db的状态信息。
这样会打印db相关的stats信息,包括db的stat信息和compaction的stat信息
DB path: [/tmp/rocksdbtest-1001/dbbench]
# 测试顺序写的性能信息
fillseq : 2.311 micros/op 432751 ops/sec; 47.9 MB/s
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
L0 1/0 28.88 MB 0.2 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 60.6 0.48 0.31 1 0.477 0 0
Sum 1/0 28.88 MB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 60.6 0.48 0.31 1 0.477 0 0
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 60.6 0.48 0.31 1 0.477 0 0
** Compaction Stats [default] **
Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
High 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 60.6 0.48 0.31 1 0.477 0 0
Uptime(secs): 2.3 total, 2.3 interval
Flush(GB): cumulative 0.028, interval 0.028
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.03 GB write, 12.34 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.5 seconds
Interval compaction: 0.03 GB write, 12.50 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.5 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
** File Read Latency Histogram By Level [default] **
** DB Stats **
Uptime(secs): 2.3 total, 2.3 interval
Cumulative writes: 1000K writes, 1000K keys, 1000K commit groups, 1.0 writes per commit group, ingest: 0.12 GB, 53.39 MB/s
Cumulative WAL: 1000K writes, 0 syncs, 1000000.00 writes per sync, written: 0.12 GB, 53.39 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 1000K writes, 1000K keys, 1000K commit groups, 1.0 writes per commit group, ingest: 124.93 MB, 54.06 MB/s
Interval WAL: 1000K writes, 0 syncs, 1000000.00 writes per sync, written: 0.12 MB, 54.06 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
更多的meta operation操作如下
对应的sstables和levelstats显示信息如下
--- level 0 --- version# 2 ---
7:30286882[1 .. 448148]['00000000000000003030303030303030' \
seq:1, type:1 .. '000000000006D6933030303030303030' seq:448148, type:1](0)
--- level 1 --- version# 2 ---
--- level 2 --- version# 2 ---
--- level 3 --- version# 2 ---
--- level 4 --- version# 2 ---
--- level 5 --- version# 2 ---
--- level 6 --- version# 2 ---
Level Files Size(MB)
--------------------
0 1 29
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
单独随机写测试
相关的参数可以自己配置,这里仅仅列出一部分参数
可以通过 ./db_bench --help
自己查看想要的配置参数,当然配置前需要对各个参数有一定的了解
./db_bench \
--benchmarks="fillrandom,stats,levelstats" \
--enable_write_thread_adaptive_yield=false \
--disable_auto_compactions=false \
--max_background_compactions=32 \
--max_background_flushes=4 \
--write_buffer_size=536870912 \
--min_write_buffer_number_to_merge=2 \
--max_write_buffer_number=6 \
--target_file_size_base=67108864 \
--max_bytes_for_level_base=536870912 \
--num=500000000 \ #总共写入的请求个数,如果达不到则写30秒就停止
--duration=30 \ #持续IO的时间是30秒
--threads=1000\ #并发1000个线程
--value_size=8192\ #value size是8K
--key_size=16 \ #key size 16B
--enable_pipelined_write=true \
--db=./db_bench_test \ #指定创建db的目录
--wal_dir=./db_bench_test \ #指定创建wal的目录
--allow_concurrent_memtable_write=true \ #允许并发写memtable
--disable_wal=false \
--batch_size=1 \
--sync=false \ #是否开启sync
--block_cache_trace_file=/readable_trace_path\
--block_cache_trace_max_trace_file_size_in_bytes=1073741824 \
--block_cache_trace_sampling_frequency=1
随机读
先顺序写,打印db的状态信息,再随机读
./db_bench \
--benchmarks="fillseq,stats,readrandom,stats"
--enable_write_thread_adaptive_yield=false \
--disable_auto_compactions=false \
--max_background_compactions=32 \
--max_background_flushes=4 \
--write_buffer_size=536870912 \
--min_write_buffer_number_to_merge=2 \
--max_write_buffer_number=6 \
--target_file_size_base=67108864 \
--max_bytes_for_level_base=536870912 \
--num=500000000 \ #总共写入的请求个数,如果达不到则写30秒就停止
--duration=30 \ #持续IO的时间是30秒
--threads=1000\ #并发1000个线程
--value_size=8192\ #value size是8K
--key_size=16 \ #key size 16B
--enable_pipelined_write=true \
--db=./db_bench_test \ #指定创建db的目录
--wal_dir=./db_bench_test \ #指定创建wal的目录
--allow_concurrent_memtable_write=true \ #允许并发写memtable
--disable_wal=false \
如果要测试热点读,可以指定参数--key_id_range=100000
,表示生成的key的范围是在100000范围内,该测试需要在benchmark中增加timeseries