使用db_bench 对rocksdb进行性能压测

rocksdb提供benchmark工具来对自身性能进行全方位各个纬度的评估,包括顺序读写,随机读写,热点读写,删除,合并,查找,校验等性能的评估,十分好用。

工具编译

这里使用的是cmake的方式,主要是为了在自己用户下指定对应的第三方库的路径(glfags等)以及指定是否开启rocksdb自己的压缩算法编译,否则直接make 原生的Makefile问题较多且 db_bench所依赖的一些库都没法自动链接进去(zstd,snappy等压缩算法 默认是不编译到db_bench中的)

如果你的db_bench工具已经安装好了,可以跳过当前步骤

基本流程如下:

  1. 下载rocksdb源码
    git clone https://github.com/facebook/rocksdb.git
    
    如果需要指定对应的版本,可以下载好之后执行
    git branch xxx切换到对应版本的分支
  2. 第三方库的编译安装
    a. gflags
    a. git clone https://github.com/gflags/gflags.git
    b. cd gflags
    c. mkdir build && cd build
    
    #以下DCMAKE_INSTALL_PREFIX 之后的路径为自己想要安装的路径,如果有root权限且可以安装到系统目录下,那么可以不用指定prefix选项,BUILD_SHARED_LIBS选项表示开启编译gflags的动态库,否则默认不编译动态库
    d. cmake .. -DCMAKE_INSTALL_PREFIX=/xxx -DBUILD_SHARED_LIBS=1 -DCMAKE_BUILD_TYPE=Release 
    
    e. make && make install	
    
    #增加gflags的include 和 lib库的路径到系统库下面,如上面未指定路径,则系统默认安装在
    #/usr/local/gflags
    f. 编辑当前用户下的bashrc,添加如下内容:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/xxx/gcc-5.3/lib64:/xxx/gflags/lib
    export LIBRARY_PATH=$LIBRARY_PATH:/xxx/gflags/include
    
    b. 安装snappy
    sudo yum install snappy snappy-devel
    c. 安装 zlib
    yum install zlib zlib-devel
    d. 安装 bzip2
    yum install bzip2 bzip2-devel
    e.安装lz4
    yum install lz4-devel
    f. 安装zstardard
    wget https://github.com/facebook/zstd/archive/v1.1.3.tar.gz
    mv v1.1.3.tar.gz zstd-1.1.3.tar.gz
    tar zxvf zstd-1.1.3.tar.gz
    cd zstd-1.1.3
    make && sudo make install
    
  3. 生成makefile
    cd rocksdb && mkdir build
    
    #以下的prefix路径需要指定安装gflags的prefix路径,否则编译过程中无法链接到gflags的库
    #如果cmake 版本过低,使用cmake3
    #DWITH_xxx 表示开启几个压缩算法的编译选项,否则运行db_bench时rocksdb产生数据压缩的时候无法找到对应的库
    cmake .. -DCMAKE_PREFIX_PATH=/xxx -DWITH_SNAPPY=1 -DWITH_LZ4=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DCMAKE_BUILD_TYPE=Release=release
    
  4. 编译
    以上会通过上级目录的CMakeList.txt 在当前目录生成Makefile,最终执行编译
    make -j
    

成功后db_bench工具会在当前目录下生成

性能压测

由于db_bench工具的选项太多了,这里直接提取社区的测试方式
核心是benchmark,它代表本次测试使用的压测方式,benchmark的列表如下

fillseq       -- write N values in sequential key order in async mode
fillseqdeterministic       -- write N values in the specified key order and keep the shape of the LSM tree
fillrandom    -- write N values in random key order in async mode
filluniquerandomdeterministic       -- write N values in a random key order and keep the shape of the LSM tree
overwrite     -- overwrite N values in random key order in async mode
fillsync      -- write N/100 values in random key order in sync mode
fill100K      -- write N/1000 100K values in random order in async mode
deleteseq     -- delete N keys in sequential order
deleterandom  -- delete N keys in random order
readseq       -- read N times sequentially
readtocache   -- 1 thread reading database sequentially
readreverse   -- read N times in reverse order
readrandom    -- read N times in random order
readmissing   -- read N missing keys in random order
readwhilewriting      -- 1 writer, N threads doing random reads
readwhilemerging      -- 1 merger, N threads doing random reads
readrandomwriterandom -- N threads doing random-read, random-write
prefixscanrandom      -- prefix scan N times in random order
updaterandom  -- N threads doing read-modify-write for random keys
appendrandom  -- N threads doing read-modify-write with growing values
mergerandom   -- same as updaterandom/appendrandom using merge operator. Must be used with merge_operator
readrandommergerandom -- perform N random read-or-merge operations. Must be used with merge_operator
newiterator   -- repeated iterator creation
seekrandom    -- N random seeks, call Next seek_nexts times per seek
seekrandomwhilewriting -- seekrandom and 1 thread doing overwrite
seekrandomwhilemerging -- seekrandom and 1 thread doing merge
crc32c        -- repeated crc32c of 4K of data
xxhash        -- repeated xxHash of 4K of data
acquireload   -- load N*1000 times
fillseekseq   -- write N values in sequential key, then read them by seeking to each key
randomtransaction     -- execute N random transactions and verify correctness
randomreplacekeys     -- randomly replaces N keys by deleting the old version and putting the new version
timeseries            -- 1 writer generates time series data and multiple readers doing random reads on id
  • 创建一个db,并写入一些数据 ./db_bench --benchmarks="fillseq"
    但是这样并不会打印更多有效的元信息

    DB path: [/tmp/rocksdbtest-1001/dbbench]
    fillseq      :       2.354 micros/op 424867 ops/sec;   47.0 MB/s
    
  • 创建一个db,并打印一些元信息./db_bench --benchmarks="fillseq,stats"
    --benchmarks表示测试的顺序,支持持续叠加。本次就是顺序写之后打印db的状态信息。
    这样会打印db相关的stats信息,包括db的stat信息和compaction的stat信息

    DB path: [/tmp/rocksdbtest-1001/dbbench]
    # 测试顺序写的性能信息
    fillseq      :       2.311 micros/op 432751 ops/sec;   47.9 MB/s
    
    
    ** Compaction Stats [default] **
    Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      L0      1/0   28.88 MB   0.2      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     60.6      0.48              0.31         1    0.477       0      0
     Sum      1/0   28.88 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     60.6      0.48              0.31         1    0.477       0      0
     Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     60.6      0.48              0.31         1    0.477       0      0
    
    ** Compaction Stats [default] **
    Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    High      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0     60.6      0.48              0.31         1    0.477       0      0
    Uptime(secs): 2.3 total, 2.3 interval
    Flush(GB): cumulative 0.028, interval 0.028
    AddFile(GB): cumulative 0.000, interval 0.000
    AddFile(Total Files): cumulative 0, interval 0
    AddFile(L0 Files): cumulative 0, interval 0
    AddFile(Keys): cumulative 0, interval 0
    Cumulative compaction: 0.03 GB write, 12.34 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.5 seconds
    Interval compaction: 0.03 GB write, 12.50 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.5 seconds
    Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
    
    ** File Read Latency Histogram By Level [default] **
    
    ** DB Stats **
    Uptime(secs): 2.3 total, 2.3 interval
    Cumulative writes: 1000K writes, 1000K keys, 1000K commit groups, 1.0 writes per commit group, ingest: 0.12 GB, 53.39 MB/s
    Cumulative WAL: 1000K writes, 0 syncs, 1000000.00 writes per sync, written: 0.12 GB, 53.39 MB/s
    Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
    Interval writes: 1000K writes, 1000K keys, 1000K commit groups, 1.0 writes per commit group, ingest: 124.93 MB, 54.06 MB/s
    Interval WAL: 1000K writes, 0 syncs, 1000000.00 writes per sync, written: 0.12 MB, 54.06 MB/s
    Interval stall: 00:00:0.000 H:M:S, 0.0 percent
    

    更多的meta operation操作如下

    • compact 对整个数据库进行合并
    • stats 打印db的状态信息
    • resetstats 重置db的状态信息
    • levelstats 打印每一层的文件个数以及每一层的占用的空间大小
    • sstables 打印sst文件的信息

    对应的sstables和levelstats显示信息如下

    --- level 0 --- version# 2 ---
     7:30286882[1 .. 448148]['00000000000000003030303030303030' \
     seq:1, type:1 .. '000000000006D6933030303030303030' seq:448148, type:1](0)
    --- level 1 --- version# 2 ---
    --- level 2 --- version# 2 ---
    --- level 3 --- version# 2 ---
    --- level 4 --- version# 2 ---
    --- level 5 --- version# 2 ---
    --- level 6 --- version# 2 ---
    
    
    Level Files Size(MB)
    --------------------
      0        1       29
      1        0        0
      2        0        0
      3        0        0
      4        0        0
      5        0        0
      6        0        0
    
  • 单独随机写测试
    相关的参数可以自己配置,这里仅仅列出一部分参数
    可以通过 ./db_bench --help自己查看想要的配置参数,当然配置前需要对各个参数有一定的了解

    ./db_bench  \
    --benchmarks="fillrandom,stats,levelstats" \
    --enable_write_thread_adaptive_yield=false \
    --disable_auto_compactions=false \
    --max_background_compactions=32 \
    --max_background_flushes=4 \
    --write_buffer_size=536870912 \
    --min_write_buffer_number_to_merge=2 \
    --max_write_buffer_number=6 \
    --target_file_size_base=67108864 \
    --max_bytes_for_level_base=536870912 \
    --num=500000000 \ #总共写入的请求个数,如果达不到则写30秒就停止
    --duration=30 \ #持续IO的时间是30秒
    --threads=1000\ #并发1000个线程
    --value_size=8192\ #value size是8K
    --key_size=16 \ #key size 16B
    --enable_pipelined_write=true \
    --db=./db_bench_test \ #指定创建db的目录
    --wal_dir=./db_bench_test \ #指定创建wal的目录
    --allow_concurrent_memtable_write=true \ #允许并发写memtable
    --disable_wal=false \
    --batch_size=1 \
    --sync=false   \ #是否开启sync
    --block_cache_trace_file=/readable_trace_path\
    --block_cache_trace_max_trace_file_size_in_bytes=1073741824 \
    --block_cache_trace_sampling_frequency=1
    
  • 随机读
    先顺序写,打印db的状态信息,再随机读

    ./db_bench \
    --benchmarks="fillseq,stats,readrandom,stats"
    --enable_write_thread_adaptive_yield=false \
    --disable_auto_compactions=false \
    --max_background_compactions=32 \
    --max_background_flushes=4 \
    --write_buffer_size=536870912 \
    --min_write_buffer_number_to_merge=2 \
    --max_write_buffer_number=6 \
    --target_file_size_base=67108864 \
    --max_bytes_for_level_base=536870912 \
    --num=500000000 \ #总共写入的请求个数,如果达不到则写30秒就停止
    --duration=30 \ #持续IO的时间是30秒
    --threads=1000\ #并发1000个线程
    --value_size=8192\ #value size是8K
    --key_size=16 \ #key size 16B
    --enable_pipelined_write=true \
    --db=./db_bench_test \ #指定创建db的目录
    --wal_dir=./db_bench_test \ #指定创建wal的目录
    --allow_concurrent_memtable_write=true \ #允许并发写memtable
    --disable_wal=false \	
    

    如果要测试热点读,可以指定参数--key_id_range=100000,表示生成的key的范围是在100000范围内,该测试需要在benchmark中增加timeseries

你可能感兴趣的:(存储引擎,#,Rocksdb)