7、Rally 生产配置

参考：https://esrally.readthedocs.io/en/latest/rally_daemon.html

在生产环境，Rally 可以根据需要灵活运用：

Rally 既可单点运行，也可以分布式运行。性能好的集群，就需要多个压测节点才能产生足够的压力
Rally 既可以使用自带的 cars 配置来创建 ES 实例，也可以指定现成的 ES 集群进行压测

一、Rally Daemon 介绍

Rally 可以 daemon 的方式运行，共有 3 种角色：

Benchmark coordinator(esrally)
Load driver(Rally daemon)
Provisioner(ES)

Benchmark coordinator【唯一】

控制节点，用于执行 esrally 命令，打印压测报告。

Load driver【多个】

负载节点，监听 1900 端口。

用于解析 tracks 配置，并执行压测操作。在控制节点通过 --load-driver-hosts 参数，就可以调用这些节点运行压测操作。

Provisioner

需要压测的 ES 目标集群。

默认使用 cars 配置在本地创建一个 ES 实例；也可以通过 --target-hosts 指定已搭建好的 ES 集群。

这里姑且将 Benchmark coordinator 和 Load driver 合称为 Rally 集群。

image

二、部署 Rally 集群

假设有 2 台机器可以用于部署 Rally 集群，那么我这样使用它

IP	roles
192.168.0.11	调度节点、负载节点
192.168.0.12	负载节点

1、编译运行环境

先参考《2、Rally 安装》，在一台线下机器安装 esrally 运行环境(最小化安装)。

Python 和 Git 需要编译安装，与操作系统的诸多工具有着强依赖，所以为了减少依赖带来的额外工作量，强烈建议对不同的 OS 编译不同的 Python 和 Git。这里以 CentOS6 为例

编译 Git1.9

yum install curl-devel expat-devel perl-ExtUtils-MakeMaker
wget https://www.kernel.org/pub/software/scm/git/git-2.9.5.tar.gz
tar -zxf git-2.9.5.tar.gz
cd git-2.9.5
./configure --prefix=$GIT_HOME
make -j $(grep -c ^process /proc/cpuinfo) all
make install
export PATH=$GIT_HOME/bin:$GIT_HOME/libexec/git-core:$PATH
which git git-add
git --version

编译 Python

PYTHON3_HOME=/apps/svr/esrally/python-3.5.2
GIT_HOME=/apps/svr/esrally/git-1.9

sudo yum install glibc-devel openssl-devel bzip2-devel
wget https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tgz
tar -zxf Python-3.5.2.tgz
cd Python-3.5.2
./configure --prefix=$PYTHON3_HOME
make -j $(grep -c ^process /proc/cpuinfo)
make install
export PATH=$PYTHON3_HOME/bin:$PATH
which python3

安装 esrally

mkdir -p ~/.pip
cat > ~/.pip/pip.conf << EOF
[global]
index-url = http://mirrors.aliyun.com/pypi/simple/

[install]
trusted-host=mirrors.aliyun.com
EOF

pip3 install esrally
which esrally

为了兼容 ES5.5，需要修改 metrics、races、results 的脚本和 index template 文件，将 mapping type 从 _doc 改为 doc，否则会抛出异常：

elasticsearch.exceptions.RequestError: TransportError(400, 'invalid_type_name_exception', "mapping type name [_doc] can't start with '_'")

vim /apps/svr/esrally/python-3.5.2/lib/python3.5/site-packages/esrally/metrics.py

# 修改前
METRICS_DOC_TYPE = "_doc"
RESULTS_DOC_TYPE = "_doc"
RACE_DOC_TYPE = "_doc"

# 修改后
METRICS_DOC_TYPE = "doc"
RESULTS_DOC_TYPE = "doc"
RACE_DOC_TYPE = "doc"

cd /apps/svr/esrally/python-3.5.2/lib/python3.5/site-packages/esrally/resources

metrics-template.json
races-template.json
results-template.json

修改这 3 个文件，需要修改的内容是一样的

# 修改前
"indices_pattern": ["rally-metrics-*"],
# 修改后
"template": "rally-metrics-*",

# 修改前
"mappings": {
    "_doc": {
        ...
    }
}
# 修改后
"mappings": {
    "doc": {
        ...
    }
}

生产中很少用到 ingest 功能，而 rally 的 bulk ingest 百分比范围是 (0,100]，需要改为 [0,100]，支持禁用 ingest：

vim /apps/svr/esrally/python-3.5.2/lib/python3.5/site-packages/esrally/track/params.py

# 修改前
self.ingest_percentage = self.float_param(params, name="ingest-percentage", default_value=100, min_value=0, max_value=100)
# 修改后
self.ingest_percentage = self.float_param(params, name="ingest-percentage", default_value=100, min_value=-1, max_value=100)

2、配置 rally

ES 集群：

压测报告存放在指定的 ES 实例(其实使用 in-memory 已经足够)
目标 ES 集群使用现有的集群

也就是说，我们不会使用 Rally 来创建任何的 ES 实例

esrally configure --advanced-config

如果使用自己搭建的 ES 存放压测报告，则可参考如下配置：

    ____        ____
   / __ \____ _/ / /_  __
  / /_/ / __ `/ / / / / /
 / _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
                /____/

Running advanced configuration. You can get additional help at:

  https://esrally.readthedocs.io/en/1.2.1/configuration.html

WARNING: Will overwrite existing config file at [/home/apps/.rally/rally.ini]

Enter the benchmark root directory (default: /home/apps/.rally/benchmarks): /apps/svr/esrally/rally/benchmarks

Enter your Elasticsearch project directory: (default: /apps/svr/esrally/rally/benchmarks/src/elasticsearch): /apps/svr/esrally/rally/benchmarks/src/elasticsearch

Where should metrics be kept?

(1) In memory (simpler but less options for analysis)
(2) Elasticsearch (requires a separate ES instance, keeps all raw samples for analysis)

 (default: 1): 2

Enter the host name of the ES metrics store (default: localhost): 192.168.0.11

Enter the port of the ES metrics store: 9200

Use secure connection (True, False) (default: False): 

Username for basic authentication (empty if not needed) (default: ): 

Password for basic authentication (empty if not needed) (default: ): 

Enter a descriptive name for this benchmark environment (ASCII, no spaces) (default: local): es_bentchmark   

Do you want Rally to keep the Elasticsearch benchmark candidate installation including the index (will use several GB per trial run)? (default: False): True

Configuration successfully written to /home/apps/.rally/rally.ini. Happy benchmarking!

移动配置目录

mv /home/apps/.rally /apps/svr/esrally/rally

创建 tracks 和 data 目录

INSTALL_HOME=/apps/svr/esrally
TRACKS_HOME=$INSTALL_HOME/rally/benchmarks/tracks/default
DATA_HOME=$INSTALL_HOME/rally/benchmarks/data
mkdir -p $TRACKS_HOME $DATA_HOME

如果生产使用现成的 ES 实例来存储压测报告，使用现成的 ES 目标集群来进行压测，则这里不需要 teams 目录。

3、下载 tracks 配置

cd /apps/svr/esrally/rally/benchmarks
sudo update-ca-trust
git clone https://github.com/elastic/rally-tracks.git tracks/default
esrally list tracks --offline

git 项目的 .git* 文件和目录一定要保留，因为 esrally 运行的时候会去检查 git 配置

4、添加环境变量文件

INSTALL_HOME=/apps/svr/esrally
GIT_HOME=$INSTALL_HOME/git-1.9
PYTHON3_HOME=$INSTALL_HOME/python-3.5.2
export PATH=$PYTHON3_HOME/bin:$GIT_HOME/bin:$GIT_HOME/libexec/git-core:$PATH
which esrally git git-add
#esrally -h

5、打包 esrally 目录

查看目录大小

du -shc /apps/svr/esrally/*

90M     /apps/svr/esrally/git-1.9
234M    /apps/svr/esrally/python-3.5.2
2.9M    /apps/svr/esrally/rally
349M    total

git-1.9 目录一定要保留，因为 esrally 每次运行的时候都会调用 git 命令检查 tracks 配置

打包 esrally

不要启用压缩，因为编译后的文件都是二进制的，没得压缩，启用压缩反而会导致压缩包会更大

cd /apps/svr/ && tar --exclude=*.bz2 -cf esrally.el6.tar esrally

查看包大小

du -sh esrally.el6.tar

273M    esrally.el6.tar

这里只打包 rally 配置、运行环境、tracks 配置，样本数据体积较大，需另行打包上传

5、打包样本数据

上一步打包了 rally 配置、运行环境、tracks 配置，这里打包某个 track 的数据，以 geopoint 为例

使用浏览器下载样本数据到的 data 目录：/apps/svr/esrally/rally/benchmarks/data/geopoint/

http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geopoint/documents.json.bz2
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geopoint/documents-1k.json.bz2

查看文件大小

cd /apps/svr/esrally/rally/benchmarks/data/geopoint/; du -sh *

253M    documents-2.json.bz2
482M    documents.json.bz2

打包数据，不需要启用压缩

cd /apps/svr/esrally/rally/benchmarks/data/; tar -cf geopoint.tar geopoint

查看包大小

du -sh geopoint.tar

735M    geopoint.tar

6、上传包到所有压测机器

一般调度机器是不需要压测数据的，如果它同时也是负载机器，那么也需要压测数据

rz

解压

tar -xf esrally.el6.tar -C /apps/svr/
tar -xf geopoint.tar -C /apps/svr/esrally/rally/benchmarks/data/
# 做个软链，因为配置目录 ~/.rally 是写死在代码里面的
ln -svnf /apps/svr/esrally/rally /home/apps/.rally

查看解压后的目录是否 ok

tree /apps/svr/esrally/rally/benchmarks/ -L 3

/apps/svr/esrally/rally/benchmarks/
├── data
│   └── geopoint
│       ├── documents-2.json.bz2
│       └── documents.json.bz2
└── tracks
    └── default
        ├── download.sh
        ├── eventdata
        ├── geonames
        ├── geopoint
        ├── geopointshape
        ├── geoshape
        ├── http_logs
        ├── metricbeat
        ├── nested
        ├── noaa
        ├── nyc_taxis
        ├── percolator
        ├── pmc
        ├── README.md
        └── so

这时 geopoint 的 tracks 配置和 data 都部署好了。

7、其他压测场景

根据生产业务，可以选定多种 rally 提供的较为类似的 tracks 进行压测：

Tracks	业务类型	数据源	Docs	Compressed Size	Uncompressed Size
eventdata	日志入库，类似 Dragonfly	elastic.co 官网的 HTTP 访问日志	20,000,000	755.1 MB	15.3 GB
nested	嵌套搜索，类似常见问题搜索	StackOverflow 的 Q&A 数据	11,203,029	663.1 MB	3.4 GB
percolator	AOL 查询	rally	2,000,000	102.7 kB	104.9 MB
pmc	全文搜索	PMC 的学术论文搜索	574,199	5.5 GB	21.7 GB
so	问题搜索	StackOverflow 的问答数据	36,062,278	8.9 GB	33.1 GB

下载地址(只支持 w3m or 浏览器下载)

eventdata
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/eventdata/eventdata-1k.json.bz2 (41KB)
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/eventdata/eventdata.json.bz2 (755MB)

nested
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/nested/documents-1k.json.bz2 (70KB)
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/nested/documents.json.bz2 (663MB)

percolator
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/percolator/queries-2-1k.json.bz2 (1KB)
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/percolator/queries-2.json.bz2 (103KB)

pmc
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/pmc/documents-1k.json.bz2 (8.57MB)
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/pmc/documents.json.bz2 (5.5GB)

so
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/so/posts-1k.json.bz2 (181KB)
http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/so/posts.json.bz2 (8.9GB)

三、压测模式

IP	roles	配置
192.168.0.11	调度节点、负载节点、ES 单机实例	24U32G
192.168.0.12	负载节点	24U32G
192.168.0.17	ES master1	2U4G
192.168.0.18	ES master2	2U4G
192.168.0.19	ES master3	2U4G
192.168.0.89	ES node1	12U24G
192.168.0.90	ES node2	12U24G
192.168.0.91	ES node3	12U24G

1、压测前的检查

声明路径

source /apps/svr/esrally/rally_env.sh

检查 reporting 配置

vim ~/.rally/rally.ini

[reporting]
datastore.type = elasticsearch
datastore.host = 192.168.0.11
datastore.port = 9200
datastore.secure = False
datastore.user =
datastore.password =

以下压测全部使用离线 --offline 模式

2、单机压测模式

dryrun

esrally --pipeline=benchmark-only --target-hosts=192.168.0..89:9200,192.168.0..90:9200,192.168.0..91:9200 --user-tag="noah:p1_r0_b100" --track=percolator --offline --test-mode

正式压测

把 --test-mode 去掉即可

3、双机分布式压测模式

在调度机器启动 coordinator daemon

esrallyd start --node-ip=192.168.0.11 --coordinator-ip=192.168.0.11

esrallyd start --node-ip=192.168.0.12 --coordinator-ip=192.168.0.11

查看 daemon 状态

esrallyd status

停止 daemon

esrallyd stop

dryrun（在调度机器执行调度命令）

esrally --pipeline=benchmark-only --load-driver-hosts=192.168.0.11,192.168.0.12 --target-hosts=192.168.0..89:9200,192.168.0..90:9200,192.168.0..91:9200 --user-tag="noahes:1_0_100_0" --track=percolator --offline --test-mode