Elastic Stack产品栈包含Beats、APM、Elasticsearch、Elasticsearch Hadoop、Kibana、Logstash,这些产品常被作为一个整体搭配使用,其部署需要使用同样的版本,这样子能够有效简化部署操作。
本文主要记录Elasticsearch 8.4.3的安装过程,一方面是记录如何搭建一个Elasticsearch 8 集群,另外一方面是通过安装过程,了解在这个过程中Elasticsearch在背后做了一些什么,有助于我们理解Elasticsearch的启动和集群搭建流程。
Elasticsearch的安装支持以下几种方案:
包类型 | 适用范围 |
---|---|
tar.gz | Linux、MacOS |
.zip | Windows |
deb | Debian、Ubuntu以及其他基于Debian体系的Linux系统 |
rpm | Red Hat、Centos、SLES、OpenSuSE和其他基于RPM体系的Linux系统 |
docker | 容器 |
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.4.3-linux-x86_64.tar.gz
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.4.3-linux-x86_64.tar.gz.sha512
$ shasum -a 512 -c elasticsearch-8.4.3-linux-x86_64.tar.gz.sha512
elasticsearch-8.4.3-linux-x86_64.tar.gz: OK
下载并进行checksum验证,看到"OK"则可以证明验证完成。
如果是MacOS系统,则下载专属于MacOS系统的安装包,其他的类似,不再赘述。
$ curl -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.4.3-darwin-x86_64.tar.gz
$ curl https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.4.3-darwin-x86_64.tar.gz.sha512 | shasum -a 512 -c -
若提示没有shasum命令,则需要安装对应的包
$ yum install perl-Digest-SHA
解压并入解压之后的目录,这个目录将作为$ES_HOME
,如果不设置ES_HOME,在默认从执行命令的bin目录的上一层作为ES_HOME。
$ tar -xzf elasticsearch-8.4.3-linux-x86_64.tar.gz
$ cd elasticsearch-8.4.3/
$ pwd
home/elastic/elasticsearch-8.4.3
将以下内容添加到~/.bashrc
ES_HOME=home/elastic/elasticsearch-8.4.3
执行以下命令进行验证
$ echo $ES_HOME
/home/elastic/elasticsearch-8.4.3
一些商业功能会自动在 Elasticsearch 中创建索引。 默认情况下,Elasticsearch 配置为允许自动创建索引,不需要额外的步骤。 但是,如果在 Elasticsearch 中禁用了自动创建索引,则必须在 elasticsearch.yml 中配置 action.auto_create_index 以允许商业功能创建以下索引:
action.auto_create_index: .monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*
Logstash和Beats等组件都会自动创建索引,这些索引会有特别的名称,需要根据实际使用到的组件进行设置,如果不确定具体名称,为了方便可以将值设置为*
,表示允许自动创建所有索引。
因为Elasticsearch运行过程中会占用大量的系统资源,所以对于系统配置有一些要求,按照以下方案对系统进行相关配置。
1、修改文件描述符和最大线程数限制
切换到root用户,修改/etc/security/limits.conf配置文件,添加以下内容并保存。
* soft nofile 65536
* hard nofile 131072
* soft nproc 4096
* hard nproc 4096
2、修改max_map_count参数
打开/etc/sysctl.conf配置文件,添加如下内容并保存,执行sysctl -p
命令生效。
vm.max_map_count=262144
执行以下命令启动Elasticsearch
$ ./bin/elasticsearch
warning: ignoring JAVA_HOME=/opt/jdk-17.0.5; using bundled JDK
[2022-12-14T14:30:09,021][INFO ][o.e.n.Node ] [node1] version[8.4.3], pid[2073], build[tar/42f05b9372a9a4a470db3b52817899b99a76ee73/2022-10-04T07:17:24.662462378Z], OS[Linux/3.10.0-1160.76.1.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/18.0.2.1/18.0.2.1+1-1]
[2022-12-14T14:30:09,038][INFO ][o.e.n.Node ] [node1] JVM home [/home/elastic/elasticsearch-8.4.3/jdk], using bundled JDK [true]
......
[2022-12-14T14:30:42,065][INFO ][o.e.c.m.MetadataCreateIndexService] [node1] [.geoip_databases] creating index, cause [auto(bulk api)], templates [], shards [1]/[0]
[2022-12-14T14:30:42,463][INFO ][o.e.c.r.a.AllocationService] [node1] current.health="GREEN" message="Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.geoip_databases][0]]])." previous.health="YELLOW" reason="shards started [[.geoip_databases][0]]"
[2022-12-14T14:30:43,929][INFO ][o.e.i.g.GeoIpDownloader ] [node1] successfully downloaded geoip database [GeoLite2-ASN.mmdb]
[2022-12-14T14:30:44,355][INFO ][o.e.i.g.DatabaseNodeService] [node1] successfully loaded geoip database file [GeoLite2-ASN.mmdb]
[2022-12-14T14:30:45,144][INFO ][o.e.x.s.InitialNodeSecurityAutoConfiguration] [node1] HTTPS has been configured with automatically generated certificates, and the CA's hex-encoded SHA-256 fingerprint is [8a717506cb9c1f7c8190a6c3a2026cead47790aca632e35e988a65fdda78366d]
[2022-12-14T14:30:45,152][INFO ][o.e.x.s.s.SecurityIndexManager] [node1] security index does not exist, creating [.security-7] with alias [.security]
[2022-12-14T14:30:45,196][INFO ][o.e.x.s.e.InternalEnrollmentTokenGenerator] [node1] Will not generate node enrollment token because node is only bound on localhost for transport and cannot connect to nodes from other hosts
[2022-12-14T14:30:45,325][INFO ][o.e.c.m.MetadataCreateIndexService] [node1] [.security-7] creating index, cause [api], templates [], shards [1]/[0]
[2022-12-14T14:30:45,386][INFO ][o.e.x.s.s.SecurityIndexManager] [node1] security index does not exist, creating [.security-7] with alias [.security]
[2022-12-14T14:30:45,521][INFO ][o.e.c.r.a.AllocationService] [node1] current.health="GREEN" message="Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.security-7][0]]])." previous.health="YELLOW" reason="shards started [[.security-7][0]]"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ Elasticsearch security features have been automatically configured!
✅ Authentication is enabled and cluster connections are encrypted.
ℹ️ Password for the elastic user (reset with `bin/elasticsearch-reset-password -u elastic`):
LYePogNEis=ogbMaUzmJ
ℹ️ HTTP CA certificate SHA-256 fingerprint:
8a717506cb9c1f7c8190a6c3a2026cead47790aca632e35e988a65fdda78366d
ℹ️ Configure Kibana to use this cluster:
• Run Kibana and click the configuration link in the terminal when Kibana starts.
• Copy the following enrollment token and paste it into Kibana in your browser (valid for the next 30 minutes):
eyJ2ZXIiOiI4LjQuMyIsImFkciI6WyIxOTIuMTY4LjU2LjExOjkyMDAiXSwiZmdyIjoiOGE3MTc1MDZjYjljMWY3YzgxOTBhNmMzYTIwMjZjZWFkNDc3OTBhY2E2MzJlMzVlOTg4YTY1ZmRkYTc4MzY2ZCIsImtleSI6Im9aNVVENFVCbFJ4a2VsQU9EbXZxOlpvYXBVWjBUVERTdjk5aDNRV25oNUEifQ==
ℹ️ Configure other nodes to join this cluster:
• On this node:
⁃ Create an enrollment token with `bin/elasticsearch-create-enrollment-token -s node`.
⁃ Uncomment the transport.host setting at the end of config/elasticsearch.yml.
⁃ Restart Elasticsearch.
• On other nodes:
⁃ Start Elasticsearch with `bin/elasticsearch --enrollment-token <token>`, using the enrollment token that you generated.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[2022-12-14T14:30:57,162][INFO ][o.e.i.g.GeoIpDownloader ] [node1] successfully downloaded geoip database [GeoLite2-City.mmdb]
[2022-12-14T14:30:58,768][INFO ][o.e.i.g.GeoIpDownloader ] [node1] successfully downloaded geoip database [GeoLite2-Country.mmdb]
[2022-12-14T14:30:58,815][INFO ][o.e.i.g.DatabaseNodeService] [node1] successfully loaded geoip database file [GeoLite2-City.mmdb]
[2022-12-14T14:30:58,898][INFO ][o.e.i.g.DatabaseNodeService] [node1] successfully loaded geoip database file [GeoLite2-Country.mmdb]
当启动成功,可以看到类似以上的日志输出。首次启动Elasticsearch,默认会启用安全配置功能:
elastic
,并生成默认密码以上这些关键信息都会在启动输出日志中打印出来,默认情况下,Elasticsearch 将其日志打印到控制台 ( stdout
) 和日志目录
中的文件。Elasticsearch 在启动时会记录一些信息,但在完成初始化后,它将继续在前台运行并且不会进一步记录任何信息。
如上日志所示:
当 Elasticsearch 运行时,默认开放了9200作为HTTP交互的端口,可以通过这个端口与它交互。使用以下命令可以验证Elasticsearch运行状态
curl --cacert $ES_HOME/config/certs/http_ca.crt -u elastic https://localhost:9200
结果如下所示:
[elastic@node1 ~]$ curl --cacert ~/elasticsearch-8.4.3/config/certs/http_ca.crt -u elastic https://localhost:9200
Enter host password for user 'elastic':
{
"name" : "node1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "-oF_yhG7TtamNIARhI4-Tg",
"version" : {
"number" : "8.4.3",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "42f05b9372a9a4a470db3b52817899b99a76ee73",
"build_date" : "2022-10-04T07:17:24.662462378Z",
"build_snapshot" : false,
"lucene_version" : "9.3.0",
"minimum_wire_compatibility_version" : "7.17.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "You Know, for Search"
}
可以看到Elasticsearch已经正常启动了起来。注意,因为启用了security,所以一定要使用HTTPS进行访问,其中参数–cacert指定了证书。
如果要停止Elasticsearch,只需要Ctrl-C
即可。
Elasticsearch启动以后,安全配置将会自动将HTTP层绑定到0.0.0.0
,也就是说对于本机的所有地址都是开放的,不管是内网地址还是外网地址,但是transport却默认只绑定了localhost,这是为了确保用户第一次启动的时候是一个单节点的集群。
要加入一个新节点(node2),需要修改transport.host
,绑定到一个指定的地址或者0.0.0.0
,重启节点(待加入的集群中的节点,用node1指代)在这个过程中可能token(注册令牌)会失效,因此在Elasticsearch启动过程中不会自动生成用于新节点加入的token。
接着跟着以下的步骤:
在node1上Elasticsearch的目录下生成token
bin/elasticsearch-create-enrollment-token -s node
在新节点node2使用步骤一生成出来的token启动新节点
bin/elasticsearch --enrollment-token
新节点在config/certs目录下生成证书,自动加入集群
多个新节点加入,只需要重复进行步骤一~步骤三即可。
我们尝试往前单节点的集群中加入一个新的节点,首先使用elasticsearch-create-enrollment-token工具生成token
[elastic@node1 elasticsearch-8.4.3]$ bin/elasticsearch-create-enrollment-token -s node
warning: ignoring JAVA_HOME=/opt/jdk-17.0.5; using bundled JDK
eyJ2ZXIiOiI4LjQuMyIsImFkciI6WyIxOTIuMTY4LjU2LjExOjkyMDAiXSwiZmdyIjoiOGE3MTc1MDZjYjljMWY3YzgxOTBhNmMzYTIwMjZjZWFkNDc3OTBhY2E2MzJlMzVlOTg4YTY1ZmRkYTc4MzY2ZCIsImtleSI6IjJQdVRESVlCWU1kQnBSaTdnODVQOlRfa1RDR3BlVDJPdTZvMzBzanBPRVEifQ==
使用生成的token启动新节点:
[elastic@node2 elasticsearch-8.4.3]$ bin/elasticsearch --enrollment-token eyJ2ZXIiOiI4LjQuMyIsImFkciI6WyIxOTIuMTY4LjU2LjExOjkyMDAiXSwiZmdyIjoiOGE3MTc1MDZjYjljMWY3YzgxOTBhNmMzYTIwMjZjZWFkNDc3OTBhY2E2MzJlMzVlOTg4YTY1ZmRkYTc4MzY2ZCIsImtleSI6IjJQdVRESVlCWU1kQnBSaTdnODVQOlRfa1RDR3BlVDJPdTZvMzBzanBPRVEifQ==
发现不断提示以下日志,从日志中可以看到访问的IP是127.0.0.1,这是因为node1默认只开放了localhost。
[2023-02-01T18:46:59,135][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node2] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{node2}{kuM1xg2lR4myQH6N-AsxGA}{2cS3Wrm1RDKLxMyFgo_NYQ}{node2}{127.0.0.1}{127.0.0.1:9300}{cdfhilmrstw}]; discovery will continue using [] from hosts providers and [{node2}{kuM1xg2lR4myQH6N-AsxGA}{2cS3Wrm1RDKLxMyFgo_NYQ}{node2}{127.0.0.1}{127.0.0.1:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
为了方便,这里修改node1 elasticsearch的配置,增加transport.host:0.0.0.0
和network.host:0.0.0.0
,修改相关配置以后重新启动。
重新将node2重新加入集群的时候发现仍然提示了其他错误,导致启动失败:
ERROR: Skipping security auto configuration because it appears that the node is not starting up for the first time. The node might already be part of a cluster and this auto setup utility is designed to configure Security for new clusters only.
究其原因是因为在上一次加入集群的时候elasticsearch未能加入某个节点,导致自己成为一个独立的新集群,所以无法再加入其他集群,我们从头开始,重新解压缩一个elasticsearch安装包,再重新使用上述方法加入集群,又提示如下错误:
ERROR: Aborting enrolling to cluster. Could not communicate with the node on any of the addresses from the enrollment token. All of [192.168.56.11:9200] were at tempted.
这是因为node1从开放localhost到开放了0.0.0.0
,故需要在在node1上重新生成token,使用新token加入集群
[elastic@node2 elasticsearch-8.4.3]$ bin/elasticsearch --enrollment-token eyJ2ZXIiOiI4LjQuMyIsImFkciI6WyIxOTIuMTY4LjU2LjExOjkyMDAiXSwiZmdyIjoiOGE3MTc1MDZjYjljMWY3YzgxOTBhNmMzYTIwMjZjZWFkNDc3OTBhY2E2MzJlMzVlOTg4YTY1ZmRkYTc4MzY2ZCIsImtleSI6IlBHUGNESVlCdkt6dldhTUF2SVF0Ok9kRzgzQS1YUVUtOGkwdVRrSnZpZWcifQ==
[2023-02-01T20:04:29,861][INFO ][o.e.n.Node ] [node2] version[8.4.3], pid[1675], build[tar/42f05b9372a9a4a470db3b52817899b99a76ee73/2022-10-
[2023-02-01T20:04:58,053][INFO ][o.e.c.s.ClusterApplierService] [node2] master node changed {previous [], current [{node1}{Ugh2e7ubSb2fw9Wj8U918A}{l_xB6z5QQee23U2jX8Ctiw}{node1}{192.168.56.11}{192.168.56.11:9300}{cdfhilmrstw}]}, added {{node1}{Ugh2e7ubSb2fw9Wj8U918A}{l_xB6z5QQee23U2jX8Ctiw}{node1}{192.168.56.11}{192.168.56.11:9300}{cdfhilmrstw}}, term: 7, version: 93, reason: ApplyCommitRequest{term=7, version=93, sourceNode={node1}{Ugh2e7ubSb2fw9Wj8U918A}{l_xB6z5QQee23U2jX8Ctiw}{node1}{192.168.56.11}{192.168.56.11:9300}{cdfhilmrstw}{ml.machine_memory=1907675136, ml.max_jvm_size=956301312, xpack.installed=true, ml.allocated_processors=1}}
......
......
[2023-02-01T20:04:58,299][INFO ][o.e.n.Node ] [node2] started {node2}{TplD7Q2DT_K1tZaxhL6PjA}{xgNgZioKTFWFu8jBKOmZJQ}{node2}{192.168.56.12}{192.168.56.12:9300}{cdfhilmrstw}{xpack.installed=true, ml.allocated_processors=1, ml.max_jvm_size=956301312, ml.machine_memory=1907675136}
此时可以看到,node2启动成功,成功加入集群,在打印的日志中可以看到node1和node2节点的信息。
通过_cat API可以查看集群的节点数量:
curl --cacert $ES_HOME/config/certs/http_ca.crt -u elastic https://192.168.56.11:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.56.11 32 91 0 0.01 0.07 0.12 cdfhilmrstw * node1
192.168.56.12 40 96 0 0.00 0.08 0.10 cdfhilmrstw - node2
可以看到新节点已经正常加入集群中,总共有2个节点。
如果希望elasticsearch以守护进程的形式存在,可以使用以下命令启动
./bin/elasticsearch -d -p pid
pid为一个具体的文件,用于存放elasticsearch启动后的pid值。
停止Elasticsearch进程,可以采用以下命令:
pkill -F pid
elasticsearch的配置支持两种形式,一是使用elasticsearch.yml,二是使用命令行,所有能够在elasticsearch.yml中配置的项也都可以使用命令行进行配置。
./bin/elasticsearch -d -Ecluster.name=my_cluster -Enode.name=node_1
官方建议是针对集群的配置使用elasticsearch.yml配置文件,例如cluster.name
;针对特定节点的配置使用命令行,例如node.name
。但是为了方便后期的运维,个人经验是能放配置文件的尽量放配置文件。
由于启用了TLS,所有的客户端连接Elasticsearch都必须信任HTTPS证书,Fleet Server 和 Fleet-managed Elastic Agent 自动配置为信任 CA 证书,而其他客户端可以通过使用 CA 证书的指纹或 CA 证书本身来建立信任。
证书存放在目录$ES_HOME/config/certs
中,目录中可以看到有证书http_ca.crt。
如果客户端支持CA指纹的话,可以从Elasticsearch启动打印的日志中获取CA指纹。如果错过了日志中打印出来的CA指纹的话,也可以使用以下命令从证书中重新生成:
openssl x509 -fingerprint -sha256 -in config/certs/http_ca.crt
结果如下:
SHA256 Fingerprint=8A:71:75:06:CB:9C:1F:7C:81:90:A6:C3:A2:02:6C:EA:D4:77:90:AC:A6:32:E3:5E:98:8A:65:FD:DA:78:36:6D
以下是日志中打印出的CA指纹,对比之下可以发现是一致的。
ℹ️ HTTP CA certificate SHA-256 fingerprint:
8a717506cb9c1f7c8190a6c3a2026cead47790aca632e35e988a65fdda78366d
注意,使用这种方式生成的CA指纹,issuer
必须是Elasticsearch security auto-configuration HTTP CA
。
issuer= /CN=Elasticsearch security auto-configuration HTTP CA
以上基本就完成了Elasticsearch集群的搭建。当然,Elasticsearch的功能非常强大,还有很多复杂的功能和配置,待后面其他文章再详细分析。