Middleware ❀ Elasticsearch功能与使用详解

文章目录

  • 1 服务概述
    • 1.1 节点 - Node
    • 1.2 索引 - Index
    • 1.3 文档 - Document
    • 1.4 分片 - Shard
      • 1.4.1 主分片 - Primary Shard
      • 1.4.2 副本分片 - Replica Shard
      • 1.4.3 分片设定问题
    • 1.5 主从选举
      • 1.5.1 筛选activceMaster列表
      • 1.5.2 筛选masterCandidates列表
      • 1.5.3 从activeMasters列表选举Master节点
      • 1.5.4 从masterCandidates列表选举Master节点
      • 1.5.5 本地节点是Master
      • 1.5.6 本地节点不是Master
    • 1.6 集群健康状态
    • 1.7 数据类型
  • 2 服务安装
    • 2.1 关闭防火墙和SELinux
    • 2.2 JDK环境准备
    • 2.3 设置内核参数
    • 2.3 单机部署
      • 2.3.1 使用yum源安装
      • 2.3.2 使用tar包安装
    • 2.4 集群化部署
      • 2.4.1 假集群安装
      • 2.4.2 集群启停脚本制作
  • 3 基础操作
    • 3.1 状态检查
      • 3.1.1 集群
      • 3.1.2 节点
      • 3.1.3 索引
      • 3.1.4 磁盘分配
      • 3.1.5 查看集群及其他信息
    • 3.2 索引操作
      • 3.2.1 创建
      • 3.2.2 查询
      • 3.2.3 删除
      • 3.2.4 关闭和打开
    • 3.3 文档操作
      • 3.3.1 创建
      • 3.3.2 查询
      • 3.3.3 更新
      • 3.3.4 删除
      • 3.3.5 特殊查询
      • 3.3.6 修改数据类型
      • 3.3.7 聚合查询
    • 3.3 分片扩缩容
      • 3.3.1 分片 - pri
      • 3.3.2 副本 - rep
    • 3.4 集群扩缩容
      • 3.4.1 扩容
      • 3.4.2 缩容

1 服务概述

Elaticsearch简称为ES,ES是一个开源的分布式检索引擎,由java开发并使用。

1.1 节点 - Node

在ES服务中,一个ES实例本质上就是一个java进程(因此可以使用单个虚拟机安装不同的进程以满足集群环境要求),每个ES实例可以承担不同的工作内容,因此ES实例称为节点,ES Node主要分类有:

  • Master-eligible Node:每个节点启动后,默认是一个Master-eligible节点,Master-eligible的节点可参加选主流程,成为Master节点,通过配置项node.master:falase可以禁用节点的Master-eligible职责,禁止后当前节点就不会参加选主流程;
  • Master Node:ES集群中虽然每个节点都保存了集群状态,但是只有Master节点才有修改集群状态的权限,集群状态包括:集群中节点信息、所有索引和其相关的Mapping和Setting信息、分片的路由信息。在集群启动时,第一个启动的Master-eligible节点会将自己选举为主节点;
  • Data Node:保存数据的节点,负责保存分片数据,对数据扩展有重要作用;
  • Coordinating Node:负责接受Client请求,将请求分发到合适的节点获取响应后,将结果最终汇集在一起,每个节点默认都有Coordinating节点的职责;
  • Machine Learning Node:负责运行机器学习的Job,用来做异常检测;
  • Ingest Node:数据预处理的节点,支持Pipeline管道设置,可以使用Ingest对数据进行过滤、转换等操作。

1.2 索引 - Index

在ES中索引是一类文档的集合,是文档的容器,通常索引是由两部分构成:Mapping和Setting。

  • Mapping:定义该索引包含的文档的数据结构的信息;
  • Setting:定义了该索引的数据分布信息。

1.3 文档 - Document

ES是面向文档的搜索,文档是ES所有可搜索数据的最小单元,在ES中文档会被序列化成json格式进行数据保存,每个文档都会有一个Unique ID,这个ID可以由用户创建时指定,在用户未指定时则由ES随机生成。

在ES中一个文档所包含的元数据如下:

  • _index:文档索引名称;
  • _type:文档所属类型;
  • _id:文档唯一ID;
  • _version:文档版本信息;
  • _seq_no:shard级别严格递增顺序号,后写入文档的_seq_no大于先写入文档的;
  • _primary_term:主分片发生重分配时递增1,主要用来恢复数据时处理当多个文档的_seq_no一样时的冲突;
  • _score:相关性评分,在进行文档搜索时,根据该结果与搜索关键词的相关性进行评分;
  • _source:文档的JSON数据。

文档数据案例(数据来源v8):

{
  "_index": "test_index",
  "_id": "kPJVyYgBfIG-po552LGQ",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

1.4 分片 - Shard

由于单台机器的存储能力是有限的,所以为了解决数据水平扩展问题ES使用了分片的设计。在这个设计中定义了两种分片类型:主分片Primary Shard和副本分片Replica Shard。

1.4.1 主分片 - Primary Shard

主分片用于解决数据水平拓展问题,在ES中可以将一个索引中的数据切分为多个分片,分布在多台服务器上存储,这样单个索引数据的拓展就不会受到单机存储容量的限制,同时让搜索和分析等操作分布到多台服务器上去执行,吞吐量和性能也得到提升。
每个主分片都是一个lucene实例,是一个最小工作单元,它承载部分数据,具有建立索引和处理请求的能力。主分片数在创建索引的时候就需要指定,后续不可再修改,在ES 7.0版本之前一个索引的默认主分片是5,从ES 7.0 开始索引的默认主分片数量改为了1

1.4.2 副本分片 - Replica Shard

副本分片用于保证数据服务的高可用。一个索引的多个分片分布在不同的机器上存储,当一个服务器宕机后,就会造成该索引分片数据丢失,因此ES也设计了分片的副本机制。一个分片可以创建多个副本,副本分片的数量也可以动态调整,副本分片可以在主分片故障时提供备用服务,保证数据安全,同时设置合理个数的副本分片还可以提升搜索的吞吐量和性能。

1.4.3 分片设定问题

主分片设置过小:

  • 后续无法通过增加节点实现水平拓展
  • 单个分片数据量太大,数据重分配慢

主分片数设置过大:

  • 影响搜索的准确性
  • 单个节点上分片过多,浪费资源和性能

1.5 主从选举

ES选举的触发条件:

  • 集群初始化部署;
  • 集群Master节点崩溃;
  • 任何一个节点发现当前集群的Master节点没有得到n/2+1节点认可的时候;

都会触发选举规则,选出新的Master节点。

1.5.1 筛选activceMaster列表

ES的Master从activceMaster列表或masterCandidates列表选举出来。

ES节点成员首先向集群中所有成员发送ping请求,默认等待discovery.zen.ping_timeout超时时间,然后ES针对获得的全部response进行过滤,筛选出其中ActivceMaster列表,ActivceMaster列表是其它节点认为的当前集群的Master节点(ES在获取activceMaster列表时会排除本地节点,目的是为了防止触发脑裂现象)。

假设:当前最小编号的节点P0认为自己是Master并且P0he其他节点发生网络分区,同时ES允许将自己放在activceMaster中,因为P0编号最小,那么P0永远会选择自己作为Master节点,那么就会出现脑裂现象。

1.5.2 筛选masterCandidates列表

masterCandidates列表是当前集群有资格成为Master的节点,如果我们在elasticsearch.yml中配置了如下参数,那么这个节点就没有资格成为Master节点,也就不会被筛选进入masterCandidates列表。

# elasticsearch.yml内配置
node.master:false

1.5.3 从activeMasters列表选举Master节点

activeMaster列表是其它节点认为的当前集群的Master节点列表,如果activeMasters列表不为空,Elasticsearch会优先从activeMasters列表中选举,比较列表中节点优先级,最大者称为Master。

1.5.4 从masterCandidates列表选举Master节点

如果activeMaster列表为空,那么会在masterCandidates中选举,masterCandidates选举也会涉及到优先级比较,masterCandidates选举的优先级比较和activeMaster选举的优先级比较不同。它首先会判断masterCandidates列表成员数目是否达到了最小数目discovery.zen.minimum_master_nodes。满足后比较优先级,优先级比较的时候首先比较节点拥有的集群状态版本编号,然后再比较id,这一流程的目的是让拥有最新集群状态的节点成为Master。

1.5.5 本地节点是Master

经过上述选举之后,会选举出一个准Master节点, 准Master节点会等待其它节点的投票,如果有discovery.zen.minimum_master_nodes-1个节点投票认为当前节点是Master,那么选举就成功,准Master会等待discovery.zen.master_election.wait_for_joins_timeout时间,如果超时,那么就失败。

1.5.6 本地节点不是Master

当前节点判定在集群当前状态下如果自己不可能是Master节点,首先会禁止其他节点加入自己,然后投票选举出准Master节点。同时监听Master发布的集群状态(MasterFaultDetection机制),如果集群状态显示的Master节点和当前节点认为的Master节点不是同一个节点,那么当前节点就重新发起选举。

非Master节点也会监听Master节点进行错误检测,如果成员节点发现master连接不上,重新加入新的Master节点,如果发现当前集群中有很多节点都连不上master节点,那么会重新发起选举。

1.6 集群健康状态

  • green:最健康得状态,说明所有的分片包括备份都可用; 这种情况Elasticsearch集群所有的主分片和副本分片都已分配, Elasticsearch集群是 100% 可用的。
  • yellow:基本的分片可用,但是备份不可用(或者是没有备份); 这种情况Elasticsearch集群所有的主分片已经分片了,但至少还有一个副本是缺失的。不会有数据丢失,所以搜索结果依然是完整的。不过,你的高可用性在某种程度上被弱化。如果 更多的 分片消失,你就会丢数据了。把 yellow 想象成一个需要及时调查的警告。
  • red:部分的分片可用,表明分片有一部分损坏。此时执行查询部分数据仍然可以查到,遇到这种情况,还是赶快解决比较好; 这种情况Elasticsearch集群至少一个主分片(以及它的全部副本)都在缺失中。这意味着你在缺少数据:搜索只能返回部分数据,而分配到这个分片上的写入请求会返回一个异常。

1.7 数据类型

  • 字符串类型:keyword、text(text 和 keyword 的区别就是否分词,分词器standard)
  • 数值和数据类型
    • 整数型:long、integer、short、byte
    • 浮点型:double、float、half_float、scaled_float
  • 时间类型:date(ES支持自定义存储格式:yyyy-MM-dd HH:mm:ss、yyyy-MM-dd、epoch_millis 毫秒值)
  • 布尔类型:boolean
  • 二进制类型:binary
  • 区间类型:integer_range、long_range、float_range、double_range、date_range
  • 复杂类型
    • 数组类型:array(数组不需要专用的字段数据类型,数组中的所有值都必须具有相同的数据类型
    • 对象类型:object
    • Nested类型:nested
  • 特定类型
    • GEO地理位置类型:Geo-point、Geo-shape
    • IP类型:ip_address
    • 自动补全类型:completion
    • 令牌计数数据类型:token_count
    • percolate类型:mumur3
    • 父子索引:percolator
    • 别名类型:alias

2 服务安装

ES服务测试环境多为单机器多端口集群部署或单机部署,生产环境方采用多机器分布式部署。

2.1 关闭防火墙和SELinux

ES服务默认使用9200+端口,若集群部署则默认使用9300-9400+进行信息通信。

[root@master cluster]# systemctl stop firewalld && systemctl disable firewalld && setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

2.2 JDK环境准备

ES服务运行需要JDK环境,Linux默认携带JDK,若版本不够,参考JDK安装或升级进行更新。

[root@master cluster]# java -version
java version "1.8.0_351"
Java(TM) SE Runtime Environment (build 1.8.0_351-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.351-b10, mixed mode)

2.3 设置内核参数

# 为每个用户或用户组分别设置资源限制
[root@master cluster]# vim /etc/security/limits.conf
* soft nofile 65536
* hard nofile 65536

# 配置内核参数
[root@master cluster]# vim /etc/sysctl.conf
vm.max_map_count=655360
[root@master cluster]# sysctl -p

2.3 单机部署

ES服务安装可以采用多种方式,本文仅介绍yum源或tar包安装两种。

2.3.1 使用yum源安装

# 导入rpm仓库密钥,配置yum源
[root@localhost test]#  rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[root@localhost test]#  vim /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md

# 指定yum源安装ES服务
[root@localhost test]#  yum install --enablerepo=elasticsearch elasticsearch -y

# 相关配置文件
[root@localhost test]#  rpm -qc elasticsearch
/etc/elasticsearch/elasticsearch-plugins.example.yml
/etc/elasticsearch/elasticsearch.yml						# ES服务配置文件
/etc/elasticsearch/jvm.options									# JVM(Java虚拟机器)配置文件
/etc/elasticsearch/log4j2.properties						# 日志配置文件
/etc/elasticsearch/role_mapping.yml
/etc/elasticsearch/roles.yml
/etc/elasticsearch/users
/etc/elasticsearch/users_roles
/etc/sysconfig/elasticsearch
/usr/lib/sysctl.d/elasticsearch.conf
/usr/lib/systemd/system/elasticsearch.service

# ES服务配置文件,运行服务后可通过—E参数进行修改
[root@localhost test]# vim /etc/elasticsearch/elasticsearch.yml
cluster.name: my-test—cluster										# 集群名称,一个节点只能加入一个集群
node.name: node-1																# 节点名称,ES的具体实例
path.data: /var/lib/elasticsearch								# 数据存放路径
path.logs: /var/log/elasticsearch								# 日志存放路径
network.host: 0.0.0.0														# 服务可访问网络,0.0.0.0代表全局访问
#transport.port 9300														# 集群通信端口,绑定范围9300-9400
http.port: 9200																	# 服务可访问端口,默认9200,冲突后自动+1
#discovery.seed_hosts: ["host-1"]								# 集群节点初始化列表,绑定格式:host:port或hostname
#cluster.initial_master_nodes: ["node-1"]				# 初始候选master节点列表,使用node.name填写

# 开启服务
[root@localhost test]# systemctl start elasticsearch

# 服务检测
[root@localhost test]# curl http://127.0.0.1:9200
{
  "name" : "node-1",
  "cluster_name" : "my-application",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "8.8.1",
    "build_flavor" : "default",
    "build_type" : "rpm",													# yum源安装完成
    "build_hash" : "f8edfccba429b6477927a7c1ce1bc6729521305e",
    "build_date" : "2023-06-05T21:32:25.188464208Z",
    "build_snapshot" : false,
    "lucene_version" : "9.6.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

2.3.2 使用tar包安装

# 解压安装包到指定目录下
[root@master test]# tar -xvf elasticsearch-8.5.0-linux-x86_64.tar.gz -C /cluster/
[root@master test]# cd /cluster/
[root@master cluster]# ln -s elasticsearch-8.5.0/ es
[root@master cluster]# cd es/
[root@master es]# ll
total 2208
drwxr-xr-x.  2 root root    4096 Oct 25  2022 bin			# 可执行文件目录
drwxr-xr-x.  3 root root     210 Jun 17 13:38 config	# 配置文件目录
drwxr-xr-x.  8 root root      96 Oct 25  2022 jdk			# JDK环境
drwxr-xr-x.  5 root root    4096 Oct 25  2022 lib
-rw-r--r--.  1 root root    3860 Oct 25  2022 LICENSE.txt
drwxr-xr-x.  2 root root       6 Oct 25  2022 logs		# 日志文件
drwxr-xr-x. 67 root root    4096 Oct 25  2022 modules
-rw-r--r--.  1 root root 2235851 Oct 25  2022 NOTICE.txt
drwxr-xr-x.  2 root root       6 Oct 25  2022 plugins
-rw-r--r--.  1 root root    8107 Oct 25  2022 README.asciidoc
[root@master es]# ll config/
total 40
-rw-rw----. 1 root root  1042 Oct 25  2022 elasticsearch-plugins.example.yml
-rw-rw----. 1 root root  2882 Jun 17 13:38 elasticsearch.yml				# 服务配置文件
-rw-rw----. 1 root root  2563 Oct 25  2022 jvm.options	# JVM配置文件
drwxr-x---. 2 root root     6 Oct 25  2022 jvm.options.d
-rw-rw----. 1 root root 17417 Oct 25  2022 log4j2.properties
-rw-rw----. 1 root root   473 Oct 25  2022 role_mapping.yml
-rw-rw----. 1 root root   197 Oct 25  2022 roles.yml
-rw-rw----. 1 root root     0 Oct 25  2022 users
-rw-rw----. 1 root root     0 Oct 25  2022 users_roles

# 服务配置文件
[root@localhost test]# vim /cluster/es/conf/elasticsearch.yml
cluster.name: my-test—cluster
node.name: node-1
path.data: /cluster/es/data/
path.logs: /cluster/es/logs/
network.host: 0.0.0.0
#transport.port 9300
http.port: 9200
#discovery.seed_hosts: ["host-1"]
#cluster.initial_master_nodes: ["node-1"]

# 创建数据目录
[root@master es]# mkdir data

# 注意:使用tar包安装ES服务不可以使用root用户直接启动(安全防护)
[root@master es]# ./bin/elasticsearch -d
warning: ignoring JAVA_HOME=/middleware/jdk; using bundled JDK
[2023-06-17T14:05:23,603][ERROR][o.e.b.Elasticsearch      ] [node-1] fatal exception while booting Elasticsearchjava.lang.RuntimeException: can not run elasticsearch as root

# 使用临时用户启动服务
[root@master cluster]# useradd redhat
[root@master cluster]# chown -R redhat:redhat es
[root@master cluster]# su redhat
[redhat@master cluster]$ ./bin/elasticsearch -d					# -d为后台启动,可使用-h查看参数
~
bootstrap check failure [1] of [1]: Transport SSL must be enabled if security is enabled. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]

# 按照要求修改配置参数重启服务
[redhat@master cluster]$ sed -i '$a xpack.security.enabled:\ false' ES-1/config/elasticsearch.yml
[redhat@master cluster]$ ./ES-1/bin/elasticsearch -d
~
[2023-06-17T14:52:04,623][INFO ][o.e.r.s.FileSettingsService] [node-1] file settings service up and running [tid=57]							# 服务启动成功

# 服务检测
[redhat@master es]$ curl http://127.0.0.1:9200
{
  "name" : "node-1",
  "cluster_name" : "test-ES",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "8.5.0",
    "build_flavor" : "default",
    "build_type" : "tar",																# tar包安装完成
    "build_hash" : "c94b4700cda13820dad5aa74fae6db185ca5c304",
    "build_date" : "2022-10-24T16:54:16.433628434Z",
    "build_snapshot" : false,
    "lucene_version" : "9.4.1",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

2.4 集群化部署

本章使用单个虚拟机三个端口模拟安装ES集群服务。

2.4.1 假集群安装

# 创建集群部署环境
[root@master test]# mkdir /cluster
# 解压安装包并修改安装目录名称
[root@master test]# tar -xvf elasticsearch-8.5.0-linux-x86_64.tar.gz -C /cluster/
[root@master test]# cd /cluster/
[root@master cluster]# mv elasticsearch-8.5.0/ es-1
# 依此方式分别创建好es-1、es-2、es-3
[root@master cluster]# ll
total 0
drwxr-xr-x. 9 root root 155 Oct 25  2022 es-1
drwxr-xr-x. 9 root root 155 Oct 25  2022 es-2
drwxr-xr-x. 9 root root 155 Oct 25  2022 es-3

# node1配置
[root@master cluster]# grep -Ev "^$|^#" es-1/config/elasticsearch.yml
cluster.name: test-es
node.name: node-1
path.data: /cluster/es-1/data/
path.logs: /cluster/es-1/logs/
network.host: 0.0.0.0
http.port: 9201																			# 服务端口
transport.port: 9300																# 通信端口
discovery.seed_hosts: ["127.0.0.1:9300", "127.0.0.1:9400", "127.0.0.1:9500"]				# 集群节点初始化列表
cluster.initial_master_nodes: ["node-1"]						# 初始候选master节点列表
xpack.security.enabled: false												# 关闭SSL传输

# node2配置
[root@master cluster]# grep -Ev "^$|^#" es-2/config/elasticsearch.yml
cluster.name: test-es
node.name: node-2
path.data: /cluster/es-2/data/
path.logs: /cluster/es-2/logs/
network.host: 0.0.0.0
http.port: 9202
transport.port: 9400
discovery.seed_hosts: ["127.0.0.1:9300", "127.0.0.1:9400", "127.0.0.1:9500"]
xpack.security.enabled: false

# node3配置
[root@master cluster]# grep -Ev "^$|^#" es-3/config/elasticsearch.yml
cluster.name: test-es
node.name: node-3
path.data: /cluster/es-3/data/
path.logs: /cluster/es-3/logs/
network.host: 0.0.0.0
http.port: 9203
transport.port: 9500
discovery.seed_hosts: ["127.0.0.1:9300", "127.0.0.1:9400", "127.0.0.1:9500"]
xpack.security.enabled: false

# 分别创建ES集群节点的数据目录
[root@master cluster]#  mkdir {es-1,es-2,es-3}/data/

# 修改JVM内存大小,所有节点均要修改
[root@master cluster]# grep -Ev "^$|^#" es-1/config/jvm.options | grep 256m
-Xms256m																						# 默认使用大小为1G
-Xmx256m

# 使用临时用户启动服务
[root@master cluster]# chown -R redhat:redhat {es-1,es-2,es-3}
[root@master cluster]# su redhat
[redhat@master cluster]$ ll
total 0
drwxr-xr-x. 10 redhat redhat 167 Jun 17 19:27 es-1
drwxr-xr-x. 10 redhat redhat 167 Jun 17 19:27 es-2
drwxr-xr-x. 10 redhat redhat 167 Jun 17 19:27 es-3
# 开启集群服务
[redhat@master cluster]$ ./es-1/bin/elasticsearch -d
[redhat@master cluster]$ ./es-2/bin/elasticsearch -d
[redhat@master cluster]$ ./es-3/bin/elasticsearch -d

# 服务检测
[redhat@master cluster]$ curl http://127.0.0.1:9201
{
  "name" : "node-1",
  "cluster_name" : "test-ES",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "8.5.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "c94b4700cda13820dad5aa74fae6db185ca5c304",
    "build_date" : "2022-10-24T16:54:16.433628434Z",
    "build_snapshot" : false,
    "lucene_version" : "9.4.1",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

# 集群节点
[redhat@master cluster]$ curl http://127.0.0.1:9201/_cat/nodes
192.168.15.132 35 95 12 1.07 0.65 0.35 cdfhilmrstw - node-3
192.168.15.132 47 95 23 1.07 0.65 0.35 cdfhilmrstw * node-1		# node-1为master
192.168.15.132 45 95 23 1.07 0.65 0.35 cdfhilmrstw - node-2

2.4.2 集群启停脚本制作

[root@master cluster]#  vim cluster.sh
#!/bin/bash

dir_list=( `ll | grep es | awk '{print$9}'` )

for i in ${dir_list[@]};do
    case $1 in
    	"start" )
			su - redhat -c "sh /cluster/$i/bin/elasticsearch -d" &> /dev/null
			echo -e "========$i es service is start!========"
			;;
		"stop" )
			ps -ef | grep elasticsearch |grep -v grep | awk '{print$2}' | xargs kill -9
			echo -e "========es service is stop!========"
			break
			;;
		"status" )
			jps | grep -i elasticsearch
			break
			;;
		"*" )
			echo -e "Error!"
			break
			;;
	esac
done

3 基础操作

ES服务多使用curl命令发送HTTP请求,返回JSON数据。curl命令使用教程,安装jq工具参考json数据格式与工具操作。

3.1 状态检查

格式介绍:

Query URI_path [Body]
# 请求方式:GET、POST、PUT、DELETE、HEAD...
# URI路径:GET参数携带在URI内
# Body体:POST、PUT参数携带在请求体内,传输JSON数据需要携带HTTP头部参数:-H 'Content-Type: application/json' -d '{JSON_data}'

3.1.1 集群

# 简单查询
GET /_cat/health?v
epoch      timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1687066363 05:32:43  test-es green           3         3     14   5    0    0        0             0                  -                100.0%

# 集群健康状态
GET /_cluster/health
{
  "cluster_name": "test-es",						# 集群名称
  "status": "green",										# 集群状态
  "timed_out": false,										# 是否超时
  "number_of_nodes": 3,									# 集群节点数
  "number_of_data_nodes": 3,						# 数据节点数
  "active_primary_shards": 5,						# 主分配数
  "active_shards": 14,									# 分片总数
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100
}

3.1.2 节点

# 集群节点
GET /_cat/nodes?v
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
192.168.15.132           44          91   0    0.00    0.01     0.06 cdfhilmrstw -      node-3
192.168.15.132           62          91   0    0.00    0.01     0.06 cdfhilmrstw *      node-1
192.168.15.132           43          91   0    0.00    0.01     0.06 cdfhilmrstw -      node-2

# 集群节点详细信息
GET /_nodes/process
{
  "_nodes": {
    "total": 3,
    "successful": 3,
    "failed": 0
  },
  "cluster_name": "test-es",
  "nodes": {
    "szQRkbALS9Ol1Ne9w_fFqg": {
      "name": "node-2",
      "transport_address": "192.168.15.132:9400",
      "host": "192.168.15.132",
      "ip": "192.168.15.132",
      "version": "8.5.0",
      "build_flavor": "default",
      "build_type": "tar",
      "build_hash": "c94b4700cda13820dad5aa74fae6db185ca5c304",
      "roles": [
        "data",
        "data_cold",
        "data_content",
        "data_frozen",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "ml",
        "remote_cluster_client",
        "transform"
      ],
      "attributes": {
        "ml.allocated_processors_double": "4.0",
        "ml.machine_memory": "3954188288",
        "ml.max_jvm_size": "268435456",
        "xpack.installed": "true",
        "ml.allocated_processors": "4"
      },
      "process": {
        "refresh_interval_in_millis": 1000,
        "id": 19616,
        "mlockall": false
      }
    },
    "FU1gu65SQWaeNrrbju6OdQ": {
      "name": "node-1",
      "transport_address": "192.168.15.132:9300",
      "host": "192.168.15.132",
      "ip": "192.168.15.132",
      "version": "8.5.0",
      "build_flavor": "default",
      "build_type": "tar",
      "build_hash": "c94b4700cda13820dad5aa74fae6db185ca5c304",
      "roles": [
        "data",
        "data_cold",
        "data_content",
        "data_frozen",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "ml",
        "remote_cluster_client",
        "transform"
      ],
      "attributes": {
        "xpack.installed": "true",
        "ml.machine_memory": "3954188288",
        "ml.allocated_processors": "4",
        "ml.max_jvm_size": "268435456",
        "ml.allocated_processors_double": "4.0"
      },
      "process": {
        "refresh_interval_in_millis": 1000,
        "id": 19416,
        "mlockall": false
      }
    },
    "-wcMitOrTV6lVFzD7V16bw": {
      "name": "node-3",
      "transport_address": "192.168.15.132:9500",
      "host": "192.168.15.132",
      "ip": "192.168.15.132",
      "version": "8.5.0",
      "build_flavor": "default",
      "build_type": "tar",
      "build_hash": "c94b4700cda13820dad5aa74fae6db185ca5c304",
      "roles": [
        "data",
        "data_cold",
        "data_content",
        "data_frozen",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "ml",
        "remote_cluster_client",
        "transform"
      ],
      "attributes": {
        "ml.allocated_processors_double": "4.0",
        "ml.machine_memory": "3954188288",
        "xpack.installed": "true",
        "ml.max_jvm_size": "268435456",
        "ml.allocated_processors": "4"
      },
      "process": {
        "refresh_interval_in_millis": 1000,
        "id": 19845,
        "mlockall": false
      }
    }
  }
}

# 单个集群节点信息
GET /_nodes/node-1/process

3.1.3 索引

GET /_cat/indices?v
health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   test_index FkEZLt8KTB6ESz5J9JBodg   1   2          2            4     48.3kb         21.7kb
green  open   user       t5panoawTJqrYhkLntxbjA   3   2          0            0      1.9kb           675b

3.1.4 磁盘分配

GET /_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host           ip             node
     5         40mb    14.6gb       13gb     27.6gb           52 192.168.15.132 192.168.15.132 node-2
     5         40mb    14.6gb       13gb     27.6gb           52 192.168.15.132 192.168.15.132 node-1
     4       15.4kb    14.6gb       13gb     27.6gb           52 192.168.15.132 192.168.15.132 node-3

3.1.5 查看集群及其他信息

GET /_cat
=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates
/_cat/component_templates/_cat/ml/anomaly_detectors
/_cat/ml/anomaly_detectors/{job_id}
/_cat/ml/datafeeds
/_cat/ml/datafeeds/{datafeed_id}
/_cat/ml/trained_models
/_cat/ml/trained_models/{model_id}
/_cat/ml/data_frame/analytics
/_cat/ml/data_frame/analytics/{id}
/_cat/transforms
/_cat/transforms/{transform_id}

3.2 索引操作

3.2.1 创建

PUT /index_test_001
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "index_test_001"
}

# 不可重复创建相同名称的索引
PUT /index_test_001
{
  "error": {
    "root_cause": [
      {
        "type": "resource_already_exists_exception",
        "reason": "index [index_test_001/4AZUKyCISYKGRtELJfTRkA] already exists",
        "index_uuid": "4AZUKyCISYKGRtELJfTRkA",
        "index": "index_test_001"
      }
    ],
    "type": "resource_already_exists_exception",
    "reason": "index [index_test_001/4AZUKyCISYKGRtELJfTRkA] already exists",
    "index_uuid": "4AZUKyCISYKGRtELJfTRkA",
    "index": "index_test_001"
  },
  "status": 400
}

3.2.2 查询

# 查看指定索引
GET /index_test_001
{
  "index_test_001": {
    "aliases": {},
    "mappings": {},
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "index_test_001",
        "creation_date": "1687085184940",
        "number_of_replicas": "1",
        "uuid": "4AZUKyCISYKGRtELJfTRkA",
        "version": {
          "created": "8050099"
        }
      }
    }
  }
}

# 查看所有索引
GET /_cat/indices?v
health status index          uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   index_test_001 d6Yv9GqcRDOXxvEryKhO8A   1   1          0            0       450b           225b
green  open   index_test_002 SHcOfYpmTJyXRf062-6Mag   1   1          0            0       450b           225b
green  open   index_test_003 wbsc8AyVQpaUNoxUjvrYzQ   1   1          0            0       450b           225b

3.2.3 删除

DELETE /index_test_003 | jq
{
  "acknowledged": true
}

3.2.4 关闭和打开

# 关闭索引
POST /test_index_003-new/_close
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "indices": {
    "test_index_003-new": {
      "closed": true
    }
  }
}

# 关闭验证,status = close
GET /_cat/indices?v                                                                                                       
health status index              uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  close  test_index_003-new JYTNZR3CQ1-ITGpCcaveaw   1   1    

# 开启索引
POST /test_index_003-new/_open
{
  "acknowledged": true,
  "shards_acknowledged": true
}

# 开启验证,status = open
GET /_cat/indices?v 
health status index              uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   test_index_003-new JYTNZR3CQ1-ITGpCcaveaw   1   1          3            0      9.3kb          4.6kb

3.3 文档操作

3.3.1 创建

POST /test_index_001/_doc/1 '{"username": "test_doc_001", "message": "test_data_001"}'
{
  "_index": "test_index_001",
  "_id": "1",															# 唯一性标识
  "_version": 1,
  "result": "created",										# 执行结果
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

# 若不定义文档ID,则ES随机生成等长字符串
POST /test_index_001/_doc/ '{"username": "test_doc_002", "message": "test_data_002"}'
{
  "_index": "test_index_001",
  "_id": "1dMozogBm1uMeriITrJf",						# 随机ID
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}

# _create与_doc功能相同
POST /test_index_001/_create/2 '{"username": "test_doc_003", "message": "test_data_003"}'
{
  "_index": "test_index_001",
  "_id": "2",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1
}

3.3.2 查询

# 查询使用GET请求
GET /test_index_001/_doc/1
{
  "_index": "test_index_001",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "username": "test_doc_001",
    "message": "test_data_001"
  }
}

# 若查询失败,fonud返回false
GET /test_index_001/_doc/10
{
  "_index": "test_index_001",
  "_id": "10",
  "found": false
}

# 查询全部数据
GET /test_index_001/_search
{
  "took": 255,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "username": "test_doc_001",
          "message": "test_data_001"
        }
      },
      {
        "_index": "test_index_001",
        "_id": "1dMozogBm1uMeriITrJf",
        "_score": 1,
        "_source": {
          "username": "test_doc_002",
          "message": "test_data_002"
        }
      },
      {
        "_index": "test_index_001",
        "_id": "2",
        "_score": 1,
        "_source": {
          "username": "test_doc_003",
          "message": "test_data_003"
        }
      }
    ]
  }
}

3.3.3 更新

# 覆盖更新
PUT /test_index_001/_doc/2 '{"username": "test_doc_002", "message": "test_data_002"}'
{
  "_index": "test_index_001",
  "_id": "2",
  "_version": 2,											# 版本递增
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 3,												# _seq_no递增
  "_primary_term": 1
}

# 局部修改
POST /test_index_001/_doc/2 '{"doc":{"message":"test_data_change"}}'
{
  "_index": "test_index_001",
  "_id": "2",
  "_version": 3,											# 版本递增
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 4,												# _seq_no递增
  "_primary_term": 1
}

# 结果验证
GET /test_index_001/_doc/2
{
  "_index": "test_index_001",
  "_id": "2",
  "_version": 3,
  "_seq_no": 4,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "doc": {
      "message": "test_data_change"		# 局部修改成功
    }
  }
}

3.3.4 删除

DELETE /test_index_001/_doc/2
{
  "_index": "test_index_001",
  "_id": "2",
  "_version": 4,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_seq_no": 5,
  "_primary_term": 1
}

# 结果验证
GET /test_index_001/_doc/2
{
  "_index": "test_index_001",
  "_id": "2",
  "found": false
}

3.3.5 特殊查询

# 请求路径查询,q = Query
GEt /test_index_001/_search?q=message:test_data_002
{
  "took": 377,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1dMozogBm1uMeriITrJf",
        "_score": 0.6931471,
        "_source": {
          "username": "test_doc_002",
          "message": "test_data_002"
        }
      }
    ]
  }
}

# 请求体查询
POST /test_index_001/_search '{"query":{"match":{"message":"test_data_002"}}}'
{
  "took": 1065,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1dMozogBm1uMeriITrJf",
        "_score": 0.6931471,
        "_source": {
          "username": "test_doc_002",
          "message": "test_data_002"
        }
      }
    ]
  }
}

# 请求体查询所有
POST /test_index_001/_search '{"query":{"match_all":{}}}'
{
  "took": 30,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "username": "test_doc_001",
          "message": "test_data_001"
        }
      },
      {
        "_index": "test_index_001",
        "_id": "1dMozogBm1uMeriITrJf",
        "_score": 1,
        "_source": {
          "username": "test_doc_002",
          "message": "test_data_002"
        }
      }
    ]
  }
}

# 分页查询,起始位置from,size每页多少条数据
POST /test_index_001/_search '{"query":{"match_all":{}},"from":0,"size":1}'
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,												# 数据2条只显示1条
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "username": "test_doc_001",
          "message": "test_data_001"
        }
      }
    ]
  }
}

# 只显示查询的某个key
POST /test_index_001/_search '{"query":{"match_all":{}},"from":0,"size":1,"_source":["message"]}'
{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "message": "test_data_001"			# 未显示username信息
        }
      }
    ]
  }
}

# 创建映射:若类型为keyword,不支持分词完全匹配;类型为text,支持分词模糊匹配
GET /test_index_001/_mapping
{
  "index_test-001": {
    "mappings": {
      "properties": {
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "fielddata": true
        },
        "username": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

# 注意数据类型
POST /test_index_001/_mapping '{"properties": {"message": {"type": "text","fielddata": true}}}'
{
  "acknowledged": true
}

# 查询结果排序,ase-正序、desc-反序
POST /test_index_001/_search '{"query":{"match_all":{}},"from":0,"size":2,"_source":["number","data"],"sort":{"data":{"order":"desc"}}}'
{
  "took": 20,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      {
        "_index": "index_test-001",
        "_id": "3",
        "_score": null,
        "_source": {
          "message": "test_data_003"
        },
        "sort": [
          "test_data_003"
        ]
      },
      {
        "_index": "index_test-001",
        "_id": "2",
        "_score": null,
        "_source": {
          "message": "test_data_002"
        },
        "sort": [
          "test_data_002"
        ]
      }
    ]
  }
}

# 条件匹配查询
POST /test_index_001/_search '{"query":{"bool":{"must":[{"match":{"number":"1111"}}]}}}'
{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.9808291,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 0.9808291,
        "_source": {
          "number": "1111",
          "data": "1111"
        }
      }
    ]
  }
}

# 多条件与查询,must
POST /test_index_001/_search '{"query":{"bool":{"must":[{"match":{"number":"1111"}},{"match":{"data":1111}}]}}}'
{
  "took": 36,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.9616582,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 1.9616582,
        "_source": {
          "number": "1111",
          "data": "1111"
        }
      }
    ]
  }
}

# 多条件或查询,should
POST /test_index_001/_search '{"query":{"bool":{"should":[{"match":{"number":"1111"}},{"match":{"number":2222}}]}}}'
{
  "took": 49,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,												# 查询到2条数据
      "relation": "eq"
    },
    "max_score": 0.9808291,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 0.9808291,
        "_source": {
          "number": "1111",
          "data": "1111"
        }
      },
      {
        "_index": "test_index_001",
        "_id": "2",
        "_score": 0.9808291,
        "_source": {
          "number": "2222",
          "data": "2222"
        }
      }
    ]
  }
}


# 范围匹配查询 gt-大于 le-小于
POST /test_index_001/_search {"query":{"bool":{"should":[{"match":{"number":"1111"}},{"match":{"number":"2222"}}],"filter":{"range":{"data":{"gt":1111}}}}}}'
{
  "took": 222,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,												# data数据大于1111
      "relation": "eq"
    },
    "max_score": 0.9808291,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "2",
        "_score": 0.9808291,
        "_source": {
          "number": "2222",
          "data": "2222"
        }
      },
      {
        "_index": "test_index_001",
        "_id": "3",
        "_score": 0,
        "_source": {
          "number": "3333",
          "data": "3333"
        }
      }
    ]
  }
}

# 完全匹配-match_phrase + 高亮显示-highlight
POST /test_index_001/_search '{"query":{"match_phrase":{"number":"1111"}},"highlight":{"fields":{"number":{}}}}'
{
  "took": 454,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.9808291,
    "hits": [
      {
        "_index": "test_index_001",
        "_id": "1",
        "_score": 0.9808291,
        "_source": {
          "number": "1111",
          "data": "1111"
        },
        "highlight": {
          "number": [
            "1111"											# em标签代表高亮显示
          ]
        }
      }
    ]
  }
}

3.3.6 修改数据类型

ES无法直接修改当前索引的数据类型(防止数据类型修改后数据异常无法使用的原因),可使用ES索引别名的方式重新创建出索引关系,最终结果等效于修改原索引字段类型。

# 创建一个新的索引,并配置字段类型
PUT /test_index_003-new '{"mappings": {"properties": {"username": {"type": "text"},"test_info": {"type": "long"}}}}'
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "test_index_003-new"
}

# 查看映射关系
GET /test_index_003-new/_mapping
{
  "test_index_003-new": {
    "mappings": {
      "properties": {
        "test_info": {
          "type": "long"													# 数字类型
        },
        "username": {
          "type": "text"
        }
      }
    }
  }
}

# 将旧索引数据导入新索引
POST /_reindex '{"source": {"index": "test_index_003"},"dest": {"index": "test_index_003-new"}}' 
{
  "took": 55,
  "timed_out": false,
  "total": 3,
  "updated": 0,
  "created": 3,																			# 新建3条数据
  "deleted": 0,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1.0,
  "throttled_until_millis": 0,
  "failures": []
}

# 删除旧索引
DELETE /test_index_003/
{
  "acknowledged": true
}

# 创建新索引别名关系,指向旧索引
PUT /test_index_003-new/_alias/test_index_003
{
  "acknowledged": true
}

# 查看别名关系
GET /test_index_003-new/_alias
{
  "test_index_003-new": {
    "aliases": {
      "test_index_003": {}
    }
  }
}

# 别名功能验证 - 使用旧索引名称查询
GET /test_index_003/_search
{
  "took": 250,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test_index_003-new",									# 索引名称为新索引
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "111"
        }
      },
      {
        "_index": "test_index_003-new",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "222"
        }
      },
      {
        "_index": "test_index_003-new",
        "_id": "3",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "333"
        }
      }
    ]
  }
}

GET /test_index_003/_mapping
{
  "test_index_003-new": {
    "mappings": {
      "properties": {
        "test_info": {
          "type": "long"
        },
        "username": {
          "type": "text"
        }
      }
    }
  }
}

3.3.7 聚合查询

注意数据类型,字符串类型不可求平均值等操作。

# 数据准备
POST /test_index_003-new/_doc/1 '{"username": "test_doc_001", "test_info": "111"}'
POST /test_index_003-new/_doc/2 '{"username": "test_doc_002", "test_info": "222"}'
POST /test_index_003-new/_doc/3 '{"username": "test_doc_003", "test_info": "333"}'

# 聚合查询
POST /test_index_003-new/_search '{"aggs":{"avg_grade":{"terms":{"field":"test_info"}}}}'
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test_index_003-new",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "111"
        }
      },
      {
        "_index": "test_index_003-new",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "222"
        }
      },
      {
        "_index": "test_index_003-new",
        "_id": "3",
        "_score": 1.0,
        "_source": {
          "username": "test_doc_001",
          "test_info": "333"
        }
      }
    ]
  },
  "aggregations": {
    "avg_grade": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 111,
          "doc_count": 1
        },
        {
          "key": 222,
          "doc_count": 1
        },
        {
          "key": 333,
          "doc_count": 1
        }
      ]
    }
  }
}

# 无其他非相关数据回显
POST /test_index_003-new/_search '{"aggs":{"avg_grade":{"terms":{"field":"test_info"}}},"size":0}'
{
  "took": 30,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "avg_grade": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 111,
          "doc_count": 1
        },
        {
          "key": 222,
          "doc_count": 1
        },
        {
          "key": 333,
          "doc_count": 1
        }
      ]
    }
  }
}

# 取平均值avg
POST /test_index_003-new/_search '{"aggs":{"avg_grade":{"avg":{"field":"test_info"}}},"size":0}'
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "avg_grade": {
      "value": 222.0
    }
  }
}

3.3 分片扩缩容

分片与副本支持修改的参数如下:

GET /test_index_003-new/_settings
{
  "test_index_003-new": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "test_index_003-new",
        "creation_date": "1687163817811",
        "number_of_replicas": "1",
        "uuid": "JYTNZR3CQ1-ITGpCcaveaw",
        "version": {
          "created": "8080199"
        }
      }
    }
  }
}

3.3.1 分片 - pri

ES中的数据会被分别存储在不同的分片上,索引库的分片数量是在索引库创建的时候通过settings去设置的,分片数默认是1(V7版本之后,之前为5),一旦确定不可改变。

若必须修改索引分片数量可采用本章3.3.6方法,使用ES索引别名的方式重新创建出索引关系并配置新索引分配数量为所需数量,最终结果等效于修改原索引分配数量。

# 查看分片
GET /_cat/shards           
test_index_003     0 p STARTED 0  247b 9.134.244.180 node-1
test_index_003     0 r STARTED 0  247b 9.134.244.180 node-4
test_index_002     0 r STARTED 0  247b 9.134.244.180 node-4
test_index_002     0 p STARTED 0  247b 9.134.244.180 node-2
test_index_001     0 p STARTED 0  247b 9.134.244.180 node-3
test_index_001     0 r STARTED 0  247b 9.134.244.180 node-1
temp               0 p STARTED 0  247b 9.134.244.180 node-1
temp               0 r STARTED 0  247b 9.134.244.180 node-2
test_index_003-new 0 r STARTED 3 4.6kb 9.134.244.180 node-3
test_index_003-new 0 p STARTED 3 4.6kb 9.134.244.180 node-2

# 不可修改索引分片数量
PUT /test_index_003-new/_settings '{"index": {"number_of_shards": 4}}'
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Can't update non dynamic settings [[index.number_of_shards]] for open indices [[test_index_003-new/JYTNZR3CQ1-ITGpCcaveaw]]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Can't update non dynamic settings [[index.number_of_shards]] for open indices [[test_index_003-new/JYTNZR3CQ1-ITGpCcaveaw]]"
  },
  "status": 400
}

# 新建索引,设置分片与副本数                                      
PUT /temp '{"settings": {"number_of_shards": 3,"number_of_replicas": 2}}'
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "user"
}

GET /_cat/indices?v
health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   temp       t5panoawTJqrYhkLntxbjA   3   2          0            0      1.9kb           675b

3.3.2 副本 - rep

副本支持热更改。

PUT /test_index/_settings '{ "number_of_replicas": 4 }}'
{
  "acknowledged": true
}

# rep = 2
GET /_cat/indices?v
health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   test_index FkEZLt8KTB6ESz5J9JBodg   1   2          2            7       64kb         21.4kb

3.4 集群扩缩容

集群扩容:只需安装ES服务目录,不要启动该ES服务,否则新集群uuid不一致无法正常加入之前旧集群。

3.4.1 扩容

# 解压安装包到集群路径下,并修改目录为es-4
[root@master es]# grep -Ev "^$|^#" es-4/config/elasticsearch.yml
cluster.name: test-cluster
node.name: node-4
path.data: /cluster/es-4/data
path.logs: /cluster/es-4/logs
network.host: 0.0.0.0
http.port: 9204
transport.port: 9999
discovery.seed_hosts: ["localhost:9700","localhost:9800","localhost:9900","localhost:9999"]
cluster.initial_master_nodes: ["node-1"]
xpack.security.enabled: false

[root@master cluster]# grep -Ev "^$|^#" es-1/config/jvm.options | grep 256m
-Xms256m
-Xmx256m

[root@master es]# mkdir es-4/data
[root@master es]# chown -R redhat:redhat es-4/

# 后台启动
[root@master es]# ./es-4/bin/elasticsearch -d 

# 上线验证
[root@VM-244-180-centos cluster]# curl -s http://127.0.0.1:9201/_cat/nodes?v
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
9.134.244.180           61          55   0    0.26    0.14     0.04 cdfhilmrstw *      node-1
9.134.244.180           58          55   0    0.26    0.14     0.04 cdfhilmrstw -      node-2
9.134.244.180           72          55   0    0.26    0.14     0.04 cdfhilmrstw -      node-3
9.134.244.180           70          55   5    0.26    0.14     0.04 cdfhilmrstw -      node-4					# 新节点上线完成

3.4.2 缩容

# 关闭node4节点分配
PUT /_cluster/settings '{"persistent": {"cluster.routing.allocation.exclude._name": "node-4"}}' 
{
  "acknowledged": true,
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "exclude": {
            "_name": "node-4"								# _name - 主机名称;_ip - 主机IP
          }
        }
      }
    }
  },
  "transient": {}
}

# 剩余节点中将配置discovery.seed_hosts和 cluster.initial_master_nodes中删掉要下线节点

# 结果验证
GET /_cat/shards                       
test_index_002     0 r STARTED 0  247b 9.134.244.180 node-1
test_index_002     0 p STARTED 0  247b 9.134.244.180 node-2
test_index_003-new 0 r STARTED 3 4.6kb 9.134.244.180 node-3
test_index_003-new 0 p STARTED 3 4.6kb 9.134.244.180 node-2
temp               0 p STARTED 0  247b 9.134.244.180 node-1
temp               0 r STARTED 0  247b 9.134.244.180 node-2
test_index_003     0 r STARTED 0  247b 9.134.244.180 node-3
test_index_003     0 p STARTED 0  247b 9.134.244.180 node-1
test_index_001     0 p STARTED 0  247b 9.134.244.180 node-3
test_index_001     0 r STARTED 0  247b 9.134.244.180 node-1

# 清空_name变量
PUT /_cluster/settings '{"persistent": {"cluster.routing.allocation.exclude._name": ""}}'

# 关闭node-4后查看node信息
GET /_cat/nodes?v
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
9.134.244.180           46          50   0    0.12    0.10     0.04 cdfhilmrstw -      node-3
9.134.244.180           80          50   0    0.12    0.10     0.04 cdfhilmrstw *      node-1
9.134.244.180           35          50   0    0.12    0.10     0.04 cdfhilmrstw -      node-2

GET /_cluster/health?pretty
{
  "cluster_name" : "test-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

GET _nodes/node-4/stats/indices?pretty
{
  "_nodes" : {
    "total" : 0,
    "successful" : 0,
    "failed" : 0
  },
  "cluster_name" : "test-cluster",
  "nodes" : { }
}

你可能感兴趣的:(elasticsearch,java,大数据)