先创建 index (相当于mysql 的database)
PUT /zzh
number_of_shards
切片数量; number_of_replicas
备份数量{
"settings": {
"number_of_shards": "2",
"number_of_replicas": "0"
}
}
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "zzh"
}
只可以修改备份的数量不能修改切片的数量
PUT /zzh/_settings
{
"number_of_replicas" : "2"
}
删除索引
DELETE /zzh
插入数据
在以前的版本中, index 下面要有 类型 类型下面要有字段, 字段要有属性, 在新版本中 每个 index 下面只有一个类型
_doc
, 可以不设置字段, 也可以不设置字段的属性值(es会设置字段的默认属性)
`POST /zzh/_doc/1`
{
"id":1,
"name":"zzh",
"url": "http://baidu.com/s"
}
返回值
{
"_index": "zzh",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
修改数据
PUT /zzh/_doc/1
, 每次全局更新后这个文档的 _version
都会发生改变{
"id": 1,
"name": "zzh",
"url": "http://google.org"
}
POST /zzh/_update/1
每次更新后_version
加一, 但是更新内容相同 第二次_version
不累加 {
"doc": {
"name": "zyn"
}
}
查询数据
GET /zzh/_doc/1
返回的数据如下{
"_index": "zzh",
"_type": "_doc",
"_id": "1",
"_version": 4,
"_seq_no": 3,
"_primary_term": 1,
"found": true,
"_source": {
"id": 1,
"name": "zynzzh",
"url": "http://google.org"
}
}
删除数据
DELETE /zzh/_doc/1
返回的数据如下 删除数据url上的 id 不是数据里的id {
"_index": "zzh",
"_type": "_doc",
"_id": "1",
"_version": 5,
"result": "deleted",
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"_seq_no": 4,
"_primary_term": 1
}
Standard
分词器,standard
按英文单词分类并进行小写处理, 对中文支持不太好, 看大佬的文章 https://blog.csdn.net/qq_26803795/article/details/106522611
https://github.com/medcl/elasticsearch-analysis-ik/releases
https://github.com/medcl/elasticsearch-analysis-ik/
这个里面有安装步骤plugins
里面建一个文件夹 ik
, 解压压缩包 到 ik
目录里请求 POST /_analyze
请求参数
{
"analyzer":"ik_max_word",
"text": "你好吗 hello world"
}
返回的结果
{
"tokens": [
{
"token": "你好",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "好吗",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 1
},
{
"token": "hello",
"start_offset": 4,
"end_offset": 9,
"type": "ENGLISH",
"position": 2
},
{
"token": "world",
"start_offset": 10,
"end_offset": 15,
"type": "ENGLISH",
"position": 3
}
]
}
创建 index
的时候可以指定字段的类型, 如下:
PUT /iktest
{
"settings":{
"index":{
"number_of_shards":"2",
"number_of_replicas":"0"
}
},
"mappings":{
"properties": {
"id": {
"type":"integer"
},
"name": {
"type": "text",
"analyzer": "ik_max_word"
},
"headImg": {
"type": "text"
}
}
}
}
批量添加数据
{ "create" : { "_index" : "iktest", "_id" : "1" } }
{"id":1,"name": "床前明月光","headImg": "http://baidu.com/s"}
{ "create" : { "_index" : "iktest", "_id" : "2" } }
{"id":2,"name": "疑是地上霜","headImg": "http://www.baidu.com/s"}
{ "create" : { "_index" : "iktest", "_id" : "3" } }
{"id":3,"name": "举头望明月","headImg": "http://google.com"}
{ "create" : { "_index" : "iktest", "_id" : "4" } }
{"id":4,"name": "低头思故乡","headImg": "http://www.google.com"}
踩过的大坑, error: The bulk request must be terminated by a newline [\n], 最后一样必须是空行, 不能是json数据, 得敲回车
添加成功后让我们测试下
POST /iktest/_search
{
"query":{
"match":{
"name":"明月"
}
}
}
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.5077718,
"hits": [
{
"_index": "iktest",
"_type": "_doc",
"_id": "3",
"_score": 0.5077718,
"_source": {
"id": 3,
"name": "举头望明月",
"headImg": "http://google.com"
}
},
{
"_index": "iktest",
"_type": "_doc",
"_id": "1",
"_score": 0.4091398,
"_source": {
"id": 1,
"name": "床前明月光",
"headImg": "http://baidu.com/s"
}
}
]
}
}
创建索引 PUT /iktest
{
"settings": {
"number_of_shards": "2",
"number_of_replicas": "3"
}
}
删除索引 DELETE /iktest
修改索引副本数 PUT /iktest/_settings
{
"number_of_replicas": "2"
}
不指定id POST /iktest/_doc/
这个id 不是数据里的id
{
"id": 1,
"name": "逆水行舟",
"headImg": "http://google.com"
}
指定 id POST /iktest/_doc/1
{
"id": 1,
"name": "逆水行舟",
"headImg": "http://google.com"
}
删除数据 DELETE /iktest/_doc/1
更新数据
PUT /iktest/_doc/1
{
"id": 1,
"name": "人无再少年",
"headImg": "http://baidu.com"
}
POST /iktest/_update/1
{
"doc": {
"name": "莫待无花空折枝"
}
}
基础查询数据 (默认展示10条数据)
GET /iktest/_search
match_all
全局搜索数据, 可以加各种条件, 比如排序: PSOT /zzh/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"id": {
"order": "asc"
}
}
]
}
查询出来的字段的含义
{
"took": 21,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "iktest",
"_type": "_doc",
"_id": "1",
"_score": null,
"_source": {
"id": 1,
"name": "床前明月光",
"headImg": "http://baidu.com/s?word=zzh"
},
"sort": [
1
]
},
{
"_index": "iktest",
"_type": "_doc",
"_id": "2",
"_score": null,
"_source": {
"id": 2,
"name": "疑是地上霜",
"headImg": "http://www.baidu.com/s"
},
"sort": [
2
]
},
{
"_index": "iktest",
"_type": "_doc",
"_id": "3",
"_score": null,
"_source": {
"id": 3,
"name": "举头望明月",
"headImg": "http://google.com"
},
"sort": [
3
]
},
{
"_index": "iktest",
"_type": "_doc",
"_id": "4",
"_score": null,
"_source": {
"id": 4,
"name": "低头思故乡",
"headImg": "http://www.google.com"
},
"sort": [
4
]
}
]
}
}
took
es 运行查询需要多长时间(以毫秒为单位)timed_out
搜索请求是否超时_shards
搜索了多少碎片, 并对多少碎片成功、失败、或跳过进行了细分_max_scre
找到了最相关的文档的得分hits.total.value
找到了多少匹配的文档hits.sort
文档排序后的位置hits._score
文档的相关性评分(在使用 match_all
时不适用)指定文档id搜索数据 GET /iktest/_doc/1
根据关键字搜索数据
iktest
索引下的 name
字段GET /iktest/_search?q=name:"明月"
DSL查询是 JSON 格式的,所以更加灵活, 而且可以同时包含查询和过滤器, 咱们可以很轻松的构造出复杂的查询功能
term 查询
POST /iktest/_search
{
"query":{
"term":{
"id":2
}
}
}
terms 查询
POST /iktest/_search
查询id=2,4的文档{
"query":{
"terms":{
"id":[2,4]
}
}
}
range 查询
range 主要用于过滤, 通常用于按照指定范围查找一批数据, 咱们需要记忆如下四个关键字的含义
gt
大于gte
大于等于lt
小于lte
小于等于POST /iktest/_search
{
"query":{
"range":{
"id":{
"gte":2,
"lt":5
}
}
}
}
exists 查询
POST /iktest/_search
{
"query":{
"exists":{
"field":"headImg"
}
}
}
match 查询
POST /iktest/_search
{
"query":{
"match":{
"name":"明月"
}
}
}
match_phrase
查询
POST /iktest/_search
{
"query":{
"match":{
"name":"明月 地上"
}
}
}
bool
查询
bool
查询可以用来合并多个条件查询结果的布尔逻辑, 咱们需要记忆如下操作符关键字:
must
多个查询条件的完全匹配相当于 and
must_not
多个查询条件的相反匹配, 相当于 not
should
至少有一个查询条件匹配, 相当于 or
这些参数可以分别继承一个查询条件或者一个查询条件数组
POST /iktest/_search
{
"query": {
"bool": {
"must": {
"match": {
"headImg": "http"
}
},
"must_not": {
"term": {
"id": 4
}
},
"should": [
{
"term": {
"name": "床前明月光"
}
},
{
"term": {
"name": "一起学习"
}
}
]
}
}
}
filter
查询
POST /iktest/_search
{
"query":{
"bool":{
"filter":{
"match":{
"name":"明月"
}
}
}
}
}
bool
filter
融合使用
POST /iktest/_search
{
"query":{
"bool":{
"filter":{
"range":{
"id":{
"gte":2,
"lt":5
}
}
},
"must_not":{
"term":{
"id":4
}
},
"must":{
"match":{
"name":"明月"
}
}
}
}
}
avg
平均值
max
最大值
min
最小值
sum
求和
求平均值 POST /iktest/_search
{
"aggs":{
"iktest":{
"avg":{
"field":"id"
}
}
},
"size":0
}
求平均值或求和时, 为什么要加 (“size”:0) 呢?
size用来控制返回多少条数据, 由于是想要在所有文档里求平均值或求和, 所以要用size 来控制返回一个数据即可, 不然 ES 还会默认返回10条数据
cardinality 去重
涉及到聚合查询的场景当然少不了去重, ES提供了 cardinality 去重统计函数来解决这个问题
POST /iktest/_search
{
"aggs":{
"iktest":{
"cardinality":{
"field":"id"
}
}
},
"size":0
}
value_count
计数统计, 统计有多少条数据:
POST /iktest/_search
{
"aggs":{
"iktest":{
"value_count":{
"field":"id"
}
}
},
"size":0,
"query":{
"match":{
"name":"明月"
}
}
}
terms
词聚合
terms
词聚合可以基于给定的字段并按照这个字段对应的相同数据为一个桶, 然后计算每个桶里的文档个数. 默认会按照文档个数排序POST /iktest/_search
{
"aggs":{
"iktest":{
"terms":{
"field":"id"
}
}
},
"size":0
}
top_hits
聚合, 使用 sql 时可以很方便的处理top问题, ES 也提供了对应的支持, top_hits就是这样的函数,一般和terms连用, 可以获取到每组的前n条数据
id
分组, 然后拿到前 6
条数据POST /iktest/_search
{
"aggs":{
"iktest":{
"terms":{
"field":"id"
},
"aggs":{
"count":{
"top_hits":{
"size":6
}
}
}
}
},
"size":0
}
range 范围查询
id
字段的值在 6-9 之间和 10-12之间的文档有多少POST /iktest/_search
文档数量不包括 to{
"aggs":{
"id_ranges":{
"range":{
"field":"id",
"ranges":[
{"from":1,"to":2},
{"from":4,"to":6}
]
}
}
},
"size":0
}