ElasticSearch 7.x

Elasticsearch 7.x

简介

  • Elasticsearch是一个开源,基于Apache Lucene库构建的Restful搜索引擎
  • Elasticsearch是在Solr之后几年推出的。它提供了一个分布式,多租户能力的全文搜索引擎,具有HTTP Web界面(REST)和无架构JSON文档。 Elasticsearch的官方客户端库提供Java, Groovy, PHP, Ruby, Perl, Python, .NET和Javascript

官网地址

  • [官网]https://www.elastic.co/
  • [软件下载地址]https://www.elastic.co/downloads/

核心概念

  • 索引(index)
    • 一个索引可以理解成一个关系型数据库
  • 类型(type)
    • 一种type就像一类表,比如user表, order表
    • 注意:
      • ES 5.x中一个index可以有多种type
      • ES 6.x中一个index只能有一种type
      • ES 7.x以后已经移除type这个概念
  • 映射(mapping)
    • mapping定义了每个字段的类型等信息。相当于关系型数据库中的表结构
  • 文档(document)
    • 一个document相当于关系型数据库中的一行记录
  • 字段(field)
    • 相当于关系型数据库表的字段
  • 集群(cluster)
    • 集群由一个或多个节点组成,一个集群有一个默认名称"elasticsearch"
  • 节点(node)
    • 集群的节点,一台机器或者一个进程
  • 分片和副本(shard)
    • 副本是分片的副本。分片有主分片(primary Shard)和副本分片(replica Shard)之分
    • 一个Index数据在物理上被分布在多个主分片中,每个主分片只存放部分数据
    • 每个主分片可以有多个副本,叫副本分片,是主分片的复制

字段类型

核心数据类型

分类 类型 描述
字符串 text 用于全文索引,该类型的字段将通过分词器进行分词
字符串 keyword 不分词,只能搜索该字段的完整的值
数值型 long, integer, short, byte, double, float, half_float, scaled_float -
布尔 boolean -
二进制 binary 该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索
范围类型 integer_range, float_range, long_range, double_range, date_range 范围类型表示值是一个范围,而不是一个具体的值;譬如 age 的类型是 integer_range,那么值可以是 {"gte" : 20, "lte" : 40};搜索 "term" :{"age": 21} 可以搜索该值
日期 date 由于Json没有date类型,所以es通过识别字符串是否符合format定义的格式来判断是否为date类型;format默认为strict_date_optiona_time||epoch_millis;格式"2022-01-01","2022/01/01 12:10:30",或从开始纪元(1970年年1⽉ 1⽇日 0点) 开始的毫秒数

复杂数据类型

  • 数组类型 Array

    • ES中没有专门的数组类型, 直接使用[]定义即可,数组中所有的值必须是同一种数据类型, 不支持混合数据类型的数组
    • 字符串数组 [ "one", "two" ] ,整数数组 [ 1, 2 ]
    • Object对象数组 [ { "name": "Louis", "age": 18 }, { "name": "Daniel", "age": 17 }]
    • 同一个数组只能存同类型的数据,不能混存,譬如 [ 10, "some string" ] 是错误的
  • 对象类型 Object

    • 对象类型可能有内部对象

      {
          "name": "李蒙",
          "age": 14,
          "sex": "0",
          "class": "7(2)班",
          "birthday": "2005-10-15"
          "hobbies": [
              "阅读",
              "跑步"
          ],
          "address":
          {
              "province": "山东",
              "location":
              {
                  "city": "日照"
              }
          }
      }
      

专⽤用数据类型

  • IP类型

    IP类型的字段⽤用于存储IPv4或IPv6的地址, 本质上是⼀一个⻓长整型字段

索引

功能 请求方式 url 参数
新增 PUT(必须) localhost:9200/stu -
获取 GET localhost:9200/stu -
删除 DELETE localhost:9200/stu -
批量获取 GET localhost:9200/stu,tea -
获取所有1 GET localhost:9200/_all -
获取所有2 GET localhost:9200/_cat/indices?v -
存在 HEAD localhost:9200/stu -
关闭 POST localhost:9200/stu/_close -
打开 POST localhost:9200/stu/_open -
自动创建索引 PUT localhost:9200/_cluster/settings 见下
数据复制 POST localhost:9200/_reindex 见下
  • 新增

    PUT localhost:9200/stu

    // 响应
    {
        "acknowledged": true,
        "shards_acknowledged": true,
        "index": "stu"
    }
    
  • 获取

    GET localhost:9200/stu

    // 响应
    {
        "stu": {
            "aliases": {},//别名
            "mappings": {},//映射
            "settings": {
                "index": {
                    "creation_date": "1576139082806",//创建时间
                    "number_of_shards": "1",//分片
                    "number_of_replicas": "1",//副本
                    "uuid": "-ocQkbgoSyG2vDTsugK_9Q",
                    "version": {
                        "created": "7020099"
                    },
                    "provided_name": "stu"
                }
            }
        }
    }
    
  • 删除

    DELETE localhost:9200/stu

    {
        "acknowledged": true
    }
    
  • 批量获取

    GET localhost:9200/stu,tea

    // 响应
    {
        "stu": {
            "aliases": {},
            "mappings": {},
            "settings": {
                "index": {
                    "creation_date": "1576139586417",
                    "number_of_shards": "1",
                    "number_of_replicas": "1",
                    "uuid": "H9dyTutEQg-4OsV2Byt-gA",
                    "version": {
                        "created": "7020099"
                    },
                    "provided_name": "stu"
                }
            }
        },
        "tea": {
            "aliases": {},
            "mappings": {},
            "settings": {
                "index": {
                    "creation_date": "1576139593175",
                    "number_of_shards": "1",
                    "number_of_replicas": "1",
                    "uuid": "nYhKuggbT_Wa2RI-M_COGA",
                    "version": {
                        "created": "7020099"
                    },
                    "provided_name": "tea"
                }
            }
        }
    }
    
  • 获取所有1

    GET localhost:9200/_all

    // 响应
    {
        "stu": {
            "aliases": {},
            "mappings": {},
            "settings": {
                "index": {
                    "creation_date": "1576139586417",
                    "number_of_shards": "1",
                    "number_of_replicas": "1",
                    "uuid": "H9dyTutEQg-4OsV2Byt-gA",
                    "version": {
                        "created": "7020099"
                    },
                    "provided_name": "stu"
                }
            }
        },
        "tea": {
            "aliases": {},
            "mappings": {},
            "settings": {
                "index": {
                    "creation_date": "1576139593175",
                    "number_of_shards": "1",
                    "number_of_replicas": "1",
                    "uuid": "nYhKuggbT_Wa2RI-M_COGA",
                    "version": {
                        "created": "7020099"
                    },
                    "provided_name": "tea"
                }
            }
        }
    }
    
  • 获取所有2

    GET localhost:9200/_cat/indices?v

    // 响应
    health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .kibana_task_manager            IjZxE0H9TtmTrpBgzjr-qg   1   0          2            0     12.8kb         12.8kb
    yellow open   stu                             H9dyTutEQg-4OsV2Byt-gA   1   1          0            0       283b           283b
    yellow open   tea                             nYhKuggbT_Wa2RI-M_COGA   1   1          0            0       283b           283b
    
  • 存在

    HEAD localhost:9200/stu

    // 响应-存在
    200 ok
    
  • 关闭

    POST localhost:9200/stu/_close

    // 响应
    {
        "acknowledged": true,
        "shards_acknowledged": true
    }
    
  • 打开

    POST localhost:9200/stu/_open

    // 响应
    {
        "acknowledged": true,
        "shards_acknowledged": true
    }
    
  • 自动创建索引

    插入文档时(见下)是否自动创建索引

    GET 请求http://localhost:9200/_cluster/settings 查看auto_create_index 的状态

    true自动创建

    • 修改auto_create_index 的状态

      PUT localhost:9200/_cluster/settings

      // 参数
      {
          "persistent": {
              "action.auto_create_index": "true"//true或false
          }
      }
      
  • 数据复制(结合索引别名,可以重建索引并导入数据)

    POST localhost:9200/_reindex

    {
      "source": {
        "index": "stu"
      },
      "dest": {
        "index": "stu_oth"
      }
    }
    

索引别名

在开发中,随着业务需求的迭代,较⽼的业务逻辑就要⾯临更新甚⾄是重构,⽽对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做⼀些修改,⽐如对某些字段做调整,甚⾄是重建索引。⽽做这些操作的时候,可能会对业务造成影响,甚⾄是停机调整等问题。由此,es提供了索引别名来解决这些问题。 索引别名就像⼀个快捷⽅式或是软连接,可以指向⼀个或多个索引,也可以给任意⼀个需要索引名的API来使⽤。别名的应⽤为程序提供了极⼤地灵活性

多个索引可以指定同一个别名,一个索引也可以指定多个别名

功能 请求方式 url 参数
查询 GET localhost:9200/_alias; localhost:9200/stu/_alias -
新增 POST localhost:9200/_aliases 见下
新增 PUT localhost:9200/stu/_alias/stu_v1.0 -
删除 POST localhost:9200/_aliases 见下
删除 DELETE localhost:9200/stu/_alias/stu_v1.0 -
重命名 POST localhost:9200/_aliases 见下
  • 新增

    POST localhost:9200/_aliases

    {
      "actions": [
        {
          "add": {
            "index": "stu",
            "alias": "stu_1214"
          }
        }
      ]
    }
    
  • 删除

    POST localhost:9200/_aliases

    {
      "actions": [
        {
          "remove": {
            "index": "stu",
            "alias": "stu_v1.1"
          }
        }
      ]
    }
    
  • 重命名

    POST localhost:9200/_aliases

    {
      "actions": [
        {
          "remove": {
            "index": "stu",
            "alias": "stu_1214"
          }
        },
        {
          "add": {
            "index": "stu",
            "alias": "stu_1215"
          }
        }
      ]
    }
    
  • 当别名指定了多个索引,可以指定写某个索引

    POST localhost:9200/_aliases

    {
      "actions": [
        {
          "add": {
            "index": "stu",
            "alias": "alia_v1.0",
            "is_write_index": "true"
          }
        },
        {
          "add": {
            "index": "tea",
            "alias": "alia_v1.0"
          }
        }
      ]
    }
    

映射

功能 请求方式 url 参数
新增 PUT localhost:9200/stu/_mapping 见下
获取 GET localhost:9200/stu/_mapping -
批量获取 GET localhost:9200/stu,tea/_mapping -
获取所有1 GET localhost:9200/_mapping -
获取所有2 GET localhost:9200/_all/_mapping -
修改 PUT localhost:9200/stu/_mapping 见下
  • 新增

    PUT localhost:9200/stu/_mapping

    // 参数
    {
        "properties":
        {
            "name":
            {
                "type": "text"
            },
            "age": {
                "type": "long"
            },
            "sex": {
                "type": "keyword"
            },
            "class": {
                "type": "keyword"
            }
        }
    }
    
  • 获取

    GET localhost:9200/stu/_mapping

    // 响应
    {
        "stu": {
            "mappings": {
                "properties": {
                    "age": {
                        "type": "long"
                    },
                    "class": {
                        "type": "keyword"
                    },
                    "name": {
                        "type": "text"
                    },
                    "sex": {
                        "type": "keyword"
                    }
                }
            }
        }
    }
    
  • 批量获取

    GET localhost:9200/stu,tea/_mapping

    // 响应
    {
        "tea": {
            "mappings": {}
        },
        "stu": {
            "mappings": {
                "properties": {
                    "age": {
                        "type": "long"
                    },
                    "class": {
                        "type": "keyword"
                    },
                    "name": {
                        "type": "text"
                    },
                    "sex": {
                        "type": "keyword"
                    }
                }
            }
        }
    }
    
  • 获取所有1

    GET localhost:9200/_mapping

    // 响应
    {
        "stu": {
            "mappings": {
                "properties": {
                    "age": {
                        "type": "long"
                    },
                    "class": {
                        "type": "keyword"
                    },
                    "name": {
                        "type": "text"
                    },
                    "sex": {
                        "type": "keyword"
                    }
                }
            }
        },
        "tea": {
            "mappings": {}
        },
    }
    
  • 获取所有2

    GET localhost:9200/_all/_mapping

    // 响应
    {
        "stu": {
            "mappings": {
                "properties": {
                    "age": {
                        "type": "long"
                    },
                    "class": {
                        "type": "keyword"
                    },
                    "name": {
                        "type": "text"
                    },
                    "sex": {
                        "type": "keyword"
                    }
                }
            }
        },
        "tea": {
            "mappings": {}
        },
    }
    
  • 修改

    注意:

    修改映射时,只能新增字段,不能修改或删除已存在的字段

    PUT localhost:9200/stu/_mapping

    // 参数
    {
        "properties":
        {
            "name":
            {
                "type": "text"
            },
            "age": {
                "type": "long"
            },
            "sex": {
                "type": "keyword"
            },
            "class": {
                "type": "keyword"
            },
            "birthday": {
                "type": "date"
            }
        }
    }
    

文档

功能 请求方式 url 参数
新增(指定id) PUT localhost:9200/stu/_doc/1 见下
新增(不指定id) POST(必须) localhost:9200/stu/_doc 见下
指定操作类型 PUT localhost:9200/stu/_doc/1?op_type=create 见下
查看 GET localhost:9200/stu/_doc/1 -
查看多个⽂文档 POST localhost:9200/_mget 见下
修改 POST localhost:9200/stu/_update/1 见下
删除 DELETE localhost:9200/stu/_doc/1 -
删除全部 POST localhost:9200/stu/_delete_by_query
  • 新增(指定id)

    PUT localhost:9200/stu/_doc/1

    // 参数
    {
        "name": "杨光",
        "age": 14,
        "sex": "1",
        "class": "7(2)班",
        "birthday": "2005-08-26"
    }
    
    // 响应
    {
        "_index": "stu",
        "_type": "_doc",
        "_id": "1",
        "_version": 1,
        "result": "created",
        "_shards": {
            "total": 2,
            "successful": 1,
            "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 3
    }
    
  • 新增(不指定id)

    不指定id,系统会自动分配id

    POST localhost:9200/stu/_doc

    // 参数
    {
        "name": "张世杰",
        "age": 13,
        "sex": "0",
        "class": "7(5)班",
        "birthday": "2004-11-01"
    }
    
    // 响应
    {
        "_index": "stu",
        "_type": "_doc",
        "_id": "C_SI-W4Bj7nk6pLmw4Er",//系统分配id
        "_version": 1,
        "result": "created",
        "_shards": {
            "total": 2,
            "successful": 1,
            "failed": 0
        },
        "_seq_no": 1,
        "_primary_term": 3
    }
    
  • 指定操作类型

    若不指定插入时的操作类型,向已存在的id插入数据,原数据会被更新掉,并生成一个新的版本

    PUT localhost:9200/stu/_doc/1

    {
     "name": "杨光11",
     "age": 14,
     "sex": "1",
     "class": "7(2)班",
     "birthday": "2005-08-26"
    }
    
    {
     "_index": "stu",
     "_type": "_doc",
     "_id": "1",
     "_version": 2,//产生新的版本
     "result": "updated",//执行结果时updated,而不是created
     "_shards": {
         "total": 2,
         "successful": 1,
         "failed": 0
     },
     "_seq_no": 2,
     "_primary_term": 3
    }
    

    PUT localhost:9200/stu/_doc/1?op_type=create (向已存在的id插入数据会报错)

    // 参数
    {
        "name": "杨光22",
        "age": 14,
        "sex": "1",
        "class": "7(2)班",
        "birthday": "2005-08-26"
    }
    
    // 响应
    {
        "error": {
            "root_cause": [
                {
                    "type": "version_conflict_engine_exception",
                    "reason": "[1]: version conflict, document already exists (current version [2])",
                    "index_uuid": "H9dyTutEQg-4OsV2Byt-gA",
                    "shard": "0",
                    "index": "stu"
                }
            ],
            "type": "version_conflict_engine_exception",
            "reason": "[1]: version conflict, document already exists (current version [2])",
            "index_uuid": "H9dyTutEQg-4OsV2Byt-gA",
            "shard": "0",
            "index": "stu"
        },
        "status": 409
    }
    
  • 查看

    GET localhost:9200/stu/_doc/1

    // 响应
    {
        "_index": "stu",
        "_type": "_doc",
        "_id": "1",
        "_version": 2,
        "_seq_no": 2,
        "_primary_term": 3,
        "found": true,
        "_source": {
            "name": "杨光11",
            "age": 14,
            "sex": "1",
            "class": "7(2)班",
            "birthday": "2005-08-26"
        }
    }
    
  • 查看多个⽂文档

    1. 方式一

      POST localhost:9200/_mget

      // 参数
      {
          "docs": [
          {
              "_index": "stu",
              "_type": "_doc",
              "_id": "1"
          },
          {
              "_index": "stu",
              "_type": "_doc",
              "_id": "C_SI-W4Bj7nk6pLmw4Er"
          }]
      }
      
      // 响应
      {
          "docs": [
              {
                  "_index": "stu",
                  "_type": "_doc",
                  "_id": "1",
                  "_version": 3,
                  "_seq_no": 3,
                  "_primary_term": 3,
                  "found": true,
                  "_source": {
                      "name": "杨光33",
                      "age": 14,
                      "sex": "1",
                      "class": "7(2)班",
                      "birthday": "2005-08-26"
                  }
              },
              {
                  "_index": "stu",
                  "_type": "_doc",
                  "_id": "C_SI-W4Bj7nk6pLmw4Er",
                  "_version": 1,
                  "_seq_no": 1,
                  "_primary_term": 3,
                  "found": true,
                  "_source": {
                      "name": "张世杰",
                      "age": 13,
                      "sex": "0",
                      "class": "7(5)班",
                      "birthday": "2004-11-01"
                  }
              }
          ]
      }
      
    2. 方式二

      POST localhost:9200/stu/_mget

      // 参数
      {
          "docs": [
          {
              "_type": "_doc",
              "_id": "1"
          },
          {
              "_type": "_doc",
              "_id": "C_SI-W4Bj7nk6pLmw4Er"
          }]
      }
      
    3. 方式三

      POST localhost:9200/stu/_doc/_mget

      // 参数
      {
          "docs": [
          {
              "_id": "1"
          },
          {
              "_id": "C_SI-W4Bj7nk6pLmw4Er"
          }]
      }
      
  • 修改

    1. 根据提供的⽂文档⽚片段更更新数据

      POST localhost:9200/stu/_update/1

      // 参数
      {
          "doc": {
              "name": "杨光33",
              "age": 14,
              "sex": "1",
              "class": "7(2)班",
              "birthday": "2005-08-26"
          }
      }
      
    2. 向_source字段,增加一个字段

      POST localhost:9200/stu/_update/1

      // 参数
      {
         "script": "ctx._source.height = \"173cm\""
      }
      
    3. 从_source字段,删除一个字段

      POST localhost:9200/stu/_update/1

      // 参数
      {
         "script": "ctx._source.remove(\"height\")"
      }
      
  • 删除

    DELETE localhost:9200/stu/_doc/1

    // 响应
    {
        "_index": "stu",
        "_type": "_doc",
        "_id": "1",
        "_version": 6,
        "result": "deleted",
        "_shards": {
            "total": 2,
            "successful": 1,
            "failed": 0
        },
        "_seq_no": 6,
        "_primary_term": 3
    }
    
  • 删除全部

    POST localhost:9200/stu/_delete_by_query

    // 参数
    {
        "query": {
            "match_all": {
                
            }
        }   
    }
    

查询搜索

数据准备:批量导入数据-ES提供了一个叫 bulk 的API 来进行批量操作

  • 数据

    {"index": {"_index": "stu", "_type": "_doc", "_id": 1}}
    {"name":"杨光","age":14,"sex":"1","class":"7(2)班","birthday":"2005-08-26"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 2}}
    {"name":"张世杰","age":13,"sex":"0","class":"7(5)班","birthday":"2004-11-01"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 3}}
    {"name":"李蒙","age":14,"sex":"0","class":"7(2)班","birthday":"2005-10-15"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 4}}
    {"name":"李沁","age":15,"sex":"0","class":"7(3)班","birthday":"2004-10-15"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 5}}
    {"name":"王昭","age":14,"sex":"1","class":"7(3)班","birthday":"2005-01-26"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 6}}
    {"name":"李明","age":14,"sex":"1","class":"7(2)班","birthday":"2005-03-26"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 7}}
    {"name":"张璐","age":14,"sex":"1","class":"7(5)班","birthday":"2005-06-02"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 8}}
    {"name":"李思敏","age":14,"sex":"1","class":"7(3)班","birthday":"2005-06-02"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 9}}
    {"name":"吴民锡","age":13,"sex":"1","class":"7(5)班","birthday":"2006-04-02"}
    {"index": {"_index": "stu", "_type": "_doc", "_id": 10}}
    {"name":"赵曦","age":14,"sex":"0","class":"7(2)班","birthday":"2005-09-02"}
    
    
  • POST bulk

    curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @name
    

term(词条)查询

单词级别查询-词条查询不会分析查询条件,只有当词条和查询字符串完全匹配时,才匹配搜索;这些查询通常用于结构化的数据,比如: number, date, keyword等,而不是对text。也就是说,全文本查询之前要先对文本内容进行分词,而单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询一般用于数值、日期等类型的字段上。

功能 请求方式 url 参数 描述
单条term查询 POST localhost:9200/stu/_search 见下 -
多条term查询 POST localhost:9200/stu/_search 见下 -
Exsit Query POST localhost:9200/stu/_search 见下 特定的字段中查找⾮非空值的⽂文档
Prefix Query POST localhost:9200/stu/_search 见下 查找包含带有指定前缀term的⽂文档
Wildcard Query POST localhost:9200/stu/_search 见下 支持通配符查询, *表示任意字符, ?表示任意单个字符
Regexp Query POST localhost:9200/stu/_search 见下 正则表达式查询
Ids Query POST localhost:9200/stu/_search 见下 通过id查询文档
  • 单条term查询

    POST localhost:9200/stu/_search

    // 参数
    {
        "query":{
            "term":{
                "sex": "1"
            }
        }
    }
    
    // 响应
    {
        "took": 1,
        "timed_out": false,
        "_shards": {
            "total": 1,
            "successful": 1,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": {
                "value": 1,
                "relation": "eq"
            },
            "max_score": 0.9808292,
            "hits": [
                {
                    "_index": "stu",
                    "_type": "_doc",
                    "_id": "1",
                    "_score": 0.9808292,
                    "_source": {
                        "name": "杨光",
                        "age": 14,
                        "sex": "1",
                        "class": "7(2)班",
                        "birthday": "2005-08-26"
                    }
                }
            ]
        }
    }
    
  • 多条term查询

    POST localhost:9200/stu/_search

    // 参数
    {
        "query":{
            "terms":{
                "sex": ["0","1"]
            }
        }
    }
    
  • Exsit Query

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "exists": {
          "field": "birthday"
        }
      }
    }
    
  • Prefix Query

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "prefix": {
          "class": {
            "value": "7"
          }
        }
      }
    }
    
  • Wildcard Query

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "wildcard": {
          "class": {
            "value": "*2*"
          }
        }
      }
    }
    
  • Regexp Query

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "regexp": {
          "class": "7.*"
        }
      }
    }
    
  • Ids Query

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "ids": {
          "values": [1,2]
        }
      }
    }
    

full text(全文)查询

ElasticSearch引擎会先分析查询字符串,将其拆分成多个分词,只要已分析的字段中包含词条的任意一个,或全部包含,就匹配查询条件,返回该文档;如果不包含任意一个分词,表示没有任何⽂文档匹配查询条件

类型 请求方式 url 参数 描述
match_all POST localhost:9200/stu/_search 见下 查询全部
match POST localhost:9200/stu/_search 见下 分词匹配查询
multi_match POST localhost:9200/stu/_search 见下 多字段查询
match_phrase POST localhost:9200/stu/_search 见下 精确匹配
match_phrase_prefix POST localhost:9200/stu/_search 见下 模糊匹配(text)
  • match_all

    POST localhost:9200/nba/_search

    // 参数
    {
        "query":{
            "match_all":{}
        },
        "from": 0,
        "size": 10
    }
    
  • match

    POST localhost:9200/nba/_search

    // 参数
    {
      "query": {
        "match": {
          "name": "张"
        }
      }
    }
    
  • multi_match

    POST localhost:9200/nba/_search

    // 参数
    {
      "query": {
        "multi_match": {
          "query": "世",// 查询条件
          "fields": ["name","class"]//查询哪些字段
        }
      }
    }
    
  • match_phrase

    POST localhost:9200/nba/_search

    // 参数
    {
      "query": {
        "match_phrase": {
          "class": "7(2)班"
        }
      }
    }
    
  • match_phrase_prefix

    // 参数
    {
      "query": {
        "match_phrase_prefix": {
          "name": "世杰"
        }
      }
    }
    

范围查询

范围查询--日期、数字或字符串

POST localhost:9200/nba/_search

// 查询年龄14-15岁的学生
{
  "query": {
    "range": {
      "age": {
        "gte": 14,
        "lte": 15
      }
    }
  }
}
// 查询2003年到2004年出生的学生
{
  "query": {
    "range": {
      "birthday": {
        "gte": "2003",
        "lte": "31-12-2004",
        "format": "dd-MM-yyyy||yyyy"
      }
    }
  }
}

布尔查询

类型 请求方式 url 参数 描述
must POST localhost:9200/nba/_search 见下 必须出现在匹配文档中
filter POST localhost:9200/nba/_search 必须出现在文档中,但是不打分
must_not POST localhost:9200/nba/_search 不能出现在文档中
should POST localhost:9200/nba/_search 应该出现在文档中
  • must

    POST localhost:9200/nba/_search

    // 查询sex为"0",name中含有"曦"的学生
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "曦"
              }
            },
            {
              "term": {
                "sex": {
                  "value": "0"
                }
              }
            }
          ]
        }
      }
    }
    
    // 响应
    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 2.9985561,
        "hits" : [
          {
            "_index" : "stu",
            "_type" : "_doc",
            "_id" : "10",
            "_score" : 2.9985561,// 分数
            "_source" : {
              "name" : "赵曦",
              "age" : 14,
              "sex" : "0",
              "class" : "7(2)班",
              "birthday" : "2005-09-02"
            }
          }
        ]
      }
    }
    
    
  • filter

    效果同must,但是不打分

    POST localhost:9200/nba/_search

    {
      "query": {
        "bool": {
          "filter": [
            {
              "match": {
                "name": "曦"
              }
            },
            {
              "term": {
                "sex": {
                  "value": "0"
                }
              }
            }
          ]
        }
      }
    }
    
  • must_not

    POST localhost:9200/nba/_search

    // 查询name包含"张",sex不是"0"的学生
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "张"
              }
            }
          ],
          "must_not": [
            {
              "term": {
                "sex": {
                  "value": "0"
                }
              }
            }
          ]
        }
      }
    }
    
  • should

    POST localhost:9200/nba/_search

    // 查询sex为"1"的学生
    {
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "sex": {
                  "value": "1"
                }
              }
            }
          ]
        }
      }
    }
    

    与其他模式结合使用时即使匹配不到也返回,只是评分不同

    // 查询name中包含"李",age在13-14之间的学生
    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name": "李"
              }
            }
          ],
          "should": [
            {
              "range": {
                "age": {
                  "gte": 13,
                  "lte": 14
                }
              }
            }
          ]
        }
      }
    }
    

排序查询

POST localhost:9200/nba/_search

// 查询7(5)班学生,age倒序排列
{
  "query": {
    "term": {
      "class": {
        "value": "7(5)班"
      }
    }
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

聚合查询

  • 聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。 ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力
  • 对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合
  • 而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在ES中称为桶聚合

指标聚合

  • max min sum avg

    POST localhost:9200/nba/_search

    // max-7(3)班最大年龄
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "maxAge": {// 自定义名称
          "max": {
            "field": "age"
          }
        }
      },
      "size": 0
    }
    
    // min-7(3)班最小年龄
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "minAge": {
          "min": {
            "field": "age"
          }
        }
      },
      "size": 0
    }
    
    // sum
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "sumAge": {
          "sum": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
    // avg-7(3)班平均年龄
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "avgAge": {// 自定义名称
          "avg": {
            "field": "age"
          }
        }
      },
      "size": 0
    }
    
  • value_count

    统计非空字段的文档数

    POST localhost:9200/nba/_search

    // 查询7(3)班年龄非空的学生总数
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "countAge": {
          "value_count": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
  • Cardinality

    值去重计数

    POST localhost:9200/nba/_search

    // 7(3)班age去重统计
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "cardinalityAge": {
          "cardinality": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
  • stats

    统计count max min avg sum 5个值

    POST localhost:9200/nba/_search

    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "statsAge": {
          "stats": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
  • Extended stats

    比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间

    POST localhost:9200/nba/_search

    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "extendedAge": {
          "extended_stats": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
  • Percentiles

    占比百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值

    POST localhost:9200/nba/_search

    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "percentilesAge": {
          "percentiles": {
            "field": "age"
          }
        }
      }, 
      "size": 0
    }
    
    // 指定分位值
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "percentilesAge": {
          "percentiles": {
            "field": "age",
            "percents": [
              20,
              50,
              75
            ]
          }
        }
      },
      "size": 0
    }
    

桶聚合

  • Terms Aggregation 根据字段项分组聚合

    POST localhost:9200/nba/_search

    // 7(3)班按照age分组
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "aggsAge": {
          "terms": {
            "field": "age",
            "size": 5
          }
        }
      },
      "size": 0
    }
    
  • order 分组聚合排序

    POST localhost:9200/nba/_search

    // 7(3)班按照age分组,分组信息通过年龄从大到小排序
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "aggsAge": {
          "terms": {
            "field": "age",
            "size": 5,
            "order": {
              "_key": "desc"
            }
          }
        }
      },
      "size": 0
    }
    
    // 7(3)班按照age分组,分组信息通过文档数从大到小排序
    {
      "query": {
        "term": {
          "class": {
            "value": "7(3)班"
          }
        }
      },
      "aggs": {
        "aggsAge": {
          "terms": {
            "field": "age",
            "size": 5,
            "order": {
              "_count": "desc"
            }
          }
        }
      },
      "size": 0
    }
    
    // 根据class分组,根据分组后的平均age倒排
    {
      "aggs": {
        "aggsClass": {
          "terms": {
            "field": "class",
            "size": 10,
            "order": {
              "aggsAge": "desc"
            }
          },
          "aggs": {
            "aggsAge": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      },
      "size": 0
    }
    
  • 筛选分组聚合

    POST localhost:9200/nba/_search

    {
      "aggs": {
        "aggsClass": {
          "terms": {
            "field": "class",
            "include": ["7(3)班", "7(2)班", "7(5)班"],// 包含
            "exclude": ["7(5)班"],// 排除
            "size": 10,
            "order": {
              "aggsAge": "desc"
            }
          },
          "aggs": {
            "aggsAge": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      },
      "size": 0
    }
    
    // 正则匹配
    // include,exclude类型要一致
    {
      "aggs": {
        "aggsClass": {
          "terms": {
            "field": "class",
            "include": "7.*",
            "exclude": "7(5)班",
            "size": 10,
            "order": {
              "aggsAge": "desc"
            }
          },
          "aggs": {
            "aggsAge": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      },
      "size": 0
    }
    
  • Range Aggregation 范围分组聚合

    POST localhost:9200/nba/_search

    // -13,13-14,15-  范围分组
    {
      "aggs": {
        "aggsrange": {
          "range": {
            "field": "age",
            "ranges": [
              {
                "to": 13
              },
              {
                "from": 13,
                "to": 14
              },
              {
                "from": 15
              }
            ]
          }
        }
      },
      "size": 0
    }
    
    // 范围分组-别名
    {
      "aggs": {
        "aggsrange": {
          "range": {
            "field": "age",
            "ranges": [
              {
                "to": 13,
                "key":"A"
              },
              {
                "from": 13,
                "to": 14,
                "key":"B"
              },
              {
                "from": 15,
                "key":"C"
              }
            ]
          }
        }
      },
      "size": 0
    }
    
  • Date Range Aggregation 时间范围分组聚合

    POST localhost:9200/nba/_search

    // Date 时间范围分组聚合
    {
      "aggs": {
        "aggsrange": {
          "date_range": {
            "field": "birthday",
            "format": "yyyy-MM",
            "ranges": [
              {
                "to": "2004-12",
                "key":"A"
              },
              {
                "from": "2005-01",
                "to": "2005-12",
                "key":"B"
              },
              {
                "from": "2006-01",
                "key":"C"
              }
            ]
          }
        }
      },
      "size": 0
    }
    
  • Date Histogram Aggregation 时间柱状图聚合

    按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合

    POST localhost:9200/nba/_search

    {
      "aggs": {
        "aggsrange": {
          "date_histogram": {
            "field": "birthday",
            "format": "yyyy",
            "calendar_interval": "year"
          }
        }
      },
      "size": 0
    }
    

query_string查询

  • 单个字段查询

    POST localhost:9200/nba/_search

    {
      "query": {
        "query_string": {
          "default_field": "name",
          "query": "李 AND 思 OR 敏"
        }
      }
    }
    
  • 多个字段查询

    POST localhost:9200/nba/_search

    {
      "query": {
        "query_string": {
          "fields": ["name", "sex"],
          "query": "李 AND 0"
        }
      }
    }
    

分词器

将⽤用户输入的一段文本,按照一定逻辑,分析成多个词语的一种工具

内置分词器

  • standard analyzer (标准分词器)

    标准分析器是默认分词器,如果未指定,则使用该分词器

  • simple analyzer

    simple 分析器当它遇到只要不是字母的字符,就将文本解析成term,而且所有的term都是
    小写的

  • whitespace analyzer

    whitespace 分析器,当它遇到空白字符时,就将文本解析成terms

  • stop analyzer

    stop 分析器 和 simple 分析器很像,唯一不同的是, stop 分析器增加了对删除停止词的支持,默认使⽤用了english停止词

    stopwords 预定义的停止词列表,比如 (the,a,an,this,of,at)等

  • language analyzer

    特定的语言的分词器,比如说, english,英语分词器),内置语言: arabic, armenian,basque, bengali, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish,french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian,lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish,swedish, turkish, thai

  • pattern analyzer

    用正则表达式来将文本分割成terms,默认的正则表达式是\W+(非单词字符)

eg:

GET /_analyze
{
  "analyzer": "simple", 
  "text": "Deploy a 14-day trial of Elasticsearch Service."
}
{
  "tokens" : [
    {
      "token" : "deploy",
      "start_offset" : 0,// 开始偏移量
      "end_offset" : 6,// 结束偏移量
      "type" : "word",
      "position" : 0 // 索引
    },
    {
      "token" : "a",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "day",
      "start_offset" : 12,
      "end_offset" : 15,
      "type" : "word",
      "position" : 2
    },
    {
      "token" : "trial",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "word",
      "position" : 3
    },
    {
      "token" : "of",
      "start_offset" : 22,
      "end_offset" : 24,
      "type" : "word",
      "position" : 4
    },
    {
      "token" : "elasticsearch",
      "start_offset" : 25,
      "end_offset" : 38,
      "type" : "word",
      "position" : 5
    },
    {
      "token" : "service",
      "start_offset" : 39,
      "end_offset" : 46,
      "type" : "word",
      "position" : 6
    }
  ]
}

中文分词器

  • smartCN

    一个简单的中文或中英文混合文本的分词器

    • 安装 (重启服务后使用)

      sh elasticsearch-plugin install analysis-smartcn 
      
    • eg:

      GET /_analyze
      {
        "analyzer": "smartcn", 
        "text": "有限公司"
      }
      
      {
        "tokens" : [
          {
            "token" : "有限公司",
            "start_offset" : 0,
            "end_offset" : 4,
            "type" : "word",
            "position" : 0
          }
        ]
      }
      
  • IK分词器

    更智能更友好的中文分词器

    • 下载 https://github.com/medcl/elasticsearch-analysis-ik/releases (版本要对应)

    • 安装 解压到es安装目录-plugins目录

    • eg:

      GET /_analyze
      {
        "analyzer": "ik_max_word", 
        "text": "有限公司"
      }
      
      {
        "tokens" : [
          {
            "token" : "有限公司",
            "start_offset" : 0,
            "end_offset" : 4,
            "type" : "CN_WORD",
            "position" : 0
          },
          {
            "token" : "有限",
            "start_offset" : 0,
            "end_offset" : 2,
            "type" : "CN_WORD",
            "position" : 1
          },
          {
            "token" : "公司",
            "start_offset" : 2,
            "end_offset" : 4,
            "type" : "CN_WORD",
            "position" : 2
          }
        ]
      }
      

refresh

新的数据已添加到索引中⽴⻢就能搜索到,但是真实情况不是这样的

  • 先添加⼀个⽂档,再⽴刻搜索,获取不到新添加的数据

    curl -X PUT localhost:9200/stu/_doc/666 -H 'Content-Type:application/json' -d '{ "name": "王丝菲" }'
    curl -X GET localhost:9200/stu/_doc/_search?pretty
    
  • 强制刷新

    curl -X PUT localhost:9200/stu/_doc/667?refresh -H 'Content-Type:application/json' -d '{ "name": "王豆豆" }'
    curl -X GET localhost:9200/stu/_doc/_search?pretty
    
  • 修改默认更新时间(默认时间是1s)

    PUT localhost:9200/stu/_settings

    {
      "index": {
        "refresh_interval": "5s"
      }
    }
    
  • 将refresh关闭

    PUT localhost:9200/stu/_settings

    {
      "index": {
        "refresh_interval": "-1"
      }
    }
    

高亮查询

  • 高亮查询

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "match": {
          "name": "赵"
        }
      },
      "highlight": {
        "fields": {
          "name": {}
        }
      }
    }
    
    // 相应
    {
      "took" : 4,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 2.4191523,
        "hits" : [
          {
            "_index" : "stu",
            "_type" : "_doc",
            "_id" : "10",
            "_score" : 2.4191523,
            "_source" : {
              "name" : "赵曦",
              "age" : 14,
              "sex" : "0",
              "class" : "7(2)班",
              "birthday" : "2005-09-02"
            },
            "highlight" : {
              "name" : [
                "曦"
              ]
            }
          }
        ]
      }
    }
    
  • 自定义高亮查询

    POST localhost:9200/stu/_search

    // 参数
    {
      "query": {
        "match": {
          "name": "赵"
        }
      },
      "highlight": {
        "fields": {
          "name": {
            "pre_tags": ["

    "], "post_tags": ["

    "] } } } }
    // 响应
    {
      "took" : 6,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 2.4191523,
        "hits" : [
          {
            "_index" : "stu",
            "_type" : "_doc",
            "_id" : "10",
            "_score" : 2.4191523,
            "_source" : {
              "name" : "赵曦",
              "age" : 14,
              "sex" : "0",
              "class" : "7(2)班",
              "birthday" : "2005-09-02"
            },
            "highlight" : {
              "name" : [
                "

    曦" ] } } ] } }

查询建议

查询建议,是为了给⽤户提供更好的搜索体验。包括:词条检查,⾃动补全

字段类型

类型 描述
text 指定搜索文本
field 获取建议器的搜索字段
analyzer 指定分词器
size 每个词返回的最大建议词数
sort 如何对建议词进行排序,可用选项:score-先按评分排序,再按文档频率排序,term顺序;frequency:先按文档频率排序,再按评分排序,term顺序;
suggest_mode 建议模式,控制提供建议词的方式:missing-仅在搜索的词项在索引中不存在时才提供建议词,默认值;popular-仅建议文档频率比搜索词项高的词;always-总是提供匹配的建议词;

suggester

  • Term suggester

    term 词条建议器,对给输⼊的文本进⾏分词,为每个分词提供词项建议

    POST localhost:9200/stu/_search

    // 参数
    {
      "suggest": {
        "MY_SUGGESTION": {
          "text": "7(6)班",
          "term": {
            "suggest_mode": "missing",
            "field": "class"
          }
        }
      }
    }
    
    {
      "took" : 105,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 0,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      },
      "suggest" : {
        "MY_SUGGESTION" : [
          {
            "text" : "7(6)班",
            "offset" : 0,
            "length" : 5,
            "options" : [
              {
                "text" : "7(2)班",
                "score" : 0.8,
                "freq" : 4
              },
              {
                "text" : "7(3)班",
                "score" : 0.8,
                "freq" : 3
              },
              {
                "text" : "7(5)班",
                "score" : 0.8,
                "freq" : 3
              }
            ]
          }
        ]
      }
    }
    
  • Phrase suggester

    phrase 短语建议,在term的基础上,会考量多个term之间的关系,⽐如是否同时出现在索

    引的原文里,相邻程度,以及词频等

    POST localhost:9200/stu/_search

    // 参数
    {
      "suggest": {
        "MY_SUGGESTION": {
          "text": "7(2) 班",
          "phrase": {
            "field": "class"
          }
        }
      }
    }
    
    // 响应
    {
      "took" : 17,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 0,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      },
      "suggest" : {
        "MY_SUGGESTION" : [
          {
            "text" : "7(2) 班",
            "offset" : 0,
            "length" : 6,
            "options" : [
              {
                "text" : "7(2)班",
                "score" : 0.4678218
              },
              {
                "text" : "7(3)班",
                "score" : 0.37474233
              },
              {
                "text" : "7(5)班",
                "score" : 0.37474233
              }
            ]
          }
        ]
      }
    }
    
  • Completion suggester

    完成建议,自动补充查询内容后面的内容

    POST localhost:9200/stu/_search

    // 要查询字段的类型必须是 completion
    {
      "suggest": {
        "MY_SUGGESTION": { // 自定义名称
          "text": "I like",
          "completion": {
            "field": "selfDesc"
          }
        }
      }
    }
    

你可能感兴趣的:(ElasticSearch 7.x)