三、ElasticSearch搜索核心语法

目录

  • 一、文档批量操作
    • 1. _mget(批量获取文档数据)
    • 2. _bulk(批量创建文档数据)
    • 3. _bulk(批量删除文档数据)
    • 4. _bulk(批量修改文档数据)
  • 二、DSL语言高级查询
    • 1.概述
    • 2.无查询条件
    • 3.有查询条件
      • 3.1叶子条件查询(单字段查询条件)
      • 3.2 组合条件查询(多条件查询)
      • 3.3 连接查询(多文档合并查询)
      • 3.4 查询DSL(query DSL)和过滤DSL(filter DSL)区别
    • 4.query查询举例
      • 4.1term查询
      • 4.2match模糊查询
      • 4.3multi_match多字段模糊匹配查询与指定字段查询
      • 4.4query_string 未指定字段和指定字段条件查询 , 含 AND 与 OR 条件
      • 4.5range范围查询
      • 4.6分页、输出指定字段、按字段排序查询
    • 5.filter查询举例
    • 6.match、term、match_phase、query_string总结
  • 三、文档映射
    • 1.动态映射
    • 2.静态映射
    • 3.获取文档映射
    • 4.核心类型
    • 5.keyword 与 text 映射类型的区别
    • 6.创建静态映射时指定text类型的ik分词器
    • 7.对已存在的mapping映射进行修改
  • 8.Elasticsearch乐观并发控制

一、文档批量操作

1. _mget(批量获取文档数据)

(1)在URL中不指定index和type
请求方式:GET
请求地址:_mget
功能说明:通过id获取不index(索引库)和type(表)的数据
请求参数:

"docs":[
        {
            "_index":"索引库名称",
            "_type":"类型默认是_doc",
            "_id":主键
        }, #多个条件使用逗号分割
        {
        .............
		}
    ]

_type在es7.x版本后为默认,已经不推荐使用_type查询(如果你的是7版本及以上版本,可以不加)

ES语句示例:
我是查询的两个索引库(test_index01,test_index02)

GET _mget
{
    "docs":[
        {
            "_index":"test_index01",
            "_id":1
        },
        {
            "_index":"test_index02",
            "_id":1
        }
    ]
}

结果

{
  "docs" : [
    {
      "_index" : "test_index01",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 3,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "秀儿",
        "sex" : 1,
        "age" : 25,
        "address" : "上海",
        "remark" : "java"
      }
    },
    {
      "_index" : "test_index02",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 3,
      "_seq_no" : 2,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "李四",
        "sex" : 1,
        "age" : 28,
        "address" : "北京",
        "remark" : "java"
      }
    }
  ]
}

(2)在URL中指定index
请求方式:GET
请求地址:GET /索引库名称/_mget
功能说明:批量查询指定索引库的内容
请求参数:

GET /索引库名称/_mget
{
    "docs":[
        {
            "_id":主键id
        },
        {
            "_id":主键id
        }
    ]
}

示例

GET /test_index01/_mget
{
    "docs":[
        {
            "_id":1
        },
        {
            "_id":2
        }
    ]
}

结果

{
  "docs" : [
    {
      "_index" : "test_index01",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 3,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "秀儿",
        "sex" : 1,
        "age" : 25,
        "address" : "上海",
        "remark" : "java"
      }
    },
    {
      "_index" : "test_index01",
      "_type" : "_doc",
      "_id" : "2",
      "found" : false
    }
  ]
}

2. _bulk(批量创建文档数据)

关于put和post方法在上一篇文章中已经有讲过,在这里大家可自行执行一下两种方式看看差异。下边以post方式进行讲解

(1)批量创建文档
在URL中不指定index和type
请求方式:POST
请求地址:_bulk
功能说明:对文档数据进行增删改
请求参数:
一般是有两行参数(或者很多成对的参数)
第一行:指定操作类型(增删改)和操作的对象(index,type,id)
第二行:操作的数据

1 {"actionName":{"_index":"indexName", "_type":"typeName","_id":"id"}}
2 {"field1":"value1", "field2":"value2"}
POST _bulk
{"操作类型(create,index,delete,update)":{"_index":"索引库名称", "_type":"类型(7.x以后版本可省略)", "_id":指定主键}}
{"title":"文档数据具体内容","content":"文档数据具体内容","tags":["文档数据具体内容", "文档数据具体内容"]}

示例
这里的id:3是业务中的id,不是es的主键,es主键是_id:9这样的

POST _bulk
{"create":{"_index":"test_index03", "_id":9}}
{"id":3,"title":"1","content":"111","tags":["java", "面向对象"]}
{"create":{"_index":"test_index03","_id":810}}
{"id":4,"title":"2","content":"222","tags":["java", "面向对象"]}

结果

{
  "took" : 3,
  "errors" : false,
  "items" : [
    {
      "create" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "9",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 12,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "810",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 13,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

(2)普通创建或替换index
如果原文档不存在,则是创建
如果存在,则全量修改原文档
示例

 POST _bulk
 {"index":{"_index":"article", "_id":99}}
 {"title":"1","content":"2222","tags":["333", "444"]}

结果

{
  "took" : 5,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "article",
        "_type" : "_doc",
        "_id" : "99",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

再次执行时,结果变成"result" : “updated”

{
  "took" : 2,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "article",
        "_type" : "_doc",
        "_id" : "99",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

3. _bulk(批量删除文档数据)

在URL中不指定index和type
请求方式:POST
请求地址:_bulk
功能说明:批量删除索引库的文档内容
请求参数:

 POST _bulk
 {"delete":{"_index":"索引库名称", "_id":es主键}}

示例

 POST _bulk
 {"delete":{"_index":"test_index03", "_id":3}}
 {"delete":{"_index":"test_index03", "_id":4}}

结果

{
  "took" : 3,
  "errors" : false,
  "items" : [
    {
      "delete" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 7,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 14,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "delete" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 7,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 15,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

4. _bulk(批量修改文档数据)

在URL中不指定index和type
请求方式:POST
请求地址:_bulk
功能说明:批量修改文档数据
请求参数:

 POST _bulk
 {"update":{"_index":"索引库名称", "_id":es主键}}
 {"doc":{"更新的字段名称":"更新内容"}}

示例

 POST _bulk
 {"update":{"_index":"test_index03", "_id":3}}
 {"doc":{"title":"ES"}}
 {"update":{"_index":"test_index03",  "_id":4}}
 {"doc":{"content":"修改内容"}}

结果(我这里是之前删除过,我新增完之后进行的修改)

{
  "took" : 35,
  "errors" : false,
  "items" : [
    {
      "update" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 3,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 20,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "test_index03",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 3,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 21,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

二、DSL语言高级查询

1.概述

Domain Specific Language(领域专用语言)
Elasticsearch提供了基于JSON的DSL来定义查询。
DSL由叶子查询子句和复合查询子句两种子句组成。
三、ElasticSearch搜索核心语法_第1张图片

2.无查询条件

无条件查询,默认查询所有

GET /test_index03/_doc/_search
{
    "query":{
        "match_all":{
        }
    }
}

3.有查询条件

3.1叶子条件查询(单字段查询条件)

(1)模糊查询

模糊匹配主要是针对文本类型的字段,文本类型的字段会对文本内容进行分词,在查询时,你查询的的条件所传的内容也会进行分词,然后通过倒排索引查找到匹配的数据,模糊匹配主要通过match等参数来实现。.

①match:通过match关键词模糊匹配条件内容
②prefix:前缀匹配
③regexp:通过正则表达式来匹配数据
match的复杂用法

  1. match条件还支持以下参数:
  2. query : 指定匹配的值
  3. operator : 匹配条件类型
    • and : 条件分词后都要匹配
    • or : 条件分词后有一个匹配即可(默认)
  4. minmum_should_match : 指定最小匹配的数量

(2) 精确匹配

  1. term : 单个条件相等
  2. terms : 单个字段属于某个值数组内的值
  3. range : 字段属于某个范围内的值
  4. exists : 某个字段的值是否存在
  5. ids : 通过ID批量查询

3.2 组合条件查询(多条件查询)

组合条件查询是将叶子条件查询语句进行组合而形成的一个完整的查询条件

  1. bool : 各条件之间有and,or或not的关系
    • must : 各个条件都必须满足,即各条件是and的关系
    • should : 各个条件有一个满足即可,即各条件是or的关系
    • must_not : 不满足所有条件,即各条件是not的关系
    • filter : 不计算相关度评分,它不计算_score即相关度评分,效率更高
  2. constant_score: 不计算相关度评分

must/filter/shoud/must_not 等的子条件是通过
term/terms/range/ids/exists/match 等叶子条件为参数的

3.3 连接查询(多文档合并查询)

  1. 父子文档查询:parent/child
  2. 嵌套文档查询: nested

3.4 查询DSL(query DSL)和过滤DSL(filter DSL)区别

DSL查询语言中存在两种:query,filter。
三、ElasticSearch搜索核心语法_第2张图片

4.query查询举例

4.1term查询

term查询不会对字段进行分词查询,会采用精确匹配

采用term精确查询, 查询字段映射类型属于为keyword(映射相关请看目录)

索引内容

GET /test_index02/_search
{
    "query":{
        "match_all":{
        }
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "C9bd_IYBuSAClcwx8fGZ",
        "_score" : 1.0,
        "_source" : {
          "name" : "李四",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "java"
        }
      }
    ]
  }
}

示例1

POST /test_index02/_search
{
    "query":{
        "term":{
            "age":"28"
        }
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "李四",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "java"
        }
      }
    ]
  }
}

示例2

POST /test_index02/_search
{
    "query":{
        "term":{
            "name":"秀儿"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

示例3

POST /test_index02/_search
{
    "query":{
        "term":{
            "name.keyword":"秀儿"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6931471,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

4.2match模糊查询

match会根据该字段的分词器,进行分词查询

示例

POST /test_index02/_search
{
    "from":0,
    "size":2,
    "query":{
        "match":{
            "name":"秀儿"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.3862942,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3862942,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

4.3multi_match多字段模糊匹配查询与指定字段查询

这里只是简单的做演示,更加详细的方法,需要大家自己去了解

索引所有数据

"hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "DNbg_IYBuSAClcwx4fEv",
        "_score" : 1.0,
        "_source" : {
          "name" : "李四",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 1.0,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Dtbv_IYBuSAClcwx5fFw",
        "_score" : 1.0,
        "_source" : {
          "name" : "赵六",
          "sex" : 1,
          "age" : 28,
          "address" : "秀儿",
          "remark" : "python"
        }
      }
    ]

示例(指定字段)

POST /test_index02/_search
{
    "query":{
        "multi_match":{
            "query":"秀儿",
            "fields":[
                "remark",
                "name"
            ]
        }
    }
}

结果

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.9616582,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.9616582,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 1.3921447,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      }
    ]
  }
}

示例2

POST /test_index02/_search
{
    "query":{
        "multi_match":{
            "query":"秀儿"
        }
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 2.4079456,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 2.4079456,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Dtbv_IYBuSAClcwx5fFw",
        "_score" : 1.9333868,
        "_source" : {
          "name" : "赵六",
          "sex" : 1,
          "age" : 28,
          "address" : "秀儿",
          "remark" : "python"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 1.5779729,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      }
    ]
  }
}

示例3(字段使用通配符)

POST /test_index02/_search
{
    "query":{
        "multi_match":{
            "query":"秀儿",
            "fields":[
                "address",
                "na*"
            ]
        }
    }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.4079456,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 2.4079456,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Dtbv_IYBuSAClcwx5fFw",
        "_score" : 1.9333868,
        "_source" : {
          "name" : "赵六",
          "sex" : 1,
          "age" : 28,
          "address" : "秀儿",
          "remark" : "python"
        }
      }
    ]
  }
}

4.4query_string 未指定字段和指定字段条件查询 , 含 AND 与 OR 条件

AND 和 OR 需要大写
未指定示例

POST /test_index02/_search
{
    "query":{
        "query_string":{
            "query":"王五 OR 李四"
        }
    }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.4079456,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "DNbg_IYBuSAClcwx4fEv",
        "_score" : 2.4079456,
        "_source" : {
          "name" : "李四",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "java"
        }
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 2.4079456,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      }
    ]
  }
}

未指定示例

POST /test_index02/_search
{
    "query":{
        "query_string":{
            "query":"王五 AND 秀儿"
        }
    }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 3.9859185,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 3.9859185,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      }
    ]
  }
}

指定字段示例(字段使用了通配符na*,这里只有一个name字段符合)

POST /test_index02/_search
{
    "query":{
        "query_string":{
            "query":"王五 OR 张三",
            "fields":[
                "address",
                "na*"
            ]
        }
    }
}

结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.4079456,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : 2.4079456,
        "_source" : {
          "name" : "王五",
          "sex" : 1,
          "age" : 28,
          "address" : "北京",
          "remark" : "秀儿多字段"
        }
      }
    ]
  }
}

4.5range范围查询

gte 大于等于, lte 小于等于, gt 大于, lt 小于。

示例

POST /test_index02/_search
{
    "query":{
        "range":{
            "age":{
                "gte":25,
                "lte":26
            }
        }
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

4.6分页、输出指定字段、按字段排序查询

示例

POST /test_index02/_search
{
    "query":{
        "range":{
            "age":{
                "gte":25,
                "lte":28
            }
        }
    },
    "from":0,
    "size":2,
    "_source":[
        "name",
        "age"
    ],
    "sort":{
        "age":"desc",
        "sex":"desc"
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "DNbg_IYBuSAClcwx4fEv",
        "_score" : null,
        "_source" : {
          "name" : "李四",
          "age" : 28
        },
        "sort" : [
          28,
          1
        ]
      },
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "Ddbm_IYBuSAClcwxbfER",
        "_score" : null,
        "_source" : {
          "name" : "王五",
          "age" : 28
        },
        "sort" : [
          28,
          1
        ]
      }
    ]
  }
}

5.filter查询举例

Filter的查询不会计算相关性分值,也不会对结果进行排序, 因此效率会高一点,查询的结果可以被缓存

示例

POST /test_index02/_search
{
    "query":{
        "bool":{
            "filter":{
                "term":{
                    "age":25
                }
            }
        }
    }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "test_index02",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "name" : "秀儿",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

6.match、term、match_phase、query_string总结

英文分词:按照空格拆分成功单个的单词

以hello july为例
假如分词最细粒度为:hello,july。这个在第一篇博客中有举例(分词过程)。

  1. match
    模糊匹配,需要指定字段名称,但是输入查询的内容会进行分词,比如查询"hello july",会先分词,拆成hello和july两部分,然后根据倒排索引的规则去搜索索引库中对应的内容。如果字段中包含hello或者july,或 hello july,都会被查询出来。
  2. term
    有些时候term和match查询结果是相同的,比如term查询单个单词july,那么和match查询结果就是一样的,term查询不会把内容进行分词,也就是说july不会在被拆分成更细的词。而match查询july,虽然match会把查询内容进行分词,但是july已经是最细粒度的分词了,然后都是拿着july去倒排索引查询,所以他俩查询结果是一样的。
    但是如果term查询hello july就不同了,hello july当成一个整体去倒排索引查询,不会被分词,而match就会分词,这样查询结果不是一样了。
  3. match_phase
    match_phase也会对查询的内容进行分词,但是要求结果中也得包含所有的分词,而且顺序要一样,以查询"hello july"为例,结果中必须包含hello 和 july,而且要求他俩是连在一起出现的,顺序也是固定的,可以理解为"hello july"为最细粒度进行了模糊查询。并且"july hello"顺序不对不满足查询条件也不会出现在结果中。
  4. query_string
    和match相似,但是match需要指定字段名,而query_string是在所有字段进行搜索。

三、文档映射

ElasticSearch映射分为两种动态映射和静态映射。

1.动态映射

在关系数据库中(例如mysql),需要先创建数据库,数据表,然后定义字段,字段类型等,最后配置完才能往这张表插入数据。而ElasticSearch中可以不需要定义mapping映射(mysql中的表,字段,类型等),在文档写入ElasticSearch时,会根据文档字段自动识别类型,这种机制称为动态映射。

动态映射规则如下:

json数据 自动映射类型
null 没有添加任何字段
true或false boolean类型
小数 float类型
数字 long类型
日期 date或text
字符串 text类型
数组 由数组第一个非空值决定
json对象 object类型

2.静态映射

在ElasticSearch中也可以事先定义好映射,包含文档的各字段类型,分词器等,这样的方式就是静态映射。(设置完的字段映射关系,是不支持修改的)

3.获取文档映射

查询

GET /test_index02/_mapping

结果

{
  "test_index02" : {
    "mappings" : {
      "properties" : {
        "address" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
          "type" : "long"
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "remark" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "sex" : {
          "type" : "long"
        }
      }
    }
  }
}

设置静态映射
"index":true:是否对该字段进行分词索引
"store":true:是否对该字段保存原数据

PUT /test_index05
{
    "mappings":{
        "properties":{
            "name":{
                "type":"keyword",
                "index":true,
                "store":true
            },
            "sex":{
                "type":"integer",
                "index":true,
                "store":true
            },
            "age":{
                "type":"integer",
                "index":true,
                "store":true
            },
            "address":{
                "type":"text",
                "index":true,
                "store":true
            },
            "remark":{
                "type":"text",
                "index":true,
                "store":true
            }
        }
    }
}

查询

GET /test_index05/_mapping

结果

{
  "test_index05" : {
    "mappings" : {
      "properties" : {
        "address" : {
          "type" : "text",
          "store" : true
        },
        "age" : {
          "type" : "integer",
          "store" : true
        },
        "name" : {
          "type" : "keyword",
          "store" : true
        },
        "remark" : {
          "type" : "text",
          "store" : true
        },
        "sex" : {
          "type" : "integer",
          "store" : true
        }
      }
    }
  }
}

4.核心类型

  1. 字符串:string,string类型包含 text 和 keyword。
  • text:该类型被用来索引长文本,在创建索引前会将这些文本进行分词,转化为词的组合,建立索引;允许es来检索这些词,text类型不能用来排序和聚合。
  • keyword:该类型不能分词,可以被用来检索过滤、排序和聚合,keyword类型不可用text进行分词模糊检索。(就是说你保存一条数据为"hello july",是text类型的话,会被分词存储到索引中,而keyword则会直接把hello july当成一个整体存到索引不会拆分。)
  1. 数值型:long、integer、short、byte、double、float
  2. 日期型:date
  3. 布尔型:boolean

5.keyword 与 text 映射类型的区别

索引映射类型
name现在是keyword(只能精准查询, 不能分词查询,能聚合、排序

    "mappings" : {
      "properties" : {
        "address" : {
          "type" : "text",
          "store" : true
        },
        "age" : {
          "type" : "integer",
          "store" : true
        },
        "name" : {
          "type" : "keyword",
          "store" : true
        },
        "remark" : {
          "type" : "text",
          "store" : true
        },
        "sex" : {
          "type" : "integer",
          "store" : true
        }
      }
    }
  }
}

索引所有数据

    "hits" : [
      {
        "_index" : "test_index05",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "天王盖地虎",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]

term查询

POST /test_index05/_search
{
    "query":{
        "term":{
            "name":"天王盖地虎"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "test_index05",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "天王盖地虎",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

match查询

POST /test_index05/_search
{
    "query":{
        "match":{
            "name":"地虎"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

这里需要注意一点:当name的type为keyword类型时,使用match进行搜索时,如果搜索内容和存贮内容完全一致,是可以搜到结果的。明明match会分词,为啥有能搜到呢?这个问题有些困扰到我。于是多方了解有了答案
以下内容请仔细阅读

①在ElasticSearch中使用match查询type为keyword类型数据时,会直接进行精确匹配查询,不会进行分词操作。
②而当使用match进行查询type为text类型的数据时,默认会使用标准分词器(standard analyzer)进行分词,将查询字符串进行分词后再进行查询。
③标准分词器会按照空格、标点符号等分隔符对查询字符串进行分词,同时会将单词转换为小写形式,并去除常见的停用词(如a、an、the等)。分词后,ElasticSearch会将查询字符串分词后的每一个词条在倒排索引中进行查询,并返回匹配的文档。
如果需要使用其他的分词器进行查询,可以在match查询中指定相应的分词器。

例如指定分词器进行分词

{
  "query": {
    "match": {
      "content": {
        "query": "天王盖地虎",
        "analyzer": "ik_max_word"
      }
    }
  }
}

索引结构

"name" : {
          "type" : "keyword",
          "store" : true
        }

索引内容

"hits" : [
  {
    "_index" : "test_index05",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 1.0,
    "_source" : {
      "name" : "天王盖地虎",
      "sex" : 1,
      "age" : 25,
      "address" : "上海",
      "remark" : "java"
    }
  }
]

示例

POST /test_index05/_search
{
    "query":{
        "match":{
            "name":"天王盖地虎"
        }
    }
}

结果

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "test_index05",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "天王盖地虎",
          "sex" : 1,
          "age" : 25,
          "address" : "上海",
          "remark" : "java"
        }
      }
    ]
  }
}

6.创建静态映射时指定text类型的ik分词器

指定name字段为ik分词器

PUT /test_index05
{
    "mappings":{
        "properties":{
            "name":{
                "type":"keyword",
                "index":true,
                "store":true
            },
            "sex":{
                "type":"integer",
                "index":true,
                "store":true
            },
            "age":{
                "type":"integer",
                "index":true,
                "store":true
            },
            "address":{
                "type":"text",
                "index":true,
                "store":true,
                "analyzer":"ik_smart",
                "search_analyzer":"ik_smart"
            },
            "remark":{
                "type":"text",
                "index":true,
                "store":true
            }
        }
    }
}

7.对已存在的mapping映射进行修改

简单方式:
①新建一个新的满足你需求的静态映射索引

②把旧索引数据迁移到新索引

POST _reindex
{
    "source":{
        "index":"index_old"
    },
    "dest":{
        "index":"index_new"
    }
}

③如果还想用就索引名字进行相关业务操作,则需要修改别名(可选)
新索引设置别名为旧索引的名称,这里使用别名和索引一对一关系。(一个别名可以指向多个索引,后期会讲到)

PUT /index_new/_alias/index_old

后续会讲解在公司中遇到这种问题是如何处理的。

8.Elasticsearch乐观并发控制

  1. 悲观并发控制

    这种方法被关系型数据库广泛使用,它假定有变更冲突可能发生,因此阻塞访问资源以防止冲突。 一个典型的例子是读取一行数据之前先将其锁住,确保只有放置锁的线程能够对这行数据进行修改。

  2. 乐观并发控制
    Elasticsearch 中使用的这种方法假定冲突是不可能发生的,并且不会阻塞正在尝试的操作。 然而,如果源数据在读写当中被修改,更新将会失败。应用程序接下来将决定该如何解决冲突。 例如,可以重试更新、使用新的数据、或者将相关情况报告给用户。

ES新版本(7版本及以后版本)不使用version进行并发版本控制,使用 if_seq_no=版本值&if_primary_term=文档位置
新增一条数据

PUT /test_index05/_doc/1
{
"name": "天王盖地虎",
"sex": 1,
"age": 25,
"address": "上海",
"remark": "java"
}

更新该数据内容

POST /test_index05/_update/1
{
    "doc":{
        "name":"宝塔镇河妖"
    }
}

结果
三、ElasticSearch搜索核心语法_第3张图片
再次更新

POST /test_index05/_update/1/?if_seq_no=1&if_primary_term=1
{
    "doc":{
        "name":"秀儿"
    }
}

结果
三、ElasticSearch搜索核心语法_第4张图片
模拟并发
再次执行该命令

POST /test_index05/_update/1/?if_seq_no=1&if_primary_term=1
{
    "doc":{
        "name":"秀儿"
    }
}

结果
三、ElasticSearch搜索核心语法_第5张图片

你可能感兴趣的:(elasticsearch,elasticsearch,java,大数据)