13、_search api的基本语法介绍。filter和query查询条件的区别

主要内容：_search api的基本语法介绍。filter和query查询条件的区别

1、_search api的基础语法介绍

1.1 基础语法回顾

GET /_search
GET /ecommerce,company/_search
GET /_search
{
  "from": 10,
  "size": 20
}

1.2 Query DSL语法

## Query DSL的基本语法
{
    QUERY_NAME: {  ##
        ARGUMENT: VALUE,   ##
        ARGUMENT: VALUE,...
    }
}

{
    QUERY_NAME: {
        FIELD_NAME: { ##
            ARGUMENT: VALUE,
            ARGUMENT: VALUE,...
        }
    }
}

查询示例

GET /_search
{
  "query": {
    "match_all": {}
  }
}

GET /ecommerce/_search
{
  "query": {
    "match": {
      "name": "special"
    }
  }
}

1.3 组合多个搜索条件

构造的moke数据

POST website/_doc
{
  "title": "my hadoop article",
  "content": "hadoop is very bad",
  "author_id": 111
}

POST website/_doc
{
  "title": "my elasticsearch article",
  "content": "es is very bad",
  "author_id": 110
}

POST website/_doc
{
  "title": "my elasticsearch article",
  "content": "es is very goods",
  "author_id": 111
}

搜索需求：title必须包含elasticsearch，content可以包含'elasticsearch'也可以不包含，author_id必须不为111

GET /website/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "elasticsearch"
          }
        }
      ],
      "should": [
        {
          "match": {
            "content": "elasticsearch"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "author_id": 111
          }
        }
      ]
    }
  }
}

查询需求02：

name必须是tom 其他条件可以不满足

GET /test_index/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "name": "tom"
        }
      },
      "should": [
        {
          "match": {
            "hired": true
          }
        },
        {
          "bool": {
            "must": {
              "match": {
                "personality": "good"
              }
            },
            "must_not": {
              "match": {
                "rude": true
              }
            }
          }
        }
      ],
      "minimum_should_match": 1   ##至少匹配到一条
    }
  }
}

2、ES 的query 和 filter 区别

插入的示例数据

POST /company/_doc/1
{
  "address": {
    "country": "china",
    "province": "guangdong",
    "city": "guangzhou"
  },
  "name": "jack",
  "age": 27,
  "join_date": "2017-01-01"
}

POST /company/_doc/2
{
  "address": {
    "country": "china",
    "province": "jiangsu",
    "city": "nanjing"
  },
  "name": "tom",
  "age": 30,
  "join_date": "2016-01-01"
}

POST /company/_doc/3
{
  "address": {
    "country": "china",
    "province": "shanxi",
    "city": "xian"
  },
  "name": "marry",
  "age": 35,
  "join_date": "2015-01-01"
}

搜索请求，年龄必须大于等于30，同时join_date必须是2016年1月1号

GET /company/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "join_date": "2016-01-01"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 30
          }
        }
      }
    }
  }
}

2.1、query DSL

120296-20160318164439303-146810557.png

在查询上下文中，查询会回答这个问题——“这个文档匹不匹配这个查询，它的相关度高么？”

如何验证匹配很好理解，如何计算相关度呢？ES中索引的数据都会存储一个_score分值，分值越高就代表越匹配。另外关于某个搜索的分值计算还是很复杂的，因此也需要一定的时间。

查询上下文是在使用query进行查询时的执行环境，比如使用search的时候。还有需要按照相关度进行排序的时候。

一些query的场景：

与full text search的匹配度最高
包含run单词，如果包含这些单词：runs、running、jog、sprint，也被视为包含run单词
包含quick、brown、fox。这些词越接近，这份文档的相关性就越高

2.2、filter DSL

在过滤器上下文中，查询会回答这个问题——“这个文档匹不匹配？”

答案很简单，是或者不是。它不会去计算任何分值，也不会关心返回的排序问题，因此效率会高一点。

过滤上下文是在使用filter参数时候的执行环境，比如在bool查询中使用Must_not或者filter

一些filter的情况：

创建日期是否在2013-2014年间？
status字段是否为published？
lat_lon字段是否在某个坐标的10公里范围内？

概述总结：

filter和query的选择：

一般来说，如果你是在进行搜索，需要将最匹配搜索条件的数据先返回，那么用query；如果你只是要根据一些条件筛选出一部分数据，不关注其排序，那么用filter
除非是你的这些搜索条件，你希望越符合这些搜索条件的document越排在前面返回，那么这些搜索条件要放在query中；如果你不希望一些搜索条件来影响你的document排序，那么就放在filter中即可

filter与query的性能：

filter，不需要计算相关度分数，不需要按照相关度分数进行排序，同时还有内置的自动cache最常使用filter的数据
query，相反，要计算相关度分数，按照分数进行排序，而且无法cache结果

2.3、filter和query合并

比如说我们有这样一条查询语句，获取邮件内容中带“business opportunity” 的：

{ 
    "match": { 
        "email": "business opportunity" 
    } 
}

然后我们想要让这条语句加入 term 过滤，只在收信箱中匹配邮件：

{ 
    "term": { 
        "folder": "inbox" 
    } 
}

search API中只能包含 query 语句，所以我们需要用 filtered 来同时包含 "query" 和 "filter" 子句：

{ 
    "filtered": { 
        "query":  { "match": { "email": "business opportunity" }}, 
        "filter": { "term":  { "folder": "inbox" }} 
    } 
}

我们在外层再加入 query 的上下文关系：

GET /_search 
{ 
    "query": { 
        "filtered": { 
            "query":  { "match": { "email": "business opportunity" }}, 
            "filter": { "term": { "folder": "inbox" }} 
        } 
    } 
}

参考内容：

ES 的query 和 filter 区别Java王文健的Blogs~~！-CSDN博客 https://blog.csdn.net/qq_29580525/article/details/80908523