ElasticSearch入门

ElasticSearch笔记

ElasticSearch.jpg

前言

Elasticsearch 是一个开源的搜索引擎，建立在一个全文搜索引擎库 Apache Lucene™ 基础之上。 Lucene 可以说是当下最先进、高性能、全功能的搜索引擎库--无论是开源还是私有。

但是 Lucene 仅仅只是一个库。为了充分发挥其功能，你需要使用 Java 并将 Lucene 直接集成到应用程序中。更糟糕的是，您可能需要获得信息检索学位才能了解其工作原理。Lucene 非常复杂。

Elasticsearch 也是使用 Java 编写的，它的内部使用 Lucene 做索引与搜索，但是它的目的是使全文检索变得简单，通过隐藏 Lucene 的复杂性，取而代之的提供一套简单一致的 RESTful API。

然而，Elasticsearch 不仅仅是 Lucene，并且也不仅仅只是一个全文搜索引擎。它可以被下面这样准确的形容：

一个分布式的实时文档存储，每个字段 可以被索引与搜索
一个分布式实时分析搜索引擎
能胜任上百个服务节点的扩展，并支持 PB 级别的结构化或者非结构化数据

Elasticsearch 将所有的功能打包成一个单独的服务，这样你可以通过程序与它提供的简单的 RESTful API 进行通信，可以使用自己喜欢的编程语言充当 web 客户端，甚至可以使用命令行（去充当这个客户端）。

就 Elasticsearch 而言，起步很简单。对于初学者来说，它预设了一些适当的默认值，并隐藏了复杂的搜索理论知识。它 开箱即用 。只需最少的理解，你很快就能具有生产力

本文基于ElasticSearch 2.2，参考自https://www.elastic.co/guide/en/elasticsearch/reference/2.2/getting-started.html

ElasticSearch笔记

基础知识

Near RealTime
Elastic是一种实时搜索引擎
Cluster
集群是各个节点的集合，保存了所有的数据并提供了数据检索能力。
- 默认集群的名称是elasticsearch
- 集群名称不可重复
- 每个集群至少有一个节点
Node
多个节点构成了集群，集群的数据存储和数据检索也是依赖于节点来完成的，节点的名称是一个随机值，当然你也可以自己修改咯
- 节点默认加入elasticsearch集群
- 一个集群的拥有的节点数量不做限制
- 如果网络上没有一个节点，那么此时新建一个节点将默认加入到elasticsearch集群中
Index
索引是document的集合，通过索引你可以查询，更新，删除document
- 索引的名称必须全部是小写
- 集群中你可以定义多个index
Type
Type可以理解为Java中的类，Document就是通过该类创建出来的实例
Document
Document是可以被检索到的最小数据单元集合
- Document数据全部是JSON格式
- Document必须指定其对应type
Shards
数据分片，提供了index数据量无限增大的能力。出现数据分片的原因：单个node容不下某个index中所有的数据。
Replicas
既然有了数据分片，也就意味着数据不在同一个物理节点上存储，那么某次查询就会可能出现某个数据分片所在的节点挂掉的情况，此时的解决方案就是进行数据复制
- 默认情况下，elasicsearch有5个数据分片和1次复制，这就意味着如果集群中只有两个nodeA和B，加入nodeA保存着源数据，那么经过一个数据复制之后，nodeB中也会存在一份相同的数据

启动ElasticSearch

在bin目录下执行

./elasticsarch

启动时修改集群名称和节点名称

./elasticsearch --cluster.name fanyank_cluster --node.name fanyank_node1

快速上手

执行健康检查

curl "localhost:9200/_cat/health?v"

执行成功之后可以看到

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 
1541906922 11:28:42  elasticsearch green           1         1      0   0    0    0        0             0                  -                100.0%

status: green(正常)，yellow(集群整体可用，但某些复制节点有问题)，red(集群不可用)

查询所有节点

curl "localhost:9200/_cat/nodes?v"

执行成功之后可以看到

host      ip        heap.percent ram.percent  load node.role master name       
127.0.0.1 127.0.0.1            4          81 -1.00 d         *      Nightshade

查询所有index

curl "localhost:9200/_cat/indices?v"

执行成功之后可以看到

health status index pri rep docs.count docs.deleted store.size pri.store.size

意味着我们目前还没有建立任何索引

建立index
建立名称为customer的index
```
curl -XPUT 'localhost:9200/customer?pretty'
curl 'localhost:9200/_cat/indices?v'
```
执行之后可以看到
```
health status index    pri rep docs.count docs.deleted store.size pri.store.size 
yellow open   customer   5   1          0            0       650b           650b
```
我们可以得知customer索引现在有5个数据分片和1次复制，包含了0个文档。另外，health状态为yellow，这是因为我们现在只有一个node,而复制操作至少需要两个node，所以replica是连接不上的，所以状态为yellow

创建mapping
每一个type都有一个对应mapping文件，在插入第一个文档的时候，es会自动搜索插入文档的字段并为这个type建立一个mapping，随着插入文档字段的增多，mapping中的字段也随着增多。
这里，我们手动创建一个mapping,对每一个字段我们都严格的定义类型，我们要在customer索引创建如下的mapping

{
    "properties": {
         "name": {
             "type": "string",
             "index": "not_analyzed"  /*关闭分词*/
         },
         "age": {
             "type": "integer"
         },
         "score": {
             "type": "double"
         },
         "create_time": {
             "type": "date",
             "format": "yy-MM-dd HH:mm:ss"   /*日期格式转换*/
         }
     }
}

插入document
在插入数据前，必须指定数据的type。现在我们在customer索引中插入一个类型为external的数据，并且数据的ID为1
插入的JSON数据 {"name":"fanyank"}
```
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '{"name" : "fanyank"}'
```
执行之后可以看到
```
{
    "_index" : "customer",
    "_type" : "external",
    "_id" : "2",
    "_version" : 1,
    "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
    },
    "created" : true
}
```
插入数据时注意，elasticsearch不会检查索引是否存在，如果发现索引不存在，那么elasticsearch会自动创建索引并插入，所以执行插入时一定要检查索引名称是否拼写正确。

我们在插入数据时也可以不指定ID，这样elasticsearch会为我们生成一个唯一的hashcode,注意此时使用的http请求是POST
```
curl -XPOST 'localhost:9200/customer/external?pretty' -d '{"name" : "fanyank"}'
```
执行之后返回如下
```
{
    "_index" : "customer",
    "_type" : "external",
    "_id" : "AWcBDGmpUlH0QZg74qVp",
    "_version" : 1,
    "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
    },
    "created" : true
}
```

更新数据
更新数据和插入数据一样，如果发现id已经存在，那么elasticsearch就会执行更新操作
elasticsearch执行更新操作的本质是删除已有数据然后新插入一条数据

方法一

curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '{"name":"Big Bang"}'

方法二

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '{"doc":{"name":"Nathan James"}}'

方法三
在external中添加age字段

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '{"doc":{"name":"Nathan James","age":20}}'

方法四
使用script

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
    "script" : "ctx._source.age += 5"
}'

注意方法二,方法三请求是POST请求，而且请求体包含doc字段

查询数据
查询我们刚刚插入的id为1的数据

curl 'localhost:9200/customer/external/1?pretty'

执行之后返回如下

{
    "_index" : "customer",
    "_type" : "external",
    "_id" : "2",
    "_version" : 4,
    "found" : true,
    "_source" : {
        "name" : "fanyank"
    }
}

删除index

curl -XDELETE 'localhost:9200/customer?pretty'
curl 'localhost:9200/_cat/indices?v'

删除document

curl -XDELETE 'localhost:9200/customer/external/2?pretty'

批处理
批处理可以在一次请求中对多个document完成插入，更新，删除操作
如下示例在一次请求中插入了一条数据{"name":"Alice"},更新一条数据{"name":"tom","age",16},删除一条数据{"_id":"1"}
```
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"index":{"_id":"3"}}
{"name":"Alice"}
{"update":{"_id":"2"}}
{"doc":{"name:"tom","age":"16"}}
{"delete":{"_id":"1"}}
'
```
这个不成功，稍后再试一下

搜索API详解

搜索某个索引的全部数据

curl 'localhost:9200/bank/_search?q=*&pretty'

或者

curl 'localhost:9200/bank/_search?pretty' -d '
{
    "query" : {"match_all":{}}
}
'

限制返回的结果数量为100，默认限制的是10个

curl 'localhost:9200/bank/_search?pretty' -d '
{
    "query" : {"match_all":{}},
    "size": 100
}
'

分页
返回下标为10往后的10个数据

curl 'localhost:9200/bank/_search?pretty' -d '
{
    "query" : {"match_all":{}},
    "from": 10,
    "size": 10
}
'

排序
按照余额倒序排列

curl 'localhost:9200/bank/_search?pretty' -d '
{
    "query" : {"match_all":{}},
    "sort": { "balance": { "order": "desc" } }
}
'

限制返回的字段
限制返回的字段只有account_number和balance

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}'

条件查询
查询account_id为10010的账户

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": {"account_id":"10010"} }
}'

与查询
查询address为test,且account_id为10010的账户

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { 
    "bool": {
        "must": [
            {"match": {"account_id":"10010"}},
            {"match": {"address": "test"}}
        ]
    }
 }
}'

或查询
查询account_id为10010或account_id为10011的账户

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { 
    "bool": {
        "should": [
            {"match": {"account_id":"10010"}},
            {"match": {"account_id": "10011"}}
        ]
    }
 }
}'

非查询
查询account_id不为10010和10011的账户

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { 
    "bool": {
        "must_not": [
            {"match": {"account_id":"10010"}},
            {"match": {"account_id": "10011"}}
        ]
    }
 }
}'

组合查询(与或非)
查询account_id不为10011，且address为test的账户

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { 
    "bool": {
        "must_not": [
            {"match": {"account_id": "10011"}}
        ],
        "must": [
            {"match": {"addrss": "test"}}
        ]
    }
 }
}'

过滤器查询

过滤器查询，必须先要查出来所有数据然后才能进行过滤

单个字段过滤查询
通常查询一个精确的值时，我们不希望对查询进行评分进行计算，只希望对文档进行快速的包含或者排除计算，所以我们会使用constant_score查询使得es以非评分的模式查询term，理论上以非评分模式进行查询的速度是优于评分模式的。
1. term过滤单个字段
  查询所有数据，过滤出姓名为fanyank的用户
```
{
    "query": {
        "constant_score": {
            "filter": {
                "term": {
                    "name": "fanyank"
                }
            }
        }
    }
}
```
2. terms过滤出匹配某个字段的所有元素
  查询出所有数据，过滤出姓名为fanyank,jerry的用户
```
{
    "query": {
        "constant_score": {
            "filter": {
                "terms": {
                    "name": [
                        "fanyank",
                        "jerry"
                    ]
                }
            }
        }
    }
}
```

多个字段过滤组合查询
我们在某些场景下可能用到组合过滤进行查询，如面对如下的SQL，我们的ES查询参数应该怎么写呢？

select * from customer where (name = 'fanyank' and age = 10)
or (age = 2)

此时我们会用到多个字段进行过滤查询，在开始之前，有必要了解一下bool查询器，因为bool查询器提供了查询与或非的能力
bool查询器结构如下

{
    "bool": {
        "must": [],
        "must_not": [],
        "should": []
    }
}

其中
must为与查询
must_not为非查询
should为或查询

多个term与或非查询
接开头的SQL语句，我们可以写出如下查询参数

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "should": [
                        {"term": {"age": 2}},
                        {
                            "bool": {
                                "must": [
                                    {"term": {"name": "fanyank"}},
                                    {"term": {"age": 10}}
                                ]
                            }
                        }
                    ]
                }
            }
        }
    }
}

范围查询

select * from customer where create_time between '2018-11-01' and '2018-11-31'

对应的ES查询如下

{
    "query": {
        "filtered": {
            "filter": {
                "range": {
                    "create_time": {
                        "gte": "2018-11-01",
                        "lte": "2018-11-31"
                    }
                }
            }
        }
    }
}

range和terms组合查询

select * from customer where 
name in ('fanyank','jerry')
and create_time between '2018-11-01' and '2018-11-03'

对应的ES查询参数如下：

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "terms": {
                                "name": [
                                    "fanyank",
                                    "jerry"
                                ]
                            }
                        },
                        {
                            "range": {
                                "create_time": {
                                    "gte": "2018-11-01",
                                    "lte": "2018-11-03"
                                }
                            }
                        }
                    ]
                }
            }
        }
    }
}

聚合查询

avg
使用avg查询某个字段的平均值
查询学生的平均成绩

{
    "aggs": {
        "avg_grade": {
            "avg": {
                "field": "grade"
            }
        }
    }
}

查询结果如下

{
    ...

    "aggregations": {
        "avg_grade": {
            "value": 75
        }
    }
}

cardinality
使用cardinality统计某个字段去重之后的值
查询学生数量，由于每个学生的ID是唯一的，所以去重之后统计结果就是学生数量，但是如果按照姓名去做统计，那么统计出来的学生的数量就不准确了，因为可能存在姓名相同的学生
```
{
    "size": 0,
    "aggs": {
        "student_count": {
            "cardinality": {
                "field": "id"
            }
        }
    }
}
```
查询结果如下：
```
{
    ...

    "aggregations": {
        "student_count": {
            "value": 100
        }
    }
}
```

stats
统计学生成绩(min,max,avg,count,sum)

{
    "size": 0,
    "aggs": {
        "student_grade_stats": {
            "extended_stats": {
                "field": "grade"
            }
        }
    }
}

查询结果如下:

{
    ...

    "aggregations": {
        "grade_stats": {
        "count": 9,
        "min": 72,
        "max": 99,
        "avg": 86,
        "sum": 774,
        "sum_of_squares": 67028,
        "variance": 51.55555555555556,
        "std_deviation": 7.180219742846005,
        "std_deviation_bounds": {
            "upper": 100.36043948569201,
            "lower": 71.63956051430799
        }
        }
    }
}

terms
terms可以让数据按照给定字段的不同值进行分组，组的形式以bucket形式呈现
把学生按照性别进行分组

{
    "size": 0,
    "aggs": {
        "genders": {
            "terms": {
                "field": "gender"
            }
        }
    }
}

执行之后结果如下:

{
    ...

    "aggregations" : {
        "genders" : {
            "doc_count_error_upper_bound": 0, 
            "sum_other_doc_count": 0, 
            "buckets" : [ 
                {
                    "key" : "male",
                    "doc_count" : 10
                },
                {
                    "key" : "female",
                    "doc_count" : 10
                },
            ]
        }
    }
}

value_count
统计学生个数(即使存在两个名词相同的学生也可以统计出来准确的个数)

{
    "size": 0,
    "aggs": {
        "student_count": {
            "value_count": {
                "field": "name"
            }
        }
    }
}

嵌套查询
分别计算男生和女生的数学成绩的平均分

{
    "aggs": {
        "groupby_gender": {
            "terms": {
                "field": "gender"
            },
            "aggs": {
                "avg_math_score": {
                    "avg": {
                        "field": "math"
                    }
                }
            }
        }
    }
}

查询结果如下

{
    ...

    "aggregations": {
    "groupby_gender": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
            {
                "key": "male",
                "doc_count": 2,
                "avg_math_score": {
                    "value": 94.9
                }
            },
            {
                "key": "female",
                "doc_count": 1,
                "avg_math_score": {
                    "value": 9.9
                }
            }
        ]
    }
}
}

top_hits
top_hits也属于聚合查询的一种，top_hits的作用就是让我们能够查看被聚合document中的某些属性(否则我们只能够看到聚合的字段)。
一个具体的应用场景就是：按照姓名聚合学生的姓名，并求出姓名相同的学生的平均分，另外我还要求按照每个学生的年龄倒序排序

{   
    "size": 0,
    "aggs": {
        "group-by-name": {
            "terms": {
                "field": "name"
            },
            "aggs": {
                "add-top-hits": {
                    "top_hits": {
                    "sort": [
                            {
                                "age": {
                                    "order": "desc"
                                }
                            }
                    ],
                        "_source": {
                            "include": [
                                "age"
                            ]
                        }
                    }
                }, 
                "score-avg": {
                    "avg": {
                        "field": "score"
                    }
                }
            }
        }
    }
}

查询结果如下:

{
    "took": 379,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "group-by-name": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "fanyank",
                    "doc_count": 3,
                    "score-avg": {
                        "value": 7.3999999999999995
                    },
                    "add-top-hits": {
                        "hits": {
                            "total": 3,
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWc_90rRosSO2HBCMsNS",
                                    "_score": null,
                                    "_source": {
                                        "age": 29
                                    },
                                    "sort": [
                                        29
                                    ]
                                },
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWc_9xGqosSO2HBCMsNR",
                                    "_score": null,
                                    "_source": {
                                        "age": 10
                                    },
                                    "sort": [
                                        10
                                    ]
                                },
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWdAA4_cosSO2HBCMsNW",
                                    "_score": null,
                                    "_source": {
                                        "age": 8
                                    },
                                    "sort": [
                                        8
                                    ]
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "jerry",
                    "doc_count": 2,
                    "score-avg": {
                        "value": 6.95
                    },
                    "add-top-hits": {
                        "hits": {
                            "total": 2,
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWdABHJyosSO2HBCMsNX",
                                    "_score": null,
                                    "_source": {
                                        "age": 8
                                    },
                                    "sort": [
                                        8
                                    ]
                                },
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWc_95cqosSO2HBCMsNT",
                                    "_score": null,
                                    "_source": {
                                        "age": 2
                                    },
                                    "sort": [
                                        2
                                    ]
                                }
                            ]
                        }
                    }
                },
                {
                    "key": "tom",
                    "doc_count": 1,
                    "score-avg": {
                        "value": 12.8
                    },
                    "add-top-hits": {
                        "hits": {
                            "total": 1,
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "customer",
                                    "_type": "external",
                                    "_id": "AWc_9oRhosSO2HBCMsNQ",
                                    "_score": null,
                                    "_source": {
                                        "age": 28
                                    },
                                    "sort": [
                                        28
                                    ]
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
}

script
尽管ES为我们提供了多种多样的统计函数，但是面对复杂的业务场景ES也常常感到无能为力，此时我们就要使用自己定义的脚本进行数据统计
首先需要在/config/elasticseach.yml配置文件中配置，使得ES支持script脚本

script.inline: on
script.indexed: on

配置成功后重启ES

查询firstName和lastName拼接起来去重之后的数量

{
    "size": 0,
    "aggs": {
        "fullNameCount": {
            "cardinality": {
                "script": "doc['firstName'].value + ' ' + doc['lastName'].value"
            }
        }
    }
}

统计学生的成绩，如果成绩不存在，则把学生的年龄作为成绩进行统计(虽然很不合理)

{
    "size": 0,
    "aggs": {
        "group_by_name": {
            "terms": {"field": "name"},
            "aggs": {
                "score_or_age": {
                    "sum": {
                        "script": "if(doc['score'].value==0){return doc['age'].value}else{return doc['score'].value}"
                    }
                }
            }
        }
    }
}

简化版本(使用三目运算符)

{
    "size": 0,
    "aggs": {
        "group_by_name": {
            "terms": {"field": "name"},
            "aggs": {
                "score_or_age": {
                    "sum": {
                        "script": "return (doc['score'].value==0) ? doc['age'].value : doc['score'].value"
                    }
                }
            }
        }
    }
}

数据字典

某次搜索结果如下，借此说明各个字段的含义

{
"took": 63,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
},
"hits": {
    "total": 1000,
    "max_score": 1,
    "hits": [
    {
        "_index": "bank",
        "_type": "account",
        "_id": "1",
        "_score": 1,
        "_source": {
        "account_number": 1,
        "balance": 39225,
        "firstname": "Amber",
        "lastname": "Duke",
        "age": 32,
        "gender": "M",
        "address": "880 Holmes Lane",
        "employer": "Pyrami",
        "email": "[email protected]",
        "city": "Brogan",
        "state": "IL"
        }
    },
    {
        "_index": "bank",
        "_type": "account",
        "_id": "6",
        "_score": 1,
        "_source": {
        "account_number": 6,
        "balance": 5686,
        "firstname": "Hattie",
        "lastname": "Bond",
        "age": 36,
        "gender": "M",
        "address": "671 Bristol Street",
        "employer": "Netagy",
        "email": "[email protected]",
        "city": "Dante",
        "state": "TN"
        }
    }
    ]
}
}

数据字典：

took: 查询花费的毫秒数
time_out: 查询是否超时
_shards: 查询所涉及到的数据分片的数量，成功数量和失败数量
hits: 查询结果
- hits_.total: 返回的结果总数
- hits.hits: 返回的结果，以数组形式返回
- _score: 与查询参数匹配的程度
- max_score: 最大匹配程度

ES过滤非空字符串||ES处理空值

首先需要明确要过滤的是空字符串还是空值，比如我们要过滤的字段是non_field

空字符串
空字符串表现形式如下：

{
    ...
    "non_field": ""
}

要过滤空字符串，第一步先确定mapping中non_field字段的类型，如果是

{
    "non_field": {
        "type": "string",
        "index": "not_analyzed"  /*关闭分词*/
    }
    /*或者这个类型*/
    "non_field": {
        "type": "keyword"
    }
}

那么可以直接使用如下参数过滤空字符串

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must_not": {
                        "term": {
                            "non_field": ""
                        }
                    }
                }
            }
        }
    }
}

如果mappping中的non_field类型为string类型，有两种做法，第一种做法就是修改mapping中的类型，第二种做法就是使用script脚本去过滤
使用script脚本过滤参数如下：

{
    "query": {
        "filtered": {
            "filter": {
                "script": {
                    "script": "_source.non_field.length() != 0"
                }
            }
        }
    }
}

空值的表现形式如下：

{
    ...
    "non_field": null
    /*或者没有这个字段*/
}

过滤空值做法比较简单，参数如下:

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": {
                        "term": {
                            "exists": {"filed": "non_field"}
                        }
                    }
                }
            }
        }
    }
}

过滤空字符串和空值
如果non_field已关闭分词，参数如下：

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must_not": {
                        "term": {
                            "non_field": ""
                        }
                    },
                    "must": {
                        "exists": {
                            "field": "non_field"
                        }
                    }
                }
            }
        }
    }
}

如果non_field未关闭分词，参数如下：

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "exists": {
                                "field": "non_field"
                            }
                        }, {
                            "script": {
                                "script": "_source.non_field.length() != 0"
                            }
                        }
                    ]
                }
            }
        }
    }
}

ElasticSearch入门

ElasticSearch笔记

前言

ElasticSearch笔记

基础知识

启动ElasticSearch

快速上手

搜索API详解

过滤器查询

聚合查询

数据字典

ES过滤非空字符串||ES处理空值

你可能感兴趣的:(ElasticSearch入门)