ES脚本查询空字符串

本文介绍的查询方法是基于ES5.2版本的,可能对其他版本不适用。其他版本请参考官网:
https://www.elastic.co/guide/en/elasticsearch/reference/5.2/modules-scripting-fields.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.2/modules-scripting-painless-syntax.html

查询字段为空的文档

curl localhost:9200/customer/_search?pretty -d'{
    "size": 5,
    "query": {
        "bool": {
            "must": {
                "script": {
                    "script": {
                        "inline": "params._source.strnickname.length()<1",
                        "lang": "painless"
                    }
                }
            }
        }
    }
}'

doc,fields和_source用法

doc:除了analyzed text 类型的字段,其余的字段默认都是开着doc_values的。doc可以查询text字段,前提是fielddata要开启。但是开启fielddata后,查询时将加载所有的term进入JVM heap。内存消耗太大,慎用。官网上说可用可用doc['field_name']查询,但是测试后发现不可以,但可通过doc.field_name查询。后又几经测试,发现是单引号的问题。doc['field_name']无法查询。总是报错:

"caused_by" : {
        "type" : "script_exception",
        "reason" : "compile error",
        "caused_by" : {
          "type" : "illegal_argument_exception",
          "reason" : "Variable [field_name] is not defined."
        },
        "script_stack" : [
          "doc[field_name].length() <  ...",
          "    ^---- HERE"
        ],
        "script" : "doc[field_name].length() < 1",
        "lang" : "painless"
      }

但是doc[\u0027field_name\u0027]可以查询,doc['''field_name''']也可以查询。对上述例子可通过下述查询方式查询

curl localhost:9200/customer/_search?pretty -d'{
    "size": 5,
    "query": {
        "bool": {
            "must": {
                "script": {
                    "script": {
                        "inline": "doc['\''strnickname'\''].length()<1",
                        "lang": "painless"
                    }
                }
            }
        }
    }
}'

fields:仅标记为 "store": true 的字段可用,可通过_fields['field_name'].value or _fields['field_name'].values查询。

_source:是一个特殊的store fields。任何字段都可以使用。可通过_source.field_name访问。

curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "my_type": {
      "properties": {
        "title": { 
          "type": "text"
        },
        "first_name": {
          "type": "text",
          "store": true
        },
        "last_name": {
          "type": "text",
          "store": true
        }
      }
    }
  }
}
'
curl -XPUT 'localhost:9200/my_index/my_type/1?pretty' -H 'Content-Type: application/json' -d'
{
  "title": "Mr",
  "first_name": "Barry",
  "last_name": "White"
}
'
curl -XGET 'localhost:9200/my_index/_search?pretty' -H 'Content-Type: application/json' -d'
{
  "script_fields": {
    "source": {
      "script": {
        "inline": "params._source.title + ' '+ params._source.first_name + ' ' + params._source.last_name" 
      }
    },
    "stored_fields": {
      "script": {
        "inline": "params._fields['first_name'].value + ' ' + params._fields['last_name'].value"
      }
    }
  }
}
'

doc比store fields性能要好的多

Stored fields (which includes the stored _source field) are much slower than doc-values. They are optimised for returning several fields per result, while doc values are optimised for accessing the value of a specific field in many documents.
It makes sense to use _source or stored fields when generating a script field for the top ten hits from a search result but, for other search and aggregation use cases, always prefer using doc values.

你可能感兴趣的:(ES脚本查询空字符串)