ElasticSearch 常用语法规则

文章目录

    • Query 操作
      • 查询的主要标签
      • 聚合分组计算模板
      • 多次下转聚合Range
      • multi-index和multi-type搜索示例
      • es查询多条件模板
      • 多字段值匹配 multi_match
      • es通过查询删除数据 高版本才支持
      • 包含高亮聚合过滤布尔的模板
      • terms聚合
      • 多個aggregations聚合
      • 给字段分组,分组后再统计
      • 高亮展示(高亮字段需要出现在查询中)
      • query_string 查询
      • 多个filter,must,should拼接
      • must 下实现 should ,where a = 1 and b in (1,2,3)
    • 对 Index 的操作
      • 移除和修改索引的别名
      • 给已经存在的索引添加字段
      • 重建索引 高版本才支持
    • 分词
      • 字段设置为String后 index索引可选类型
      • 字段分词结果分析
      • 定制化自己的分词器
    • 调试优化
      • 参看查询的执行计划
      • 验证搜索语句是否合法
      • 常见问题
      • 无法排序 将文本field的fielddata属性设置为true
    • 其他工具
      • elastic dump导出索引使用

Query 操作

查询的主要标签

查询一级标签
query
    二级标签
    prefix  wildcard
    constant_score
        filter
    prefix  前缀匹配,文档中的词语以搜索的词汇开头
        {field:value}
	term
	    代表完全匹配,即不进行分词器分析,文档中必须包含整个搜索的词汇
	    {field:value}
	match
	    match和term的区别是,match查询的时候,elasticsearch会根据你给定的字段提供合适的分析器,而term查询不会有分析器分析的过程
        match查询相当于模糊匹配,只包含其中一部分关键词就行
	    {field:value}
	match_all   没有参数,查询出来所有
	    {}
	match_phrase    匹配一个短语,给一个短语词,匹配
        {field:value}
    multi_match     匹配多个字段
        {
          "query": {
            "multi_match": {
              "query": "caoke2",
              "fields": [
                "msg",
                "code"
              ]
            }
          }
        }
	range
	    {"age":{"lt":10,"gt":1}}
	-- 以上两级关键字 不能同时使用 多关键字需要用bool
	bool    布尔查询,匹配多个条件,下面的关键字中可以把上面的再嵌套一遍 bool中不能包含range,要用filter包裹
		must should must_not filter
        多个查询条件用数组包裹
highlight   自定义标签的前后缀,fields 里面的字段要是query中查询的字段
    "highlight":{
        "pre_tags":[
          ""
        ],
        "post_tags":[
          ""
        ],
        "require_field_match":true,
        "fields":{
          "msg":{
            "fragment_size":1000,
            "number_of_fragments":0,
            "fragment_offset":0
          }
        }
      }
from    查询的结果取范围开始
size    查询的结果取范围结束
sort    排序
      "sort": [
        {
          "age": "desc"
        }
      ]
_source     包含or排除字段
    {"includes":["code"],"excludes":[]}
script_fields
aggs    聚合
    "aggregations":{
        "term_agg":{
          "terms":{
            "field":"age",  对字段分组
            "size":10,      返回的条数
            "order":[{   对返回的结果排序,可以多条
                "min_agg":"desc"
            }]
          },
          "aggregations":{  分组后需要统计,可以多个
            "min_agg":{
              "min":{
                "field":"age"
              }
            }
          }
        }
    }

聚合分组计算模板

"aggregations" : {
    "" : {                                --聚合名称
        "" : {                            --聚合类型 terms,max,min,sum,avg,value_count
                                          --聚合内容 field,size,order[]
        }
        [,"meta" : {  [] } ]?               --meta 不知道
        [,"aggregations" : { []+ } ]?      --对上面的聚合分组后的结果,再次聚合分析
    }
    [,"" : { ... } ]*                   --其他的聚合需求
}

多次下转聚合Range

GET /ecommerce/product/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 20
          },
          {
            "from": 20,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_tags": {
          "terms": {
            "field": "tags"
          },
          "aggs": {
            "average_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

multi-index和multi-type搜索示例

/_search:所有索引,所有type下的所有数据都搜索出来
/index1/_search:指定一个index,搜索其下所有type的数据
/index1,index2/_search:同时搜索两个index下的数据
/*1,*2/_search:按照通配符去匹配多个索引
/index1/type1/_search:搜索一个index下指定的type的数据
/index1/type1,type2/_search:可以搜索一个index下多个type的数据
/index1,index2/type1,type2/_search:搜索多个index下的多个type的数据
/_all/type1,type2/_search:_all,可以代表搜索所有index下的指定type的数据

es查询多条件模板

{
    "query": {
            "bool": {
                "must": [{ "match":   { "name": "tom" }}],
                "should": [
                    { "match":       { "hired": true }},
                    { "bool": {
                        "must":      { "match": { "personality": "good" }},
                        "must_not":  { "match": { "rude": true }}
                    }}
                ],
                "must_not": [ { "match": { "author_id": 111 } } ],
                "filter": {  "range": { "age": { "gte": 30 } } },
                "minimum_should_match": 1
            }
    }
}

多字段值匹配 multi_match

{
  "query": {
    "multi_match": {
      "query": "test",
      "fields": ["test_field", "test_field1"]
    }
  }
}

es通过查询删除数据 高版本才支持

_delete_by_query

包含高亮聚合过滤布尔的模板

{
	"query:{
		"bool":{
			"must":[
				{
					"match":{
						"title":"smith"
					}
				}
			],
			"must_not":[
				{
					"match_phrase":{
						"title":"granny smith"
					}
				}
			],
			"filter":[
				{
					"exists":{
						"field":"title"
					}
				}
			]
		}
	},
	"aggs":{
		"my_agg":{
			"terms":{
				"field":"user",
				"size":10
			}
		}
	},
	"highlight":{
		"pre_tags":[
			""
		],
		"post_tags":[
			"
		],
		"fields":{
			"body":{
				"number_of_fragments":1,
				"fragment_size":20
			},
			"title":{}
		}
	},
	"size":20,
	"from":100,
	"_source":[
		"title",
		"id"
	],
	"sort":[
		{
			"_id":{
				"order":"desc"
			}
		}
	]
}

terms聚合

------------terms聚合
{
  "from":0,
  "size":10,
  "query":{
    "prefix":{
      "msg":{
        "value":"caoke",
        "boost":-1.0
      }
    }
  },
  "aggregations":{
    "term_agg":{
      "terms":{
        "field":"age",
        "size":10,
        "min_doc_count":1,
        "shard_min_doc_count":0,
        "show_term_doc_count_error":false,
        "order":[
          {
            "_count":"desc"
          },
          {
            "_key":"asc"
          }
        ]
      },
      "aggregations":{
        "min_agg":{
          "min":{
            "field":"age"
          }
        },
        "sum_agg":{
          "sum":{
            "field":"age"
          }
        }
      }
    }
  }
}
------------------結果
{
  "took":51,
  "timed_out":false,
  "_shards":{
    "total":5,
    "successful":5,
    "skipped":0,
    "failed":0
  },
  "_clusters":{
    "total":0,
    "successful":0,
    "skipped":0
  },
  "hits":{
    "total":5,
    "max_score":-1.0,
    "hits":[]
  },
  "aggregations":{
    "term_agg":{
      "doc_count_error_upper_bound":0,
      "sum_other_doc_count":0,
      "buckets":[
        {
          "key":1,
          "doc_count":1,
          "min_agg":{
            "value":1.0
          },
          "sum_agg":{
            "value":1.0
          }
        },
        {
          "key":5,
          "doc_count":1,
          "min_agg":{
            "value":5.0
          },
          "sum_agg":{
            "value":5.0
          }
        }
      ]
    }
  }
}

多個aggregations聚合

------------------多個aggregations聚合-------------------------------
{
  "from":0,
  "size":10,
  "query":{
    "prefix":{
      "msg":{
        "value":"caoke",
        "boost":-1.0
      }
    }
  },
  "aggregations":{
    "min_agg":{
      "min":{
        "field":"age"
      }
    },
    "sum_agg":{
      "sum":{
        "field":"age"
      }
    }
  }
}
-------------結果
{
  "took":9,
  "timed_out":false,
  "_shards":{
    "total":5,
    "successful":5,
    "skipped":0,
    "failed":0
  },
  "_clusters":{
    "total":0,
    "successful":0,
    "skipped":0
  },
  "hits":{
    "total":5,
    "max_score":-1.0,
    "hits":[
      {},
      {}
    ]
  },
  "aggregations":{
    "min_agg":{
      "value":1.0
    },
    "sum_agg":{
      "value":15.0
    }
  }
}

给字段分组,分组后再统计

----------------------------给字段分组,分组后再统计-----------
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "a-a": {
      "terms": {
        "field": "age",
        "order": [{
            "min_agg": "desc"
        }]
      },
      "aggregations": {
        "min_agg": {
          "min": {
            "field": "age"
          }
        }
      }
    }
  }
}
------------结果 以age分组后对age聚合取最小值
{
  "took": 11,
  "timed_out": false,
  "_shards": { "total": 5,"successful": 5,"skipped": 0,"failed": 0},
  "hits": { "total": 5,"max_score": 1,"hits": [ 
  { "_index": "crud","_type": "_doc","_id": "BOE5C2oB98oJuFh74UVc","_score": 1,"_source": { "msg": "caoke3","code": "10123","age": 3}},
  { "_index": "crud","_type": "_doc","_id": "AuE5C2oB98oJuFh74UVb","_score": 1,"_source": { "msg": "caoke1","code": "10121","age": 1}},
  { "_index": "crud","_type": "_doc","_id": "A-E5C2oB98oJuFh74UVc","_score": 1,"_source": { "msg": "caoke2","code": "10122","age": 2}},
  { "_index": "crud","_type": "_doc","_id": "BeE5C2oB98oJuFh74UVc","_score": 1,"_source": { "msg": "caoke4","code": "10124","age": 4}},
  { "_index": "crud","_type": "_doc","_id": "BuE5C2oB98oJuFh74UVc","_score": 1,"_source": { "msg": "caoke5","code": "10125","age": 5}}]
  },
  "aggregations": {
    "a-a": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 5,
          "doc_count": 1,
          "min_agg": {
            "value": 5
          }
        },
        {
          "key": 1,
          "doc_count": 1,
          "min_agg": {
            "value": 1
          }
        }
      ]
    }
  }
}

高亮展示(高亮字段需要出现在查询中)

---------------------------------高亮展示
{
  "from":0,
  "size":10,
  "query":{
    "prefix":{
      "msg":{
        "value":"caoke",
        "boost":-1.0
      }
    }
  },
  "_source":{
    "includes":[
      "code",
      "msg"
    ],
    "excludes":[
    ]
  },
  "highlight":{
    "pre_tags":[
      ""
    ],
    "post_tags":[
      ""
    ],
    "require_field_match":true,
    "fields":{
      "msg":{
        "fragment_size":1000,
        "number_of_fragments":0,
        "fragment_offset":0
      }
    }
  }
}

query_string 查询

POST db_student_live_detail_20190715/_search
{
  "query": {
    "query_string": {
      "default_field": "product_name",
      "query": "考研"
    }
  }
}

多个filter,must,should拼接

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "status": 5
          }
        }
      ],
      "must": [
        {
          "query_string": {
            "default_field": "productName",
            "query": "wp-雅思-1天有效"
          }
        }
      ],
      "should": [
        {
          "bool": {
            "must": {
              "term": {
                "isMemeberProduct": true
              }
            }
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "toSearch": true
                }
              },
              {
                "term": {
                  "displayProductInfo": true
                }
              }
            ]
          }
        }
      ]
    }
  }
}

must 下实现 should ,where a = 1 and b in (1,2,3)

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "user_id": 54646190
          }
        },
        {
          "match": {
            "product_id": 89419
          }
        },
        {
          "bool": {
            "should": [
              {
                "term": {
                  "live_status": 2
                }
              },
              {
                "term": {
                  "live_status": 1
                }
              }
            ]
          }
        }
      ]
    }
  }
}

对 Index 的操作

移除和修改索引的别名

POST /_aliases
{
    "actions": [
        {"remove": {"index": "goods_v1", "alias": "goods"}},
        { "add": {"index": "goods_v2",  "alias": "goods"}}
    ]
}

给已经存在的索引添加字段

PUT index_name/_mappings/type_name
{
  "properties": {
    "test": {
      "type": "integer"
    }
  }
}

重建索引 高版本才支持

POST _reindex
{
  "source": {
    "index": "twitter"
  },
  "dest": {
    "index": "twitter_new"
  }
}
由于es不支持修改索引结构字段类型,字段类型修改的时候,就必须要重建索引(可以新增字段)
es提供了reindex来支持重建索引,同步时会把_id同样同步过去
新建一个索引,设置好索引结构,用reindex来同步数据,经测试(6.2.2)同步的很慢
大量数据同步时要注意控制时间段,以免影响正常服务,同步完成后删除原索引即可

分词

字段设置为String后 index索引可选类型

analyzed
not_analyzed
no              不能被索引or检索

字段分词结果分析

GET /my_store/_analyze
{
  "field": "productID",
  "text": "XHDK-A-1293-#fJ3"
}

定制化自己的分词器

PUT /my_index
{
  "settings": {
    "analysis": {
      "char_filter": {
        "&_to_and": {
          "type": "mapping",
          "mappings": ["&=> and"]
        }
      },
      "filter": {
        "my_stopwords": {
          "type": "stop",
          "stopwords": ["the", "a"]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "char_filter": ["html_strip", "&_to_and"],
          "tokenizer": "standard",
          "filter": ["lowercase", "my_stopwords"]
        }
      }
    }
  }
}

调试优化

参看查询的执行计划

GET /test_index/test_type/6/_explain
{
  "query": {
    "match": {
      "test_field": "test hello"
    }
  }
}

验证搜索语句是否合法

GET /test_index/test_type/_validate/query?explain
{
  "query": {
    "math": {
      "test_field": "test"
    }
  }
}
{
  "valid": false,
  "error": "org.elasticsearch.common.ParsingException: no [query] registered for [math]"
}

常见问题

无法排序 将文本field的fielddata属性设置为true

Fielddata is disabled on text fields by default.
Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index.
Note that this can however use significant memory.
PUT /ecommerce/_mapping/product
{
  "properties": {
    "tags": {
      "type": "text",
      "fielddata": true
    }
  }
}

其他工具

elastic dump导出索引使用

elasticdump \
  --input=https://elasticsearch2.release.koolearn.com/koo_lexicon \
  --output=http://10.155.20.50:9200/koo_lexicon \
  --type=analyzer
elasticdump \
  --input=https://elasticsearch2.release.koolearn.com/koo_lexicon \
    --output=http://10.155.20.50:9200/koo_lexicon \
  --type=mapping
elasticdump \
  --input=https://elasticsearch2.release.koolearn.com/koo_lexicon \
    --output=http://10.155.20.50:9200/koo_lexicon \
  --type=data

你可能感兴趣的:(elastic,elastic)