准备数据
PUT _bulk
{"index": {"_index": "test_index", "_id": "1"}}
{"test_field": "hello world"}
{"index": {"_index": "test_index", "_id": "2"}}
{"test_field": "hello win"}
{"index": {"_index": "test_index", "_id": "3"}}
{"test_field": "hello dog"}
{"index": {"_index": "test_index", "_id": "4"}}
{"test_field": "hello cat"}
搜索推荐:match_phrase_prefix
match_phrase_prefix原理跟match_phrase类似,唯一的区别就是把最后一个term作为前缀去搜索。属于search time
以搜索hello w为例。
hello就会去进行match搜索,搜索对应的文档,而w会作为前缀去扫描整个倒排索引,找到所有w开头的文档,然后,找到所有文档中,既包含hello,又包含w开头的字符的文档。
最后在这些文档中根据你的slop去计算,看在slop的范围内能不能让hello和w正好跟文档中的hello和w开头的单词的position匹配。
搜索代码如下:
GET /test_index/_search
{
"query": {
"match_phrase_prefix": {
"test_field": "hello w"
}
}
}
输出结果:
{
"took" : 40,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.5133061,
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.5133061,
"_source" : {
"test_field" : "hello world"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 2.5133061,
"_source" : {
"test_field" : "hello win"
}
}
]
}
}
搜索推荐:ngram
ngram做搜索推荐与前缀匹配一个很大的区别是ngram是属于index time的,在index的时候就将此进行拆分,比如world就会拆分成w、o、r、l、d。但是搜索的本质与match_phrase_prefix是一样的。
以搜索hello w为例。
hello就会去进行match搜索,搜索对应的文档,而w会作为前缀去扫描整个倒排索引,找到所有w开头的文档,然后,找到所有文档中,既包含hello,又包含w开头的字符的文档。
最后在这些文档中根据你的slop去计算,看在slop的范围内能不能让hello和w正好跟文档中的hello和w开头的单词的position匹配。
建立索引:
PUT test_index
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"test_field": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
插入数据
PUT _bulk
{"index": {"_index": "test_index", "_id": "1"}}
{"test_field": "hello world"}
{"index": {"_index": "test_index", "_id": "2"}}
{"test_field": "hello win"}
{"index": {"_index": "test_index", "_id": "3"}}
{"test_field": "hello dog"}
{"index": {"_index": "test_index", "_id": "4"}}
{"test_field": "hello cat"}
查询
GET /test_index/_search
{
"query": {
"match_phrase": {
"test_field": "hello w"
}
}
}
输出:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.1620307,
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.1620307,
"_source" : {
"test_field" : "hello world"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.1620307,
"_source" : {
"test_field" : "hello win"
}
}
]
}
}