三种场景
-
最佳字段 (Best Fields)
- 当字段之间相互竞争,⼜相互关联。例如 title 和 body 这样的字段。评分来⾃最匹配字段
-
多数字段 (Most Fields)
- 处理英⽂内容时:⼀种常⻅的⼿段是,在主字段( English Analyzer),抽取词⼲,加⼊同义词,以 匹配更多的⽂档。相同的⽂本,加⼊⼦字段(Standard Analyzer),以提供更加精确的匹配。其他字 段作为匹配⽂档提⾼相关度的信号。匹配字段越多则越好
-
混合字段 (Cross Field)
- 对于某些实体,例如⼈名,地址,图书信息。需要在多个字段中确定信息,单个字段只能作为整体 的⼀部分。希望在任何这些列出的字段中找到尽可能多的词
Multi Match Query
Best Fields 是默认类型,可以不⽤指定
Minimum should match 等参数可以传递到⽣成的 query 中
POST blogs/_search
{
"query": {
"multi_match": {
"type": "best_fields",
"query": "Quick pets",
"fields": ["title","body"],
"tie_breaker": 0.2,
"minimum_should_match": "20%"
}
}
}
⼀个查询案例
- 英⽂分词器,导致精确度降低,时态信息丢失
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english"
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET titles/_search
{
"query": {
"match": {
"title": "barking dogs"
}
}
}
使⽤多数字段匹配解决
⽤⼴度匹配字段 title 包括尽可能多的⽂档——以提 升召回率——同时⼜使⽤字段 title.std 作为信号 将 相关度更⾼的⽂档置于结果顶部。
每个字段对于最终评分的贡献可以通过⾃定义值 boost 来控制。⽐如,使 title 字段更为重要, 这样同时也降低了其他信号字段的作⽤
DELETE /titles
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"fields": {"std": {"type": "text","analyzer": "standard"}}
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title", "title.std" ]
}
}
}
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title^10", "title.std" ]
}
}
}
跨字段搜索
⽆法使⽤ Operator
可以⽤ copy_to 解决,但是需要额外的存储空间
PUT address/_doc/1
{
"street": "5 Poland Street",
"city": "London",
"country": "United Kingdom",
"postcode": "W1V 3Dg"
}
POST address/_search
{
"query": {
"multi_match": {
"query": "Poland Street W1V",
"type": "most_fields",
"fields": ["street", "city", "country", "postcode"]
}
}
}
跨字段搜索 [cross_fields解决]
POST address/_search
{
"query": {
"multi_match": {
"query": "Poland Street W1V",
"type": "cross_fields",
"operator": "and",
"fields": ["street", "city", "country", "postcode"]
}
}
}
⽀持使⽤ Operator
与 copy_to, 相⽐,其中⼀个优势就是它可以在搜索时为单个字段提升权重。
本节知识点回顾
Multi Match 查询的基本语法
查询的类型
最佳字段 / 多数字段 / 跨字段
Boosting
控制 Precision
以及使⽤⼦字段多数字段算分,控制
使⽤ Operator
课程demo
POST blogs/_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Quick pets" }},
{ "match": { "body": "Quick pets" }}
],
"tie_breaker": 0.2
}
}
}
POST blogs/_search
{
"query": {
"multi_match": {
"type": "best_fields",
"query": "Quick pets",
"fields": ["title","body"],
"tie_breaker": 0.2,
"minimum_should_match": "20%"
}
}
}
POST books/_search
{
"multi_match": {
"query": "Quick brown fox",
"fields": "*_title"
}
}
POST books/_search
{
"multi_match": {
"query": "Quick brown fox",
"fields": [ "*_title", "chapter_title^2" ]
}
}
DELETE /titles
PUT /titles
{
"settings": {
"number_of_replicas": 1
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"fields": {
"std": {
"type": "text",
"analyzer": "standard"
}
}
}
}
}
}
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english"
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET titles/_search
{
"query": {
"match": {
"title": "barking dogs"
}
}
}
DELETE /titles
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"fields": {"std": {"type": "text","analyzer": "standard"}}
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title", "title.std" ]
}
}
}
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title^10", "title.std" ]
}
}
}
PUT address/_doc/1
{
"street": "5 Poland Street",
"city": "London",
"country": "United Kingdom",
"postcode": "W1V 3Dg"
}
POST address/_search
{
"query": {
"multi_match": {
"query": "Poland Street W1V",
"type": "most_fields",
"fields": ["street", "city", "country", "postcode"]
}
}
}
POST address/_search
{
"query": {
"multi_match": {
"query": "Poland Street W1V",
"type": "cross_fields",
"operator": "and",
"fields": ["street", "city", "country", "postcode"]
}
}
}
相关阅读
- https://www.elastic.co/guide/en/elasticsearch/reference/7.1/query-dsl-dis-max-query.html