Elasticsearch学习笔记(四):ES的搜索方式

大纲:

  • query string search
  • query DSL
  • query filter
  • full-text search
  • phrase search
  • highlight search
  • 聚合分析

1. query string search

 搜索全部商品:get /ecommerce/product/_search

{
  "took": 20,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "chaojiban gaolujie yaogao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

took:耗费了几毫秒

time_out:是否超时,这里没有

_shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或是它的某个 replica shard也可以)

hits.total:结果数量,这里是三个document。

hits.max_score:score的含义,就是document对于一个search相关度的匹配分数,越相关,就越匹配,分数越高

hits.hits:包含匹配搜索的document数据 


query string search的由来,是因为search 参数都是以http请求的query string来附带的。

搜索商品中包含牙膏的数据,按照售价降序排序:get /ecommerce/produce/_search?q=name:yaogao&sort=price:desc

适用于临时在命令行使用的工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询的条件非常复杂,是很难去构建的,因此,在生产环境当中,是跟少去用的。

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": null,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": null,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        },
        "sort": [
          40
        ]
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        },
        "sort": [
          30
        ]
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": null,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        },
        "sort": [
          25
        ]
      }
    ]
  }
}

2. query DSL

DSL: domain specified language: 特定领域的语言

http request body,请求体,可以用json来构建复杂的查询语法,比较方便,比query string search强大很多。

查询所有的商品: 

get ecommerce/product/_search
{
  "query":{
    "match_all": {}
  }
}

查询名称包含yagao的商品,并价格降序

GET /ecommerce/product/_search
{
    "query": {
        "match": {
          "name": "yagao"
        }
    },
    "sort": [
      {
        "price": {
          "order": "desc"
        }
      }
    ]
}


{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": null,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": null,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        },
        "sort": [
          40
        ]
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        },
        "sort": [
          30
        ]
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": null,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        },
        "sort": [
          25
        ]
      }
    ]
  }
}

分页查询商品,假设每页一条,先查询第二页,则应该返回第二条数据

GET /ecommerce/product/_search
{
    "query": {
        "match_all": {}
    },
    "from": 1, 
    "size": 1
}


{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      }
    ]
  }
}

只查询指定字段

GET /ecommerce/product/_search
{
    "query": {
        "match_all": {}
    },
    "_source": ["name","price"]
}

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "price": 25,
          "name": "jiajieshi yagao"
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "price": 30,
          "name": "chaojiban gaolujie yagao"
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "price": 40,
          "name": "zhonghua yagao"
        }
      }
    ]
  }
}

3. query filter

搜索包含名称,且售价大于26的商品

GET /ecommerce/product/_search
{
    "query": {
      "bool": {
         "must": {
            "match": {
               "name": "yagao"
            }
         },
         "filter": {
            "range": {
              "price": {
                "gte": 26
              }
            }
         }
      }
    }
}


{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.5565415,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.5565415,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

4. full_text search(全文检索)

 

GET /ecommerce/product/_search
{
    "query": {
       "match": {
         "producer": "yagao producer"
       }
    }
}

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.25811607,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.25811607,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.16358379,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      }
    ]
  }
}

全文检索,首先producer这个字段会首先被es建立倒排索引,如 gaolujie producer会被建立 gaolujie 和producer这样的索引,而查询条件yagao producer也会被拆解为yagao 和producer去搜索。查询出来的结果score分数高的排在前面,表示相关度也越高。


5. phrase search (短语搜索)

与全文检索相对应,相反,全文检索会将输入的搜索串拆解开来,去倒排索引里面一一匹配,只要能匹配任意一个拆解出来的单词,就会作为结果返回,而短语搜索,要求输入的搜索字符串,必须在指定的字段文本中,完全包含一模一样的,才可以算匹配,作为结果返回。

GET /ecommerce/product/_search
{
    "query": {
       "match_phrase": {
         "producer": "zhonghua producer"
       }
    }
}

{
  "took": 121,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.51623213,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.51623213,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

6. highlight search(高亮搜索结果)

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "producer" : "producer"
        }
    },
    "highlight": {
        "fields" : {
            "producer" : {}
        }
    }
}

{
  "took": 27,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.25811607,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.25811607,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "jiajieshi producer"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        },
        "highlight": {
          "producer": [
            "zhonghua producer"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.16358379,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "gaolujie producer"
          ]
        }
      }
    ]
  }
}

7. 聚合分析

(一)计算每个tag(基于之前的数据)下的商品数量

GET /ecommerce/product/_search
{
   "aggs": {
     "group_by_aggs": {
       "terms": {
         "field": "tags"
       }
     }
   }
}

在直接这样查询的时候,es会报错,原因在于字段aggs的fielddata属性不是为true,只有当为true的情况下才可以取消倒排索引来进行聚合。所以需要设置一下:

PUT /ecommerce/_mapping/product?update_all_types
{
   "properties": {
      "tags": {
        "type": "text",
        "fielddata": true
        
      }
   }
}

设置完了之后查询的结果为

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_aggs": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2
        },
        {
          "key": "meibai",
          "doc_count": 1
        },
        {
          "key": "qingxin",
          "doc_count": 1
        }
      ]
    }
  }
}

(二)对名称中包含chaojiban的商品,计算每个tag下的商品数量

GET /ecommerce/product/_search
{
   "size": 20,
   "query": {
     "match": {
       "name": "chaojiban"
     }
   }, 
   "aggs": {
     "group_by_aggs": {
       "terms": {
         "field": "tags"
       }
     }
   }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5565415,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.5565415,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_aggs": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 1
        },
        {
          "key": "meibai",
          "doc_count": 1
        }
      ]
    }
  }
}

size是设置查询结果的最大数量

(三)先按照tags分组,再算每组的平均值,计算每个tag下的商品的平均价格

GET /ecommerce/product/_search
{
   "size": 20,
   "aggs": {
     "group_by_aggs": {
       "terms": {
         "field": "tags"
       },
       "aggs": {
         "avg_price": {
           "avg": {
             "field": "price"
           }
         }
       }
     }
   }
}


{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "chaojiban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_aggs": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": {
            "value": 27.5
          }
        },
        {
          "key": "meibai",
          "doc_count": 1,
          "avg_price": {
            "value": 30
          }
        },
        {
          "key": "qingxin",
          "doc_count": 1,
          "avg_price": {
            "value": 40
          }
        }
      ]
    }
  }
}

(四)计算每个tag下的商品的平均价格,并且按照平均价格降序排序

GET /ecommerce/product/_search
{
  "size": 0,
  "aggs": {
    "all_tags": {
      "terms": {
        "field": "tags",
        "order": {
          "avg_price": "desc"
        }
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "all_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "qingxin",
          "doc_count": 1,
          "avg_price": {
            "value": 40
          }
        },
        {
          "key": "meibai",
          "doc_count": 1,
          "avg_price": {
            "value": 30
          }
        },
        {
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": {
            "value": 27.5
          }
        }
      ]
    }
  }
}

(五)按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格

GET /ecommerce/product/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 20
          },
          {
            "from": 20,
            "to": 35
          },
          {
            "from":35,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_tags": {
          "terms": {
            "field": "tags"
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}


{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_price": {
      "buckets": [
        {
          "key": "0.0-20.0",
          "from": 0,
          "to": 20,
          "doc_count": 0,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": []
          }
        },
        {
          "key": "20.0-35.0",
          "from": 20,
          "to": 35,
          "doc_count": 2,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "fangzhu",
                "doc_count": 2,
                "avg_price": {
                  "value": 27.5
                }
              },
              {
                "key": "meibai",
                "doc_count": 1,
                "avg_price": {
                  "value": 30
                }
              }
            ]
          }
        },
        {
          "key": "35.0-50.0",
          "from": 35,
          "to": 50,
          "doc_count": 1,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "qingxin",
                "doc_count": 1,
                "avg_price": {
                  "value": 40
                }
              }
            ]
          }
        }
      ]
    }
  }
}

在区间范围的查询过程中,也是基于包左不包右的原则。

你可能感兴趣的:(Elasticsearch)