ES通过Collapse实现类似SQL over开窗函数功能

数据如下:

a b
10 1
10 2
10 3
20 20
20 10
20 30

需求:

按照a列进行分组,然后按照b列进行排序,返回b列中最小的结果对应的数据。类似SQL :

select a,b from (select a,b,row_number() over (partition by a order by b asc) as rank from tbl ) t where rank =1

想要得到的结果:

a b
10 1
20 10

实现:

在es中可以通过collapse来实现以上功能,关于collapse的使用官方解释如下:

You can use the collapse parameter to collapse search results based on field values. The collapsing is done by selecting only the top sorted document per collapse key.

For example, the following search collapses results by user.id and sorts them by http.response.bytes.

GET my-index-000001/_search
{
  "query": {
    "match": {
      "message": "GET /search"
    }
  },
  "collapse": {
    "field": "user.id"    ----- ①     
  },
  "sort": [
    {
      "http.response.bytes": { -----②
        "order": "desc"
      }
    }
  ],
  "from": 0     -------③               
}

    
①:Collapse the result set using the user.id field


②:Sort the results by http.response.bytes


③:Define the offset of the first collapsed result

 es中创建索引并插入测试数据:

DELETE test

PUT /test/_doc/1
{
  "a":10,
  "b":1
}

PUT /test/_doc/2
{
  "a":10,
  "b":2
}

PUT /test/_doc/3
{
  "a":10,
  "b":3
}

PUT /test/_doc/4
{
  "a":20,
  "b":20
}

PUT /test/_doc/5
{
  "a":20,
  "b":10
}

PUT /test/_doc/6
{
  "a":20,
  "b":30
}

编写查询语句:
 

GET  /test/_search
{
  "query": {
    "match_all": {}
  },
  "collapse": {
    "field": "a"
  },
  "sort": [
    {
      "b": {
        "order": "asc"
      }
    }
  ]
}

结果如下:

{
  "took" : 855,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "a" : 10,
          "b" : 1
        },
        "fields" : {
          "a" : [
            10
          ]
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : null,
        "_source" : {
          "a" : 20,
          "b" : 10
        },
        "fields" : {
          "a" : [
            20
          ]
        },
        "sort" : [
          10
        ]
      }
    ]
  }
}

你可能感兴趣的:(Elasticsearch,sql,数据库,es,elasticsearch)