【es】cardinality 计算不准确问题

遇到问题:

两个结果不一样,按说是一样的

结果一:

 
     
  1. {
  2. "query": {
  3. "bool": {
  4. "must_not": [
  5. {
  6. "match_phrase": {
  7. "reqUA": "Jakarta Commons-HttpClient/3.1"
  8. }
  9. },
  10. {
  11. "match_phrase": {
  12. "reqReferer": "http://www.baidu.com/s?wd=www"
  13. }
  14. }
  15. ],
  16. "must": [
  17. {
  18. "range": {
  19. "reqTime": {
  20. "gte": "2016-09-25 22:00:00",
  21. "lte": "2016-09-26 22:00:00"
  22. }
  23. }
  24. },
  25. {
  26. "range": {
  27. "operateBeforeObj.sendTime": {
  28. "gte": "2016-09-25 22:00:00",
  29. "lte": "2016-09-26 22:00:00"
  30. }
  31. }
  32. },
  33. {
  34. "terms": {
  35. "productPageCode": [
  36. "10001",
  37. "33002"
  38. ]
  39. }
  40. }
  41. ]
  42. }
  43. },
  44. "from": 0,
  45. "aggs": {
  46. "channelTag": {
  47. "terms": {
  48. "field": "channelTag",
  49. "size": 0
  50. },
  51. "aggs": {
  52. "userId": {
  53. "cardinality": {
  54. "field": "user.userId"
  55. }
  56. }
  57. }
  58. }
  59. },
  60. "size": 0
  61. }
【es】cardinality 计算不准确问题_第1张图片

结果二:

 
     
  1. {
  2. "query": {
  3. "bool": {
  4. "must_not": [
  5. {
  6. "match_phrase": {
  7. "reqUA": "Jakarta Commons-HttpClient/3.1"
  8. }
  9. },
  10. {
  11. "match_phrase": {
  12. "reqReferer": "http://www.baidu.com/s?wd=www"
  13. }
  14. }
  15. ],
  16. "must": [
  17. {
  18. "range": {
  19. "reqTime": {
  20. "gte": "2016-09-25 22:00:00",
  21. "lte": "2016-09-26 22:00:00"
  22. }
  23. }
  24. },
  25. {
  26. "range": {
  27. "operateBeforeObj.sendTime": {
  28. "gte": "2016-09-25 22:00:00",
  29. "lte": "2016-09-26 22:00:00"
  30. }
  31. }
  32. },
  33. {
  34. "terms": {
  35. "productPageCode": [
  36. "10001",
  37. "33002"
  38. ]
  39. }
  40. }
  41. ]
  42. }
  43. },
  44. "from": 0,
  45. "aggs": {
  46. "userId": {
  47. "cardinality": {
  48. "field": "user.userId"
  49. }
  50. }
  51. },
  52. "size": 0
  53. }

分析问题:

问题应该在cardinality上,cardinality有个参数 "precision_threshold": 100 ,100是个预设值,你的真实值小于100计算出来的值就是正确的,真实值大于100计算出来的值就是模糊的,100可以自定义。

解决问题:

 
    
  1. {
  2. "aggs":{
  3. "author_count":{
  4. "cardinality":{
  5. "field":"author_hash",
  6. "precision_threshold":100
  7. }
  8. }
  9. }
  10. }
【es】cardinality 计算不准确问题_第2张图片
参考:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

你可能感兴趣的:(elasticsearch,cardinality,aggregations)