POST index/type/_search
{
"size": 10,
"aggs": {
"duplicateCount": {
"terms": {
"field": "link",
"min_doc_count": 10
},
"aggs": {
"duplicateDocuments": {
"top_hits": {}
}
}
}
}
}
size : hits中展示的数据
min_doc_count : link重复数量大于10的
嵌套里面的那个aggs : 标识再把duplicateDocuments的细节展示出来
{
"size": 0,
"aggs": {
"duplicateCount": {"terms": {
"script": "doc['name'].values + doc['employeeID'].values+doc['organisation'].values",
"min_doc_count": 2
},
"aggs": {}
"duplicateDocuments": {
"top_hits": {}
}
}
}
}
参考链接-https://qbox.io/blog/minimizing-document-duplication-in-elasticsearch