Linux 下用 curl 命令访问 Elasticsearch(RESTful风格)
这里介绍用 PUT 方法和 GET 方法访问 Elasticsearch。
1. 用PUT方法写数据
示例说明:megacorp 是索引的名字,employee 是类型的名字,1和2是员工的ID。
用 PUT 方法,下面两种形式都可以。
第一种,把地址放在后面:
# curl -H 'Content-Type: application/json' -X PUT \
-d '{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}' \
http://
第二种,把数据放在后面:
# curl -H 'Content-Type: application/json' -X PUT http://
-d '{"first_name":"Tom","last_name":" Crewes","age":23,"about":"I like making movies","interests":["sports","music"]}'
2. 用GET方法查询数据
查询员工1的数据:
# curl -X GET http://
返回:
{"_index":"megacorp","_type":"employee","_id":"1","_version":3,"_seq_no":2,"_primary_term":1,"found":true,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}
查询全部员工:
# curl -X GET http://
返回:
{"took":70,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}},{"_index":"megacorp","_type":"employee","_id":"3","_score":1.0,"_source":{"first_name" :"Douglas","last_name":"Fir","age":35,"about":"I like to build cabinets","interests":[ "forestry" ]}}]}}
查询姓氏中包含“Smith”的员工:
# curl -X GET http://
返回:
{"took":31,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.2876821,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":0.2876821,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}]}}
用JSON格式查询姓氏中包含“Smith”的员工:
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"match":{"last_name":"Smith"}}}'
返回:
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.6931472,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.6931472,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":0.2876821,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}]}}
查询姓氏中包含“Smith”并且年龄大于30岁的员工:
curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"bool":{"filter":{"range":{"age":{"gt":30}}},"must":{"match":{"last_name":"Smith"}}}}}'
返回:
{"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.6931472,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.6931472,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}}]}}
全文搜索:
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"match":{"about":"rock climbing"}}}'
返回:
{"took":5,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.6099695,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.6099695,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}]}}
同样使用了 match 查询来搜索 about 字段中的 “rock climbing”,但为什么会得到两个匹配的文档呢?
通常情况下,Elasticsearch 会通过相关性来排列顺序。第一个结果中,John Smith 的 about 字段中明确地写到 rock climbing。而在 Jane Smith 的 about 字段中,提及到了 rock,但是并没有 climbing,所以后者的 _score 就要比前者的低。这个例子很好地解释了 Elasticsearch 是如何执行全文搜索的。对于 Elasticsearch 来说,相关性的概念是很重要的,而这也是它与传统数据库在返回匹配数据时最大的不同之处。
短语搜索:
能够找出每个字段中的独立单词固然很好,但是有的时候你可能还需要去匹配精确的短语或者段落。例如,我们只需要查询到 about 字段只包含 rock climbing 的短语的员工。
为了实现这个效果,需将 match 查询变为 match_phrase 查询。
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"match_phrase" : {"about" : "rock climbing"}}}'
返回:
{"took":8,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}]}}
这样就没有异议地只返回 John Smith 的文档。
高亮显示搜索结果:
很多程序希望能在搜索结果中高亮(或斜体)显示搜到的关键字。在 Elasticsearch 中很容易做到,只需添加一个 highlight 参数。
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"match_phrase":{"about":"rock climbing"}},"highlight":{"fields":{"about":{}}}}'
返回:
{"took":48,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]},"highlight":{"about":["I love to go rock climbing"]}}]}}
在结果中有一个新的名叫 highlight 的部分,在这里包含了 about 字段中的匹配单词并且被 HTML标签包裹着。
统计查询:
统计,Elasticsearch 把这项功能称作汇总 (aggregations),通过这个功能,可以对数据进行复杂的统计。这个功能有些类似于 SQL 中的 GROUP BY,但是要比它更加强大。
(1)找一下员工中最受欢迎的兴趣是什么
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"aggs":{"all_interests":{"terms":{"field":"interests.keyword"}}}}'
返回:
{"took":29,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":4,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"4","_score":1.0,"_source":{"first_name" :"Douglas","last_name":"Tom","age":25,"about":"I like football","interests":[ "mathematics" ]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}},{"_index":"megacorp","_type":"employee","_id":"3","_score":1.0,"_source":{"first_name" :"Douglas","last_name":"Fir","age":35,"about":"I like to build cabinets","interests":[ "forestry" ]}}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2},{"key":"forestry","doc_count":1},{"key":"mathematics","doc_count":1},{"key":"sports","doc_count":1}]}}}
有两个员工喜欢音乐,还有一个喜欢森林,还有一个喜欢数学,还有一个喜欢运动。这些数据并没有被预先计算好,它们是在文档被查询的同时实时计算得出的。
(2)查询姓 Smith 的员工的兴趣汇总情况
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"query":{"match":{"last_name":"smith"}},"aggs":{"all_interests":{"terms":{"field":"interests.keyword"}}}}'
返回:
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.6931472,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.6931472,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":0.2876821,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2},{"key":"sports","doc_count":1}]}}}
(3)统计每一个兴趣下的平均年龄
# curl -H 'Content-Type: application/json' -X GET http://
-d '{"aggs":{"all_interests":{"terms":{"field":"interests.keyword"},"aggs":{"avg_age":{"avg":{"field":"age"}}}}}}'
返回:
{"took":13,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":4,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{"first_name" : "Jane","last_name":"Smith","age":32,"about":"I like to collect rock albums","interests":["music"]}},{"_index":"megacorp","_type":"employee","_id":"4","_score":1.0,"_source":{"first_name" :"Douglas","last_name":"Tom","age":25,"about":"I like football","interests":[ "mathematics" ]}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{"first_name":"John","last_name":"Smith","age":25,"about":"I love to go rock climbing","interests":["sports","music"]}},{"_index":"megacorp","_type":"employee","_id":"3","_score":1.0,"_source":{"first_name" :"Douglas","last_name":"Fir","age":35,"about":"I like to build cabinets","interests":[ "forestry" ]}}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2,"avg_age":{"value":28.5}},{"key":"forestry","doc_count":1,"avg_age":{"value":35.0}},{"key":"mathematics","doc_count":1,"avg_age":{"value":25.0}},{"key":"sports","doc_count":1,"avg_age":{"value":25.0}}]}}}
在结果中,我们不但可以看到兴趣的统计数据,还能针对不同的兴趣来分析喜欢这个兴趣的平均年龄。如此复杂的统计工作是这样的简单,所以,到底有多强大取决于存入了什么样的数据和数据的存储结构。
Ok,完毕!
参考:https://www.cnblogs.com/Wolfmanlq/p/5984376.html