先准备测试数据,es有提供有关批量执行的方式:_bulk
参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
但是在postman中插入式异常,所以暂时使用单条插入的方式
PUT localhost:9200/movies/movie/1
{
"title": "The Godfather",
"director": "Francis Ford Coppola",
"year": 1972,
"genres": ["Crime", "Drama"]
}
PUT localhost:9200/movies/movie/2
{
"title": "Lawrence of Arabia",
"director": "David Lean",
"year": 1962,
"genres": ["Adventure", "Biography", "Drama"]
}
PUT localhost:9200/movies/movie/3
{
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": ["Crime", "Drama", "Mystery"]
}
PUT localhost:9200/movies/movie/4
{
"title": "Apocalypse Now",
"director": "Francis Ford Coppola",
"year": 1979,
"genres": ["Drama", "War"]
}
PUT localhost:9200/movies/movie/5
{
"title": "Apocalypse Now",
"director": "Francis Ford Coppola",
"year": 1979,
"genres": ["Drama", "War"]
}
PUT localhost:9200/movies/movie/6
{
"title": "Apocalypse Now",
"director": "Francis Ford Coppola",
"year": 1979,
"genres": ["Drama", "War"]
}
数据插入完毕,可以使用全文查询验证一下
POST localhost:9200/_search
{
"query":{
"match_all":{}
}
}
查询"genres"字段含有"Drama"的movie
POST localhost:9200/_search
{
"query":{
"bool":{
"must":[
{"match":{"genres":"Drama"}}
]
}
}
}
"match":{"genres":"Drama"} 子句表示"genres"字段包含"Drama"内容,"match"可有多个
查询结果
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0.2876821,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "1",
"_score": 0.2876821,
"_source": {
"title": "The Godfather",
"director": "Francis Ford Coppola",
"year": 1972,
"genres": [
"Crime",
"Drama"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "3",
"_score": 0.2876821,
"_source": {
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": [
"Crime",
"Drama",
"Mystery"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "4",
"_score": 0.14874382,
"_source": {
"title": "Apocalypse Now",
"director": "Francis Ford Coppola",
"year": 1979,
"genres": [
"Drama",
"War"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "2",
"_score": 0.12703528,
"_source": {
"title": "Lawrence of Arabia",
"director": "David Lean",
"year": 1962,
"genres": [
"Adventure",
"Biography",
"Drama"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "6",
"_score": 0.12703528,
"_source": {
"title": "The Assassination of Jesse James by the Coward Robert Ford",
"director": "Andrew Dominik",
"year": 2007,
"genres": [
"Biography",
"Crime",
"Drama"
]
}
}
]
}
}
可以看到一共五个结果,并且"genres"字段中都包含了"Drama"内容
如果我们需要对查询出的5个内容进行进一步的过滤,就需要用到"filter"子句了
例如:查询"genres"字段含有"Drama"的movie,过滤出结果中"year"字段为 1962 的movie
POST localhost:9200/_search
{
"query":{
"bool":{
"must":[
{"match":{"genres":"Drama"}}
],
"filter":[
{"term":{"year":1962}}
]
}
}
}
查询结果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.2876821,
"hits": [{
"_index": "movies",
"_type": "movie",
"_id": "3",
"_score": 0.2876821,
"_source": {
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": ["Crime", "Drama", "Mystery"]
}
}, {
"_index": "movies",
"_type": "movie",
"_id": "2",
"_score": 0.12703528,
"_source": {
"title": "Lawrence of Arabia",
"director": "David Lean",
"year": 1962,
"genres": ["Adventure", "Biography", "Drama"]
}
}]
}
}
可以看到,结果只剩俩条了,并且这俩条的记录的"year"字段都为 1962
从官方文档中得知"term"子句的功能是某字段精确匹配指定值,例如"term":{"year":1962}
可是我在查询例如:"term":{"director":"David Lean"}时,没有匹配数据,希望有人能告知我这是什么原因。
当然,es也提供了范围过滤,使用的是"range"子句,例如:
POST localhost:9200/_search
{
"query":{
"bool":{
"must":[
{"match":{"genres":"Drama"}}
],
"filter":[
{
"range":{
"year":{"gt":2000}
}
}
]
}
}
}
这条"range":{"year":{"gt":2000}}子句的意思为,过滤出结果中"year"字段大于 2000 的内容
结果为:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.12703528,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "6",
"_score": 0.12703528,
"_source": {
"title": "The Assassination of Jesse James by the Coward Robert Ford",
"director": "Andrew Dominik",
"year": 2007,
"genres": [
"Biography",
"Crime",
"Drama"
]
}
}
]
}
}
"match_phrase"子句查询:
POST localhost:9200/_search
{
"query":{
"match_phrase":{
"title":{
"query":"Lawrence,Arabia",
"slop" : 1
}
}
}
}
"match_phrase"官方文档的解释为短语匹配查询,但是个人对这个子句没太用明白
上面的事例中主要注意"slop"字段,它表示"query"字段以","分割的俩个短语中间最多可以间隔的单词个数
查询结果为:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.6103166,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "2",
"_score": 1.6103166,
"_source": {
"title": "Lawrence of Arabia",
"director": "David Lean",
"year": 1962,
"genres": [
"Adventure",
"Biography",
"Drama"
]
}
}
]
}
}
可以看到查询结果的"title"字段中的"Lawrence"和"Arabia"中间间隔了一个短语"of"
"match_phrase_prefix"短语前缀匹配查询
POST localhost:9200/_search
{
"query":{
"match_phrase_prefix":{"title":{"query":"Mock"}}}}
和"match_phrase"不同的一点在于,"match_phrase_prefix"只需"title"字段中含有"Mock"开头的短语即可成功匹配
结果为:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "3",
"_score": 0.2876821,
"_source": {
"title": "To Kill a Mockingbird",
"director": "Robert Mulligan",
"year": 1962,
"genres": [
"Crime",
"Drama",
"Mystery"
]
}
}
]
}
}
官方文档中该子句还可以包含"max_expansions"字段,经过一番测试并没有发现其作用,希望有人能够指教一下。
"multi_match"多字段匹配查询
POST localhost:9200/_search
{
"query":{
"multi_match":{
"query":"Ford",
"fields":["title","director^3"]
}
}
}
"multi_match"用于多字段匹配查询,使用"fields"来指定要匹配哪些字段
结果:
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 2.634553,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "4",
"_score": 2.634553,
"_source": {
"title": "Apocalypse Now",
"director": "Francis Ford Coppola",
"year": 1979,
"genres": [
"Drama",
"War"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "1",
"_score": 0.8630463,
"_source": {
"title": "The Godfather",
"director": "Francis Ford Coppola",
"year": 1972,
"genres": [
"Crime",
"Drama"
]
}
},
{
"_index": "movies",
"_type": "movie",
"_id": "6",
"_score": 0.69607234,
"_source": {
"title": "The Assassination of Jesse James by the Coward Robert Ford",
"director": "Andrew Dominik",
"year": 2007,
"genres": [
"Biography",
"Crime",
"Drama"
]
}
}
]
}
}
查询结果中,"title"、"director"任意一个字段中包含"Ford"短语即可成功匹配
"director^3"官方文档解释为该字段为3倍的重要,其效果很明显,"director"字段匹配成功的都排在前面