elasticsearch笔记-004-文档API-CRUD-单文档查操作

[toc]

单文档 get API

1. Get基本查询

GET weibo/_doc/1

output:

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "user" : "niewj",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
  }
}

2. head查询文档是否存在

GET weibo/_doc/2 结果:

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "2",
  "found" : false
}

可以看到id=2的文档不存在:

HEAD weibo/_doc/2 结果: 404 - Not Found

文档id=1是存在的: GET weibo/_doc/1 结果:

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "user" : "niewj",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
  }
}

用 HEAD查询: HEAD weibo/_doc/1, 结果: 200 - OK

HEAD: 查不到: 404; 查得到: 200;

3. get查询索引的实时性

如果文档已经更新，但还没有刷新，那么get API将发出一个刷新调用，使文档可见; 默认情况下，get API就是实时的，不受索引刷新速率的影响.(当然也可以禁用realtime GET，可将realtime参数设为false。)

4. \ _source的禁用

默认情况下，get操作返回source内容，除非禁用_source字段:

GET weibo/_doc/1:

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "user" : "niewj",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"
  }
}

禁用: GET weibo/_doc/1?_source=false:

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true
}

5. \_source字段的过滤(excludes/includes)

如果您只要source中的一个或两个字段，可以使用_source_include和_source_exclude参数来包含或过滤:

_source_includes 只想查询 \_source节点中的 user 和 message 字段:

GET weibo/_doc/1?_source_includes=user,message

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "message" : "trying out Elasticsearch",
    "user" : "niewj"
  }
}

_source 上述的包含, 可以简写为: _source

GET weibo/_doc/1?_source=user,message

source_excludes 想排除 \_source节点中的 message 字段, 其余的全展示:

GET weibo/_doc/1?_source_excludes=message

{
  "_index" : "weibo",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "post_date" : "2009-11-15T14:12:12",
    "user" : "niewj"
  }
}

6. 直接使用\_source获取数据

使用/{index}/\_source/{id} 只获取文档的\_source字段，而不包含任何其他附加内容。例如:

GET weibo/_source/1

{
  "user" : "niewj",
  "post_date" : "2009-11-15T14:12:12",
  "message" : "trying out Elasticsearch"
}

可以看到, 只有纯数据返回; 也可以使用上面的 _source的过滤字段:

GET weibo/_source/1?_source=user

{
  "user" : "niewj"
}

7. 索引时的stored_fields参数

默认情况下，字段值被索引以使其可搜索，但不存储它们(的原始值); 这意味着可以查到字段，但不能检索其原始字段值。举例:

# 设置mapping中counter字段只索引不存储;tags字段索引且存储;
PUT twitter
{
   "mappings": {
       "properties": {
          "counter": {
             "type": "integer",
             "store": false
          },
          "tags": {
             "type": "keyword",
             "store": true
          }
       }
   }
}

索引一条记录:

# 索引一条文档, id=1
PUT twitter/_doc/1
{
    "counter" : 1,
    "tags" : ["red"]
}

查询 GET twitter/_doc/1

{
  "_index" : "twitter",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "counter" : 1,
    "tags" : [
      "red"
    ]
  }
}

都可以查到; 使用 stored_fields 参数查询 : GET twitter/_doc/1?stored_fields=tags,counter:

{
  "_index" : "twitter",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "fields" : {
    "tags" : [
      "red"
    ]
  }
}

可以看到, 没了 counter 字段; 因为mapping里设置了 store: false

8. 索引查询性能相关参数之: refresh

refresh设为true，以在get之前刷新相关分片，使其可搜索。在将其设置为true之前，应该仔细考虑并确认: 因为这可能导致系统负载过重(降低索引速度)。