Elasticsearch 7.x索引、文档基本操作

索引操作

1、查询索引

查询索引命令上面已经展示过了,这里再补充一些

条件过滤

_cat/indices?v&health=yellow   #查询健康状态为yellow的索引

排序

_cat/indices?v&health=yellow&s=docs.count:desc #根据文档数量进行索引排序

索引详细信息

curl -X GET "localhost:9200/my_index/_stats?pretty" #索引详细信息

2、创建索引

PUT /student
{
    "settings": {
        "number_of_shards": 3,
        "number_of_replicas": 1
    },
    "mappings": {
            "properties": {
                "name": {
                    "type":"text"
                },
                "country": {
                    "type":"keyword"
                },
                "age": {
                    "type":"integer"
                },
                "date": {
                    "type": "date",
                    "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
            }
        }
    }
}

创建成功

image

3、删除索引

curl -X DELETE "localhost:9200/index-name"

文档操作

在讲文档的CRUD之前我们要先理解 GET/PUT/POST/DELETE

POST /uri        #创建
DELETE /uri/xxx  #删除
PUT /uri/xxx     #更新或创建
GET /uri/xxx     #查看

思考POST和PUT的区别

  1. 在ES中,如果不确定文档的ID,那么就需要用POST,它可以自己生成唯一的文档ID。如果确定文档的ID,那么就可以用PUT,当然也可以用POST,它们都可以创建或修改文档(如果是修改,那么_version版本号提高1)。

  2. PUT、GET、DELETE是幂等的,而POST并不一定是幂等。如果你对POST也指定了文档ID,那它其实和PUT没啥区别,那它就是幂等。如果你没有指定文档ID那么就不是幂等操作了,因为同一数据,你执行多次POST,那么生成多个UUID。

1、添加文档

  • 1.1、指定文档ID
PUT blog/_doc/1
{
  "title":"1、VMware Workstation虚拟机软件安装图解",
  "author":"chengyuqiang",
  "content":"1、VMware Workstation虚拟机软件安装图解...",
  "url":"http://x.co/6nc81"
}

Elasticsearch服务会返回一个JSON格式的响应

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 2
}

响应结果说明

_index:文档所在的索引名
_type:文档所在的类型名
_id:文档ID
_version:文档的版本
result:created已经创建
_shards: _shards表示索引操作的复制过程的信息。
total:指示应在其上执行索引操作的分片副本(主分片和副本分片)的数量。
successful:表示索引操作成功的分片副本数。
failed:在副本分片上索引操作失败的情况下包含复制相关错误。
  • 1.2、不指定文档ID
    添加文档时可以不指定文档id,则文档id是自动生成的字符串。注意,需要使用POST方法,而不是PUT方法
POST blog/_doc
{
  "title":"2、Linux服务器安装图解",
  "author":"chengyuqiang",
  "content":"2、Linux服务器安装图解解...",
  "url":"http://x.co/6nc82"
}

响应返回结果

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "5P2-O2gBNSQY7o-KMw2P",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

2、获取文档

  • 2.1、通过文档id获取指定的文档
GET blog/_doc/1
{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "title" : "1、VMware Workstation虚拟机软件安装图解",
    "author" : "chengyuqiang",
    "content" : "1、VMware Workstation虚拟机软件安装图解...",
    "url" : "http://x.co/6nc81"
  }
}

响应结果说明

found:是否查询到该文档
_source:文档的内容
  • 2.2、文档不存在的情况
GET blog/_doc/2
{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "2",
  "found" : false
}
  • 2.3、判定文档是否存在
HEAD blog/_doc/1
200 - OK

3、更新文档

  • 3.1、更改id为1的文档,删除了author,修改content字段
PUT blog/_doc/1
{
  "title":"1、VMware Workstation虚拟机软件安装图解",
  "content":"下载得到VMware-workstation-full-15.0.2-10952284.exe可执行文件...",
  "url":"http://x.co/6nc81"
}
{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

_version更新为2
查看该文档

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "found" : true,
  "_source" : {
    "title" : "1、VMware Workstation虚拟机软件安装图解",
    "content" : "下载得到VMware-workstation-full-15.0.2-10952284.exe可执行文件...",
    "url" : "http://x.co/6nc81"
  }
}
  • 3.2、添加文档时,防止覆盖已存在的文档,可以通过_create加以限制
PUT blog/_doc/1/_create
{
  "title":"1、VMware Workstation虚拟机软件安装图解",
  "content":"下载得到VMware-workstation-full-15.0.2-10952284.exe可执行文件...",
  "url":"http://x.co/6nc81"
}

该文档已经存在,添加失败

{
  "error": {
    "root_cause": [
      {
        "type": "version_conflict_engine_exception",
        "reason": "[_doc][1]: version conflict, document already exists (current version [2])",
        "index_uuid": "GqC2fSqPS06GRfTLmh1TLg",
        "shard": "1",
        "index": "blog"
      }
    ],
    "type": "version_conflict_engine_exception",
    "reason": "[_doc][1]: version conflict, document already exists (current version [2])",
    "index_uuid": "GqC2fSqPS06GRfTLmh1TLg",
    "shard": "1",
    "index": "blog"
  },
  "status": 409
}
  • 3.3、更新文档的字段
    通过脚本更新制定字段,其中ctx是脚本语言中的一个执行对象,先获取_source,再修改content字段
POST blog/_doc/1/_update
{
  "script": {
    "source": "ctx._source.content=\"从官网下载VMware-workstation,双击可执行文件进行安装...\""  
  } 
}

响应结果如下

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}

再次获取文档GET blog/_doc/1,响应结果如下

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "found" : true,
  "_source" : {
    "title" : "1、VMware Workstation虚拟机软件安装图解",
    "content" : "从官网下载VMware-workstation,双击可执行文件进行安装...",
    "url" : "http://x.co/6nc81"
  }
}
  • 3.4、添加字段
POST blog/_doc/1/_update
{
  "script": {
    "source": "ctx._source.author=\"chengyuqiang\""  
  } 
}

再次获取文档GET blog/_doc/1,响应结果如下

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "found" : true,
  "_source" : {
    "title" : "1、VMware Workstation虚拟机软件安装图解",
    "content" : "从官网下载VMware-workstation,双击可执行文件进行安装...",
    "url" : "http://x.co/6nc81",
    "author" : "chengyuqiang"
  }
}
  • 3.5、删除字段
POST blog/_doc/1/_update
{
  "script": {
    "source": "ctx._source.remove(\"url\")"  
  } 
}

再次获取文档GET blog/_doc/1,响应结果如下

{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 5,
  "found" : true,
  "_source" : {
    "title" : "1、VMware Workstation虚拟机软件安装图解",
    "content" : "从官网下载VMware-workstation,双击可执行文件进行安装...",
    "author" : "chengyuqiang"
  }
}

  • 4、删除文档
DELETE blog/_doc/1
{
  "_index" : "blog",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 6,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 6,
  "_primary_term" : 1
}

再次判定该文档是否存在,执行HEAD blog/_doc/1,响应结果404 - Not Found


5、批量操作

如果文档数量非常庞大,商业运维中都是海量数据,一个一个操作文档显然不合实际。幸运的是ElasticSearch提供了文档的批量操作机制。我们已经知道mget允许一次性检索多个文档,ElasticSearch提供了Bulk API,可以执行批量索引、批量删除、批量更新等操作,也就是说Bulk API允许使用在单个步骤中进行多次 create 、 index 、 update 或 delete 请求
bulk 与其他的请求体格式稍有不同,bulk请求格式如下:

{ action: { metadata }}\n
{ request body        }\n
{ action: { metadata }}\n
{ request body        }\n
...

这种格式类似一个有效的单行 JSON 文档 流 ,它通过换行符(\n)连接到一起。注意两个要点

- 每行一定要以换行符(\n)结尾, 包括最后一行 。这些换行符被用作一个标记,可以有效分隔行。
- 这些行不能包含未转义的换行符,因为他们将会对解析造成干扰。这意味着这个 JSON 不 能使用 pretty 参数打印。
- action/metadata 行指定 哪一个文档 做 什么操作 。metadata 应该 指定被索引、创建、更新或者删除的文档的 _index 、 _type 和 _id 。
- request body 行由文档的 _source 本身组成–文档包含的字段和值。它是 index 和 create 操作所必需的。
  • 5.1、批量导入
POST /_bulk
{ "create": { "_index": "blog", "_type": "_doc", "_id": "1" }}
{ "title": "1、VMware Workstation虚拟机软件安装图解" ,"author":"chengyuqiang","content":"官网下载VMware-workstation,双击可执行文件进行安装" , "url":"http://x.co/6nc81" }
{ "create": { "_index": "blog", "_type": "_doc", "_id": "2" }}
{ "title":  "2、Linux服务器安装图解" ,"author":  "chengyuqiang" ,"content": "VMware模拟Linux服务器安装图解" , "url": "http://x.co/6nc82" }
{ "create": { "_index": "blog", "_type": "_doc", "_id": "3" }}
{ "title":  "3、Xshell 6 个人版安装与远程操作连接服务器" , "author": "chengyuqiang" ,"content": "Xshell 6 个人版安装与远程操作连接服务器..." , "url": "http://x.co/6nc84" }

这个 Elasticsearch 响应包含 items 数组, 这个数组的内容是以请求的顺序列出来的每个请求的结果

{
  "took" : 132,
  "errors" : false,
  "items" : [
    {
      "create" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 7,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 7,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 8,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}
  • 5.2、批量操作,包括删除、更新、新增
POST /_bulk
{ "delete": { "_index": "blog", "_type": "_doc", "_id": "1" }}
{ "update": { "_index": "blog", "_type": "_doc", "_id": "3", "retry_on_conflict" : 3} }
{ "doc" : {"title" : "Xshell教程"} }
{ "index": { "_index": "blog", "_type": "_doc", "_id": "4" }}
{ "title": "4、CentOS 7.x基本设置" ,"author":"chengyuqiang","content":"CentOS 7.x基本设置","url":"http://x.co/6nc85" }
{ "create": { "_index": "blog", "_type": "_doc", "_id": "5" }}
{ "title": "5、图解Linux下JDK安装与环境变量配置","author":"chengyuqiang" ,"content": "图解JDK安装配置" , "url": "http://x.co/6nc86" }

在7.0版本中,retry_on_conflict参数取代了之前的_retry_on_conflict

{
  "took" : 125,
  "errors" : false,
  "items" : [
    {
      "delete" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 2,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 4,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "blog",
        "_type" : "_doc",
        "_id" : "5",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 5,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

6、批量获取

GET blog/_doc/_mget
{
    "ids" : ["1", "2","3"]
}

id为1的文档已经删除,所以没有搜索到

{
  "docs" : [
    {
      "_index" : "blog",
      "_type" : "_doc",
      "_id" : "1",
      "found" : false
    },
    {
      "_index" : "blog",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "title" : "2、Linux服务器安装图解",
        "author" : "chengyuqiang",
        "content" : "VMware模拟Linux服务器安装图解",
        "url" : "http://x.co/6nc82"
      }
    },
    {
      "_index" : "blog",
      "_type" : "_doc",
      "_id" : "3",
      "_version" : 2,
      "found" : true,
      "_source" : {
        "title" : "Xshell教程",
        "author" : "chengyuqiang",
        "content" : "Xshell 6 个人版安装与远程操作连接服务器...",
        "url" : "http://x.co/6nc84"
      }
    }
  ]
}

你可能感兴趣的:(Elasticsearch 7.x索引、文档基本操作)