Elasticsearch的基本操作

1.es 的集成ik 分词

1.1 ik 分词

IKAnalyzer是一个开源的，基于java语言开发的轻量级的中文分词工具包。从2006年12月推出1.0版
开始，IKAnalyzer已经推出 了3个大版本。最初，它是以开源项目Lucene为应用主体的，结合词典分词
和文法分析算法的中文分词组件。新版本的IKAnalyzer3.0则发展为 面向Java的公用分词组件，独立于
Lucene项目，同时提供了对Lucene的默认优化实现。

1.2 ik分词的特点

1.采用了特有的“正向迭代最细粒度切分算法“，具有60万字/秒的高速处理能力。
2.采用了多子处理器分析模式，支持：英文字母（IP地址、Email、URL）、数字（日期，常用中文数
量词，罗马数字，科学计数法），中文词汇（姓名、地名处理）等分词处理。
3.支持个人词条的优化的词典存储，更小的内存占用。
4.支持用户词典扩展定义。
5.针对Lucene全文检索优化的查询分析器IKQueryParser；采用歧义分析算法优化查询关键字的搜索
排列组合，能极大的提高Lucene检索的命中率。

1.3 下载地址：

https://github.com/medcl/elasticsearch-analysis-ik/tags?after=v8.0.0
对应文章中es版本.
https://github.com/medcl/elasticsearch-analysis-ik/releases/tag/v7.12.0

1.4 在elasticsearch安装目录的plugins目录下新建 analysis-ik 目录

[root@basenode analysis-ik]# pwd
/root/tools/elasticsearch/plugins/analysis-ik
[root@basenode analysis-ik]# unzip elasticsearch-analysis-ik-7.12.0.zip
[root@basenode analysis-ik]# rm -rf elasticsearch-analysis-ik-7.12.0.zip

1.5 重启es

[root@basenode analysis-ik]# docker restart 949389b28b4e 6bc63ce3ab40 ea28731af194
949389b28b4e
6bc63ce3ab40
ea28731af194
[root@basenode analysis-ik]#

遇到一个错误采用上门的访问就会报下面的错误

"at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-7.12.0.jar:7.12.0]",
"... 6 more"] }
java.lang.IllegalStateException: Could not load plugin descriptor for plugin directory [elasticsearch-analysis-ik-7.12.0]
Likely root cause: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/plugins/elasticsearch-analysis-ik-7.12.0/plugin-descriptor.properties
    at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
    at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
    at java.base/java.nio.file.Files.newByteChannel(Files.java:375)
    at java.base/java.nio.file.Files.newByteChannel(Files.java:426)
    at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:420)
    at java.base/java.nio.file.Files.newInputStream(Files.java:160)
    at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:173)
    at org.elasticsearch.plugins.PluginsService.readPluginBundle(PluginsService.java:405)
    at org.elasticsearch.plugins.PluginsService.findBundles(PluginsService.java:382)
    at org.elasticsearch.plugins.PluginsService.getPluginBundles(PluginsService.java:375)
    at org.elasticsearch.plugins.PluginsService.(PluginsService.java:146)
    at org.elasticsearch.node.Node.(Node.java:336)
    at org.elasticsearch.node.Node.(Node.java:278)
    at org.elasticsearch.bootstrap.Bootstrap$5.(Bootstrap.java:217)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:217)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:397)
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
    at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:75)
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:116)
    at org.elasticsearch.cli.Command.main(Command.java:79)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:81)
For complete error details, refer to the log at /usr/share/elasticsearch/logs/elasticsearch.log

1.6 在镜像内部进行安装分词

进入docker

 docker exec -it elasticsearch /bin/bash

下载安装

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.0/elasticsearch-analysis-ik-7.12.0.zip

确认输入y
过程：

[root@basenode ~]# docker exec -it elasticsearch /bin/bash
^[[D[root@949389b28b4e elasticsearch]# 
[root@949389b28b4e elasticsearch]# ls
LICENSE.txt  NOTICE.txt  README.asciidoc  bin  config  data  jdk  lib  logs  modules  plugins
[root@949389b28b4e elasticsearch]# ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.0/elasticsearch-analysis-ik-7.12.0.zip
-> Installing https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.0/elasticsearch-analysis-ik-7.12.0.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.0/elasticsearch-analysis-ik-7.12.0.zip
[=================================================] 100%?? 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.net.SocketPermission * connect,resolve
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.

Continue with installation? [y/N]y
-> Installed analysis-ik
-> Please restart Elasticsearch to activate any plugins installed
[root@949389b28b4e elasticsearch]#

执行

POST _analyze
{
  "analyzer": "ik_max_word",
   "text":"我是中国人我爱大数据"
}

显示

{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "中国人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "中国",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "国人",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "人我",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "爱",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "CN_CHAR",
      "position" : 6
    },
    {
      "token" : "大数",
      "start_offset" : 7,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "数据",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 8
    }
  ]
}

2. 创建索引

语法

PUT /索引名称 
{ 
  "settings":
    { "属性名": "属性值" } 
 }
# 实例
PUT /wudldb
# 判断索引是否存在
HEAD /索引名称
#查看索引
GET /索引名称
# 批量创建索引
GET /索引名称1,索引名称2,索引名称3,...
#  查看所有索引
GET _all
# 或者
GET /_cat/indices?v
#打开索引
POST /索引名称/_open
#关闭索引
POST /索引名称/_close
#删除索引
DELETE /索引名称1,索引名称2,索引名称3...

3. 映射操作

PUT /索引库名/_mapping
 { 
 "properties": 
    { 
        "字段名":
         {
          "type": "类型",
          "index": true， 
          "store": true， 
          "analyzer": "分词器" 
          }
     }
  }
#字段名：任意填写，下面指定许多属性，例如：
#type：类型，可以是text、long、short、date、integer、object等
#index：是否索引，默认为true 
#store：是否存储，默认为false
#analyzer：指定分词器

# 实例
PUT /wudldb-index
PUT /wudldb-index_mapping/
{
  "properties": {
    "name": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "job": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "logo": {
      "type": "keyword",
      "index": "false"
    },
    "payment": {
      "type": "float"
    }
  }
}

3.1 打开索引

#查看映射关系
 GET /索引名称/_mapping
# 或者
GET _mapping
#或者
GET _all/_mapping

3.2 修改索引

修改索引和映射之间的关系
PUT /索引库名/_mapping 
{
    "properties": {
        "字段名": {
            "type": "类型",
            "index": true，
             "store": true，
             "analyzer": "分词器"
        }
    }
}

3.3 打开索引

POST /索引名称/_open

3.4 关闭索引

P0ST /索引名称/_close

3.5 删除索引

DELETE /索引名称1,索引名称2,索引名称3...

4. 索引的映射

简介：

索引创建之后，等于有了关系型数据库中的database。Elasticsearch7.x取消了索引type类型的设置，
不允许指定类型，默认为_doc，但字段仍然是有的，我们需要设置字段的约束信息，叫做字段映射（mapping）

4.1 创建索引字段

PUT /索引库名/_mapping 
{ 
    "properties": 
        { 
            "字段名": 
                { 
                    "type": "类型",
                    "index": true,
                    "store": true,
                    "analyzer": "分词器"
                }
        }
}

字段名：任意填写，下面指定许多属性，例如：
type：类型，可以是text、long、short、date、integer、object等
index：是否索引，默认为true 
store：是否存储，默认为false
analyzer：指定分词器

4.1.1 实例

PUT /wudldb-index
PUT /wudldb-index/_mapping/
{
  "properties": {
    "name": {
      "type": "text"
    
    },
    "job": {
      "type": "text"
    },
    "logo": {
      "type": "keyword",
      "index": "false"
    },
    "payment": {
      "type": "float"
    }
  }
}

4.2 映射的字段类型简介

基本可以分为以下几类

String类型，又分两种
    1.text：可分词，不可参与聚合
    2.keyword：不可分词，数据会作为完整字段进行匹配，可以参与聚合。
Numerical：数值类型，分两类
    1.基本数据类型：long、interger、short、byte、double、float、half_float.
    2.浮点数的高精度类型：scaled_float.
            1.需要指定一个精度因子，比如10或100。elasticsearch会把真实值乘以这个因子后存储.
Date：日期类型
    1.elasticsearch可以对日期格式化为字符串存储，但是建议我们存储为毫秒值，存储为long，节省
空间。
Array：数组类型:
    1.进行匹配时，任意一个元素满足，都认为满足
    2.排序时，如果升序则用数组中的最小值来排序，如果降序则用数组中的最大值来排序
Object：对象

4.3 查看映射关系

GET /索引名称/_mapping   或者
GET _all/_mapping
GET _mapping

4.4 一次性创建索引和映射

put /索引库名称
 { 
    "settings":{
         "索引库属性名":"索引库属性值"
          },
          "mappings":{
             "properties":{
                 "字段名":{ 
                    "映射属性名":"映射属性值"
                     }
                 } 
         } 
 }
#实例
PUT /wubigdata
{
  "settings": {},
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

5. 文档增删改查操作

5.1 新增文档

实例

POST /wubigdata/_doc/1
{
  "name": "flink",
  "job": "大数据架构师",
  "payment": "30000",
  "logo": "https://image.baidu.com/"
}

自动生成id

POST /索引名称/_doc 
{
    "field":"value"
}

5.2 查看单个文档