hancoder

谷粒商城-官方笔记-分布式高级(3/4)

个人笔记(新手级细节教程)：

https://blog.csdn.net/hancoder/article/details/106922139
官方笔记-基础篇：https://blog.csdn.net/hancoder/article/details/107612619
官方笔记-高级篇：https://blog.csdn.net/hancoder/article/details/107612746
官方笔记-集群篇：https://blog.csdn.net/hancoder/article/details/107612802
官方笔记下载地址：https://download.csdn.net/download/hancoder/12665396

文章目录

- 1. ELASTICSEARCH
- - 1、安装elastic search
  - 2、初步检索
  - - 1）_CAT
    - 2）索引一个文档
    - 3）查看文档
    - 4）更新文档
    - 5）删除文档或索引
    - 6）eleasticsearch的批量操作——bulk
    - 7）样本测试数据
  - 3、检索
  - - 1）search Api
    - 2）Query DSL
    - - （1）基本语法格式
      - （2）返回部分字段
      - （3）match匹配查询
      - （4） match_phrase [短句匹配]
      - （5）multi_math【多字段匹配】
      - （6）bool用来做复合查询
      - （7）Filter【结果过滤】
      - （8）term
      - （9）Aggregation（执行聚合）
    - 3）Mapping
    - - （1）字段类型
      - （2）映射
      - （3）新版本改变
      - 创建映射
        
        查看映射
        
        添加新的字段映射
        
        更新映射
        
        数据迁移
    - 4）分词
    - - （1）安装ik分词器
      - （1）查看elasticsearch版本号：
        
        （2）进入es容器内部plugin目录
      - （2）测试分词器
      - （3）自定义词库
  - 4、elasticsearch-Rest-Client
  - - 1）9300: TCP
    - 2）9200: HTTP
  - 5、附录：安装Nginx
- SpringBoot整合ElasticSearch
- - 1、导入依赖
  - 2、编写测试类
  - - 1）测试保存数据
    - 2）测试获取数据
- 其他
- - 1. kibana控制台命令

1. ELASTICSEARCH

1、安装elastic search

dokcer中安装elastic search

（1）下载ealastic search和kibana

docker pull elasticsearch:7.6.2
docker pull kibana:7.6.2

（2）配置

mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/

（3）启动Elastic search

docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e  "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v  /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.6.2

设置开机启动elasticsearch

docker update elasticsearch --restart=always

（4）启动kibana：

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.137.14:9200 -p 5601:5601 -d kibana:7.6.2

设置开机启动kibana

docker update kibana  --restart=always

（5）测试

查看elasticsearch版本信息： http://192.168.137.14:9200/

{
     
    "name": "0adeb7852e00",
    "cluster_name": "elasticsearch",
    "cluster_uuid": "9gglpP0HTfyOTRAaSe2rIg",
    "version": {
     
        "number": "7.6.2",
        "build_flavor": "default",
        "build_type": "docker",
        "build_hash": "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
        "build_date": "2020-03-26T06:34:37.794943Z",
        "build_snapshot": false,
        "lucene_version": "8.4.0",
        "minimum_wire_compatibility_version": "6.8.0",
        "minimum_index_compatibility_version": "6.0.0-beta1"
    },
    "tagline": "You Know, for Search"
}

显示elasticsearch 节点信息http://192.168.137.14:9200/_cat/nodes ，

127.0.0.1 76 95 1 0.26 1.40 1.22 dilm * 0adeb7852e00

访问Kibana： http://192.168.137.14:5601/app/kibana

2、初步检索

1）_CAT

（1）GET/cat/nodes：查看所有节点

如：http://192.168.137.14:9200/_cat/nodes :

127.0.0.1 61 91 11 0.08 0.49 0.87 dilm * 0adeb7852e00

注：*表示集群中的主节点

（2）GET/cat/health：查看es健康状况

如： http://192.168.137.14:9200/_cat/health

1588332616 11:30:16 elasticsearch green 1 1 3 3 0 0 0 0 - 100.0%

注：green表示健康值正常

（3）GET/cat/master：查看主节点

如： http://192.168.137.14:9200/_cat/master

vfpgxbusTC6-W3C2Np31EQ 127.0.0.1 127.0.0.1 0adeb7852e00

（4）GET/_cat/indicies：查看所有索引，等价于mysql数据库的show databases;

如： http://192.168.137.14:9200/_cat/indices

green open .kibana_task_manager_1   KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0   283b   283b
green open .kibana_1                PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb

2）索引一个文档

保存一个数据，保存在哪个索引的哪个类型下，指定用那个唯一标识
PUT customer/external/1;在customer索引下的external类型下保存1号数据为

PUT customer/external/1

{
     
 "name":"John Doe"
}

PUT和POST都可以
POST新增。如果不指定id，会自动生成id。指定id就会修改这个数据，并新增版本号；
PUT可以新增也可以修改。PUT必须指定id；由于PUT需要指定id，我们一般用来做修改操作，不指定id会报错。

下面是在postman中的测试数据：

创建数据成功后，显示201 created表示插入记录成功。

{
     
    "_index": "customer",
    "_type": "external",
    "_id": "1",
    "_version": 1,
    "result": "created",
    "_shards": {
     
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

这些返回的JSON串的含义；这些带有下划线开头的，称为元数据，反映了当前的基本信息。

“_index”: “customer” 表明该数据在哪个数据库下；

“_type”: “external” 表明该数据在哪个类型下；

“_id”: “1” 表明被保存数据的id；

“_version”: 1, 被保存数据的版本

“result”: “created” 这里是创建了一条数据，如果重新put一条数据，则该状态会变为updated，并且版本号也会发生变化。

下面选用POST方式：

添加数据的时候，不指定ID，会自动的生成id，并且类型是新增：

再次使用POST插入数据，仍然是新增的：

添加数据的时候，指定ID，会使用该id，并且类型是新增：

再次使用POST插入数据，类型为updated

3）查看文档

GET /customer/external/1

http://192.168.137.14:9200/customer/external/1

{
     
    "_index": "customer",//在哪个索引
    "_type": "external",//在哪个类型
    "_id": "1",//记录id
    "_version": 3,//版本号
    "_seq_no": 6,//并发控制字段，每次更新都会+1，用来做乐观锁
    "_primary_term": 1,//同上，主分片重新分配，如重启，就会变化
    "found": true,
    "_source": {
     
        "name": "John Doe"
    }
}

通过“if_seq_no=1&if_primary_term=1 ”，当序列号匹配的时候，才进行修改，否则不修改。

实例：将id=1的数据更新为name=1，然后再次更新为name=2，起始_seq_no=6，_primary_term=1

（1）将name更新为1

http://192.168.137.14:9200/customer/external/1?if_seq_no=6&if_primary_term=1

（2）将name更新为2，更新过程中使用seq_no=6

http://192.168.137.14:9200/customer/external/1?if_seq_no=6&if_primary_term=1

出现更新错误。

（3）查询新的数据

http://192.168.137.14:9200/customer/external/1

能够看到_seq_no变为7。

（4）再次更新，更新成功

http://192.168.137.14:9200/customer/external/1?if_seq_no=7&if_primary_term=1

4）更新文档

（1）POST更新文档，带有_update

http://192.168.137.14:9200/customer/external/1/_update

如果再次执行更新，则不执行任何操作，序列号也不发生变化

POST更新方式，会对比原来的数据，和原来的相同，则不执行任何操作（version和_seq_no）都不变。

（2）POST更新文档，不带_update

在更新过程中，重复执行更新操作，数据也能够更新成功，不会和原来的数据进行对比。

5）删除文档或索引

DELETE customer/external/1
DELETE customer

注：elasticsearch并没有提供删除类型的操作，只提供了删除索引和文档的操作。

实例：删除id=1的数据，删除后继续查询

实例：删除整个costomer索引数据

删除前，所有的索引

green  open .kibana_task_manager_1   KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green  open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0   283b   283b
green  open .kibana_1                PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb
yellow open customer                 nzDYCdnvQjSsapJrAIT8Zw 1 1 4 0  4.4kb  4.4kb

删除“ customer ”索引

删除后，所有的索引

green  open .kibana_task_manager_1   KWLtjcKRRuaV9so_v15WYg 1 0 2 0 39.8kb 39.8kb
green  open .apm-agent-configuration cuwCpJ5ER0OYsSgAJ7bVYA 1 0 0 0   283b   283b
green  open .kibana_1                PqK_LdUYRpWMy4fK0tMSPw 1 0 7 0 31.2kb 31.2kb

6）eleasticsearch的批量操作——bulk

语法格式：

{
     action:{
     metadata}}\n
{
     request body  }\n

{
     action:{
     metadata}}\n
{
     request body  }\n

这里的批量操作，当发生某一条执行发生失败时，其他的数据仍然能够接着执行，也就是说彼此之间是独立的。

bulk api以此按顺序执行所有的action（动作）。如果一个单个的动作因任何原因失败，它将继续处理它后面剩余的动作。当bulk api返回时，它将提供每个动作的状态（与发送的顺序相同），所以您可以检查是否一个指定的动作是否失败了。

实例1: 执行多条数据

POST customer/external/_bulk
{
     "index":{
     "_id":"1"}}
{
     "name":"John Doe"}
{
     "index":{
     "_id":"2"}}
{
     "name":"John Doe"}

执行结果

#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
     
  "took" : 491,
  "errors" : false,
  "items" : [
    {
     
      "index" : {
     
        "_index" : "customer",
        "_type" : "external",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
     
      "index" : {
     
        "_index" : "customer",
        "_type" : "external",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

实例2：对于整个索引执行批量操作

POST /_bulk
{
     "delete":{
     "_index":"website","_type":"blog","_id":"123"}}
{
     "create":{
     "_index":"website","_type":"blog","_id":"123"}}
{
     "title":"my first blog post"}
{
     "index":{
     "_index":"website","_type":"blog"}}
{
     "title":"my second blog post"}
{
     "update":{
     "_index":"website","_type":"blog","_id":"123"}}
{
     "doc":{
     "title":"my updated blog post"}}

运行结果：

#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
     
  "took" : 608,
  "errors" : false,
  "items" : [
    {
     
      "delete" : {
     
        "_index" : "website",
        "_type" : "blog",
        "_id" : "123",
        "_version" : 1,
        "result" : "not_found",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
     
      "create" : {
     
        "_index" : "website",
        "_type" : "blog",
        "_id" : "123",
        "_version" : 2,
        "result" : "created",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
     
      "index" : {
     
        "_index" : "website",
        "_type" : "blog",
        "_id" : "MCOs0HEBHYK_MJXUyYIz",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
     
      "update" : {
     
        "_index" : "website",
        "_type" : "blog",
        "_id" : "123",
        "_version" : 3,
        "result" : "updated",
        "_shards" : {
     
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

7）样本测试数据

准备了一份顾客银行账户信息的虚构的JSON文档样本。每个文档都有下列的schema（模式）。

{
     
	"account_number": 1,
	"balance": 39225,
	"firstname": "Amber",
	"lastname": "Duke",
	"age": 32,
	"gender": "M",
	"address": "880 Holmes Lane",
	"employer": "Pyrami",
	"email": "[email protected]",
	"city": "Brogan",
	"state": "IL"
}

https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json ，导入测试数据，

POST bank/account/_bulk

3、检索

1）search Api

ES支持两种基本方式检索；

通过REST request uri 发送搜索参数（uri +检索参数）；
通过REST request body 来发送它们（uri+请求体）；

信息检索

uri+请求体进行检索

GET /bank/_search
{
     
  "query": {
      "match_all": {
     } },
  "sort": [
    {
      "account_number": "asc" },
    {
     "balance":"desc"}
  ]
}

HTTP客户端工具（），get请求不能够携带请求体，

GET bank/_search?q=*&sort=account_number:asc

返回结果：

{
     
  "took" : 235,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "0",
        "_score" : null,
        "_source" : {
     
          "account_number" : 0,
          "balance" : 16623,
          "firstname" : "Bradshaw",
          "lastname" : "Mckenzie",
          "age" : 29,
          "gender" : "F",
          "address" : "244 Columbus Place",
          "employer" : "Euron",
          "email" : "[email protected]",
          "city" : "Hobucken",
          "state" : "CO"
        },
        "sort" : [
          0
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "1",
        "_score" : null,
        "_source" : {
     
          "account_number" : 1,
          "balance" : 39225,
          "firstname" : "Amber",
          "lastname" : "Duke",
          "age" : 32,
          "gender" : "M",
          "address" : "880 Holmes Lane",
          "employer" : "Pyrami",
          "email" : "[email protected]",
          "city" : "Brogan",
          "state" : "IL"
        },
        "sort" : [
          1
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "2",
        "_score" : null,
        "_source" : {
     
          "account_number" : 2,
          "balance" : 28838,
          "firstname" : "Roberta",
          "lastname" : "Bender",
          "age" : 22,
          "gender" : "F",
          "address" : "560 Kingsway Place",
          "employer" : "Chillium",
          "email" : "[email protected]",
          "city" : "Bennett",
          "state" : "LA"
        },
        "sort" : [
          2
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "3",
        "_score" : null,
        "_source" : {
     
          "account_number" : 3,
          "balance" : 44947,
          "firstname" : "Levine",
          "lastname" : "Burks",
          "age" : 26,
          "gender" : "F",
          "address" : "328 Wilson Avenue",
          "employer" : "Amtap",
          "email" : "[email protected]",
          "city" : "Cochranville",
          "state" : "HI"
        },
        "sort" : [
          3
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "4",
        "_score" : null,
        "_source" : {
     
          "account_number" : 4,
          "balance" : 27658,
          "firstname" : "Rodriquez",
          "lastname" : "Flores",
          "age" : 31,
          "gender" : "F",
          "address" : "986 Wyckoff Avenue",
          "employer" : "Tourmania",
          "email" : "[email protected]",
          "city" : "Eastvale",
          "state" : "HI"
        },
        "sort" : [
          4
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "5",
        "_score" : null,
        "_source" : {
     
          "account_number" : 5,
          "balance" : 29342,
          "firstname" : "Leola",
          "lastname" : "Stewart",
          "age" : 30,
          "gender" : "F",
          "address" : "311 Elm Place",
          "employer" : "Diginetic",
          "email" : "[email protected]",
          "city" : "Fairview",
          "state" : "NJ"
        },
        "sort" : [
          5
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "6",
        "_score" : null,
        "_source" : {
     
          "account_number" : 6,
          "balance" : 5686,
          "firstname" : "Hattie",
          "lastname" : "Bond",
          "age" : 36,
          "gender" : "M",
          "address" : "671 Bristol Street",
          "employer" : "Netagy",
          "email" : "[email protected]",
          "city" : "Dante",
          "state" : "TN"
        },
        "sort" : [
          6
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "7",
        "_score" : null,
        "_source" : {
     
          "account_number" : 7,
          "balance" : 39121,
          "firstname" : "Levy",
          "lastname" : "Richard",
          "age" : 22,
          "gender" : "M",
          "address" : "820 Logan Street",
          "employer" : "Teraprene",
          "email" : "[email protected]",
          "city" : "Shrewsbury",
          "state" : "MO"
        },
        "sort" : [
          7
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "8",
        "_score" : null,
        "_source" : {
     
          "account_number" : 8,
          "balance" : 48868,
          "firstname" : "Jan",
          "lastname" : "Burns",
          "age" : 35,
          "gender" : "M",
          "address" : "699 Visitation Place",
          "employer" : "Glasstep",
          "email" : "[email protected]",
          "city" : "Wakulla",
          "state" : "AZ"
        },
        "sort" : [
          8
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "9",
        "_score" : null,
        "_source" : {
     
          "account_number" : 9,
          "balance" : 24776,
          "firstname" : "Opal",
          "lastname" : "Meadows",
          "age" : 39,
          "gender" : "M",
          "address" : "963 Neptune Avenue",
          "employer" : "Cedward",
          "email" : "[email protected]",
          "city" : "Olney",
          "state" : "OH"
        },
        "sort" : [
          9
        ]
      }
    ]
  }
}

（1）只有6条数据，这是因为存在分页查询；

（2）详细的字段信息，参照： https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-search.html

The response also provides the following information about the search request:

took – how long it took Elasticsearch to run the query, in milliseconds

timed_out – whether or not the search request timed out

_shards – how many shards were searched and a breakdown of how many shards succeeded, failed, or were skipped.

max_score – the score of the most relevant document found

hits.total.value - how many matching documents were found

hits.sort - the document’s sort position (when not sorting by relevance score)

hits._score - the document’s relevance score (not applicable when using match_all)

2）Query DSL

（1）基本语法格式

Elasticsearch提供了一个可以执行查询的Json风格的DSL。这个被称为Query DSL，该查询语言非常全面。

一个查询语句的典型结构

QUERY_NAME:{
     
   ARGUMENT:VALUE,
   ARGUMENT:VALUE,...
}

如果针对于某个字段，那么它的结构如下：

{
     
  QUERY_NAME:{
     
     FIELD_NAME:{
     
       ARGUMENT:VALUE,
       ARGUMENT:VALUE,...
      }   
   }
}

GET bank/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "from": 0,
  "size": 5,
  "sort": [
    {
     
      "account_number": {
     
        "order": "desc"
      }
    }
  ]
}

query定义如何查询；

match_all查询类型【代表查询所有的所有】，es中可以在query中组合非常多的查询类型完成复杂查询；
除了query参数之外，我们可也传递其他的参数以改变查询结果，如sort，size；
from+size限定，完成分页功能；
sort排序，多字段排序，会在前序字段相等时后续字段内部排序，否则以前序为准；

（2）返回部分字段

GET bank/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "from": 0,
  "size": 5,
  "sort": [
    {
     
      "account_number": {
     
        "order": "desc"
      }
    }
  ],
  "_source": ["balance","firstname"]
  
}

查询结果：

{
     
  "took" : 18,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "999",
        "_score" : null,
        "_source" : {
     
          "firstname" : "Dorothy",
          "balance" : 6087
        },
        "sort" : [
          999
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "998",
        "_score" : null,
        "_source" : {
     
          "firstname" : "Letha",
          "balance" : 16869
        },
        "sort" : [
          998
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "997",
        "_score" : null,
        "_source" : {
     
          "firstname" : "Combs",
          "balance" : 25311
        },
        "sort" : [
          997
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "996",
        "_score" : null,
        "_source" : {
     
          "firstname" : "Andrews",
          "balance" : 17541
        },
        "sort" : [
          996
        ]
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "995",
        "_score" : null,
        "_source" : {
     
          "firstname" : "Phelps",
          "balance" : 21153
        },
        "sort" : [
          995
        ]
      }
    ]
  }
}

（3）match匹配查询

基本类型（非字符串），精确控制

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "account_number": "20"
    }
  }
}

match返回account_number=20的数据。

查询结果：

{
     
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "20",
        "_score" : 1.0,
        "_source" : {
     
          "account_number" : 20,
          "balance" : 16418,
          "firstname" : "Elinor",
          "lastname" : "Ratliff",
          "age" : 36,
          "gender" : "M",
          "address" : "282 Kings Place",
          "employer" : "Scentric",
          "email" : "[email protected]",
          "city" : "Ribera",
          "state" : "WA"
        }
      }
    ]
  }
}

字符串，全文检索

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "address": "kings"
    }
  }
}

全文检索，最终会按照评分进行排序，会对检索条件进行分词匹配。

查询结果：

{
     
  "took" : 30,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 5.990829,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "20",
        "_score" : 5.990829,
        "_source" : {
     
          "account_number" : 20,
          "balance" : 16418,
          "firstname" : "Elinor",
          "lastname" : "Ratliff",
          "age" : 36,
          "gender" : "M",
          "address" : "282 Kings Place",
          "employer" : "Scentric",
          "email" : "[email protected]",
          "city" : "Ribera",
          "state" : "WA"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "722",
        "_score" : 5.990829,
        "_source" : {
     
          "account_number" : 722,
          "balance" : 27256,
          "firstname" : "Roberts",
          "lastname" : "Beasley",
          "age" : 34,
          "gender" : "F",
          "address" : "305 Kings Hwy",
          "employer" : "Quintity",
          "email" : "[email protected]",
          "city" : "Hayden",
          "state" : "PA"
        }
      }
    ]
  }
}

（4） match_phrase [短句匹配]

将需要匹配的值当成一整个单词（不分词）进行检索

GET bank/_search
{
     
  "query": {
     
    "match_phrase": {
     
      "address": "mill road"
    }
  }
}

查处address中包含mill_road的所有记录，并给出相关性得分

查看结果：

{
     
  "took" : 32,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 8.926605,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 8.926605,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      }
    ]
  }
}

match_phrase和Match的区别，观察如下实例：

GET bank/_search
{
     
  "query": {
     
    "match_phrase": {
     
      "address": "990 Mill"
    }
  }
}

查询结果：

{
     
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 10.806405,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 10.806405,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      }
    ]
  }
}

使用match的keyword

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "address.keyword": "990 Mill"
    }
  }
}

查询结果，一条也未匹配到

{
     
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

修改匹配条件为“990 Mill Road”

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "address.keyword": "990 Mill Road"
    }
  }
}

查询出一条数据

{
     
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 6.5032897,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 6.5032897,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      }
    ]
  }
}

文本字段的匹配，使用keyword，匹配的条件就是要显示字段的全部值，要进行精确匹配的。

match_phrase是做短语匹配，只要文本中包含匹配条件，就能匹配到。

（5）multi_math【多字段匹配】

GET bank/_search
{
     
  "query": {
     
    "multi_match": {
     
      "query": "mill",
      "fields": [
        "state",
        "address"
      ]
    }
  }
}

state或者address中包含mill，并且在查询过程中，会对于查询条件进行分词。

查询结果：

{
     
  "took" : 28,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 5.4032025,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 5.4032025,
        "_source" : {
     
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "345",
        "_score" : 5.4032025,
        "_source" : {
     
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "[email protected]",
          "city" : "Blackgum",
          "state" : "KY"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "472",
        "_score" : 5.4032025,
        "_source" : {
     
          "account_number" : 472,
          "balance" : 25571,
          "firstname" : "Lee",
          "lastname" : "Long",
          "age" : 32,
          "gender" : "F",
          "address" : "288 Mill Street",
          "employer" : "Comverges",
          "email" : "[email protected]",
          "city" : "Movico",
          "state" : "MT"
        }
      }
    ]
  }
}

（6）bool用来做复合查询

复合语句可以合并，任何其他查询语句，包括符合语句。这也就意味着，复合语句之间
可以互相嵌套，可以表达非常复杂的逻辑。

must：必须达到must所列举的所有条件

GET bank/_search
{
     
   "query":{
     
        "bool":{
     
             "must":[
              {
     "match":{
     "address":"mill"}},
              {
     "match":{
     "gender":"M"}}
             ]
         }
    }
}

must_not，必须不匹配must_not所列举的所有条件。

should，应该满足should所列举的条件。

实例：查询gender=m，并且address=mill的数据

GET bank/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "match": {
     
            "gender": "M"
          }
        },
        {
     
          "match": {
     
            "address": "mill"
          }
        }
      ]
    }
  }
}

查询结果：

{
     
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 6.0824604,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "345",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "[email protected]",
          "city" : "Blackgum",
          "state" : "KY"
        }
      }
    ]
  }
}

must_not：必须不是指定的情况

实例：查询gender=m，并且address=mill的数据，但是age不等于38的


GET bank/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "match": {
     
            "gender": "M"
          }
        },
        {
     
          "match": {
     
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
     
          "match": {
     
            "age": "38"
          }
        }
      ]
    }
  }

查询结果：

{
     
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 6.0824604,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      }
    ]
  }
}

should：应该达到should列举的条件，如果到达会增加相关文档的评分，并不会改变查询的结果。如果query中只有should且只有一种匹配规则，那么should的条件就会被作为默认匹配条件二区改变查询结果。

实例：匹配lastName应该等于Wallace的数据

GET bank/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "match": {
     
            "gender": "M"
          }
        },
        {
     
          "match": {
     
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
     
          "match": {
     
            "age": "18"
          }
        }
      ],
      "should": [
        {
     
          "match": {
     
            "lastname": "Wallace"
          }
        }
      ]
    }
  }
}

查询结果：

{
     
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 12.585751,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 12.585751,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "345",
        "_score" : 6.0824604,
        "_source" : {
     
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "[email protected]",
          "city" : "Blackgum",
          "state" : "KY"
        }
      }
    ]
  }
}

能够看到相关度越高，得分也越高。

（7）Filter【结果过滤】

并不是所有的查询都需要产生分数，特别是哪些仅用于filtering过滤的文档。为了不计算分数，elasticsearch会自动检查场景并且优化查询的执行。

GET bank/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "match": {
     
            "address": "mill"
          }
        }
      ],
      "filter": {
     
        "range": {
     
          "balance": {
     
            "gte": "10000",
            "lte": "20000"
          }
        }
      }
    }
  }
}

这里先是查询所有匹配address=mill的文档，然后再根据10000<=balance<=20000进行过滤查询结果

查询结果：

{
     
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 5.4032025,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
     
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      }
    ]
  }
}

Each must, should, and must_not element in a Boolean query is referred to as a query clause. How well a document meets the criteria in each must or should clause contributes to the document’s relevance score. The higher the score, the better the document matches your search criteria. By default, Elasticsearch returns documents ranked by these relevance scores.

在boolean查询中，must, should 和must_not 元素都被称为查询子句。文档是否符合每个“must”或“should”子句中的标准，决定了文档的“相关性得分”。得分越高，文档越符合您的搜索条件。默认情况下，Elasticsearch返回根据这些相关性得分排序的文档。

The criteria in a must_not clause is treated as a filter. It affects whether or not the document is included in the results, but does not contribute to how documents are scored. You can also explicitly specify arbitrary filters to include or exclude documents based on structured data.

“must_not”子句中的条件被视为“过滤器”。 它影响文档是否包含在结果中，但不影响文档的评分方式。还可以显式地指定任意过滤器来包含或排除基于结构化数据的文档。

filter在使用过程中，并不会计算相关性得分：

GET bank/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "match": {
     
            "address": "mill"
          }
        }
      ],
      "filter": {
     
        "range": {
     
          "balance": {
     
            "gte": "10000",
            "lte": "20000"
          }
        }
      }
    }
  }
}

查询结果：

{
     
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 213,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "20",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 20,
          "balance" : 16418,
          "firstname" : "Elinor",
          "lastname" : "Ratliff",
          "age" : 36,
          "gender" : "M",
          "address" : "282 Kings Place",
          "employer" : "Scentric",
          "email" : "[email protected]",
          "city" : "Ribera",
          "state" : "WA"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "37",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 37,
          "balance" : 18612,
          "firstname" : "Mcgee",
          "lastname" : "Mooney",
          "age" : 39,
          "gender" : "M",
          "address" : "826 Fillmore Place",
          "employer" : "Reversus",
          "email" : "[email protected]",
          "city" : "Tooleville",
          "state" : "OK"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "51",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 51,
          "balance" : 14097,
          "firstname" : "Burton",
          "lastname" : "Meyers",
          "age" : 31,
          "gender" : "F",
          "address" : "334 River Street",
          "employer" : "Bezal",
          "email" : "[email protected]",
          "city" : "Jacksonburg",
          "state" : "MO"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "56",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 56,
          "balance" : 14992,
          "firstname" : "Josie",
          "lastname" : "Nelson",
          "age" : 32,
          "gender" : "M",
          "address" : "857 Tabor Court",
          "employer" : "Emtrac",
          "email" : "[email protected]",
          "city" : "Sunnyside",
          "state" : "UT"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "121",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 121,
          "balance" : 19594,
          "firstname" : "Acevedo",
          "lastname" : "Dorsey",
          "age" : 32,
          "gender" : "M",
          "address" : "479 Nova Court",
          "employer" : "Netropic",
          "email" : "[email protected]",
          "city" : "Islandia",
          "state" : "CT"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "176",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 176,
          "balance" : 18607,
          "firstname" : "Kemp",
          "lastname" : "Walters",
          "age" : 28,
          "gender" : "F",
          "address" : "906 Howard Avenue",
          "employer" : "Eyewax",
          "email" : "[email protected]",
          "city" : "Why",
          "state" : "KY"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "183",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 183,
          "balance" : 14223,
          "firstname" : "Hudson",
          "lastname" : "English",
          "age" : 26,
          "gender" : "F",
          "address" : "823 Herkimer Place",
          "employer" : "Xinware",
          "email" : "[email protected]",
          "city" : "Robbins",
          "state" : "ND"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "222",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 222,
          "balance" : 14764,
          "firstname" : "Rachelle",
          "lastname" : "Rice",
          "age" : 36,
          "gender" : "M",
          "address" : "333 Narrows Avenue",
          "employer" : "Enaut",
          "email" : "[email protected]",
          "city" : "Wright",
          "state" : "AZ"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "227",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 227,
          "balance" : 19780,
          "firstname" : "Coleman",
          "lastname" : "Berg",
          "age" : 22,
          "gender" : "M",
          "address" : "776 Little Street",
          "employer" : "Exoteric",
          "email" : "[email protected]",
          "city" : "Eagleville",
          "state" : "WV"
        }
      },
      {
     
        "_index" : "bank",
        "_type" : "account",
        "_id" : "272",
        "_score" : 0.0,
        "_source" : {
     
          "account_number" : 272,
          "balance" : 19253,
          "firstname" : "Lilly",
          "lastname" : "Morgan",
          "age" : 25,
          "gender" : "F",
          "address" : "689 Fleet Street",
          "employer" : "Biolive",
          "email" : "[email protected]",
          "city" : "Sunbury",
          "state" : "OH"
        }
      }
    ]
  }
}

能看到所有文档的 “_score” : 0.0。

（8）term

和match一样。匹配某个属性的值。全文检索字段用match，其他非text字段匹配用term。

Avoid using the term query for text fields.

避免对文本字段使用“term”查询

By default, Elasticsearch changes the values of text fields as part of analysis. This can make finding exact matches for text field values difficult.

默认情况下，Elasticsearch作为analysis的一部分更改’ text '字段的值。这使得为“text”字段值寻找精确匹配变得困难。

To search text field values, use the match.

要搜索“text”字段值，请使用匹配。

https://www.elastic.co/guide/en/elasticsearch/reference/7.6/query-dsl-term-query.html

使用term匹配查询

GET bank/_search
{
     
  "query": {
     
    "term": {
     
      "address": "mill Road"
    }
  }
}

查询结果：

{
     
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

一条也没有匹配到

而更换为match匹配时，能够匹配到32个文档

也就是说，全文检索字段用match，其他非text字段匹配用term。

（9）Aggregation（执行聚合）

聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于SQL Group by和SQL聚合函数。在elasticsearch中，执行搜索返回this（命中结果），并且同时返回聚合结果，把以响应中的所有hits（命中结果）分隔开的能力。这是非常强大且有效的，你可以执行查询和多个聚合，并且在一次使用中得到各自的（任何一个的）返回结果，使用一次简洁和简化的API啦避免网络往返。

“size”:0

size:0不显示搜索数据
aggs：执行聚合。聚合语法如下：

"aggs":{
     
    "aggs_name这次聚合的名字，方便展示在结果集中":{
     
        "AGG_TYPE聚合的类型(avg,term,terms)":{
     }
     }
}，

搜索address中包含mill的所有人的年龄分布以及平均年龄，但不显示这些人的详情

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "address": "Mill"
    }
  },
  "aggs": {
     
    "ageAgg": {
     
      "terms": {
     
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
     
      "avg": {
     
        "field": "age"
      }
    },
    "balanceAvg": {
     
      "avg": {
     
        "field": "balance"
      }
    }
  },
  "size": 0
}

查询结果：

{
     
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
     
    "ageAgg" : {
     
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
     
          "key" : 38,
          "doc_count" : 2
        },
        {
     
          "key" : 28,
          "doc_count" : 1
        },
        {
     
          "key" : 32,
          "doc_count" : 1
        }
      ]
    },
    "ageAvg" : {
     
      "value" : 34.0
    },
    "balanceAvg" : {
     
      "value" : 25208.0
    }
  }
}

复杂：
按照年龄聚合，并且求这些年龄段的这些人的平均薪资

GET bank/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "aggs": {
     
    "ageAgg": {
     
      "terms": {
     
        "field": "age",
        "size": 100
      },
      "aggs": {
     
        "ageAvg": {
     
          "avg": {
     
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

输出结果：

{
     
  "took" : 49,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
     
    "ageAgg" : {
     
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
     
          "key" : 31,
          "doc_count" : 61,
          "ageAvg" : {
     
            "value" : 28312.918032786885
          }
        },
        {
     
          "key" : 39,
          "doc_count" : 60,
          "ageAvg" : {
     
            "value" : 25269.583333333332
          }
        },
        {
     
          "key" : 26,
          "doc_count" : 59,
          "ageAvg" : {
     
            "value" : 23194.813559322032
          }
        },
        {
     
          "key" : 32,
          "doc_count" : 52,
          "ageAvg" : {
     
            "value" : 23951.346153846152
          }
        },
        {
     
          "key" : 35,
          "doc_count" : 52,
          "ageAvg" : {
     
            "value" : 22136.69230769231
          }
        },
        {
     
          "key" : 36,
          "doc_count" : 52,
          "ageAvg" : {
     
            "value" : 22174.71153846154
          }
        },
        {
     
          "key" : 22,
          "doc_count" : 51,
          "ageAvg" : {
     
            "value" : 24731.07843137255
          }
        },
        {
     
          "key" : 28,
          "doc_count" : 51,
          "ageAvg" : {
     
            "value" : 28273.882352941175
          }
        },
        {
     
          "key" : 33,
          "doc_count" : 50,
          "ageAvg" : {
     
            "value" : 25093.94
          }
        },
        {
     
          "key" : 34,
          "doc_count" : 49,
          "ageAvg" : {
     
            "value" : 26809.95918367347
          }
        },
        {
     
          "key" : 30,
          "doc_count" : 47,
          "ageAvg" : {
     
            "value" : 22841.106382978724
          }
        },
        {
     
          "key" : 21,
          "doc_count" : 46,
          "ageAvg" : {
     
            "value" : 26981.434782608696
          }
        },
        {
     
          "key" : 40,
          "doc_count" : 45,
          "ageAvg" : {
     
            "value" : 27183.17777777778
          }
        },
        {
     
          "key" : 20,
          "doc_count" : 44,
          "ageAvg" : {
     
            "value" : 27741.227272727272
          }
        },
        {
     
          "key" : 23,
          "doc_count" : 42,
          "ageAvg" : {
     
            "value" : 27314.214285714286
          }
        },
        {
     
          "key" : 24,
          "doc_count" : 42,
          "ageAvg" : {
     
            "value" : 28519.04761904762
          }
        },
        {
     
          "key" : 25,
          "doc_count" : 42,
          "ageAvg" : {
     
            "value" : 27445.214285714286
          }
        },
        {
     
          "key" : 37,
          "doc_count" : 42,
          "ageAvg" : {
     
            "value" : 27022.261904761905
          }
        },
        {
     
          "key" : 27,
          "doc_count" : 39,
          "ageAvg" : {
     
            "value" : 21471.871794871793
          }
        },
        {
     
          "key" : 38,
          "doc_count" : 39,
          "ageAvg" : {
     
            "value" : 26187.17948717949
          }
        },
        {
     
          "key" : 29,
          "doc_count" : 35,
          "ageAvg" : {
     
            "value" : 29483.14285714286
          }
        }
      ]
    }
  }
}

查出所有年龄分布，并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资

GET bank/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "aggs": {
     
    "ageAgg": {
     
      "terms": {
     
        "field": "age",
        "size": 100
      },
      "aggs": {
     
        "genderAgg": {
     
          "terms": {
     
            "field": "gender.keyword"
          },
          "aggs": {
     
            "balanceAvg": {
     
              "avg": {
     
                "field": "balance"
              }
            }
          }
        },
        "ageBalanceAvg": {
     
          "avg": {
     
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

输出结果：

{
     
  "took" : 119,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
     
    "ageAgg" : {
     
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
     
          "key" : 31,
          "doc_count" : 61,
          "genderAgg" : {
     
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
     
                "key" : "M",
                "doc_count" : 35,
                "balanceAvg" : {
     
                  "value" : 29565.628571428573
                }
              },
              {
     
                "key" : "F",
                "doc_count" : 26,
                "balanceAvg" : {
     
                  "value" : 26626.576923076922
                }
              }
            ]
          },
          "ageBalanceAvg" : {
     
            "value" : 28312.918032786885
          }
        }
      ]
        .......//省略其他
    }
  }
}

3）Mapping

（1）字段类型

（2）映射

Mapping(映射)
Maping是用来定义一个文档（document），以及它所包含的属性（field）是如何存储和索引的。比如：使用maping来定义：

哪些字符串属性应该被看做全文本属性（full text fields）；
哪些属性包含数字，日期或地理位置；
文档中的所有属性是否都嫩被索引（all 配置）；
日期的格式；
自定义映射规则来执行动态添加属性；

查看mapping信息
GET bank/_mapping

{
       
  "bank" : {
       
    "mappings" : {
       
      "properties" : {
       
        "account_number" : {
       
          "type" : "long"
        },
        "address" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
       
          "type" : "long"
        },
        "balance" : {
       
          "type" : "long"
        },
        "city" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "email" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "employer" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "firstname" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "gender" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "lastname" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "state" : {
       
          "type" : "text",
          "fields" : {
       
            "keyword" : {
       
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

修改mapping信息

（3）新版本改变

ElasticSearch7-去掉type概念

关系型数据库中两个数据表示是独立的，即使他们里面有相同名称的列也不影响使用，但ES中不是这样的。elasticsearch是基于Lucene开发的搜索引擎，而ES中不同type下名称相同的filed最终在Lucene中的处理方式是一样的。
- 两个不同type下的两个user_name，在ES同一个索引下其实被认为是同一个filed，你必须在两个不同的type中定义相同的filed映射。否则，不同type中的相同字段名称就会在处理中出现冲突的情况，导致Lucene处理效率下降。
- 去掉type就是为了提高ES处理数据的效率。
Elasticsearch 7.x URL中的type参数为可选。比如，索引一个文档不再要求提供文档类型。
Elasticsearch 8.x 不再支持URL中的type参数。
解决：
将索引从多类型迁移到单类型，每种类型文档一个独立索引

将已存在的索引下的类型数据，全部迁移到指定位置即可。详见数据迁移

Elasticsearch 7.x

Specifying types in requests is deprecated. For instance, indexing a document no longer requires a document type. The new index APIs are PUT {index}/_doc/{id} in case of explicit ids and POST {index}/_doc for auto-generated ids. Note that in 7.0, _doc is a permanent part of the path, and represents the endpoint name rather than the document type.

The include_type_name parameter in the index creation, index template, and mapping APIs will default to false. Setting the parameter at all will result in a deprecation warning.

The _default_ mapping type is removed.

Elasticsearch 8.x

Specifying types in requests is no longer supported.

The include_type_name parameter is removed.

创建映射

创建索引并指定映射

PUT /my_index
{
     
  "mappings": {
     
    "properties": {
     
      "age": {
     
        "type": "integer"
      },
      "email": {
     
        "type": "keyword"
      },
      "name": {
     
        "type": "text"
      }
    }
  }
}

输出：

{
     
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "my_index"
}

查看映射

GET /my_index

输出结果：

{
     
  "my_index" : {
     
    "aliases" : {
      },
    "mappings" : {
     
      "properties" : {
     
        "age" : {
     
          "type" : "integer"
        },
        "email" : {
     
          "type" : "keyword"
        },
        "employee-id" : {
     
          "type" : "keyword",
          "index" : false
        },
        "name" : {
     
          "type" : "text"
        }
      }
    },
    "settings" : {
     
      "index" : {
     
        "creation_date" : "1588410780774",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "ua0lXhtkQCOmn7Kh3iUu0w",
        "version" : {
     
          "created" : "7060299"
        },
        "provided_name" : "my_index"
      }
    }
  }
}

添加新的字段映射

PUT /my_index/_mapping
{
     
  "properties": {
     
    "employee-id": {
     
      "type": "keyword",
      "index": false
    }
  }
}

这里的 “index”: false，表明新增的字段不能被检索，只是一个冗余字段。

更新映射

对于已经存在的字段映射，我们不能更新。更新必须创建新的索引，进行数据迁移。

数据迁移

先创建new_twitter的正确映射。然后使用如下方式进行数据迁移。

POST reindex [固定写法]
{
     
  "source":{
     
      "index":"twitter"
   },
  "dest":{
     
      "index":"new_twitters"
   }
}

将旧索引的type下的数据进行迁移

POST reindex [固定写法]
{
     
  "source":{
     
      "index":"twitter",
      "twitter":"twitter"
   },
  "dest":{
     
      "index":"new_twitters"
   }
}

更多详情见： https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-reindex.html

GET /bank/_search

{
     
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
     
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
     
    "total" : {
     
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
     
        "_index" : "bank",
        "_type" : "account",//类型为account
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
     
          "account_number" : 1,
          "balance" : 39225,
          "firstname" : "Amber",
          "lastname" : "Duke",
          "age" : 32,
          "gender" : "M",
          "address" : "880 Holmes Lane",
          "employer" : "Pyrami",
          "email" : "[email protected]",
          "city" : "Brogan",
          "state" : "IL"
        }
      },
      ...

GET /bank/_search

想要将年龄修改为integer

PUT /newbank
{
     
  "mappings": {
     
    "properties": {
     
      "account_number": {
     
        "type": "long"
      },
      "address": {
     
        "type": "text"
      },
      "age": {
     
        "type": "integer"
      },
      "balance": {
     
        "type": "long"
      },
      "city": {
     
        "type": "keyword"
      },
      "email": {
     
        "type": "keyword"
      },
      "employer": {
     
        "type": "keyword"
      },
      "firstname": {
     
        "type": "text"
      },
      "gender": {
     
        "type": "keyword"
      },
      "lastname": {
     
        "type": "text",
        "fields": {
     
          "keyword": {
     
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "state": {
     
        "type": "keyword"
      }
    }
  }
}

查看“newbank”的映射：

GET /newbank/_mapping

能够看到age的映射类型被修改为了integer.

将bank中的数据迁移到newbank中

POST _reindex
{
     
  "source": {
     
    "index": "bank",
    "type": "account"
  },
  "dest": {
     
    "index": "newbank"
  }
}

运行输出：

#! Deprecation: [types removal] Specifying types in reindex requests is deprecated.
{
     
  "took" : 768,
  "timed_out" : false,
  "total" : 1000,
  "updated" : 0,
  "created" : 1000,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
     
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

查看newbank中的数据

4）分词

一个tokenizer（分词器）接收一个字符流，将之分割为独立的tokens（词元，通常是独立的单词），然后输出tokens流。

例如：whitespace tokenizer遇到空白字符时分割文本。它会将文本“Quick brown fox!”分割为[Quick,brown,fox!]。

该tokenizer（分词器）还负责记录各个terms(词条)的顺序或position位置（用于phrase短语和word proximity词近邻查询），以及term（词条）所代表的原始word（单词）的start（起始）和end（结束）的character offsets（字符串偏移量）（用于高亮显示搜索的内容）。

elasticsearch提供了很多内置的分词器，可以用来构建custom analyzers（自定义分词器）。

关于分词器： https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis.html

POST _analyze
{
     
  "analyzer": "standard",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

执行结果：

{
     
  "tokens" : [
    {
     
      "token" : "the",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "",
      "position" : 0
    },
    {
     
      "token" : "2",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "",
      "position" : 1
    },
    {
     
      "token" : "quick",
      "start_offset" : 6,
      "end_offset" : 11,
      "type" : "",
      "position" : 2
    },
    {
     
      "token" : "brown",
      "start_offset" : 12,
      "end_offset" : 17,
      "type" : "",
      "position" : 3
    },
    {
     
      "token" : "foxes",
      "start_offset" : 18,
      "end_offset" : 23,
      "type" : "",
      "position" : 4
    },
    {
     
      "token" : "jumped",
      "start_offset" : 24,
      "end_offset" : 30,
      "type" : "",
      "position" : 5
    },
    {
     
      "token" : "over",
      "start_offset" : 31,
      "end_offset" : 35,
      "type" : "",
      "position" : 6
    },
    {
     
      "token" : "the",
      "start_offset" : 36,
      "end_offset" : 39,
      "type" : "",
      "position" : 7
    },
    {
     
      "token" : "lazy",
      "start_offset" : 40,
      "end_offset" : 44,
      "type" : "",
      "position" : 8
    },
    {
     
      "token" : "dog's",
      "start_offset" : 45,
      "end_offset" : 50,
      "type" : "",
      "position" : 9
    },
    {
     
      "token" : "bone",
      "start_offset" : 51,
      "end_offset" : 55,
      "type" : "",
      "position" : 10
    }
  ]
}

（1）安装ik分词器

所有的语言分词，默认使用的都是“Standard Analyzer”，但是这些分词器针对于中文的分词，并不友好。为此需要安装中文的分词器。

注意：不能用默认elasticsearch-plugin install xxx.zip 进行自动安装
https://github.com/medcl/elasticsearch-analysis-ik/releases/download 对应es版本安装

在前面安装的elasticsearch时，我们已经将elasticsearch容器的“/usr/share/elasticsearch/plugins”目录，映射到宿主机的“ /mydata/elasticsearch/plugins”目录下，所以比较方便的做法就是下载“/elasticsearch-analysis-ik-7.6.2.zip”文件，然后解压到该文件夹下即可。安装完毕后，需要重启elasticsearch容器。

如果不嫌麻烦，还可以采用如下的方式。

（1）查看elasticsearch版本号：

[root@hadoop-104 ~]# curl http://localhost:9200
{
     
  "name" : "0adeb7852e00",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "9gglpP0HTfyOTRAaSe2rIg",
  "version" : {
     
    "number" : "7.6.2",      #版本号为7.6.2
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
    "build_date" : "2020-03-26T06:34:37.794943Z",
    "build_snapshot" : false,
    "lucene_version" : "8.4.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
[root@hadoop-104 ~]#

（2）进入es容器内部plugin目录

docker exec -it 容器id /bin/bash

[root@hadoop-104 ~]# docker exec -it elasticsearch /bin/bash
[root@0adeb7852e00 elasticsearch]#

wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip

[root@0adeb7852e00 elasticsearch]# pwd
/usr/share/elasticsearch
#下载ik7.6.2
[root@0adeb7852e00 elasticsearch]# wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip

unzip 下载的文件

[root@0adeb7852e00 elasticsearch]# unzip elasticsearch-analysis-ik-7.6.2.zip -d ink
Archive:  elasticsearch-analysis-ik-7.6.2.zip
   creating: ik/config/
  inflating: ik/config/main.dic      
  inflating: ik/config/quantifier.dic  
  inflating: ik/config/extra_single_word_full.dic  
  inflating: ik/config/IKAnalyzer.cfg.xml  
  inflating: ik/config/surname.dic   
  inflating: ik/config/suffix.dic    
  inflating: ik/config/stopword.dic  
  inflating: ik/config/extra_main.dic  
  inflating: ik/config/extra_stopword.dic  
  inflating: ik/config/preposition.dic  
  inflating: ik/config/extra_single_word_low_freq.dic  
  inflating: ik/config/extra_single_word.dic  
  inflating: ik/elasticsearch-analysis-ik-7.6.2.jar  
  inflating: ik/httpclient-4.5.2.jar  
  inflating: ik/httpcore-4.4.4.jar   
  inflating: ik/commons-logging-1.2.jar  
  inflating: ik/commons-codec-1.9.jar  
  inflating: ik/plugin-descriptor.properties  
  inflating: ik/plugin-security.policy  
[root@0adeb7852e00 elasticsearch]#
#移动到plugins目录下
[root@0adeb7852e00 elasticsearch]# mv ik plugins/

rm -rf *.zip

[root@0adeb7852e00 elasticsearch]# rm -rf elasticsearch-analysis-ik-7.6.2.zip

确认是否安装好了分词器

（2）测试分词器

使用默认

GET my_index/_analyze
{
     
   "text":"我是中国人"
}

请观察执行结果：

{
     
  "tokens" : [
    {
     
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "",
      "position" : 0
    },
    {
     
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "",
      "position" : 1
    },
    {
     
      "token" : "中",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "",
      "position" : 2
    },
    {
     
      "token" : "国",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "",
      "position" : 3
    },
    {
     
      "token" : "人",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "",
      "position" : 4
    }
  ]
}

GET my_index/_analyze
{
     
   "analyzer": "ik_smart", 
   "text":"我是中国人"
}

输出结果：

{
     
  "tokens" : [
    {
     
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
     
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
     
      "token" : "中国人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    }
  ]
}

GET my_index/_analyze
{
     
   "analyzer": "ik_max_word", 
   "text":"我是中国人"
}

输出结果：

{
     
  "tokens" : [
    {
     
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
     
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
     
      "token" : "中国人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
     
      "token" : "中国",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
     
      "token" : "国人",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 4
    }
  ]
}

（3）自定义词库

修改/usr/share/elasticsearch/plugins/ik/config中的IKAnalyzer.cfg.xml
/usr/share/elasticsearch/plugins/ik/config



<properties>
	<comment>IK Analyzer 扩展配置comment>
	
	<entry key="ext_dict">entry>
	 
	<entry key="ext_stopwords">entry>
	
	<entry key="remote_ext_dict">http://192.168.137.14/es/fenci.txtentry> 
	
	
properties>

原来的xml



<properties>
	<comment>IK Analyzer 扩展配置comment>
	
	<entry key="ext_dict">entry>
	 
	<entry key="ext_stopwords">entry>
	
	
	
	
properties>

修改完成后，需要重启elasticsearch容器，否则修改不生效。

更新完成后，es只会对于新增的数据用更新分词。历史数据是不会重新分词的。如果想要历史数据重新分词，需要执行：

POST my_index/_update_by_query?conflicts=proceed

http://192.168.137.14/es/fenci.txt，这个是nginx上资源的访问路径

在运行下面实例之前，需要安装nginx（安装方法见安装nginx），然后创建“fenci.txt”文件，内容如下：

echo "樱桃萨其马，带你甜蜜入夏" > /mydata/nginx/html/fenci.txt

测试效果：

GET my_index/_analyze
{
     
   "analyzer": "ik_max_word", 
   "text":"樱桃萨其马，带你甜蜜入夏"
}

输出结果：

{
     
  "tokens" : [
    {
     
      "token" : "樱桃",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
     
      "token" : "萨其马",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
     
      "token" : "带你",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
     
      "token" : "甜蜜",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
     
      "token" : "入夏",
      "start_offset" : 10,
      "end_offset" : 12,
      "type" : "CN_WORD",
      "position" : 4
    }
  ]
}

4、elasticsearch-Rest-Client

1）9300: TCP

spring-data-elasticsearch:transport-api.jar;
- springboot版本不同，ransport-api.jar不同，不能适配es版本
- 7.x已经不建议使用，8以后就要废弃

2）9200: HTTP

jestClient: 非官方，更新慢；
RestTemplate：模拟HTTP请求，ES很多操作需要自己封装，麻烦；
HttpClient：同上；
Elasticsearch-Rest-Client：官方RestClient，封装了ES操作，API层次分明，上手简单；
最终选择Elasticsearch-Rest-Client（elasticsearch-rest-high-level-client）；
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high.html

5、附录：安装Nginx

随便启动一个nginx实例，只是为了复制出配置
```
docker run -p80:80 --name nginx -d nginx:1.10   
```

将容器内的配置文件拷贝到/mydata/nginx/conf/ 下

mkdir -p /mydata/nginx/html
mkdir -p /mydata/nginx/logs
mkdir -p /mydata/nginx/conf
docker container cp nginx:/etc/nginx/*  /mydata/nginx/conf/ 
#由于拷贝完成后会在config中存在一个nginx文件夹，所以需要将它的内容移动到conf中
mv /mydata/nginx/conf/nginx/* /mydata/nginx/conf/
rm -rf /mydata/nginx/conf/nginx

终止原容器：
```
docker stop nginx
```
执行命令删除原容器：
```
docker rm nginx
```

创建新的Nginx，执行以下命令

docker run -p 80:80 --name nginx \
 -v /mydata/nginx/html:/usr/share/nginx/html \
 -v /mydata/nginx/logs:/var/log/nginx \
 -v /mydata/nginx/conf/:/etc/nginx \
 -d nginx:1.10

设置开机启动nginx
```
docker update nginx --restart=always
```
创建“/mydata/nginx/html/index.html”文件，测试是否能够正常访问
```
echo 'hello nginx!' >index.html
```
访问：http://ngix所在主机的IP:80/index.html

SpringBoot整合ElasticSearch

1、导入依赖

这里的版本要和所按照的ELK版本匹配。

<dependency>
    <groupId>org.elasticsearch.clientgroupId>
    <artifactId>elasticsearch-rest-high-level-clientartifactId>
    <version>7.6.2version>
dependency>

在spring-boot-dependencies中所依赖的ELK版本位6.8.7

    6.8.7

需要在项目中将它改为7.6.2

    <properties>
        ...
        <elasticsearch.version>7.6.2elasticsearch.version>
    properties>

2、编写测试类

1）测试保存数据

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html

    @Test
    public void indexData() throws IOException {
     
        IndexRequest indexRequest = new IndexRequest ("users");

        User user = new User();
        user.setUserName("张三");
        user.setAge(20);
        user.setGender("男");
        String jsonString = JSON.toJSONString(user);
        //设置要保存的内容
        indexRequest.source(jsonString, XContentType.JSON);
        //执行创建索引和保存数据
        IndexResponse index = client.index(indexRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);

        System.out.println(index);

    }

测试前：

测试后：

2）测试获取数据

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search.html

    @Test
    public void searchData() throws IOException {
        GetRequest getRequest = new GetRequest(
                "users",
                "_-2vAHIB0nzmLJLkxKWk");

        GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
        System.out.println(getResponse);
        String index = getResponse.getIndex();
        System.out.println(index);
        String id = getResponse.getId();
        System.out.println(id);
        if (getResponse.isExists()) {
            long version = getResponse.getVersion();
            System.out.println(version);
            String sourceAsString = getResponse.getSourceAsString();
            System.out.println(sourceAsString);
            Map sourceAsMap = getResponse.getSourceAsMap();
            System.out.println(sourceAsMap);
            byte[] sourceAsBytes = getResponse.getSourceAsBytes();
        } else {

        }
    }

查询state="AK"的文档：

{
     
	"took": 1,
	"timed_out": false,
	"_shards": {
     
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
     
		"total": {
     
			"value": 22,   //匹配到了22条
			"relation": "eq"
		},
		"max_score": 3.7952394,
		"hits": [{
     
			"_index": "bank",
			"_type": "account",
			"_id": "210",
			"_score": 3.7952394,
			"_source": {
     
				"account_number": 210,
				"balance": 33946,
				"firstname": "Cherry",
				"lastname": "Carey",
				"age": 24,
				"gender": "M",
				"address": "539 Tiffany Place",
				"employer": "Martgo",
				"email": "[email protected]",
				"city": "Fairacres",
				"state": "AK"
			}
		},
           ....//省略其他
          ]
	}
}

搜索address中包含mill的所有人的年龄分布以及平均年龄，平均薪资

GET bank/_search
{
     
  "query": {
     
    "match": {
     
      "address": "Mill"
    }
  },
  "aggs": {
     
    "ageAgg": {
     
      "terms": {
     
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
     
      "avg": {
     
        "field": "age"
      }
    },
    "balanceAvg": {
     
      "avg": {
     
        "field": "balance"
      }
    }
  }
}

java实现

    /**
     * 复杂检索:在bank中搜索address中包含mill的所有人的年龄分布以及平均年龄，平均薪资
     * @throws IOException
     */
    @Test
    public void searchData() throws IOException {
     
        //1. 创建检索请求
        SearchRequest searchRequest = new SearchRequest();

        //1.1）指定索引
        searchRequest.indices("bank");
        //1.2）构造检索条件
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        sourceBuilder.query(QueryBuilders.matchQuery("address","Mill"));

        //1.2.1)按照年龄分布进行聚合
        TermsAggregationBuilder ageAgg=AggregationBuilders.terms("ageAgg").field("age").size(10);
        sourceBuilder.aggregation(ageAgg);

        //1.2.2)计算平均年龄
        AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age");
        sourceBuilder.aggregation(ageAvg);
        //1.2.3)计算平均薪资
        AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");
        sourceBuilder.aggregation(balanceAvg);

        System.out.println("检索条件："+sourceBuilder);
        searchRequest.source(sourceBuilder);
        //2. 执行检索
        SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        System.out.println("检索结果："+searchResponse);

        //3. 将检索结果封装为Bean
        SearchHits hits = searchResponse.getHits();
        SearchHit[] searchHits = hits.getHits();
        for (SearchHit searchHit : searchHits) {
     
            String sourceAsString = searchHit.getSourceAsString();
            Account account = JSON.parseObject(sourceAsString, Account.class);
            System.out.println(account);

        }

        //4. 获取聚合信息
        Aggregations aggregations = searchResponse.getAggregations();

        Terms ageAgg1 = aggregations.get("ageAgg");

        for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
     
            String keyAsString = bucket.getKeyAsString();
            System.out.println("年龄："+keyAsString+" ==> "+bucket.getDocCount());
        }
        Avg ageAvg1 = aggregations.get("ageAvg");
        System.out.println("平均年龄："+ageAvg1.getValue());

        Avg balanceAvg1 = aggregations.get("balanceAvg");
        System.out.println("平均薪资："+balanceAvg1.getValue());


    }

可以尝试对比打印的条件和执行结果，和前面的ElasticSearch的检索语句和检索结果进行比较；

其他

1. kibana控制台命令

ctrl+home：回到文档首部；

ctril+end：回到文档尾部。

你可能感兴趣的:(分布式,谷粒商城)

Redis 分布式锁实现与实践佑瞻数据库与知识图谱 redis 分布式数据库
在分布式系统架构中，多个独立进程对共享资源的并发访问控制是常见需求，分布式锁作为解决这一问题的关键技术，在缓存更新、任务调度、库存管理等场景中发挥着重要作用。本文将从基础原理出发，详细阐述基于Redis的分布式锁实现方案，包括单实例模式与Redlock算法，并探讨其在实际应用中的关键考量。分布式锁核心概念分布式锁是一种跨进程、跨机器的同步机制，用于保证多个分布式节点对共享资源的互斥访问。一个可靠的
oracle pg 文件级迁移,从Oracle迁移到AntDB(二)-- ora2pg-对象和数据的导出导入
使用Ora2pg和psqlcopy方式进行数据迁移author:yafeishitags:AntDB,ora2pg,oracleAntDB:github_url,基于postgresql的高性能分布式数据库使用Ora2pg和psqlcopy方式进行数据迁移准备工作使用本文档的前提本文档指导如何使用ora2pg进行oracle到ADB的数据迁移，但是在参照本文档操作之前，有以下条件必须满足：-ADB
Spring 生态创新应用：微服务架构设计与前沿技术融合实践七夜zippoe #Java spring 微服务 java
在数字化转型的深水区，企业级应用正面临从“单体架构”向“分布式智能架构”的根本性跃迁。Spring生态以其二十年技术沉淀形成的生态壁垒，已成为支撑这场变革的核心基础设施。从2002年RodJohnson发布《ExpertOne-on-OneJ2EEDesignandDevelopment》奠定的理论基础，到如今覆盖从开发到运维全链路的技术矩阵，Spring始终以“简化开发”为初心，构建出适配不同业
HarmonyOS 入门到精通：为什么状态管理是鸿蒙开发的核心？逻极 harmonyos 鸿蒙笔记 harmonyos 华为鸿蒙入门到精通状态管理状态模式 arkts
在现代应用开发中，状态管理是构建响应式应用的基石。对于鸿蒙这种面向全场景的分布式操作系统，状态管理机制显得尤为重要。它不仅是实现复杂交互逻辑的关键，还直接关系到应用的性能、可维护性和用户体验。什么是状态管理？状态是指UI组件所依赖的、会随时间变化的数据。状态管理则是对这些变化数据的有效组织和控制，包括：状态的创建与初始化：在应用启动或组件加载时，为状态变量分配初始值，确保组件能够正确渲染初始界面。
4.服务注册发现：微服务的神经系统
在微服务架构中，服务之间不再是固定连接，而是高度动态、短暂存在的。如何让每个服务准确找到彼此，是分布式系统治理的核心问题之一。服务注册发现机制，正如神经系统之于人体，承担着连接、协调、感知变化的关键角色。本文将围绕Netflix开源的服务注册发现组件Eureka展开，深入剖析其原理，并以SpringCloud实战为导向，帮助你掌握服务治理的第一步。一、为什么需要服务注册发现？在单体架构中，服务调用
2.Spring Cloud生态全景解析：核心组件、能力边界与定位碎风影 SpringCloud深度解析 spring cloud spring 后端
导语：SpringCloud并非单一框架，而是基于SpringBoot构建的分布式系统工具集。它通过标准化封装，将服务发现、配置管理、熔断限流等复杂基础设施转化为开箱即用的组件，让开发者聚焦业务逻辑。本文将系统解析其核心组成、与SpringBoot的共生关系，并客观审视其能力边界，助您构建清晰的微服务技术选型地图。一、核心基石：SpringBoot与SpringCloud的共生关系关键结论：Spr
clickhouse分布式表插入数据不用带ON CLUSTER 时时刻刻看着自己的心 clickhouse 分布式
向分布式表插入数据时，通常不需要使用ONCLUSTER，因为分布式表的写入操作会自动将数据分发到底层表（bm_online_user_count_part）的对应节点。但对于DDL（数据定义语句，例如ALTERTABLE）操作，在分布式环境中修改底层表时，建议使用ONCLUSTER，以确保所有相关节点上的表结构和数据同步。区分DDL和DMLDML（数据操作语句，例如INSERT）向分布式表插入数据
k8s深度讲解----宏观架构与集群之脑 - API Server 和 etcd weixin_42587823 云原生 kubernetes 架构 etcd
宏观架构与集群之脑-APIServer和etcd宏观架构：数据中心的操作系统在开始之前，让我们先建立一个高层视角。你可以将Kubernetes想象成一个管理整个数据中心的分布式操作系统。在这个操作系统中：控制平面(ControlPlane)就是它的“内核”，负责管理和决策。工作节点(WorkerNodes)就是它的“CPU和内存”，是真正运行应用程序的地方。我们常用的kubectl就是与这个“内核
Kafka浅学文文Tao kafka java 分布式
Kafka应用场景？异步解耦流量消锋Kafka消息队列特点？Kafka吞吐量高：因为他存储数据时，磁盘顺序存储，磁盘的顺序存储速度很快。Kafka持久化消息：这些消息日志可以被重复读取和永久保留可以运行时动态扩展伸缩：Kafka是分布式系统：它以集群的方式运行，早期依赖Zookeeper对于Kafka的作用是什么？Zookeeper是分布式协调服务。Zookeeper作用：用于在Kafka集群中不
vivo Pulsar 万亿级消息处理实践（3）-KoP指标异常修复
作者：vivo互联网大数据团队-ChenJianbo本文是《vivoPulsar万亿级消息处理实践》系列文章第3篇。Pulsar是Apache基金会的开源分布式流处理平台和消息中间件，它实现了Kafka的协议，可以让使用KafkaAPI的应用直接迁移至Pulsar，这使得Pulsar在Kafka生态系统中更加容易被接受和使用。KoP提供了从Kafka到Pulsar的无缝转换，用户可以使用Kafka
探秘阿里云消息队列：解锁分布式系统的异步通信奥秘云资源服务商阿里云云计算中间件
阿里云消息队列：分布式架构的基石在当今数字化快速发展的时代，分布式系统已成为企业构建高可用、高性能应用的关键架构。而消息队列，作为分布式系统中的重要组件，犹如基石一般，支撑着整个架构的稳定运行。它能够有效地解决分布式系统中的异步通信、解耦、削峰填谷等问题，为系统的可靠性和扩展性提供了强大的保障。阿里云作为云计算领域的领军者，其推出的阿里云消息队列凭借着卓越的性能、高可靠性以及丰富的功能，成为了众多
Python 爬虫实战：如何搭建高效的分布式爬虫架构，突破数据抓取极限程序员威哥 python 爬虫分布式
随着互联网数据量的飞速增长，单一爬虫在抓取大量数据时的效率和稳定性往往无法满足需求。在这种情况下，分布式爬虫架构应运而生。分布式爬虫通过多节点并行工作，可以大大提高数据抓取的速度，同时减少单点故障的风险。本文将深入探讨如何使用Python构建一个高效的分布式爬虫架构，从架构设计到技术实现，帮助你突破数据抓取的极限。一、什么是分布式爬虫？分布式爬虫系统将爬虫任务拆分为多个子任务，分布到不同的服务器或
1-Kafka介绍及常见应用场景 sql2008help kafka 分布式
Kafka介绍ApacheKafka是一个开源的分布式流处理平台，最初由LinkedIn开发，后捐赠给Apache软件基金会。它被设计用于高吞吐量、低延迟、可水平扩展地处理实时数据流。官网地址是：https://kafka.apache.org/以下是Kafka的核心介绍：核心概念消息系统(MessagingSystem)Kafka充当生产者和消费者之间的消息中间件，解耦系统，确保可靠的数据传递。
基于 Java 的电商业务秒杀商品高并发、数据一致性、系统性能等多个方面设计方案一杯冰美式_丶 java 开发语言
1.需求分析高并发：大量用户同时抢购，系统需要支持高并发请求。库存一致性：避免超卖（库存减为负数）或数据不一致。高性能：响应时间要短，用户体验要好。公平性：先到先得，避免作弊。2.技术选型缓存：使用Redis缓存商品库存和秒杀结果，减少数据库压力。消息队列：使用RabbitMQ或Kafka异步处理订单，削峰填谷。数据库：MySQL存储订单和商品信息，使用事务保证数据一致性。分布式锁：使用Redis
告别重复订单！分布式ID生成核心方案全揭秘山海上的风分布式 java
《告别重复订单！分布式ID生成核心方案全揭秘》你可能用过UUID，却饱受索引性能折磨；你尝试过数据库自增ID，却在分库分表时束手无策；你研究过雪花算法，却被时钟回拨问题困扰……分布式订单ID生成究竟有没有完美方案？本文将为你一一拆解，并给出企业级最优解！一、为什么订单ID如此关键？（示意图：分布式订单系统）需求维度技术指标灾难案例全局唯一零冲突概率重复订单导致财务对账崩溃高性能10万+TPS秒杀活
Java微服务框架技术选型全景报告 chanalbert 技术选型 java java 微服务框架技术选型
一、核心框架深度解析1.1Spring生态体系组件关键特性适用场景SpringBoot-约定优于配置+自动装配（Starter）-内嵌Tomcat/Jetty容器-Actuator监控端点企业级单体应用/传统系统迁移SpringCloud-微服务全家桶（Eureka/Zuul/Config）-强事务管理（SpringTX）-生态兼容性最佳复杂分布式系统WebFlux-响应式编程模型（Reactor
Zookeeper的典型应用场景?
大家好，我是锋哥。今天分享关于【Zookeeper的典型应用场景?】面试题。希望对大家有帮助；Zookeeper的典型应用场景?超硬核AI学习资料，现在永久免费了！Zookeeper是一个开源的分布式协调服务，它被广泛应用于需要分布式系统协调的场景。以下是Zookeeper的一些典型应用场景：1.分布式锁在分布式系统中，多个节点可能需要对共享资源进行访问，这时就需要确保访问的排他性。Zookeep
NCCL 核心集体通信操作深度解析：从原理到优化实践清风 001 AI大模型底层建设 gpu算力 ai
目录引言：NCCL——分布式训练的通信引擎一、NCCL基础：GPU通信的“加速器”1.1NCCL与MPI的协同1.2集体通信的价值二、NCCL核心操作深度解析2.1AllGather：全局数据聚合2.1.1定义与目标2.1.2算法原理2.1.3性能影响因素2.1.4测试方法（nccl-tests）2.2AllReduce：梯度聚合的核心2.2.1定义与目标2.2.2算法原理2.2.3性能影响因素2
Scrapy与分布式开发(2.3)：lxml+xpath基本指令和提取方法详解九月镇灵将打造高效爬虫系统 scrapy 分布式 xpath lxml
lxml+xpath基本指令和提取方法详解一、XPath简介XPath，全称为XMLPathLanguage，是一种在XML文档中查找信息的语言。它允许用户通过简单的路径表达式在XML文档中进行导航。XPath不仅适用于XML，还常用于处理HTML文档。二、基本指令和提取方法选择节点使用XPath，你可以轻松地选择XML文档中的节点。*选择根节点：/*选择子节点：/parent/child*选择所
Docker安装部署MySQL+Canal+Kafka+Camus+HIVE数据实时同步是小南啊_- Java java centos docker kafka hadoop
因为公司业务需求要将mysql的数据实时同步到hive中，在网上找到一套可用的方案，即MySQL+Canal+Kafka+Camus+HIVE的数据流通方式，因为是首次搭建，所以暂时使用伪分布式的搭建方案。一、安装docker安装docker的教程网上一搜一大把,请参考：centos下docker安装教程二、docker安装MySQL安装教程网上也有很多，请参考:docker安装MySQL1.开启
HBase总结
HBase1.HBase核心概念HBase的作用HBase主要用于存储和管理超大规模的结构化或半结构化数据（如PB级），特点包括：高扩展性：通过分布式架构横向扩展，支持数千台服务器高吞吐量：适合实时随机读写（如用户行为日志、实时分析）强一致性：保证同一行数据的原子性操作灵活的数据模型：支持动态列和稀疏存储典型应用场景：互联网公司的用户行为日志存储（如点击流数据）社交媒体的实时消息存储物联网设备时序
GlusterFS 分布式文件系统详解 Sally璐璐运维运维
一、核心特性高扩展性GlusterFS采用无共享架构，支持横向扩展，只需添加服务器节点即可提升存储容量和性能，理论上可达PB甚至EB级规模，且扩展过程对上层应用完全透明。例如，一个初始4节点、20TB的集群可无缝扩展至100节点、500TB规模，仅需执行简单扩容命令，无需中断服务或数据迁移。详细扩容步骤：准备新服务器并安装GlusterFS软件确保操作系统版本兼容安装glusterfs-serve
什么是 Web3？
Web3是用来描述互联网下一代迭代的术语，它建立在区块链技术之上，由用户共同控制。第三次会成功吗？互联网一直在发展和变化。但不仅仅是网站和平台会时好时坏；构建互联网的代码本身也在不断变化。在过去的几年中，一些技术未来学家开始将计算机科学家GavinWood创造的术语Web3视为未来事物的标志。Web3是一种建立在区块链上的新型去中心化互联网，区块链是由参与者共同控制的分布式账本。由于区块链的集体性
区块链技术如何促进算力生态的发展？ VV- Wxiaoxwen 软件工程开源软件软件构建
区块链技术可通过优化共识机制、推动分布式算力发展、促进算力资源共享等方式，从提升效率、拓展应用场景、优化资源配置等方面促进算力生态的发展，具体如下：-优化共识机制提升算力效率：传统的工作量证明（PoW）共识机制依赖大量计算资源，能耗高且效率低。而权益证明（PoS）、委任权益证明（DPoS）等新型共识机制的出现，减少了对挖矿算力的依赖，能在保证安全性的前提下，大幅降低算力需求，提高能源利用效率，使区
分布式系统与RPC框架介绍 jjkkzzzz 分布式系统 rpc
分布式系统是什么？分布式系统是由多台独立的计算节点通过网络协同组成的系统，多个节点对外表现为一个整体，共同完成一个业务目标。这些节点可以是不同物理机、虚拟机、容器，也可以位于不同地理位置。分布式系统特点：多节点协作：系统中的多个服务进程分布在不同机器上。网络通信：节点间通过网络（通常通过RPC）通信。透明性：用户感知不到后端有多少节点。容错能力：节点故障不会影响整体系统的可用性。为什么需要分布式系
Rust之从零开始构建分布式事务数据库莲华君 rust 分布式数据库
目录第一部分：Rust基础与数据库基础Rust语言基础Rust的特点与优势Rust的内存安全与并发模型Rust工具链与开发环境搭建数据库基本原理关系型数据库与非关系型数据库数据库的事务管理原理ACID与BASE理论分布式系统与数据库的挑战第二部分：分布式数据库核心架构分布式数据库的设计原则CAP理论与BASE理论数据分片与复制数据一致性与可用性的权衡数据持久化与恢复策略分布式事务的基础事务的ACI
OpenHarmony vs Linux：分布式操作系统的终极对决 109702008 编程操作系统 #linux系统 linux 分布式人工智能
副标题：从架构基因到场景适配，解析两大系统的分布式能力差异与未来演进引言：分布式操作系统的时代命题在万物互联时代，设备协同与算力融合成为刚需。OpenHarmony和Linux作为两大开源操作系统，代表了不同的技术路线：前者是原生分布式设计，后者是生态驱动演进。本文从分布式视角深度对比二者，为开发者提供选型参考。一、架构设计：原生支持vs生态补足能力维度OpenHarmonyLinux内核模型微内
【经验分享】分布式爬虫的优势与劣势分析电商数据girl 跨境电商API接口电商项目API接口测试电商ERP项目接口经验分享分布式爬虫 java 数据库大数据 python
分布式爬虫通过多节点协同工作实现数据采集，其设计初衷是解决单节点爬虫在大规模数据抓取场景中的性能瓶颈，但同时也因架构复杂度带来了新的挑战。以下从技术特性、应用场景适配性两个维度，系统分析其优势与劣势：一、分布式爬虫的核心优势高效突破大规模数据采集瓶颈并行处理能力：通过将任务拆分到多个节点并行执行，大幅提升数据抓取效率。例如，采集100万条电商商品数据时，单节点爬虫可能需要数天，而由10个节点组成的
Docker容器如何实现分布式微服务：从0到1的深度解析 cda2024 docker 分布式微服务
在当今云计算和大数据时代，企业面临的最大挑战之一是如何快速、稳定地部署和管理复杂的软件应用。传统的单体架构已难以满足现代互联网应用的需求，而分布式微服务架构成为了解决这一难题的关键。但问题随之而来：如何高效地构建和管理分布式微服务？Docker容器技术的出现为这个问题带来了新的曙光。它不仅简化了应用程序的打包和部署过程，还为微服务架构提供了强大的支持。本文将深入探讨Docker容器如何实现分布式微
深度剖析：向70岁老系统植入通信芯片——MCP注入构建未来级分布式通信 Loving_enjoy 计算机学科论文创新点迁移学习人工智能机器学习深度学习
>如何让老旧系统重获新生？协议注入技术是关键。##一、当遗留系统遇上分布式未来：一场艰难的对话想象一下：你负责维护一套诞生于20年前的单体式银行核心系统，它像一位固执的70岁老人，使用着陈旧的TCP自定义协议。这时业务部门要求实现与云原生风险分析引擎的实时交互。直接改造？风险巨大；推倒重来？成本天文数字。这就是**分布式通信协议断层**带来的典型困境。###传统桥接方案痛点1.**协议转换地狱**
jquery实现的jsonp掉java后台知了ing java jsonp jquery
什么是JSONP？先说说JSONP是怎么产生的：其实网上关于JSONP的讲解有很多，但却千篇一律，而且云里雾里，对于很多刚接触的人来讲理解起来有些困难，小可不才，试着用自己的方式来阐释一下这个问题，看看是否有帮助。 1、一个众所周知的问题，Ajax直接请求普通文件存在跨域无权限访问的问题，甭管你是静态页面、动态网页、web服务、WCF，只要是跨域请求，一律不准； 2、
Struts2学习笔记 caoyong struts2
SSH : Spring + Struts2 + Hibernate 三层架构(表示层,业务逻辑层,数据访问层) MVC模式 (Model View Controller) 分层原则:单向依赖，接口耦合 1、Struts2 = Struts + Webwork 2、搭建struts2开发环境 a>、到www.apac
SpringMVC学习之后台往前台传值方法满城风雨近重阳 springMVC
springMVC控制器往前台传值的方法有以下几种： 1.ModelAndView 通过往ModelAndView中存放viewName：目标地址和attribute参数来实现传参： ModelAndView mv=new ModelAndView(); mv.setViewName="success
WebService存在的必要性？一炮送你回车库 webservice
做Java的经常在选择Webservice框架上徘徊很久，Axis Xfire Axis2 CXF ，他们只有一个功能，发布HTTP服务然后用XML做数据传输。是的，他们就做了两个功能，发布一个http服务让客户端或者浏览器连接，接收xml参数并发送xml结果。当在不同的平台间传输数据时，就需要一个都能解析的数据格式。但是为什么要使用xml呢？不能使json或者其他通用数据
js年份下拉框 3213213333332132 java web ee
<div id="divValue">test...</div>测试 //年份 <select id="year"></select> <script type="text/javascript"> window.onload =
简单链式调用的实现技术归来朝歌方法调用链式反应编程思想
在编程中，我们可以经常遇到这样一种场景：一个实例不断调用它自身的方法，像一条链条一样进行调用这样的调用你可能在Ajax中，在页面中添加标签： $("<p>").append($("<span>").text(list[i].name)).appendTo("#result"); 也可能在HQ
JAVA调用.net 发布的webservice 接口 darkranger webservice
/** * @Title: callInvoke * @Description: TODO(调用接口公共方法) * @param @param url 地址 * @param @param method 方法 * @param @param pama 参数 * @param @return * @param @throws BusinessException
Javascript模糊查找 | 第一章循环不能不重视。 aijuans Way
最近受我的朋友委托用js+HTML做一个像手册一样的程序，里面要有可展开的大纲，模糊查找等功能。我这个人说实在的懒，本来是不愿意的，但想起了父亲以前教我要给朋友搞好关系，再加上这也可以巩固自己的js技术，于是就开始开发这个程序，没想到却出了点小问题，我做的查找只能绝对查找。具体的js代码如下： function search(){ var arr=new Array("my
狼和羊，该怎么抉择 atongyeye 工作
狼和羊，该怎么抉择在做一个链家的小项目，只有我和另外一个同事两个人负责，各负责一部分接口，我的接口写完，并全部测联调试通过。所以工作就剩下一下细枝末节的，工作就轻松很多。每天会帮另一个同事测试一些功能点，协助他完成一些业务型不强的工作。今天早上到公司没多久，领导就在QQ上给我发信息，让我多协助同事测试，让我积极主动些，有点责任心等等，我听了这话，心里面立马凉半截，首先一个领导轻易说
读取android系统的联系人拨号百合不是茶 android sqlite数据库内容提供者系统服务的使用
联系人的姓名和号码是保存在不同的表中,不要一下子把号码查询来,我开始就是把姓名和电话同时查询出来的,导致系统非常的慢关键代码: 1, 使用javabean操作存储读取到的数据 package com.example.bean; /** * * @author Admini
ORACLE自定义异常 bijian1013 数据库自定义异常
实例： CREATE OR REPLACE PROCEDURE test_Exception ( ParameterA IN varchar2, ParameterB IN varchar2, ErrorCode OUT varchar2 --返回值,错误编码 ) AS /*以下是一些变量的定义*/ V1 NUMBER; V2 nvarc
查看端号使用情况征客丶 windows
一、查看端口在windows命令行窗口下执行： >netstat -aon|findstr "8080" 显示结果： TCP 127.0.0.1:80 0.0.0.0:0 &
【Spark二十】运行Spark Streaming的NetworkWordCount实例 bit1129 wordcount
Spark Streaming简介 NetworkWordCount代码 /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with
Struts2 与 SpringMVC的比较 BlueSkator struts2 spring mvc
1. 机制：spring mvc的入口是servlet，而struts2是filter，这样就导致了二者的机制不同。 2. 性能：spring会稍微比struts快。spring mvc是基于方法的设计，而sturts是基于类，每次发一次请求都会实例一个action，每个action都会被注入属性，而spring基于方法，粒度更细，但要小心把握像在servlet控制数据一样。spring
Hibernate在更新时，是可以不用session的update方法的(转帖） BreakingBad Hibernate update
地址：http://blog.csdn.net/plpblue/article/details/9304459 public void synDevNameWithItil() {Session session = null;Transaction tr = null;try{session = HibernateUtil.getSession();tr = session.beginTran
读《研磨设计模式》-代码笔记-观察者模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; import java.util.Observable; import java.util.Observer; /** * “观
重置MySQL密码 chenhbc mysql 重置密码忘记密码
如果你也像我这么健忘，把MySQL的密码搞忘记了，经过下面几个步骤就可以重置了（以Windows为例，Linux/Unix类似）： 1、关闭MySQL服务 2、打开CMD，进入MySQL安装目录的bin目录下，以跳过权限检查的方式启动MySQL mysqld --skip-grant-tables 3、新开一个CMD窗口，进入MySQL mysql -uroot
再谈系统论，控制论和信息论 comsci 设计模式生物能源企业应用领域模型
再谈系统论，控制论和信息论偶然看
oracle moving window size与 AWR retention period关系 daizj oracle
转自： http://tomszrp.itpub.net/post/11835/494147 晚上在做11gR1的一个awrrpt报告时,顺便想调整一下AWR snapshot的保留时间,结果遇到了ORA-13541这样的错误.下面是这个问题的发生和解决过程. SQL> select * from v$version; BANNER -------------------
Python版B树 dieslrae python
话说以前的树都用java写的,最近发现python有点生疏了,于是用python写了个B树实现,B树在索引领域用得还是蛮多了,如果没记错mysql的默认索引好像就是B树... 首先是数据实体对象,很简单,只存放key,value class Entity(object): '''数据实体''' def __init__(self,key,value)
C语言冒泡排序 dcj3sjt126com 算法
代码示例： # include <stdio.h> //冒泡排序 void sort(int * a, int len) { int i, j, t; for (i=0; i<len-1; i++) { for (j=0; j<len-1-i; j++) { if (a[j] > a[j+1]) // >表示升序
自定义导航栏样式 dcj3sjt126com 自定义
-(void)setupAppAppearance { [[UILabel appearance] setFont:[UIFont fontWithName:@"FZLTHK—GBK1-0" size:20]]; [UIButton appearance].titleLabel.font =[UIFont fontWithName:@"FZLTH
11.性能优化-优化-JVM参数总结 frank1234 jvm参数性能优化
1.堆 -Xms --初始堆大小 -Xmx --最大堆大小 -Xmn --新生代大小 -Xss --线程栈大小 -XX:PermSize --永久代初始大小 -XX:MaxPermSize --永久代最大值 -XX:SurvivorRatio --新生代和suvivor比例,默认为8 -XX:TargetSurvivorRatio --survivor可使用
nginx日志分割 for linux HarborChung nginx linux 脚本
nginx日志分割 for linux 默认情况下，nginx是不分割访问日志的，久而久之，网站的日志文件将会越来越大，占用空间不说，如果有问题要查看网站的日志的话，庞大的文件也将很难打开，于是便有了下面的脚本使用方法，先将以下脚本保存为 cutlog.sh，放在/root 目录下，然后给予此脚本执行的权限复制代码代码如下: chmo
Spring4新特性——泛型限定式依赖注入 jinnianshilongnian spring spring4 泛型式依赖注入
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
centOS安装GCC和G++ liuxihope centos gcc
Centos支持yum安装，安装软件一般格式为yum install .......，注意安装时要先成为root用户。按照这个思路，我想安装过程如下：安装gcc：yum install gcc 安装g++： yum install g++ 实际操作过程发现，只能有gcc安装成功，而g++安装失败，提示g++ command not found。上网查了一下，正确安装应该
第13章 Ajax进阶（上） onestopweb Ajax
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
How to determine BusinessObjects service pack and fix pack blueoxygen BO
http://bukhantsov.org/2011/08/how-to-determine-businessobjects-service-pack-and-fix-pack/ The table below is helpful. Reference BOE XI 3.x 12.0.0. y BOE XI 3.0 12.0. x. y BO
Oracle里的自增字段设置 tomcat_oracle oracle
　大家都知道吧，这很坑，尤其是用惯了mysql里的自增字段设置，结果oracle里面没有的。oh，no 　　我用的是12c版本的，它有一个新特性，可以这样设置自增序列，在创建表是，把id设置为自增序列 create table t ( id 　　　　 number generated by default as identity (start with 1 increment b
Spring Security（01）——初体验 yang_winnie spring Security
Spring Security（01）——初体验博客分类： spring Security Spring Security入门安全认证首先我们为Spring Security专门建立一个Spring的配置文件，该文件就专门用来作为Spring Security的配置

谷粒商城-官方笔记-分布式高级(3/4)

文章目录

1. ELASTICSEARCH

1、安装elastic search

2、初步检索

1）_CAT

2）索引一个文档

3）查看文档

4）更新文档

5）删除文档或索引

6）eleasticsearch的批量操作——bulk

7）样本测试数据

3、检索

1）search Api

2）Query DSL

（1）基本语法格式

（2）返回部分字段

（3）match匹配查询

（4） match_phrase [短句匹配]

（5）multi_math【多字段匹配】

（6）bool用来做复合查询

（7）Filter【结果过滤】

（8）term

（9）Aggregation（执行聚合）

3）Mapping

（1）字段类型

（2）映射

（3）新版本改变

创建映射

查看映射

添加新的字段映射

更新映射

数据迁移

4）分词

（1）安装ik分词器

（1）查看elasticsearch版本号：

（2）进入es容器内部plugin目录

（2）测试分词器

（3）自定义词库

4、elasticsearch-Rest-Client

1）9300: TCP

2）9200: HTTP

5、附录：安装Nginx

hello nginx!

SpringBoot整合ElasticSearch

1、导入依赖

2、编写测试类

1）测试保存数据

2）测试获取数据

其他

1. kibana控制台命令

你可能感兴趣的:(分布式,谷粒商城)