1、ES基本概念-安装启动-基本使用

1、ES的概念

es是elasticSearch

es是基于Lucene的实时分布式搜索和分析引擎

lucene只是一个库，es使用lucene作为核心，来实现所有索引和搜索的功能，es中引入lucene的jar包。

es和关系型数据库的对比

关系型数据库-》es

数据库-索引库

table表-type类型

row行-document文档

column列-field字段

es集群中包含多个索引库（数据库）

传统数据库使用BTree作为索引

es和lucene使用的倒排索引（inverted index）

倒排索引：需求是按照属性的值查找记录，反向的按照value找key，所以成为倒排inverted索引

es和solr的区别：

1、底层都是使用lucene实现的分布式倒排索引查询

2、solr出现的更早，但是es的热度更高

3、es是分布式的，不需要其他组件，分发是实时的，被叫做push replication

4、es完成支持apache lucene的接近实时的搜索，处理多租户multitenancy不需要特殊配置，而solr需要更多的高级配置

5、es采用gateway的概念，备份更加简单

6、各节点组成对等的网络结构，某些节点出现故障时，会自动分配其他节点代替完成工作

2、es的下载安装

1、下载

要求jdk最低1.7

下载路径:https://www.elastic.co/cn/downloads/

2、安装

安装方式有多种：apte-get，homebrew，yum，docker，或者直接下载解压运行

docker方式安装
1、docker pull elasticsearch:7.12.1 下载
2、docker run -d --name es -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.12.1 运行
如果本地内存不够的话，可以增加启动环境的设置，限制es的内存的消耗（es很消耗内存）-e ES_JAVA_OPTS = '"-Xms=64m -Xmx=512m"
3、http://虚拟机ip地址:9200 测试
4、docker pull kibana:7.12.1安装kibana
5、准备kibana的配置文件
docker inspect es容器的id(7a1a6d3d3423) 找到IPAddress
增加kibana配置如下：
mkdir -p /usr/share/kibana/config/
vim /opt/docker/es/kibana.yml
server.name:kibana
server.host:"0"
elasticsearch.hosts:["http:IPAddress:9200"]
xpack.monitoring.ui.container.elasticsearch.enable:true

6、启动kibana
docker run -d --restart=always --log-driver json-file --log-opt max-size=100m --log-opt max-file=2 --name kibana -e elasticsearch_url=http://172.17.0.2:9200 -p 5601:5601 -v /opt/docker/es/kibana.yml:/usr/share/kibana/config/kibana.yml kibana:7.12.1
7、查看
访问地址：http://127.0.0.1/5601

3、使用

修改配置config/elasticsearch.yml：

（1）Cluster.name:clusterName(集群的名称，同一个集群的各个节点名称要一样)

（2）Node.name:node-1(集群中每个节点的名称，集群内名称不能同步)

（3）Network.Host:192.168.1.194(此处不能谢127.0.0.1)

（4）防止脑裂配置

discorvery.zen.ping.multicast.enable:false

discorvery.zen.ping_timeout:120s

client.transport.ping_timeout:60s

discorvery.zen.ping.unicast.hosts:["192.168.1.191","192.168.1.192","192.168.1.193"]

4、启动

cd elasticsearch-2.4.5/

bin/elasticsearch

启动成功后访问 /ip:9200

查看集群健康状态：ip:9200/_cluster/health?pretty

3、es的使用

1、创建索引库

创建一个叫myindex的索引库

curl -XPUT http://192.168.78.102:9200/myindex

（索引库的名称必须为小写，不能以下划线开头，也不能包含逗号）

（如果没有明确指定索引数据的id，则用post提交会自动创建一个，而用put提交则会报错）

2、查询所有的索引库及具体的某个索引库详情

curl -XGET  http://192.168.78.102:9200/_cat/indices?v&pretty
(其中的pretty可以将json格式的返回结果格式化
?v可以现实出每个索引库的详细信息的标题)
curl -XGET  http://192.168.78.102:9200/myindex?pretty

3、删除索引库

curl -XDELETE http://192.168.78.102:9200

4、创建document

put插入


curl -XPUT http://192.168.78.102:9200/myindex/employee/1 -d '{

"first_name":"kk",

"age":25

}'

(employee是type，1表示id)

post插入

curl -XPOST http://192.168.78.102:9200/myindex/employee -d '{

"first_name":"tt",

"age":25

}'

(id自动生成)

5、更新doucument

curl -XPUT http://192.168.78.102:9200/myindex/employee/1 -d '{

"first_name":"kk",

"age":26

post也可以修改，但是必须指定id

（post最好用于新增，put用于修改）

（es在执行更新操作时，是先将旧的数据删除，打上删除的标记，然后添加新的文档。旧的文档也不会消失，无法搜索到，es会在继续添加更多数据的时候，在后台的执行定时任务，删除掉删除标记的文档）
put和post的区别
put是全部覆盖，post只是局部更新
post的局部更新

curl -XPOST http://localhost:9200/myindex/employee/1_update -H 'Content-Type:application/json'  -d  '{
 "doc":{  
      "city":"beijing",
      "sex":"male"
      }
}'

6、查询

1、返回指定字段

返回id为1 的name和age
curl -XGET http://localhost:9200/myindex/employee/1?_source=name,age

2、查询所有数据

curl -XGET http://localhost:9200/myindex/employee/_search?pretty

3、按条件查询

查找last_name为kk的数据
curl -XGET http://localhost:9200/myindex/employee/_search?q=last_name:kk

4、DSL查询
DSL领域特定语言（domain specific language）

查询指定条件
curl -XGET http://localhost:9200/myindex/employee/_search?pretty -d'{
  "query":{
      "match":
      {"last_name":"smith"}
  }
}'

对多个条件查询
curl -XGET http://localhost:9200/myindex/employee/_search?pretty -d'{
  "query":{
      "multi_match":
      {"query":"kk","fields":["last_name","first_name"]}
  }
}'

复合查询
must：AND
must_not:NOT
should:OR
curl -XGET http://192.168.78.101:9200/myindex/employee/_search?pretty -d '{
  "query":{"bool":
                  {"must":{"match":{"last_name":"kk"}}},
                  {"must":{"term":{"last_name":"kk"}}},
                  {"must":{"match":{"age":37}}},
                  {"must":{"range":{"age":{"from":30,"to":40}}}}
}
}'

(其中term和match的区别是：match会对搜索条件进行拆词，使用分词器分词，而term则是完全匹配)
(根据lucene的评分机制(TF/IDF)来进行评分)