ElasticSearch是一个基于Lucene的全文搜索服务器。它提供了一个分布式集群,多用户能力的全文搜索引擎 , 基于RESTful web接口 , 以使全文搜索变得简单易用
下载地址
运行 %ES_HOME%\bin\elasticsearch.bat即可运行ES服务器
输入url http://localhost:9200/
在启动ES前将分词器解压至plugins文件夹下
POST _analyze
{
"analyzer":"ik_smart",
"text":"《新闻联播》播发国际锐评:打“香港牌”牵制中国的图谋必败无疑"
}
IK分词器指定:ik_smart(智能模式) ; ik_max_word(最大细粒度)
下载地址
解压并编辑config/kibana.yml,设置elasticsearch.url的host为已启动的ES(一般默认9200)
启动 打开bin\kibana.bat 访问地址http://localhost:5601
RESTful是一种面向资源的架构风格,可以简单理解为:使用URL定位资源,用HTTP动词(GET,POST,DELETE,PUT)描述操作。
GET /user/1
=> 获取id为1的用户GET /users
=> 获取用户列表PUT /user
=> 添加用户 数据:{“id”:“1”,“name”:“cc”}POST /user
数据{“id”:“1”,“name”:“cc”}DELELE /user/1
=> 删除id为1的用户PUT crm
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
}
}
GET _cat/indices?v
GET _cat/indices/crm?v
通过删除后再次添加
DELETE crm
PUT crm/employee/1
{
"id":1,
"name":"cc",
"sex":true
}
如不指定路径上的id,ES会自动生成一个id字符串
GET crm/employee/1
GET crm/employee/1?_source=name,sex
GET crm/employee/_search
PUT crm/employee/2
{
"id":2,
"name":"aabbcc",
"sex":true
}
POST crm/employee/2/_update
{
"doc": {
"sex":false
}
}
DELETE crm/employee/2
GET _mget
{
"docs":[
{
"_index":"crm",
"_type":"employee",
"_id":"1"
},{
"_index":"cms",
"_type":"user",
"_id":"1"
}
]
}
GET crm/employee/_mget
{
"ids":["1","2"]
}
GET crm/user/_search?q=age:17&size=2&from=2&sort=id:desc&_source=id,username
此方式不推荐,因条件过多,拼起来较麻烦,后面会使用下面的方式.
DSL查询相当于模糊查询;DSL过滤相当于精确匹配;
ES底层优先使用DSL过滤再使用DSL查询,以达到高效目的;
一般我们会将两者进行结合起来使用:
GET crm/employee/_search
{
"query":{
"bool": {
"must": [
{"match": {
"name": "carry"
}}
],
"filter": [{
"term": {
"sex": true
}},
{
"range":{
"id":{
"gte":1,
"lte":7
}
}
}]
}
},
"from": 0,
"size": 10,
"_source": ["name","sex"],
"sort": [{"id": "desc"}]
}
关键字 :
ES的文档映射(mapping)机制用于进行字段类型确认,将每个字段匹配为一种确定的数据类型。
① 基本字段类型
字符串:text(分词),keyword(不分词) StringField(不分词文本),TextFiled(要分词文本)
text默认为全文文本,keyword默认为非全文文本
数字:long,integer,short,double,float
日期:date
逻辑:boolean
② 复杂数据类型
对象类型:object
数组类型:array
地理位置:geo_point,geo_shape
json类型 ==> ES字段类型
Boolean ==> boolean
整数 ==> long
小数 ==> double
日期字符串 ==> date
String ==> string
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Czn5pksz-1574560932287)(en-resource://database/4157:1)]
put aimall
put aimall/goods/_mapping
{
"goods": {
"properties": {
"id": {
"type": "long"
},
"name": {
"type": "text",
"analyzer": "ik_smart",
"search_analyzer": "ik__max__word"
}
}
}
}
PUT crm
{
"mappings": {
"user": {
"properties": {
"id": {
"type": "integer"
},
"info": {
"type": "text",
"analyzer": "ik_smart",
"search_analyzer": "ik_smart"
}
}
},
"dept": {
"properties": {
"id": {
"type": "integer"
},
....其他字段映射配置
}
}
}
}
{
"id" : 1,
"hobby" : ["football","dabaojian"]
}
文档映射:
{
"properties": {
"id": {"type": "long"},
"hobby": {"type": "keyword"}
}
}
{
"id" : 1,
"girl" : {
"name" : "扛把子",
"age" : 22
}
}
对象文档映射
{
"properties": {
"id": {"type": "long"},
"girl": {
"properties":{
"name": {"type": "keyword"},
"age": {"type": "integer"}
}
}
}
}
{
"id" : 1,
"girl":[{"name":"迪丽热巴","age":23},{"name":"古力娜扎","age":22}]
}
文档映射
"properties": {
"id": {
"type": "long"
},
"girl": {
"properties": {
"age": { "type": "long" },
"name": { "type": "text" }
}
}
}
全局映射通过两种方式实现:默认设置和动态模板
PUT {indexName}
{
"mappings": {
"_default_": {
"_all": {
"enabled": false
}
},
"user": {},
"dept": {
"_all": {
"enabled": true
}
}
}
PUT _template/global_template //创建名为global_template的模板
{
"template": "*", //匹配所有索引库
"settings": { "number_of_shards": 1 }, //匹配到的索引库只创建1个主分片
"mappings": {
"_default_": {
"_all": {
"enabled": false //关闭所有类型的_all字段
},
"dynamic_templates": [
{
"string_as_text": {
"match_mapping_type": "string",//匹配类型string
"match": "*_text", //匹配字段名字以_text结尾
"mapping": {
"type": "text",//将类型为string的字段映射为text类型
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word",
"fields": {
"raw": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
{
"string_as_keyword": {
"match_mapping_type": "string",//匹配类型string
"mapping": {
"type": "keyword"//将类型为string的字段映射为keyword类型
}
}
}
]
}
}}
映射方式优先级 (低 -> 高):默认 -> 全局 -> 自定义
默认情况下,elasticsearch集群中每个节点都有成为主节点的资格,也都存储数据,还可以提供查询服务。在生产环境下,如果不修改elasticsearch节点的角色信息,在高数据量,高并发的场景下集群容易出现脑裂等问题。这些功能是由两个属性控制的。node.master 和 node.data 默认情况下这两个属性的值都是true
配置 | 类型 |
---|---|
node.master=true & node.data=false | 主节点 |
node.master=false & node.data=true | 数据节点 |
node.master=false & node.data=false | 负载均衡节点client |
在一个生产集群中我们可以对这些节点的职责进行划分,建议集群中设置3台以上的节点作为master节点,这些节点只负责成为主节点,维护整个集群的状态。再根据数据量设置一批data节点,这些节点只负责存储数据,后期提供建立索引和查询索引的服务,这样的话如果用户请求比较频繁,这些节点的压力也会比较大,所以在集群中建议再设置一批client节点(node.master: false node.data: false),这些节点只负责处理用户请求,实现请求转发,负载均衡等功能。
单node环境下,创建一个index,有3个primary shard,3个replica shard
2个node环境下,创建一个index, 3个primary shard,3个replica shard
- cluster.name
集群名,自定义集群名,默认为elasticsearch,建议修改,因为低版本多播模式下同一网段下相同集群名会自动加入同一集群,如生产环境这样易造成数据运维紊乱。
- node.name
节点名,同一集群下要求每个节点的节点名不一致,起到区分节点和辨认节点作用
- node.master
是否为主节点,选项为true或false,当为true时在集群启动时该节点为主节点,在宕机或任务挂掉之后会选举新的主节点,恢复后该节点依然为主节点
- node.data
是否处理数据,选项为true或false。负责数据的相关操作
- path.data
默认数据路径,可用逗号分隔多个路径
- path.logs
默认日志路径
- bootstrap.mlockall
内存锁,选项为true或false,用来确保用户在es-jvm中设置的ES_HEAP_SIZE参数内存可以使用一半以上而又不溢出
- network.host
对外暴露的host,0.0.0.0时暴露给外网
- http.port
对外访问的端口号,默认为9200,所以外界访问该节点一般为http://ip:9200/
- transport.tcp.port
集群间通信的端口号,默认为9300
- discovery.zen.ping.unicast.hosts
集群的ip集合,可指定端口,默认为9300,如 ["192.168.1.101","192.168.1.102"]
- discovery.zen.minimum_master_nodes
最少的主节点个数,为了防止脑裂,最好设置为(总结点数/2 + 1)个
- discovery.zen.ping_timeout
主节点选举超时时间设置
- gateway.recover_after_nodes
值为n,网关控制在n个节点启动之后才恢复整个集群
- node.max_local_storage_nodes
值为n,一个系统中最多启用节点个数为n
- action.destructive_requires_name
选项为true或false,删除indices是否需要现实名字
# 统一的集群名
cluster.name: my-ealsticsearch
# 当前节点名 node-1 node-2 node-3
node.name: node-?
# 对外暴露端口使外网访问
network.host: 127.0.0.1
# 对外暴露端口 9201 9202 9203
http.port: 920?
#集群间通讯端口号 9301 9302 9303
transport.tcp.port: 930?
#集群的ip集合,可指定端口,默认为9300
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9301","127.0.0.1:9302","127.0.0.1:9303"]
注意:需要准备三个ES(拷贝),然后删除data目录 , 如果电脑内存不够,可以把jvm.properties中的内存设置改小
elasticsearch.url: "http://localhost:9201"
连接其中一个节点自然能连接上整个集群 , 然后启动Kibana
查看
GET _cat/nodes?v ==> 查看Node
GET _cat/indices?v ==> 查看索引库
<dependency>
<groupId>org.elasticsearch.clientgroupId>
<artifactId>transportartifactId>
<version>5.2.2version>
dependency>
<dependency>
<groupId>org.apache.logging.log4jgroupId>
<artifactId>log4j-apiartifactId>
<version>2.7version>
dependency>
<dependency>
<groupId>org.apache.logging.log4jgroupId>
<artifactId>log4j-coreartifactId>
<version>2.7version>
dependency>
public class ESClientUtil {
public static TransportClient getClient(){
Settings settings = Settings.builder()
.put("cluster.name","my-ealsticsearch")
.put("client.transport.sniff", true).build();
TransportClient client = null;
try {
client = new PreBuiltTransportClient(settings)
.addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"), 9301));
} catch (UnknownHostException e) {
e.printStackTrace();
}
return client;
}
}
/**
* @Author: Carry
* @Date: 2019-11-22 18:02
* @Description:
*/
public class ESClusterTest {
TransportClient client;
/**
* 创建客户端方法
* */
@Before
public void getClient(){
client = ESClientUtil.getClient();
}
/**
* g关闭客户端
* */
@After
public void close(){
client.close();
}
/**
* 添加文档
* */
@Test
public void testAdd() throws Exception{
//创建索引库及类型
IndexRequestBuilder index = client.prepareIndex("crm", "department", "1");
//准备数据
HashMap<String, Object> map = new HashMap<>();
map.put("id",1);
map.put("name","科技部");
map.put("address","chengdu");
//添加数据并获取返回结果
IndexResponse response = index.setSource(map).get();
System.out.println(response);
}
/**
* 获取文档
* */
@Test
public void testGet() throws Exception{
GetResponse response = client.prepareGet("crm", "department", "6").get();
//获取文档需要再次调用getSource()方法得到map对象
Map<String, Object> map = response.getSource();
System.out.println(map);
}
/**
* 修改文档
* */
@Test
public void testUpdate() throws Exception{
UpdateRequestBuilder requestBuilder = client.prepareUpdate("crm", "department", "1");
//准备数据
HashMap<String, Object> map = new HashMap<>();
map.put("name","小卖部");
map.put("address","成都");
UpdateResponse response = requestBuilder.setDoc(map).get();
System.out.println(response);
}
/**
* 删除文档
* */
@Test
public void testDelete() throws Exception{
DeleteRequestBuilder requestBuilder = client.prepareDelete("crm", "department", "1");
DeleteResponse response = requestBuilder.get();
System.out.println(response);
}
/**
* 批量添加
* */
@Test
public void testBulkAdd() throws Exception{
BulkRequestBuilder requestBuilder = client.prepareBulk();
//准备数据
HashMap<String, Object> map1 = new HashMap<>();
map1.put("id",5);
map1.put("name","小卖部");
map1.put("address","成都");
//准备数据
HashMap<String, Object> map2 = new HashMap<>();
map2.put("id",6);
map2.put("name","公关部");
map2.put("address","武汉");
//准备数据
HashMap<String, Object> map3 = new HashMap<>();
map3.put("id",7);
map3.put("name","行政部");
map3.put("address","杭州");
BulkResponse responses = requestBuilder
.add(client.prepareIndex("crm", "department", "5").setSource(map1))
.add(client.prepareIndex("crm", "department", "6").setSource(map2))
.add(client.prepareIndex("crm", "department", "7").setSource(map3))
.get();
Iterator<BulkItemResponse> iterator = responses.iterator();
if(iterator.hasNext()){
System.out.println(iterator.next().getResponse());
System.out.println(1);
}
}
/**
* 高级查询带分页
* */
@Test
public void testQueryPage() throws Exception{
SearchRequestBuilder searchRequestBuilder = client.prepareSearch("crm");
searchRequestBuilder.setTypes("department");
searchRequestBuilder.setFrom(0);
searchRequestBuilder.setSize(10);
searchRequestBuilder.addSort("id", SortOrder.DESC);
//查询条件
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
List<QueryBuilder> must = boolQueryBuilder.must();
//DSL查询
must.add(QueryBuilders.matchQuery("name" , "部"));
List<QueryBuilder> filter = boolQueryBuilder.filter();
//DSL过滤
filter.add(QueryBuilders.rangeQuery("id").lte(6).gte(1));
filter.add(QueryBuilders.termQuery("address","chengdu"));
/*此处使用term注意数据的分词器:比如搜索'成都',使用term不会分词,
但是数据'成都'可能在ES中已经拆分,导致搜索不到*/
searchRequestBuilder.setQuery(boolQueryBuilder);
SearchResponse searchResponse = searchRequestBuilder.get();
SearchHits hits = searchResponse.getHits();
System.out.println("总条数:"+hits.getTotalHits());
for (SearchHit hit : hits.getHits()) {
System.out.println(hit.getSource());
}
}
}