一.概念
1.1 基础概念
ELK: 是ElasticSearch,LogStash以及Kibana三个产品的首字母缩写
lucene : apache 的全文搜索引擎工具包
elasticsearch : ElasticSearch是一个基于全文检索引擎lucene实现的一个面向文档的schema free的数据库。所有对数据库的配置、监控及操作都通过Restful接口完成。数据格式为json。默认支持节点自动发现,数据自动复制,自动分布扩展,自动负载均衡。适合处理最大千万级别的数据的检索。处理效率非常高。可以理解为elasticSearch是一个在lucene基础上增加了restful接口及分布式技术的整合。
elasticsearch : http协议访问默认使用9200端口
elasticsearch : tcp协议访问默认使用9300端口
操作elasticsearch的四种方式:
Kibana:使用http
原始的api:使用tcp
RestAPI:使用http
Sde(SpringDataElasticsearch): 使用tcp
tcp传输效率比http高
1.2 elasticsearch概念
Index:存储数据的逻辑区域,类似关系型数据库中的database,是文档的命名空间。如下图的湖蓝色部分所示,Index为twitter。
Type:类似关系型数据库中的Table,是包含一系列field的json数据。储存一系列类似的field。如下图的黄色部分所示,Type为tweet。不同document里面同名的field一定要是相同类型的。
Document:存储的实体数据,类似关系型数据库中的Row,是具体的包含一组filed的资料。如下图橙色部分所示,包含user,post_data,message三个field。
Field:即关系型数据库中Column, Document的一个组成部分,有两个部分组成,name和value。如下图紫色部分所示 post_date及其具体的值就是一个field。
Mapping:存储field的相关映射信息,不同document type会有不同的mapping。
Term:不可分割的单词,搜索最小单元。不同的分析器对同样的内容的分析结果是不同的。也就得到不同的term。
Token:一个Term呈现方式,包含这个Term的内容,在文档中的起始位置,以及类型。
Node:对应这关系型数据库中的数据库实例。
Cluster:由多个node组成的一组服务实例。
Shard:关系型数据库中无此概念,是Lucene搜索的最小单元。一个index可能会存在于多个shards,不同shards可能在不同nodes。一个lucene index在es中我们称为一个shard,而es中的index则是一系列shard。当es执行search操作,会将请求发送到这个index包含的所有shard上去,然后将没一个shard上的执行结果搜集起来作为最终的结果。shard的个数在创建索引之后不能改变!
Replica:shard的备份,有一个primary shard,其余的叫做replica shards。Elasticsearch采用的是Push Replication模式,当你往 master主分片上面索引一个文档,该分片会复制该文档(document)到剩下的所有 replica副本分片中,这些分片也会索引这个文档
文档的录入时,Elasticsearch通过对docid进行hash来确定其放在哪个shard上面,然后在shard上面进行索引存储。
和数据库的对应:
mysql数据库 |
ES |
Database |
Indices index的复数 |
Table |
Type 一般一个索引库中只有一个type |
数据 |
Document |
约束 列存储什么数据类型之类的 |
Mapping 规定字段什么数据类型、什么分词器 |
Column |
Field |
二.Kibana操作索引库
1. 连接
2. 操作
创建类型并且制定每个字段的属性(数据类型、是否存储、是否索引、哪种分词器
put ahd/_mapping/goods { "properties":{ "goodsName":{ "type":"text", "analyzer":"ik_max_word", "index":"true", "store":"true" }, "price":{ "type":"double", "index":"true", "store":"false" }, "brand":{ "type":"keyword", "index":"true", "store":"true" } } }
查询创建的索引/映射 get ahd/_mapping[/goods]
分片5,副本1 put /heima { "settings":{ "number_of_shards":5, "number_of_replicas":1 } } 创建索影库2 put ahd2
创建索引库及其字段 put ahd2 { "mappings":{ "goods":{ "properties":{ "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } }
} } }
添加一条数据:指定id的新增
post ahd/goods/1 { "goodsname":"华为p20手机", "brand":"华为", "price":2299 }
根据id查询记录 get ahd/goods/1
修改,
post ahd/goods/1 { "goodsname":"华为p20手机", "brand":"华为", "price":2599 }
不指定id插入一条数据
post ahd/goods { "goodsname":"小米手机6", "brand":"小米", "price":"2500" }
插入数据最好还是使用post,修改数据使用put
使用put和使用post是一样的效果
指定id删除一条数据 delete ahd/goods/IkXNN2wBr0WPOOKNJpRg
自定义模板 1. 首先先添加一个索引库, put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } } } }
在添加的这个索引库基础上添加模板(改动添加语句)
put ahd3 { "mappings":{ "goods":{ "properties":{ "image":{ "type":"text", "index":"false", "store":"true" }, "goodsname":{ "analyzer":"ik_max_word", "type":"text", "store":"true", "index":"true" }, "price":{ "type":"double", "index":"true", "store":"true" }, "brand":{ "type":"text", "index":"true", "store":"true" } } , "dynamic_templates":[ { "mystring":{ "match_mapping_type":"string", "mapping":{ "type":"keyword" } } } ]
} } }
新增数据还就只能使用post
在ahd3中新添加一条数据 post ahd3/goods { "goodsname":"小米6X手机", "price":1199, "image":"http://image.im.com/123.jpg", "brand":"小米" }
查询goods document get ahd3/_mapping/goods
===================================================================== ===================================================================== =========================查询(重点)================================== ===================================================================== =====================================================================
1.查询所有 get ahd3/_search { "query":{ "match_all": {
} } }
2.term查询:精确查询
get ahd3/_search { "query":{ "term":{ "goodsname":"小米" } } }
注意,第一行不能有大括号{
*.在添加一条数据,进行测试, post ahd3/goods { "goodsname":"大米", "brand":"吊牌", "price":200, "image":"http://localhost:8080/a.jpg" }
进行查询测试 get ahd3/_search { "query":{ "term":{ "goodsname": "小米" } } }
插入一条新的记录 post ahd3/goods { "goodsname":"大米手机", "price":20000, "brand":"大米", "image":"http://baidu.com/a.jpg" }
3.分词查询match测试 get ahd3/_search { "query":{ "match": { "brand":"米" } } }
2.4 Range范围查询 get ahd3/_search { "query":{ "range":{ "price":{ "lte":1000, "gte":100 } } }
}
新添加一条数据 post ahd3/goods { "goodsname":"appla", "brand":"apple", "price":5000, "image":"http://www.baidu.com/sadf.jpg" }
2.5 Fuzzy容错
get ahd3/goods/_search { "query":{ "fuzzy":{ "goodsname":{ "value": "apple", "fuzziness": 1 } } } }
2.6 Bool组合查询
get ahd3/goods/_search { "query":{ "bool": { "must":{ "match":{ "goodsname":"大米" }
} } } }
测试json书写是否正确
get ahd3/goods/_search { "query":{ "bool": { "must":[{ "match":{ "goodsname":"大米" } },{ "term":{ "brand":"大米" } }
] } } }
显示字段的过滤
只显示goodsname
get ahd3/_search { "_source":{ "includes":["goodsname"] } }
排除goodsname
get ahd3/_search { "_source":{ "excludes":["goodsname"] } }
3.2 、查询结果的过滤
查询结果的过滤
get ahd3/_search { "query":{ "bool": { "must": { "term":{ "goodsname":"小米" } }, "filter":{ "range": { "price": { "gte": 10, "lte": 20000 } } }
} } }
分页: get ahd3/_search { "query":{ "match_all": {
} }, "from":2, "size":2 }
排序倒序 get ahd3/_search { "query":{ "match_all": {
} }, "sort":{ "price":"desc" }
}
高亮
get ahd3/_search { "query":{ "term": { "goodsname": { "value": "小米" } } }, "highlight":{ "pre_tags":"", "post_tags":"", "fields":{ "goodsname":{} } } }
聚合: get /ahd3/goods/_search { "size":0, "aggs":{ "populor_color":{ "terms": { "field": "price", "size": 10 }
} } }
|
三.原始的api操作索引库(tcp:9300)
2.1导入依赖
<dependencies>
|
2.2原始api操作索引库
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
public class EsManager {
private TransportClient client = null;
@Before
public void init() throws Exception{
client = new PreBuiltTransportClient(Settings.EMPTY)
.addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
}
@After
public void end(){
client.close();
}
}
第三步:各种查询
@Test
public void queryTest() throws Exception{
// QueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
// QueryBuilder queryBuilder = QueryBuilders.matchQuery("goodsName","小米手机");
// QueryBuilder queryBuilder = QueryBuilders.termQuery("goodsName","小米");
// FuzzyQueryBuilder queryBuilder = QueryBuilders.fuzzyQuery("goodsName", "大米");
// queryBuilder.fuzziness(Fuzziness.ONE);
// QueryBuilder queryBuilder = QueryBuilders.rangeQuery("price").gte(1000).lte(2000);
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.rangeQuery("price").gte(1000).lte(8000));
queryBuilder.mustNot(QueryBuilders.termQuery("goodsName", "华为"));
SearchResponse searchResponse = client.prepareSearch("heima").setQuery(queryBuilder).get();
SearchHits searchHits = searchResponse.getHits();
long totalHits = searchHits.getTotalHits();
System.out.println("总记录数:"+totalHits);
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
Goods goods = JSON.parseObject(sourceAsString, Goods.class);
System.out.println(goods);
}
}
四.RestAPI操作索引库(http:9200)
3.1 坐标
|
3.2 RestAPI操作索引库
1.初始化client
private RestHighLevelClient client = null; |
2.准备pojo对象(使用lombok)
@Data
|
// 新增或修改 IndexRequest
Item item = new Item(1L,"大米6X手机","手机","小米",1199.0,"http.jpg");
String jsonStr = gson.toJson(item);
IndexRequest request = new IndexRequest("item","docs",item.getId().toString());
request.source(jsonStr, XContentType.JSON);
client.index(request, RequestOptions.DEFAULT);
修改文档数据
就是使用上面的新增方法,它既是新增也是修改
根据id获取文档数据
GetRequest request = new GetRequest("item","docs","1");
GetResponse getResponse = client.get(request, RequestOptions.DEFAULT);
String sourceAsString = getResponse.getSourceAsString();
Item item = gson.fromJson(sourceAsString, Item.class);
System.out.println(item);
删除文档数据
DeleteRequest deleteRequest = new DeleteRequest("item","docs","1");
client.delete(deleteRequest,RequestOptions.DEFAULT);
批量新增文档数据
// 准备文档数据:
List
list.add(new Item(1L, "小米手机7", "手机", "小米", 3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(2L, "坚果手机R1", "手机", "锤子", 3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(3L, "华为META10", "手机", "华为", 4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(4L, "小米Mix2S", "手机", "小米", 4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Item(5L, "荣耀V10", "手机", "华为", 2799.00,"http://image.leyou.com/13123.jpg"));
BulkRequest bulkRequest = new BulkRequest();
for (Item item : list) {
bulkRequest.add(new IndexRequest("item","docs",item.getId().toString()).source(JSON.toJSONString(item),XContentType.JSON)) ;
}
client.bulk(bulkRequest,RequestOptions.DEFAULT);
各种查询
@Test
public void testQuery() throws Exception{
SearchRequest searchRequest = new SearchRequest("item");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
searchSourceBuilder.query(QueryBuilders.matchQuery("title","小米手机"));
searchSourceBuilder.query(QueryBuilders.fuzzyQuery("title","大米").fuzziness(Fuzziness.ONE));
searchSourceBuilder.query(QueryBuilders.rangeQuery("price").gte(3000).lte(4000));
searchSourceBuilder.query(QueryBuilders.boolQuery().must(QueryBuilders.termQuery("title","手机"))
.must(QueryBuilders.rangeQuery("price").gte(3000).lte(3500)));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
SearchHits searchHits = searchResponse.getHits();
long total = searchHits.getTotalHits();
System.out.println("总记录数:"+total);
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString, Item.class);
System.out.println(item);
}
}
过滤
1、属性字段显示的过滤
searchSourceBuilder.fetchSource(new String[]{"title","category"},null);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
2、查询结果的过滤
searchSourceBuilder.query(QueryBuilders.termQuery("title","手机"));
searchSourceBuilder.postFilter(QueryBuilders.termQuery("brand","小米"));
分页
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.from(0); //起始位置
searchSourceBuilder.size(3); //每页显示条数
排序
searchSourceBuilder.sort("id", SortOrder.ASC); // 参数1:排序的域名 参数2:顺序
高亮
构建高亮的条件
searchSourceBuilder.query(QueryBuilders.termQuery("title","小米"));
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("");
highlightBuilder.postTags("");
highlightBuilder.field("title");
searchSourceBuilder.highlighter(highlightBuilder);
解析高亮的结果
for (SearchHit hit : hits) {
Map
HighlightField highlightField = highlightFields.get("title");
String title = highlightField.getFragments()[0].toString();
String sourceAsString = hit.getSourceAsString();
Item item = JSON.parseObject(sourceAsString, Item.class);
item.setTitle(title);
System.out.println(item);
}
聚合
需求:根据品牌统计数量
构建的条件代码
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchSourceBuilder.aggregation(AggregationBuilders.terms("brandAvg").field("brand"));
解析结果:
Aggregations aggregations = searchResponse.getAggregations();
Terms terms = aggregations.get("brandAvg");
List extends Terms.Bucket> buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+":"+bucket.getDocCount());
}
五.SpringDataElasticsearch操作索引库
1. 准备环境
1、添加依赖
<dependency>
<groupId>org.springframework.bootgroupId>
<artifactId>spring-boot-starter-data-elasticsearchartifactId>
dependency>
2、创建引导类
@SpringBootApplication
public class EsApplication {
public static void main(String[] args) {
SpringApplication.run(EsApplication.class,args);
}
}
3、添加配置文件 application.yml
spring:
data:
elasticsearch:
cluster-name: leyou-elastic
cluster-nodes: 127.0.0.1:9301,127.0.0.1:9302,127.0.0.1:9303
4、创建一个测试类,注入SDE提供的一个模板
@RunWith(SpringRunner.class)
@SpringBootTest
public class SpringDataEsManager {
@Autowired
private ElasticsearchTemplate elasticsearchTemplate;
}
Kibana:http
原始的api:tcp
RestAPI:http
Sde: tcp
2. 操作索引库和映射
第一步:准备一个pojo,并且构建和索引的映射关系
@Data
@AllArgsConstructor
@NoArgsConstructor
@Document(indexName="leyou",type = "goods",shards = 3,replicas = 1)
public class Goods implements Serializable{
@Field(type = FieldType.Long)
private Long id;
@Field(type = FieldType.Text,analyzer = "ik_max_word",store = true)
private String title; //标题
@Field(type = FieldType.Keyword,index = true,store = true)
private String category;// 分类
@Field(type = FieldType.Keyword,index = true,store = true)
private String brand; // 品牌
@Field(type = FieldType.Double,index = true,store = true)
private Double price; // 价格
@Field(type = FieldType.Keyword,index = false,store = true)
private String images; // 图片地址
}
第二步:创建索引库和映射
@Test
public void addIndexAndMapping(){
// elasticsearchTemplate.createIndex(Goods.class); //根据pojo中的注解创建索引库
elasticsearchTemplate.putMapping(Goods.class); //根据pojo中的注解创建映射
}
3. 操作文档
// 新增或修改
// Goods goods = new Goods(1L,"大米6X手机","手机","小米",1199.0,"http.jpg");
// goodsRespository.save(goods); //save or update
// 根据id查询
// Optional
// Goods goods = optional.get();
// System.out.println(goods);
// 删除
// goodsRespository.deleteById(1L);
// 批量新增
/* List
list.add(new Goods(1L, "小米手机7", "手机", "小米", 3299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(2L, "坚果手机R1", "手机", "锤子", 3699.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(3L, "华为META10", "手机", "华为", 4499.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(4L, "小米Mix2S", "手机", "小米", 4299.00,"http://image.leyou.com/13123.jpg"));
list.add(new Goods(5L, "荣耀V10", "手机", "华为", 2799.00,"http://image.leyou.com/13123.jpg"));
goodsRespository.saveAll(list);*/
4. 查询
4.1 goodsRespository自带的查询
// Iterable
// Iterable
Iterable
for (Goods goods : goodsList) {
System.out.println(goods);
}
4.2 自定义查询方法
可以在接口中根据规定定义一些方法就可以直接使用
public interface GoodsRespository extends ElasticsearchRepository
public List
public List
public List
public List
public List
}
使用:
// List
List
for (Goods goods : goodsList) {
System.out.println(goods);
}
5. SpringDataElasticSearch结合原生api查询
1、结合native查询
@Test
public void testQuery(){
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withQuery(QueryBuilders.termQuery("title", "小米"));
// nativeSearchQueryBuilder.withQuery(QueryBuilders.matchAllQuery());
// nativeSearchQueryBuilder.withPageable(PageRequest.of(0,3,Sort.by(Sort.Direction.DESC,"price")));
nativeSearchQueryBuilder.addAggregation(AggregationBuilders.terms("brandAvg").field("brand"));
AggregatedPage
Aggregations aggregations = aggregatedPage.getAggregations();
Terms terms = aggregations.get("brandAvg");
List extends Terms.Bucket> buckets = terms.getBuckets();
for (Terms.Bucket bucket : buckets) {
System.out.println(bucket.getKeyAsString()+bucket.getDocCount());
}
List
for (Goods goods : content) {
System.out.println(goods);
}
}
2、自己处理高亮
需要自定一个用来处理高亮的实现类
class GoodsHighLightResultMapper implements SearchResultMapper{
@Override
public
List
Aggregations aggregations = searchResponse.getAggregations();
String scrollId = searchResponse.getScrollId();
SearchHits searchHits = searchResponse.getHits();
long total = searchHits.getTotalHits();
float maxScore = searchHits.getMaxScore();
for (SearchHit searchHit : searchHits) {
String sourceAsString = searchHit.getSourceAsString();
T t = JSON.parseObject(sourceAsString, aClass);
Map
HighlightField highlightField = highlightFields.get("title");
String title = highlightField.getFragments()[0].toString();
try {
BeanUtils.setProperty(t,"title",title);
} catch (Exception e) {
e.printStackTrace();
}
content.add(t);
}
return new AggregatedPageImpl
// List
}
3、使用