先收藏一波官方link再看看别人的笔记1link笔记2link就假装我会了。
为何ES就成为全文搜索引擎的首选。可以快速地存储、搜索和分析海量数据。这些离不开它的倒排索引表,示意如下:
对于保存的记录:
1-红海行动
2-探索红海行动
3-红海特别行动
4-红海记录篇
5-特工红海特别探索
dokcer中安装elastic search
(1)下载ealastic search和kibana
docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2
(2)配置
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/
(3)启动Elastic search
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2
设置开机启动elasticsearch
docker update elasticsearch --restart=always
(4)启动kibana:
docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.1.100:9200 -p 5601:5601 -d kibana:7.4.2
注意这里的192.168.1.100:9200为ES的
设置开机启动kibana
docker update kibana --restart=always
(5)测试
查看elasticsearch版本信息:http://localhost:9200/
访问Kibana: http://192.168.1.100:5601/app/kibana
新建索引并添加属性mapping
PUT gulimall_product
{
"mappings": {
"properties": {
"attrs": {
"type": "nested",
"properties": {
"attrId": {
"type": "long"
},
"attrName": {
"type": "keyword"
},
"attrValue": {
"type": "keyword"
}
}
},
"brandId": {
"type": "long"
},
"brandImg": {
"type": "keyword"
},
"brandName": {
"type": "keyword"
},
"catalogId": {
"type": "long"
},
"catalogName": {
"type": "keyword"
},
"hasStock": {
"type": "boolean"
},
"hotScore": {
"type": "long"
},
"saleCount": {
"type": "long"
},
"skuId": {
"type": "long"
},
"skuImg": {
"type": "keyword"
},
"skuPrice": {
"type": "keyword"
},
"skuTitle": {
"type": "text",
"analyzer": "ik_smart"
},
"spuId": {
"type": "keyword"
}
}
}
}
你会发现其中有一个"type": “nested”,如果要不是该类型会如何呢,ES默认会对对嵌套数据类型进行扁平化处理
为保持属性的独立,需要将该属性类型改为nested,通过对下面的实例进行nested类型查询示意:
这个是某款商品的属性表示例
"attrs" : [
{
"attrId" : 10,
"attrName" : "上市年份",
"attrValue" : "2020"
},
{
"attrId" : 11,
"attrName" : "品牌名",
"attrValue" : "mate30"
},
{
"attrId" : 12,
"attrName" : "CPU",
"attrValue" : "麒麟"
},
{
"attrId" : 13,
"attrName" : "屏幕刷新率",
"attrValue" : "120HZ"
}
],
该数据为nested type,maintain the independence of each object in the array
现在想查询匹配属性需求的产品,示例:12号属性必须为鲲鹏、11号属性必须为xiaomi,其DSL如下
将其中一个nested query展开,针对的是path指定的attrs数组中每个独立的对象query,查询两个独立对象就需要两个nested query.
"nested": {
//嵌入式的
"path": "attrs",
"query": {
"bool": {
"must": [ // 根据属性id 以及属性值进行过滤
{
"term": {
"attrs.attrId": {
"value": "12"
}
}
},
{
"terms": {
"attrs.attrValue": [
"鲲鹏"
]
}
}
]
}
}
}
ES内置了不少分词器,但对中文分词并不友好
1、安装 ik 分词器
去link 下载与 es对应的版本,拷贝到ES的plugins 目录下,即ES当初启动时指定的挂载目录-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins
2、自定义词库
对于有些中文流行语,分词器支持的未必很好,可以自己扩展词库,修改/mydata/elasticsearch/plugins/ik/config/IKAnalyzer.cfg.xml
<properties>
<comment>IK Analyzer 扩展配置comment>
<entry key="ext_dict">entry>
<entry key="ext_stopwords">entry>
<entry key="remote_ext_dict">http://192.168.1.100/es/fenci.txtentry>
properties>
PS:http://192.168.1.100/es/fenci.txt这个是nginx静态资源,配置好nginx可直接访问,之后再阐述
选择 Elasticsearch - Rest - Client (elasticsearch - rest - high - level - client)link
<properties>
<elasticsearch.version>7.4.2elasticsearch.version>
properties>
<dependency>
<groupId>org.elasticsearch.clientgroupId>
<artifactId>elasticsearch-rest-high-level-clientartifactId>
dependency>
配置类,给容器注入一个RestHighLevelClient
@Configuration
public class ElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// builder.addHeader("Authorization", "Bearer " + TOKEN);
// builder.setHttpAsyncResponseConsumerFactory(
// new HttpAsyncResponseConsumerFactory
// .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient esRestClient() {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http")));
return client;
}
}
上架商品、将该商品相关属性上传到 Es中 为搜索服务做铺垫。上架是面向商家的后台系统处理的,在product服务中的up方法会调用search服务中的保存至ES。
com/atguigu/gulimall/search/impl/ProductSaveServiceImpl.java
@Slf4j
@Service("productsaveservice")
public class ProductSaveServiceImpl implements ProductSaveService {
@Autowired
RestHighLevelClient esClient;
@Override
public boolean productStatusUp(List<SkuEsModel> skuEsModelList) throws IOException {
// 1.给ES建立一个索引 product
BulkRequest bulkRequest = new BulkRequest();
// 2.构造保存请求
for (SkuEsModel skuEsModel : skuEsModelList) {
IndexRequest indexRequest = new IndexRequest(EsConstant.PRODUCT_INDEX);
// 设置索引id
indexRequest.id(skuEsModel.getSkuId().toString());
indexRequest.source(JSON.toJSONString(skuEsModel), XContentType.JSON);
bulkRequest.add(indexRequest);
}
//BulkRequest bulkRequest, RequestOptions options
BulkResponse bulk = esClient.bulk(bulkRequest, ElasticSearchConfig.COMMON_OPTIONS);
// TODO 是否拥有错误
boolean hasFailures = bulk.hasFailures();
if(hasFailures){
List<String> collect = Arrays.stream(bulk.getItems()).map(item -> item.getId()).collect(Collectors.toList());
log.error("商品上架完成:{}",collect);
}
return hasFailures;
}
}
语句有点长,粘贴在这里,没法看,放在Kibana中看着舒服点
/home/xu/PersonProjects/IdeaProjects/guimail/gulimall-search/src/main/resources/dsl.json
GET gulimall_product/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"skuTitle": "华为" // 按照关键字查询
}
}
],
"filter": [
{
"term": {
"catalogId": "225" // 根据分类id过滤
}
},
{
"terms": {
"brandId": [ // 品牌id
"1",
"5",
"9"
]
}
},
{
"nested": {
//嵌入式的
"path": "attrs",
"query": {
"bool": {
"must": [ // 根据属性id 以及属性值进行过滤
{
"term": {
"attrs.attrId": {
"value": "8"
}
}
},
{
"terms": {
"attrs.attrValue": [
"2019"
]
}
}
]
}
}
}
},
{
"term": {
// 是否有库存
"hasStock": {
"value": "false"
}
}
},
{
"range": {
// 价格区间
"skuPrice": {
"gte": 0,
"lte": 7000
}
}
}
]
}
},
"sort": [ //排序
{
"skuPrice": {
"order": "desc"
}
}
],
"from": 0,
"size":4,
"highlight": {
// 对搜索田间进行高亮
"fields": {
"skuTitle": {
}},
"pre_tags": "",
"post_tags": ""
},
"aggs": {
"brand_agg": {
//品牌进行聚合
"terms": {
"field": "brandId",
"size": 10
},
"aggs": {
"brand_name_agg": {
// 品牌名字
"terms": {
"field": "brandName",
"size": 10
}
},
"brand_img_agg": {
//品牌图片
"terms": {
"field": "brandImg",
"size": 10
}
}
}
},
"catalog_agg": {
// 分类
"terms": {
"field": "catalogId",
"size": 10
},
"aggs": {
"catalog_name_agg": {
//分类名字
"terms": {
"field": "catalogName",
"size": 10
}
}
}
},
"attr_agg":{
"nested": {
//嵌入式的聚合
"path": "attrs"
},
"aggs": {
//属性聚合
"attr_id_agg": {
"terms": {
"field": "attrs.attrId",
"size": 10
},
"aggs": {
"attr_name_agg": {
//属性名字
"terms": {
"field": "attrs.attrName",
"size": 10
}
},
"attr_value_agg":{
//属性的值
"terms": {
"field": "attrs.attrValue",
"size": 10
}
}
}
}
}
}
}
}
这里的代码量太大了,我就只放出DSL翻译为Java的代码部分,查询的结果处理还是看文件。
/home/xu/PersonProjects/IdeaProjects/guimail/gulimall-search/src/main/java/com/atguigu/gulimall/search/impl/MallSearchServiceImpl.java
@Autowired
RestHighLevelClient esClient;
//1、准备检索请求
SearchRequest searchRequest = buildSearchRequest(param);
try {
// 2、执行检索请求
SearchResponse response = esClient.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);
。。。
/**
* 准备检索请求
* #模糊匹配、过滤(按照属性、分类、品牌、价格区间、库存)、排序、分页、高亮、聚合分析
*
* @return
*/
private SearchRequest buildSearchRequest(SearchParam param){
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); //构建DSL语句
/**
* 模糊匹配 过滤(按照属性、分类、品牌、价格区间、库存)
*/
// 1、构建bool - query
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
// 1.1 must - 模糊匹配
if (!StringUtils.isEmpty(param.getKeyword())) {
boolQuery.must(QueryBuilders.matchQuery("skuTitle", param.getKeyword()));
}
// 1.2 bool - filter 按照三级分类id来查询
if (param.getCatalog3Id() != null) {
boolQuery.filter(QueryBuilders.termQuery("catalogId", param.getCatalog3Id()));
}
// 1.2 bool - filter 按照品牌id来查询
if (param.getBrandId() != null && param.getBrandId().size() > 0) {
boolQuery.filter(QueryBuilders.termsQuery("brandId", param.getBrandId()));
}
// 1.2 bool - filter 按照所有指定的属性来进行查询 *******不理解这个attr=1_5寸:8寸这样的设计
if (param.getAttrs() != null && param.getAttrs().size() > 0) {
for (String attr : param.getAttrs()) {
// attr=1_5寸:8寸&attrs=2_16G:8G
BoolQueryBuilder nestedboolQuery = QueryBuilders.boolQuery();
String[] s = attr.split("_");
String attrId = s[0];// 检索的属性id
String[] attrValues = s[1].split(":");
nestedboolQuery.must(QueryBuilders.termQuery("attrs.attrId", attrId));
nestedboolQuery.must(QueryBuilders.termsQuery("attrs.attrValue", attrValues));
// 每一个必须都生成一个nested查询
NestedQueryBuilder nestedQuery = QueryBuilders.nestedQuery("attrs", nestedboolQuery, ScoreMode.None);
boolQuery.filter(nestedQuery);
}
}
// 1.2 bool - filter 按照库存是否存在
boolQuery.filter(QueryBuilders.termQuery("hasStock", param.getHasStock() == 1));
// 1.2 bool - filter 按照价格区间
/**
* 1_500/_500/500_
*/
if (!StringUtils.isEmpty(param.getSkuPrice())) {
RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("skuPrice");
String[] s = param.getSkuPrice().split("_");
if (s.length == 2) {
// 区间
rangeQuery.gte(s[0]).lte(s[1]);
} else if (s.length == 1) {
if (param.getSkuPrice().startsWith("_")) {
rangeQuery.lte(s[0]);
}
if (param.getSkuPrice().endsWith("_")) {
rangeQuery.gte(s[0]);
}
}
boolQuery.filter(rangeQuery);
}
//把以前所有条件都拿来进行封装
sourceBuilder.query(boolQuery);
/**
* 排序、分页、高亮
*/
//2.1、排序
if (!StringUtils.isEmpty(param.getSort())) {
String sort = param.getSort();
//sort=hotScore_asc/desc
String[] s = sort.split("_");
SortOrder order = s[1].equalsIgnoreCase("asc") ? SortOrder.ASC : SortOrder.DESC;
sourceBuilder.sort(s[0], order);
}
//2.2 分页 pageSize:5
// pageNum:1 from 0 size:5 [0,1,2,3,4]
// pageNum:2 from 5 size:5
// from (pageNum - 1)*size
sourceBuilder.from((param.getPageNum() - 1) * EsConstant.PRODUCT_PAGESIZE);
sourceBuilder.size(EsConstant.PRODUCT_PAGESIZE);
//2.3、高亮
if (!StringUtils.isEmpty(param.getKeyword())) {
HighlightBuilder builder = new HighlightBuilder();
builder.field("skuTitle");
builder.preTags("");
builder.postTags("");
sourceBuilder.highlighter(builder);
}
/**
* 聚合分析
*/
//1、品牌聚合
TermsAggregationBuilder brand_agg = AggregationBuilders.terms("brand_agg");
brand_agg.field("brandId").size(50);
//品牌聚合的子聚合
brand_agg.subAggregation(AggregationBuilders.terms("brand_name_agg").field("brandName").size(2));
brand_agg.subAggregation(AggregationBuilders.terms("brand_img_agg").field("brandImg").size(2));
// TODO 1、聚合brand
sourceBuilder.aggregation(brand_agg);
//2、分类聚合
TermsAggregationBuilder catalog_agg = AggregationBuilders.terms("catalog_agg").field("catalogId").size(20);
catalog_agg.subAggregation(AggregationBuilders.terms("catalog_name_agg").field("catalogName").size(1));
// TODO 2、聚合catalog
sourceBuilder.aggregation(catalog_agg);
//3、属性聚合 attr_agg
NestedAggregationBuilder attr_agg = AggregationBuilders.nested("attr_agg", "attrs");
// 聚合出当前所有的attrId
TermsAggregationBuilder attr_id_agg = AggregationBuilders.terms("attr_id_agg").field("attrs.attrId");
//聚合分析出当前attr_id对应的名字
attr_id_agg.subAggregation(AggregationBuilders.terms("attr_name_agg").field("attrs.attrName").size(1));
// 聚合分析出当前attr_id对应的可能的属性值attractValue
attr_id_agg.subAggregation(AggregationBuilders.terms("attr_value_agg").field("attrs.attrValue").size(50));
attr_agg.subAggregation(attr_id_agg);
// TODO 3、聚合attr
sourceBuilder.aggregation(attr_agg);
String s = sourceBuilder.toString();
System.out.println("构建的DSL:" + s);
SearchRequest searchRequest = new SearchRequest(new String[]{
EsConstant.PRODUCT_INDEX}, sourceBuilder);
return searchRequest;
}