下载:
ElasticSearch: https://mirrors.huaweicloud.com/elasticsearch/?C=N&O=D
logstash: https://mirrors.huaweicloud.com/logstash/?C=N&O=D
kibana: https://mirrors.huaweicloud.com/kibana/?C=N&O=D
ik 下载地址:https://github.com/medcl/elasticsearch-analysis-ik/releases
cerebro: https://github.com/lmenezes/cerebro/releases
下载head插件: https://github.com/mobz//archive/master.zip
这个是提供给想要学习elasticsearch的人准备的一个百度云盘集合,其中包含了elasticsearch、head、ik分词器、kibana的windowszip,解压即用!
链接:https://pan.baidu.com/s/11IWFi1FrOUNIx4cqBM3CZA
提取码:3y4t
简介:开源高拓展性,高可用的分布式全文检索引擎,近乎实时存储,检索数据,可用拓展上百台机器,支持PB(大数据1000TB)级别数据,使用JAVA开发,并使用Lucene作为其核心来实现索引和搜索功能,目的是通过RESTFUL API来隐藏Lucene的复杂性,从而全文搜索变得简单。全文搜索、结构化搜索、分析(实时建立索引和大数据量时性能比solr明显)
安装:
申明:JDK >=1.8 ,下载解压即用
使用:
目录:bin:启动文件
config:配置文件
log4j2:日志配置
JVM:虚拟机配置 (xmls1g-启动用一个G内存)
elasticsearch.yml:elasticsearch配置文件
lib:相关架包
modules:功能模块(可以自己添加)
plugins:插件
启动:yml:elasticsearch.bat
访问:http://127.0.0.1:9200/
安装可视化界面-head:
解压:即可
安装依赖:npm install ,(若报错可执行一下命令)
然后 cmd 执行了另外两个命令
1.修改镜像源
npm config set registry http://registry.npm.taobao.org
2.安装cnpm
npm install -g cnpm --registry=https://registry.npm.taobao.org
然后再cd到你安装的ElasticSearch-head 目录下执行npm install 插件安装成功。
启动:npm run start
访问:http://localhost:9100
访问会报跨域错误!
解决:进入elasticsearch.yml,拖到最后,配置跨域
http.cors.enabled: true
http.cors.allow-origin: “*”
重启即可!
使用:把索引当数据库
安装:kibana (和ES版本一致)
了解:ELK(elasticsearch,logstash,kibana)
logstash:中央数据流引擎,用于任何数据分析和收集,代表:日志分析和收集框架技术
直接解压:双击 kibana.bat
(汉化:配置文件中i18n.locale: “zh-CN” ,位置在: \x-pack\plugins\translations\translations)后重启
ES核心理解:
ES是面向文档,一切皆是JSON
物理设计:ES后台默认把每个索引分成多个分片,每个分片可以在集群中的不同服务器迁移
(一个就是集群)
逻辑设计:一个索引类型中,包含多个文档,比如说文档1,文档2,当我们索引一篇文档时,可以通过这样的一个 顺序找到它:索引》类型》文档ID,通过这个组合我们就能索引到某个具体的文档(注意:ID不必是整数,实际上他是个字符串)
文档:面向文档,索引和搜索数据最小单位是文档,(相当于表格中的行)
1、自我包含,一篇文档同时包含字段和对应的值,也就是同时包含KEY:VALUE
2、层次型,包含自文档(JSON对象)
3、结构灵活,文档不依赖预先定义的模式,(区分MYSQL)
索引:就是数据库
类型:文档的逻辑容器
分片:每个索引被分成多个分片(默认5个),每个分片对应一个Lucene的索引
节点:
倒排索引:采用Lucene倒排索引为底层,适合快速全文搜索。一个索引由文档中所有不重复的列表构成,每个词都有一个包含他的文档列表
IK分词器:
分词:就是把一段中文分成一个个关键字,
分为2个算法:
ik_smart:最小切分
ik_max_word:最细粒度
安装:下载并解压,创建文件夹ik,并解压到此文件夹中,放入elasticsearch的plugins中,
重启即可。
可以通过elasticsearch-plugin list 进行查看
对于自己需要的词,需要自己加入分词器字典中
添加完成后重启ES和kibana
RESTFUL风格:
创建索引:
创建索引,设置规则
如果创建文档没有指定类型,ES会自动匹配类型,
虚心学习,这个世界大佬很多!
通过GET _cat 命令可以获得ES当前很多信息
基本操作:
更新建议用:POST,因为PUT是覆盖,如果值为空也覆盖
查询:
GET mytest/user/1 通过ID查询
GET mytest/user/_search?q=name:王 条件查询
多条件精确查询
布尔值查询:
must(and)所有条件都要符合 where id = 1 and name =xxx
should(or) 满足一个即可,相当于or
must_not --》not
多条件匹配
SpringBoot集成ES
打开官网
1、引入maven
2、初始化
创建config配置文件,注入bean
3、使用
看源码
使用代码创建索引
package com.ruoyi;
import com.alibaba.fastjson.JSON;
import com.ruoyi.es.bean.es.Student;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.action.admin.indices.get.GetIndexRequest;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.MatchAllQueryBuilder;
import org.elasticsearch.index.query.MatchQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.reindex.DeleteByQueryRequest;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.elasticsearch.search.sort.SortOrder;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Map;
@RunWith(SpringRunner.class)
@SpringBootTest
public class EsTest {
private RestHighLevelClient client;
//初始化开启
@Before
public void init(){
client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("127.0.0.1", 9200, "http")));
}
//初始化关闭
@After
public void close(){
if(client!=null){
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
@Test
public void createIndex() throws Exception{
//创建索引
CreateIndexRequest request = new CreateIndexRequest("test3");
CreateIndexResponse response =client.indices().create(request, RequestOptions.DEFAULT);
System.out.print(response);
}
//新增
@Test
public void add(){
Student student = new Student(1L,"张三",11L,1.50,"张三备注",false);
String str = JSON.toJSONString(student);
IndexRequest indexRequest = new IndexRequest("studentinfo", "student", "1");
indexRequest.source(str, XContentType.JSON);
try {
IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);
System.out.println("响应结果: "+JSON.toJSONString(indexResponse));
} catch (IOException e) {
e.printStackTrace();
}
}
//批量新增
@Test
public void addList(){
BulkRequest bulkRequest = new BulkRequest();
for (long i = 11; i <=20 ; i++) {
//批量更新和删除在这里操作即可
Student student = new Student(i,"张三"+i,11+i,1.50,"张三备注"+i,false);
String str = JSON.toJSONString(student);
IndexRequest indexRequest = new IndexRequest("studentinfo", "student", i + "");
indexRequest.source(str,XContentType.JSON);
bulkRequest.add(indexRequest);
}
try {
BulkResponse bulk = client.bulk(bulkRequest, RequestOptions.DEFAULT);
System.out.println("响应结果: "+bulk);
} catch (IOException e) {
e.printStackTrace();
}
}
@Test
public void getIndexDoc() throws Exception{
//创建索引
GetRequest request = new GetRequest("studentinfo","_doc","7");
GetResponse response =client.get(request ,RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());
System.out.println(response);
}
//删除
@Test
public void delete(){
String id = "1";
DeleteRequest deleteRequest = new DeleteRequest("studentinfo", "student",id);
try {
DeleteResponse delete = client.delete(deleteRequest, RequestOptions.DEFAULT);
} catch (IOException e) {
e.printStackTrace();
}
}
//查询
@Test
public void get(){
String id = "1";
GetRequest getRequest = new GetRequest("studentinfo", "student", id);
try {
GetResponse documentFields = client.get(getRequest,RequestOptions.DEFAULT);
System.out.println("响应结果: "+documentFields);
} catch (IOException e) {
e.printStackTrace();
}
}
//查询 match
@Test
public void getMatch(){
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("title", "张");
commonSearch(matchQueryBuilder);
}
//查询 match_all
@Test
public void getMatchAll(){
MatchAllQueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
commonSearch(matchAllQueryBuilder);
}
//高亮查询
@Test
public void hightSearch(){
MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("name", "张");
//构建searchRequest请求对象,指定索引库
SearchRequest searchRequest = new SearchRequest("studentinfo");
//构建searchSourceBuilder查询对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//将queryBuilder 对象设置到searchSourceBuilder中
searchSourceBuilder.query(queryBuilder);
//高亮查询条件
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("");
highlightBuilder.postTags("");
highlightBuilder.field("title");
//将searchSourceBuilder查询对象封装到请求对象 SearchRequest中
searchRequest.source(searchSourceBuilder);
try {
//调用方法进行数据通信
SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
//解析结果
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit:hits) {
String sourceAsString = hit.getSourceAsString();
System.out.println("响应结果: "+sourceAsString);
//获取高亮查询结果
Map highlightFields = hit.getHighlightFields();
HighlightField title = highlightFields.get("title");
Text[] fragments = title.getFragments();
for (Text fragment : fragments){
System.out.println("高亮查询结果:"+fragment);
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
//查询的公共方法
public void commonSearch(QueryBuilder queryBuilder){
//构建searchRequest请求对象,指定索引库
SearchRequest searchRequest = new SearchRequest("studentinfo");
//构建searchSourceBuilder查询对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//将queryBuilder 对象设置到searchSourceBuilder中
searchSourceBuilder.query(queryBuilder);
/*排序*/
searchSourceBuilder.sort("height", SortOrder.DESC);
searchSourceBuilder.sort("age", SortOrder.ASC);
// searchSourceBuilder.sort(“name”, SortOrder.ASC);//如果需要String类型排序, 字段需要加入fielddata : true ,默认false
/*分页 分页公式: int page = (pageNum-1)*pageSize */
searchSourceBuilder.from(0);
searchSourceBuilder.size(2);
//将searchSourceBuilder查询对象封装到请求对象 SearchRequest中
searchRequest.source(searchSourceBuilder);
try {
//调用方法进行数据通信
SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
//解析结果
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit:hits) {
String sourceAsString = hit.getSourceAsString();
System.out.println("响应结果: "+sourceAsString);
}
} catch (IOException e) {
e.printStackTrace();
}
}
//修改
@Test
public void update() {
UpdateRequest updateRequest = new UpdateRequest("studentinfo", "student", "1");
Student student = new Student(4L,"李四",22L,1.75,"李四备注",false);
String str = JSON.toJSONString(student);
UpdateRequest doc = updateRequest.doc(str, XContentType.JSON);
try {
UpdateResponse update = client.update(doc, RequestOptions.DEFAULT);
System.out.println("响应结果: "+update);
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 百万数据插入到sql中
*/
@Test
public void test1(){
System.out.println("测试。。。。");
final String url = "jdbc:mysql://127.0.0.1:3306/aa?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=true&serverTimezone=GMT%2B8";
final String name = "com.mysql.cj.jdbc.driver";
final String user = "root";
final String password = "123456";
Connection conn = null;
try {
Class.forName(name);//指定连接类型
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
try {
conn = DriverManager.getConnection(url, user, password);//获取连接
} catch (SQLException e) {
e.printStackTrace();
}
if (conn!=null) {
System.out.println("获取连接成功");
insert(conn);
}else {
System.out.println("获取连接失败");
}
}
public void insert(Connection conn) {
// 开始时间
Long begin = new Date().getTime();
// sql前缀
String prefix = "INSERT INTO testTable (accountName,testImage,testRead,createTime) VALUES ";
try {
// 保存sql后缀
StringBuffer suffix = new StringBuffer();
// 设置事务为非自动提交
conn.setAutoCommit(false);
// 比起st,pst会更好些
PreparedStatement pst = (PreparedStatement) conn.prepareStatement(" ");//准备执行语句
// 外层循环,总提交事务次数
for (int i = 1; i <= 100; i++) {
suffix = new StringBuffer();
// 第j次提交步长
for (int j = 1; j <= 10000; j++) {
// 构建SQL后缀
String string = "";
for (int k = 0; k < 10; k++) {
char c = (char) ((Math.random() * 26) + 97);
string += (c + "");
}
String name = string;
String testImage = string;
String testRead = string;
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
suffix.append("('" + name+"','"+testImage+"','"+testRead+"','"+sdf.format(new Date())+"'),");
}
// 构建完整SQL
String sql = prefix + suffix.substring(0, suffix.length() - 1);
System.out.println("sql==="+sql);
// 添加执行SQL
pst.addBatch(sql);
// 执行操作
pst.executeBatch();
// 提交事务
conn.commit();
// 清空上一次添加的数据
suffix = new StringBuffer();
}
// 头等连接
pst.close();
conn.close();
} catch (SQLException e) {
e.printStackTrace();
}
// 结束时间
Long end = new Date().getTime();
// 耗时
System.out.println("100万条数据插入花费时间 : " + (end - begin) / 1000 + " s");
System.out.println("插入完成");
}
}
爬虫
例如:京东
高亮显示搜索