新建用户
groupadd es
useradd Es -g es
chown -R Es:es /home/Es/
设置密码
passwd Es
登录root用户 su - root
vim /etc/security/limits.conf
添加内容 :
* soft nofile 65536
* hard nofile 131072
* soft nproc 4096
* hard nproc 4096
* soft nproc 4096
修改文件 : JAVA_HOME\jre\lib\i386\jvm.cfg
-server KNOWN
-client IF_SERVER_CLASS -server
-minimal KNOW
https://blog.csdn.net/zwq56693/article/details/107798089
安装ik分词器(功能是支持中文分词)
分词效果
POST _analyze
{
"analyzer": "ik_smart"
, "text": "我是中国人"
}
{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
},
{
"token": "是",
"start_offset": 1,
"end_offset": 2,
"type": "CN_CHAR",
"position": 1
},
{
"token": "中国人",
"start_offset": 2,
"end_offset": 5,
"type": "CN_WORD",
"position": 2
}
]
}
概念 :Elasticsearch也是基于Lucene的全文检索库,本质也是存储数据,很多概念与MySQL类似的。
概念详细解释
创建索引
PUT请求 索引名称(小写字母) 分片个数 副本个数
PUT es
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
}
}
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "es"
}
查看索引设置
GET es
{
"es": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"creation_date": "1596506579933",
"number_of_shards": "1",
"number_of_replicas": "1",
"uuid": "BGMEzq0nS9WU7-c9tutqhw",
"version": {
"created": "6020499"
},
"provided_name": "es"
}
}
}
}
映射配置
映射是定义文档的过程,文档包含哪些字段,这些字段是否保存,是否索引,是否分词等
PUT es/_mapping/goods
{
"properties":
{
"title":
{
"type": "text",
"analyzer": "ik_max_word"
},
"images":
{
"type": "keyword",
"index": "false"
},
"price":
{
"type": "float"
}
}
}
查看映射 : GET /索引库名/_mapping
GET es/_mapping/
字段属性详解
随机id值
POST /索引库名/类型名
{
"key":"value"
}
------------------
POST /es/goods
{
"title":"rongyao手机",
"images":"http://image.junyang.com/12479122.jpg",
"price":2699.00
}
{
"_index": "es",
"_type": "goods",
"_id": "undEwHMB4sJhf15MpqCs",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 2
}
自定义id值
POST /索引库名/类型/id值
{
...
}
id对应文档存在,则修改
id对应文档不存在,则新增
没有定义映射时,也可以添加数据
PUT /es/goods/undEwHMB4sJhf15MpqCs
{
"title":"小米手机",
"images":"http://image.es.com/12479122.jpg",
"price":3899.00,
"stock": 100,
"saleable":true
}
基本语法
GET /索引库名/_search
{
"query":{
"查询类型":{
"查询条件":"查询条件值"
}
}
}
查询所有(match_all)
GET /es/_search
{
"query":{
"match_all": {}
}
}
查询结果
took:查询花费时间,单位是毫秒
time_out:是否超时
_shards:分片信息
hits:搜索结果总览对象
total:搜索到的总条数
max_score:所有结果中文档得分的最高分
hits:搜索结果的文档对象数组,每个元素是一条搜索到的文档信息
_index:索引库
_type:文档类型
_id:文档id
_score:文档得分
_source:文档的源数据
匹配查询(match)
GET /es/_search
{
"query":{
"match":{
"title":{
"query":"小米曲面电视",
"minimum_should_match": "75%"
}
}
}
}
多字段查询(multi_match)
multi_match与match类似,不同的是它可以在多个字段中查询
会在title字段和subtitle字段中查询荣耀这个词
GET /es/_search
{
"query":{
`"multi_match": {
"query": "荣耀",
"fields": [ "title", "subTitle" ]
}
}
}
``
词条匹配(term)
term查询被用于精确值 匹配,这些精确值可能是数字、时间、布尔或者那些未分词的字符串
GET /es/_search
{
"query":{
"term":{
"price":2699.00
}
}
}
多词条精确匹配(terms)
terms 查询和 term 查询一样,但它允许你指定多值进行匹配。如果这个字段包含了指定值中的任何一个值,那么这个文档满足条件
GET /es/_search
{
"query":{
"terms":{
"price":[2699.00,2899.00,3899.00]
}
}
}
结果过滤
直接指定字段
显示title和price
GET /es/_search
{
"_source": ["title","price"],
"query": {
"term": {
"price": 2699
}
}
}
指定includes和excludes
includes:来指定想要显示的字段
excludes:来指定不想要显示的字段
GET /es/_search
{
"_source": {
"includes":["title","price"]
},
"query": {
"term": {
"price": 2699
}
}
}
布尔组合(bool)
bool把各种其它查询通过must(与)、must_not(非)、should(或)的方式进行组合
GET /es/_search
{
"query":{
"bool":{
"must": { "match": { "title": "荣耀" }},
"must_not": { "match": { "title": "电视" }},
"should": { "match": { "title": "手机" }}
}
}
}
范围查询(range)
range 查询找出那些落在指定区间内的数字或者时间
GET /es/_search
{
"query":{
"range": {
"price": {
"gte": 1000.0,
"lt": 2800.00
}
}
}
模糊查询(fuzzy)
GET /es/_search
{
"query": {
"fuzzy": {
"title": {
"value":"appla",
"fuzziness":1
}
}
}
}
过滤(filter)
filter : 跟过滤条件
GET /es/_search
{
"query":{
"bool":{
"must":{ "match": { "title": "小米手机" }},
"filter":{
"range":{"price":{"gt":2000.00,"lt":3800.00}}
}
}
}
}
排序
1. 单字段排序 : sort 可以让我们按照不同的字段进行排序,并且通过order指定排序的方式
GET /es/_search
{
"query": {
"match": {
"title": "小米手机"
}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
}
多字段排序
GET /goods/_search
{
"query":{
"bool":{
"must":{ "match": { "title": "小米手机" }},
"filter":{
"range":{"price":{"gt":200000,"lt":300000}}
}
}
},
"sort": [
{ "price": { "order": "desc" }},
{ "_score": { "order": "desc" }}
]
}
聚合aggregations
基本概念
创建索引:
PUT /cars
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"transactions": {
"properties": {
"color": {
"type": "keyword"
},
"make": {
"type": "keyword"
}
}
}
}
}
导入数据
POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
聚合为桶
我们按照汽车的颜色color来划分桶
size : 查询条数,这里设置为0,因为我们不关心搜索到的数据,只关心聚合结果,提高效率
aggs:声明这是一个聚合查询,是aggregations的缩写
GET /cars/_search
{
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color"
}
}
}
}
桶内度量
求价格平均值的度量
GET /cars/_search
{
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color"
},
"aggs":{
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
桶内嵌套桶
统计每种颜色的汽车中,分别属于哪个制造商,按照make字段再进行分桶
GET /cars/_search
{
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color"
},
"aggs":{
"avg_price": {
"avg": {
"field": "price"
}
},
"maker":{
"terms":{
"field":"make"
}
}
}
}
}
}
阶梯分桶Histogram
histogram是把数值类型的字段,按照一定的阶梯大小进行分组。你需要指定一个阶梯值(interval)来划分阶梯大小。
比如你有价格字段,如果你设定interval的值为200,那么阶梯就会是这样的:0,200,400,600,…
我们对汽车的价格进行分组,指定间隔interval为5000:
min_doc_count为1,来约束最少文档数量为1,这样文档数量为0的桶会被过滤
GET /cars/_search
{
"size":0,
"aggs":{
"price":{
"histogram": {
"field": "price",
"interval": 5000,
"min_doc_count": 1
}
}
}
}