{
name: "PuFTRWU", //node名称
cluster_name: "elasticsearch", //集群名称(默认elasticsearch)
cluster_uuid: "Fl4hcPzAS9-tyy9TgFPAKw",
version: { //es版本号
number: "6.5.2",
build_flavor: "default",
build_type: "zip",
build_hash: "9434bed",
build_date: "2018-11-29T23:58:20.891072Z",
build_snapshot: false,
lucene_version: "7.5.0",
minimum_wire_compatibility_version: "5.6.0",
minimum_index_compatibility_version: "5.0.0"
},
tagline: "You Know, for Search"
}
在config/elasticsearch.yml里面修改es配置
Kibana是一个开源分析和可视化平台,旨在与Elasticsearch协同工作
elasticsearch的图形化界面工具
mapping里面规定了字段类型,也是以key:value的方式存储,是为了方便搜索,在搜索的时候会在相应类型的字段里面去搜索
put /index
{
"setting": {
"number_of_shards": 5,
"number_of_replicas": 1
},
"mapping": {
//定义数据结构
"my_type": {
"properties": {
"field": {
"type": ""
}
}
}
}
}
get _cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1544716861 16:01:01 elasticsearch green 1 1 1 1 0 0 0 0 - 100.0%
status:
green——每个索引的primary shard和replica shard都是active状态
yellow——每个索引的primary shard都是active状态,部分replica shard不可用
red——部分primary shard不可用
get _cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .kibana_1 XDUVeEKpTX6ylkSX6zDNJA 1 0 3 0 11.9kb 11.9kb
put
delete
put test_index?pretty // 创建索引 test_index是索引名
get _cat/indices //查看索引
// 结果
yellow open test_index CiUD-eEMTpmZRfo5R9WcuQ 5 1 0 0 1.1kb 1.1kb
green open .kibana_1 XDUVeEKpTX6ylkSX6zDNJA 1 0 3 0 11.9kb 11.9kb
delete test_index?pretty // 删除索引
get _cat/indices
// 结果
green open .kibana_1 XDUVeEKpTX6ylkSX6zDNJA 1 0 3 0 11.9kb 11.9kb
put /index/type/id // 索引/类型/记录id
{
"field" : value
}
新增document的过程中,es会自动创建index和type
post /index/type
{
"field" : value
}
“_id” : “wl3OvGcBxixzDxFAyNjD”
使用GUID算法生成的id长度20,不会重复
get /index/type/id
同新增,使用put,必须带上所有field修改,不然数据会丢失
post /index/type/id/_update
{
"doc" : {
"field" : value
}
}
2种方式原理相似:原document被mark为deleted,产生新的document
部分替换的查询,修改,写回操作都在一个shard中进行,避免网络传输提高性能;时间短,减少冲突时间(总结就是post比put修改快!!!)
delete /index/type/id/?pretty
没有进行物理删除,而是标记为isDelete,之后删除
command 1 新增文档
put /user/student/1
{
"studentName": "huping",
"studentAge": 20
}
result 1
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
command 2 查询文档
get /index/type/1
result 2
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"studentName" : "huping",
"studentAge" : 20
}
}
command 3 修改文档 !!!需要带上所有field,不然数据丢失
put /user/student/1
{
"studentName": "huping",
"studentAge": 21
}
result 3 版本号变为2 result是update
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
command 4 修改文档
post /user/student/1/_update
{
"doc" : {
"studentAge": 18 我不管,永远18,嘿嘿
}
}
result 4
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"_version" : 3,
"found" : true,
"_source" : {
"studentName" : "huping",
"studentAge" : 18
}
}
command 5
delete /user/student/1
command 6
get /user/student/1
result 6
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"found" : false
}
搜索所有记录
GET /user/student/_search
有条件搜索(q=)
GET /user/student/_search?q=studentName:huping&sort=studentAge:desc
_all field搜索 document所有field里面搜索字符串里面有huping的
GET /user/student/_search?q=huping
相当于sql语句
https://blog.csdn.net/jiaminbao/article/details/80105636 youxiu
http request body
query 和 filter
term (用在不分词的字段上,完全匹配)与match
多搜索条件组合查询 (bool)
must(必须匹配,类似于数据库的 =),must_not(必须不匹配,类似于数据库的 !=),should(没有强制匹配,类似于数据库的 or),filter(过滤)
查询所有记录(match_all)的指定字段(_source)并分页(from size)
GET /user/student/_search
{
"query": {
"match_all": {}
},
"_source": "studentName",
"from": 0,
"size": 2
}
条件查询(match)排序(sort)
GET /user/student/_search
{
"query": {
"match": {
"studentName": "huping"
}
},
"sort": [
{
"studentAge": {
"order": "desc"
}
}
]
}
query filter
studentName.keyword和studentName的区别
studentName.keyword,字段和输入参数的完全匹配(精确查询)
studentName,字段里面包括输入参数(模糊查询)
GET /user/student/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"studentName": "huping"
}
}
],
"filter": {
"range": {
"studentAge": {
"gte": 10,
"lte": 20
}
}
}
}
}
}
full-text search
GET /user/student/_search
{
"query": {
"match": {
"studentName": "huping qaq"
}
}
}
result:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "user",
"_type" : "student",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"studentName" : "qaq",
"studentAge" : 21
}
},
{
"_index" : "user",
"_type" : "student",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"studentName" : "huping",
"studentAge" : 18
}
},
{
"_index" : "user",
"_type" : "student",
"_id" : "3",
"_score" : 0.2876821,
"_source" : {
"studentName" : "huping ya",
"studentAge" : 20
}
}
]
}
}
会将给的参数“huping qaq”拆分成关键字“huping”和“qaq”进行检索
所以huping,qaq,huping ya都能被匹配到
匹配度根据每个hit里面的*_score*衡量
phrase search
完全包含一样的匹配(不会拆分)(match_phrase)
GET /user/student/_search
{
"query": {
"match_phrase": {
"studentName": "huping qaq"
}
}
}
highlight search
highlight
GET /user/student/_search
{
"query": {
"match": {
"studentName": "huping qaq"
}
},
"highlight": {
"fields": {
"studentName": {}
}
}
}
"hits" : [
{
"_index" : "user",
"_type" : "student",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"studentName" : "qaq",
"studentAge" : 21
},
"highlight" : {
"studentName" : [
"qaq"
]
}
},
"aggs": {
"NAME": {
"AGG_TYPE": {}
}
}
AGG_TYPE可以为avg,term,range等
primary shard在创建索引的时候固定了(因为涉及到了document路由算法),replica shard可以在创建索引之后再改变
replica shard是primary shard的副本,不能在一个node上,否则不能保证容错
PUT /index_test
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
}
}
一个index被存放在多个shard上,创建document时放在哪个shard上——路由
路由算法:
shard = hash(routing number) % number_of_primary_shards
每次增删改查会带routing number,默认为文档id,可以手动指定
请求可以给任意node,接收到请求的node会路由给相应的primary shard(节点对等,负载均衡)
_index(索引名,小写,不能用下划线开头,不能含逗号;相似的数据及字段大部分一样的数据存在一个index中,会拥有较好的查询性能)
_type(类型,大写或小写,不能用下划线开头,不能含逗号;个别字段不同的数据type不能同)
_id(/index/type/id是document的唯一标志,可以手动指定put,也可由es自动创建post)
_score(与搜索文本的关联度,越高关联度越强)
_source(数据,搜索时可使用其指定需要查询的字段)
document第一次创建的时候_version是1,之后再put,post,delete时+1
delete也是_version+1,因为不是马上进行物理删除(可进行实验,先delete再put,version数据仍然保留)
external version——可以不使用内部_version,使用外部version进行版本控制
?version=1 (内部)
version需要与_version相同,才能操作
?version=1&version=external(外部)
version > _version,就能操作
retry_on_conflict(重复次数控制的参数)
减少网络请求次数
GET user/student/_mget
{
"docs": [
{
"_id": "1"
},
{
"_id": "2"
}
]
}
POST /_bulk
{"opType" : {"metadata"}}
{"data"}
{"opType" : {"metadata"}}
{"data"}
opType——DETETE(删除只需要指定{“metadata”},不需要跟{“data”}),CREATE(put新增),INDEX(put操作,新增或全量替换),UPDATE(partial update)
{“metadata”}——{"_index":"", “_type”:"", “_id”:""}
{“data”}——{“field”:“value”}
优点,每个操作都可以发到不同shard上(性能!!!不用在一个节点上处理所有操作,数据较多时不会只占用一个节点的内存)
consistency参数控制
one(one primary shard is active)
all(all shard is active)
quorum(default,most shard is active)
多个index或type的搜索
/_search
/index1,index2/_search
/_all/type1,type2/_search
/type*/_search
精确搜索和全文检索
精确搜索——不拆分
全文检索——关键字拆分查询
建立倒排索引时除了分词之外,还会进行normalization(对每个词进行处理,大小写、同义词、缩写…)
分词器
standard analyzer(default)、simple analyzer、whitespace analyzer、language analyzer
不同的分词器对于特殊符号(短横-,括号()…)的处理方式不同
GET /_analyze
{
"analyzer": "standard",
"text": "text to analyze"
}
mapping
在es里面插入document时,mapping(es会自动创建,dynamic mapping)里面会定义每个field的类型
数据类型
简单类型
string
byte,short,integer,long
float,double
boolean
date
复杂类型
object
mapping也可以手动生成,指定分词器,指定field类型
java程序使用的是别名
_alias将old index别名为alias index
新建new index,将old index data通过/_bluk迁移到new index
将alias index切换到new index
relevance score算法——TF/IDF(term frequency/inverse document frequency)
doc1:hello,nice to meet you
doc2:hello world
word | doc1 | doc2 |
---|---|---|
hello | * | * |
nice | * | |
to | * | |
meet | * | |
you | * | |
world | * |
hello——》doc1,doc2
nice——》doc1
world——》doc2
doc1:{“name”:“jack”, “age”:22}
doc2:{name”:“rose”, “age”:20}
document | name | age |
---|---|---|
doc1 | jack | 22 |
doc2 | rose | 20 |
emmm,挺多的,和DSL对应
https://juejin.im/post/5b3ac6db6fb9a024fc284e60#heading-28
参考文档 https://www.cnblogs.com/jajian/p/9976900.html
https://blog.csdn.net/jiaminbao/article/category/7314565
学习视频 https://www.bilibili.com/video/av29521652?from=search&seid=14185113444924508715