本文转载至https://blog.csdn.net/chengyuqiang/article/details/79059958
映射参数概述:
ElasticSearch提供了丰富的映射参数。
官网地址:https://www.elastic.co/guide/en/elasticsearch/reference/6.3/mapping-params.html
1、analyzer
指定分词器,对应索引和查询都有效。如下:IK分词配置
第一步:定义索引并定义映射
第二步:向指定索引填充数据
Post http://192.168.1.74:9200/my_index/it/1
{
"name" : "中国是个幸福美丽的国度"
}
Post http://192.168.1.74:9200/my_index/it/2
{
"name" : "广东省的省会城市是羊城(广州)"
}
Post http://192.168.1.74:9200/my_index/it/3
{
"name" : "中国最南的岛屿叫曾母暗沙"
}
第三步:查询
2、normalizer
elasticsearch标准化配置属性,如下:所有字符转换为小写。
第一步:定义索引,并设置elasticsearch 分析标准。
参数内容:
{
"settings": {
"analysis": {
"normalizer":{
"my_normalizer":{
"type":"custom",
"char_filter" : [],
"filter" : ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"it" : {
"properties" : {
"name" : {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
第二步:插入数据:
Post http://192.168.1.74:9200/my_index/it/1
{
"name" : "bar"
}
Post http://192.168.1.74:9200/my_index/it/2
{
"name" : "BAR"
}
Post http://192.168.1.74:9200/my_index/it/3
{
"name" : "BaR"
}
第三步:查询数据:
3、boost
通过指定一个boost值来控制每个查询子句的相对权重,该值默认为1。一个大于1的boost会增加该查询子句的相对权重。
第一步:定义索引,设置权重比值
第二步:填充数据
Post http://192.168.1.74:9200/my_index/it/1
{
"title" : "client me"
}
Post http://192.168.1.74:9200/my_index/it/2
{
"title" : "follow"
}
第三步:查询数据
4、coerce
coerce属性用于清除脏数据,coerce的默认值是true。整型数字5有可能会被写成字符串“5”或者浮点数5.0.coerce属性可以用来清除脏数据:
【例子】
(1)重新创建my_index
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"number_one": {
"type": "integer"
},
"number_two": {
"type": "integer",
"coerce": false
}
}
}
}
}
(2)写入一条测试文档
PUT my_index/my_type/1
{
"number_one": "10"
}
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
(3)写入另一条测试文档
PUT my_index/my_type/2
{
"number_two": "10"
}
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [number_two]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [number_two]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Integer value passed as String"
}
},
"status": 400
}
5、copy-to
copy_to属性用于配置自定义的_all字段。换言之,就是多个字段可以合并成一个超级字段。比如,first_name和last_name可以合并为full_name字段。
【例子】
(1)
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
PUT my_index/my_type/1
{
"first_name": "John",
"last_name": "Smith"
}
(2)查询
GET my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": 0.5753642,
"_source": {
"first_name": "John",
"last_name": "Smith"
}
}
]
}
}
6、doc_values
doc_values是为了加快排序、聚合操作,在建立倒排索引的时候,额外增加一个列式存储映射,是一个空间换时间的做法。默认是开启的,对于确定不需要聚合或者排序的字段可以关闭。
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"status_code": {
"type": "keyword"
},
"session_id": {
"type": "keyword",
"doc_values": false
}
}
}
}
}
7、dynamic
dynamic属性用于检测新发现的字段,有三个取值:
【例子】
(1)新建索引
取值为strict,非布尔值要加引号
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"dynamic": "strict",
"properties": {
"title": { "type": "text"}
}
}
}
}
(2)插入新文档
PUT my_index/my_type/1
{
"title": "test",
"content": "test dynamic"
}
抛出异常
{
"error": {
"root_cause": [
{
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [content] within [my_type] is not allowed"
}
],
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [content] within [my_type] is not allowed"
},
"status": 400
}
8、enabled
ELasticseaech默认会索引所有的字段,enabled设为false的字段,es会跳过字段内容,该字段只能从_source中获取,但是不可搜。而且字段可以是任意类型。
【例子】
(1)新建索引,插入文档
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"name":{"enabled": false}
}
}
}
}
PUT my_index/my_type/1
{
"title": "test enabled",
"name":"chengyuqiang"
}
(2)查看文档
GET my_index/my_type/1
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"title": "test enabled",
"name": "chengyuqiang"
}
}
(3)搜索字段
GET my_index/_search
{
"query": {
"match": {
"name": "chengyuqiang"
}
}
}
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
9、ignore_above
ignore_above用于指定字段索引和存储的长度最大值,超过最大值的会被忽略
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"message": {
"type": "keyword",
"ignore_above": 20
}
}
}
}
}
PUT my_index/my_type/1
{
"message": "Syntax error"
}
PUT my_index/my_type/2
{
"message": "Syntax error with some long stacktrace"
}
GET my_index/_search
{
"size":0,
"aggs": {
"messages": {
"terms": {
"field": "message"
}
}
}
}
mapping中指定了ignore_above字段的最大长度为20,第一个文档的字段长小于20,因此索引成功,第二个超过20,因此不索引,返回结果只有”Syntax error”,结果如下
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"messages": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Syntax error",
"doc_count": 1
}
]
}
}
}
10、ignore_malformed
ignore_malformed可以忽略不规则数据。对于账号userid字段,有人可能填写的是 整数类型,也有人填写的是邮件格式。给一个字段索引不合适的数据类型发生异常,导致整个文档索引失败。如果ignore_malformed参数设为true,异常会被忽略,出异常的字段不会被索引,其它字段正常索引。
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"number_one": {
"type": "integer",
"ignore_malformed": true
},
"number_two": {
"type": "integer"
}
}
}
}
}
PUT my_index/my_type/1
{
"text": "Some text value",
"number_one": "foo"
}
PUT my_index/my_type/2
{
"text": "Some text value",
"number_two": "foo"
}
上面的例子中number_one接受integer类型,ignore_malformed属性设为true,因此文档一种number_one字段虽然是字符串但依然能写入成功;number_two接受integer类型,默认ignore_malformed属性为false,因此写入失败。
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [number_two]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [number_two]",
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"foo\""
}
},
"status": 400
}
11、index_options
index_options参数控制将哪些信息添加到倒排索引,用于搜索和突出显示目的。
注意:The index_options parameter has been deprecated for Numeric fields in 6.0.0。6.0.0中的数字字段已弃用index_options参数。
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"text": {
"type": "text",
"index_options": "offsets"
}
}
}
}
}
PUT my_index/my_type/1
{
"text": "Quick brown fox"
}
GET my_index/_search
{
"query": {
"match": {
"text": "brown fox"
}
},
"highlight": {
"fields": {
"text": {}
}
}
}
{
"took": 50,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": 0.5753642,
"_source": {
"text": "Quick brown fox"
},
"highlight": {
"text": [
"Quick brown fox"
]
}
}
]
}
}
12、 index
index属性指定字段是否索引,不索引也就不可搜索,取值可以为true或者false。
13、fields
fields可以让同一文本有多种不同的索引方式,比如一个String类型的字段,可以使用text类型做全文检索,使用keyword类型做聚合和排序。
DELETE my_index
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
PUT my_index/my_type/1
{
"city": "New York"
}
PUT my_index/my_type/2
{
"city": "York"
}
GET my_index/_search
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
},
"aggs": {
"Cities": {
"terms": {
"field": "city.raw"
}
}
}
}
{
"took": 31,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": null,
"_source": {
"city": "New York"
},
"sort": [
"New York"
]
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": null,
"_source": {
"city": "York"
},
"sort": [
"York"
]
}
]
},
"aggregations": {
"Cities": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "New York",
"doc_count": 1
},
{
"key": "York",
"doc_count": 1
}
]
}
}
}
说明:The city.raw field is a keyword version of the city field (.city.raw字段是城市字段的关键字版本。)
The city field can be used for full text search.( city字段可用于全文搜索。)
The city.raw field can be used for sorting and aggregations.( city.raw字段可用于排序和聚合)