包含的主要内容: ES的数据类型,Mapping的基本使用,如何使用复杂数据类型
1、ES数据类型总览
1. 核心数据类型
(1)字符串类型: text, keyword
(2)数字类型:long
, integer
, short
, byte
, double
, float
, half_float
, scaled_float
①long -- 带符号的64位整数,最小值-2^63,最大值2^63-1
②integer -- 带符号的32位整数,最小值-2^31,最大值2^31-1
③short -- 带符号的16位整数,最小值-32768,最大值32767
④byte -- 带符号的8位整数,最小值-128,最小值127
⑤double -- 双精度64位IEEE 754 浮点数
⑥float -- 单精度32位IEEE 754 浮点数
⑦half_float -- 半精度16位IEEE 754 浮点数
⑧scaled_float -- 带有缩放因子的缩放类型浮点数
(3)日期:date
(4)日期 纳秒:date_nanos
(5)布尔型:boolean
(6)二进制:binary
(7)范围: integer_range
, float_range
, long_range
, double_range
, date_range
2. 复杂数据类型
(1)Object: object( 用于单个JSON对象 )
(2)Nested: nested
( 用于JSON对象数组 )
3. 地理数据类型
(1)Geo-point: geo_point ( 纬度/经度 )
(2)Geo-shape: geo_shape ( 用于多边形等复杂形状 )
4. 特殊数据类型
(1)IP: ip (IPv4 和 IPv6 地址)
(2)Completion类型:completion ( 提供自动完成建议 )
(3)Token count:token_count ( 计算字符串中token的数量 )
(4)mapper-murmur3:murmur3( 在索引时计算值的哈希并将其存储在索引中 )
(5)mapper-annotated-text:annotated-text ( 索引包含特殊标记的文本(通常用于标识命名实体) )
(6)Percolator:(Accepts queries from the query-dsl)
(7)Join:( 为同一索引内的文档定义父/子关系 )
(8)Alias:( 为现有字段定义别名)
(9)Rank feature:(Record numeric feature to boost hits at query time.)
(10)Rank features:(Record numeric features to boost hits at query time.)
(11)Dense vector:(Record dense vectors of float values.)
(12)Sparse vector:(Record sparse vectors of float values.)
(13)Search-as-you-type:(A text-like field optimized for queries to implement as-you-type completion)
(14) Flattened : Allows an entire JSON object to be indexed as a single field.
(15) Shape : shape
for arbitrary cartesian geometries.
(16) Histogram : histogram
for pre-aggregated numerical values for percentiles aggregations.
5.数组类型
在Elasticsearch中,数组不需要一个特定的数据类型,任何字段都默认可以包含一个或多个值,当然,这多个值都必须是字段指定的数据类型。
6.Multi-fields
Multi-fields 通常用来以不同的方式或目的索引同一个字段。比如,一个字符串类型字段可以同时被映射为 text 类型以用于全文检索、 keyword字段用于排序或聚合。又或者,你可以用standard分析器、english分析器和french分析器来索引同一个 text字段。
2、查看Mapping
GET /index/_mapping
3、手动创建Mapping
只能创建index时手动建立mapping,或者新增field mapping,但是不能update field mapping
PUT /website
{
"mappings": {
"properties": {
"author_id": {
"type": "long"
},
"title": {
"type": "text",
"analyzer": "english"
},
"content": {
"type": "text"
},
"post_date": {
"type": "date"
},
"publisher_id": {
"type": "keyword"
}
}
}
}
4、添加field并指定mapping
PUT /website/_mappings
{
"properties": {
"new_field": {
"type": "text",
"analyzer": "english"
}
}
}
##查看一下是否新增了
GET website/_mapping
## 存入数据
POST /website/_doc
{
"author_id":123123123,
"title":"english dog",
"content":"123",
"post_date":"2020-03-22",
"publisher_id":123
}
5、测试Mapping的分词
GET /website/_analyze
{
"field": "content",
"text": "my-dogs"
}
GET website/_analyze
{
"field": "new_field",
"text": "my dogs"
}
##未分词的类型
GET /website/_analyze
{
"field": "publisher_id",
"text": "my-dogs"
}
6、复杂数据类型(object)
PUT /company/_doc/1
{
"address": {
"country": "china",
"province": "guangdong",
"city": "guangzhou"
},
"name": "jack",
"age": 27,
"join_date": "2017-01-01"
}
GET /company/_mapping
查询结果
{
"company" : {
"mappings" : {
"properties" : {
"address" : {
"properties" : {
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"country" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"province" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"age" : {
"type" : "long"
},
"join_date" : {
"type" : "date"
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
复杂数据类型会在底层进行数据的转换
##object类型
{
"authors": [
{ "age": 26, "name": "Jack White"},
{ "age": 55, "name": "Tom Jones"},
{ "age": 39, "name": "Kitty Smith"}
]
}
存储的格式;name和age会分开,从列式转化成行式来存储
{
"authors.age": [26, 55, 39],
"authors.name": [jack, white, tom, jones, kitty, smith]
}
补充一个官网示例:
PUT my_index
{
"mappings": {
"properties": {
"manager": {
"properties": {
"age": { "type": "integer" },
"name": { "type": "text" }
}
},
"employees": {
"type": "nested", // 数组类型(复杂数据类型)
"properties": {
"age": { "type": "integer" },
"name": { "type": "text" }
}
}
}
}
}
PUT my_index/_doc/1
{
"region": "US",
"manager": {
"name": "Alice White",
"age": 30
},
"employees": [
{
"name": "John Smith",
"age": 34
},
{
"name": "Peter Brown",
"age": 26
}
]
}
Elasticsearch 6.x Mapping设置 - 掘金 https://juejin.im/post/5b799dcb6fb9a019be279bd7
Field datatypes | Elasticsearch Reference [7.6] | Elastic https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html