Elasticsearch7.5 字段类型说明

(小声BB,有道云笔记上传图片要收钱无奈直接用CSDN)

alias/字段别名

简介:
别名可以代替搜索请求中的目标字段

PUT trips
{
  "mappings": {
    "properties": {
      "distance": {
        "type": "long"
      },
      "route_length_miles": {
        "type": "alias",
        "path": "distance" 
      },
      "transit_mode": {
        "type": "keyword"
      }
    }
  }
}

GET _search
{
  "query": {
    "range" : {
      "route_length_miles" : {
        "gte" : 39
      }
    }
  }
}

在搜索请求的某些部分以及当请求字段功能时,可以提供字段通配符模式。
在这些情况下,通配符模式除了具体字段外,还将匹配字段别名:

GET trips/_field_caps?fields=route_*,transit_mode

字段说明:

  1. 别名可以在查询中,聚合和排序字段使用(大部分功能都能支持别名)
  2. 目标必须是具体字段,而不是对象或其他字段别名
  3. 在创建别名时,目标字段必须存在
  4. 如果定义了嵌套对象,则字段别名必须与其目标具有相同的嵌套作用域
  5. 字段别名只能有一个目标(一个字段仅支持一个别名)
  6. 更新索引别名或字段别名时不能使用别名

arrays

简介:
ES没有专用的array数据类型,但可存储字符串、数字、一维、多维、对象的数组

字符串数组
[ "one", "two" ]
数字数组
[ 1, 2 ]
多维数组
[ 1, [ 2, 3 ]] 
一维数组
[ 1, 2, 3 ]
对象数组
[ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]

不支持混合类型数组
[ 10"some string"]
PUT my_index/_doc/1
{
  "message": "some arrays in this document...",
  "tags":  [ "elasticsearch", "wow" ], 
  "lists": [ 
    {
      "name": "prog_list",
      "description": "programming list"
    },
    {
      "name": "cool_list",
      "description": "cool stuff list"
    }
  ]
}

PUT my_index/_doc/2 
{
  "message": "no arrays in this document...",
  "tags":  "elasticsearch",
  "lists": {
    "name": "prog_list",
    "description": "programming list"
  }
}

GET my_index/_search
{
  "query": {
    "match": {
      "tags": "elasticsearch" 
    }
  }
}

字段说明

  1. 字段声明时,可声明字符串或其他类型,但可用数组作为存储
  2. 数组可以包含null值

binary

简介:
接受二进制值作为 Base64编码的字符串。该字段默认情况下不存储,并且不可搜索

PUT my_index
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "blob": {
        "type": "binary"
      }
    }
  }
}

PUT my_index/_doc/1
{
  "name": "Some binary blob",
  "blob": "U29tZSBiaW5hcnkgYmxvYg==" 
}

字段说明

  1. Base64编码的二进制值不能包含嵌入式换行符\n

boolean

简介:
布尔字段接受JSON true和false值,但也可以接受解释为true或false的字符串

PUT my_index
{
  "mappings": {
    "properties": {
      "is_published": {
        "type": "boolean"
      }
    }
  }
}

POST my_index/_doc/1
{
  "is_published": "true" 
}

GET my_index/_search
{
  "query": {
    "term": {
      "is_published": true 
    }
  }
}

GET my_index/_search
{
  "aggs": {
    "publish_state": {
      "terms": {
        "field": "is_published"
      }
    }
  },
  "script_fields": {
    "is_published": {
      "script": {
        "lang": "painless",
        "source": "doc['is_published'].value"
      }
    }
  }
}

date

简介:
存储日期格式字段

PUT my_index
{
  "mappings": {
    "properties": {
      "date": {
        "type":   "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}

PUT my_index/_doc/1
{ "date": "2015-01-01" } 

PUT my_index/_doc/2
{ "date": "2015-01-01T12:10:30Z" } 

PUT my_index/_doc/3
{ "date": 1420070400001 }    //毫秒作为单位

GET my_index/_search
{
  "sort": { "date": "asc"} 
}

字段说明

  1. 日期将始终以字符串形式呈现
  2. 默认格式,是"strict_date_optional_time||epoch_millis"
  3. strict_date_optional_time是date_optional_time的严格级别,这个严格指的是年份、月份、天必须分别以4位、2位、2位表示,不足两位的话第一位需用0补齐。不满足这个格式的日期字符串是放不进es中的
  4. epoch_millis约束值必须大于等于Long.MIN_VALUE,小于等于Long.MAX_VALUE
  5. 仅支持"yyyy-MM-dd"、“yyyyMMdd”、“yyyyMMddHHmmss”、“yyyy-MM-ddTHH:mm:ss”、“yyyy-MM-ddTHH:mm:ss.SSS”、"yyyy-MM-ddTHH:mm:ss.SSSZ"格式

geo_point

简介:
用于记录经纬度的字段

PUT my_index
{
  "mappings": {
    "properties": {
      "location": {
        "type": "geo_point"
      }
    }
  }
}

PUT my_index/_doc/1
{
  "text": "Geo-point as an object",
  "location": { 
    "lat": 41.12,
    "lon": -71.34
  }
}

PUT my_index/_doc/2
{
  "text": "Geo-point as a string",
  "location": "41.12,-71.34" 
}

PUT my_index/_doc/3
{
  "text": "Geo-point as a geohash",
  "location": "drm3btev3e86" 
}

PUT my_index/_doc/4
{
  "text": "Geo-point as an array",
  "location": [ -71.34, 41.12 ] 
}

PUT my_index/_doc/5
{
  "text": "Geo-point as a WKT POINT primitive",
  "location" : "POINT (-71.34 41.12)" 
}

GET my_index/_search
{
  "query": {
    "geo_bounding_box": { 
      "location": {
        "top_left": {
          "lat": 42,
          "lon": -72
        },
        "bottom_right": {
          "lat": 40,
          "lon": -74
        }
      }
    }
  }
}

字段说明:

  1. 不管字符串形式还是数组形式,都是纬度在前,经度在后
  2. 在一个任意形状的边界内,查找符合规定的经纬度数据
  3. 按距离对文档进行排序搜索
  4. 字符串地理位置按排序lat,lon,而数组地理位置按相反的顺序排序:lon,lat
  5. 一个点可以表示为geohash。Geohash是交错的纬度和经度的位的base32编码的字符串。geohash中的每个字符都会增加5位精度。因此,哈希越长,它越精确。为了进行索引,将目标哈希值转换为纬度-经度对。在此过程中,仅使用前12个字符,因此在geohash中指定超过12个字符不会提高精度。12个字符提供60位,这应该将可能的错误减少到2cm以下

原文:
A point can be expressed as a geohash. Geohashes are base32 encoded strings of the bits of the latitude and longitude interleaved. Each character in a geohash adds additional 5 bits to the precision. So the longer the hash, the more precise it is. For the indexing purposed geohashs are translated into latitude-longitude pairs. During this process only first 12 characters are used, so specifying more than 12 characters in a geohash doesn’t increase the precision. The 12 characters provide 60 bits, which should reduce a possible error to less than 2cm

IP

简介:
ip字段可以索引/存储IPv4或 IPv6地址

PUT my_index
{
  "mappings": {
    "properties": {
      "ip_addr": {
        "type": "ip"
      }
    }
  }
}

PUT my_index/_doc/1
{
  "ip_addr": "192.168.1.1"
}

GET my_index/_search
{
  "query": {
    "term": {
      "ip_addr": "192.168.0.0/16"
    }
  }
}

GET my_index/_search
{
  "query": {
    "term": {
      "ip_addr": "2001:db8::/48"
    }
  }
}

join

简介:
该join数据类型是创建相同的索引文件中的父/子关系的特殊领域

PUT my_index
{
  "mappings": {
    "properties": {
      "my_join_field": { 
        "type": "join",
        "relations": {
          "question": "answer" 
          //父级名称:子级名称
        }
      }
    }
  }
}

//父级数据
PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": {
    "name": "question" 
  }
}

//父级数据
PUT my_index/_doc/2?refresh
{
  "text": "This is another question",
  "my_join_field": {
    "name": "question"
  }
}

//创建父级数据时,可使用字符串与对象两种方式
PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": "question" 
}

//创建父级数据时,可使用字符串与对象两种方式
PUT my_index/_doc/2?refresh
{
  "text": "This is another question",
  "my_join_field": "question"
}


//子级数据
PUT my_index/_doc/3?routing=1&refresh 
{
  "text": "This is an answer",
  "my_join_field": {
    "name": "answer",  //子级名称
    "parent": "1"   //父级ID
  }
}

//子级数据
PUT my_index/_doc/4?routing=1&refresh
{
  "text": "This is another answer",
  "my_join_field": {
    "name": "answer",
    "parent": "2"
  }
}

字段说明:

  1. 在为子级文档建立索引时,必须在中添加关系的名称以及文档的父级ID
  2. join每个索引仅允许一个字段映射
  3. 父文档和子文档必须在同一分片上建立索引。这意味着路由查询在获取,删除或更新 子文档时需要提供相同的值
  4. 一个父级文档可以有多个子级文档,但只能有一个父级文档
  5. 可以向现有join字段添加新关系
  6. 也可以将子级文档添加到现有文档中,但前提是该文档已经是父级文档
  7. 该join字段使用全局序号来加快连接速度。对分片进行任何更改后,都需要重建全局序号

keyword

简介:
用户储存字符串数据字段,常用于中文IK分词搜索

PUT my_index
//事例一
{
  "mappings": {
    "properties": {
      "tags": {
        "type":  "keyword"
      }
    }
  }
}

//事例二
{
  "mappings": {
    "properties": {
      "tags": {
        "type":  "keyword",
        "ignore_above":   32766  //用于设置超过设定字符后,不被索引或者存储
      }
    }
  }
}

字段说明:

  1. 可配合字段排序、聚合搜索、精确匹配使用

数字范围

Elasticsearch7.5 字段类型说明_第1张图片

PUT my_index
{
  "mappings": {
    "properties": {
      "number_of_bytes": {
        "type": "integer"
      },
      "time_in_seconds": {
        "type": "float"
      },
      "price": {
        "type": "scaled_float",
        "scaling_factor": 100
      }
    }
  }
}

对象类型

简介:对象可以是单一对象也可以是多维对象

PUT my_index/_doc/1
//manager对象内包含着name对象
{ 
  "region": "US",
  "manager": { 
    "age":     30,
    "name": { 
      "first": "John",
      "last":  "Smith"
    }
  }
}



//单一对象
{
  "region":             "US",
  "manager.age":        30,
  "manager.name.first": "John",
  "manager.name.last":  "Smith"
}


{
  "mappings": {
    "properties": { 
      "region": {
        "type": "keyword"
      },
      "manager": { 
        "properties": {
          "age":  { "type": "integer" },
          "name": { 
            "properties": {
              "first": { "type": "text" },
              "last":  { "type": "text" }
            }
          }
        }
      }
    }
  }
}

范围数据类型

Elasticsearch7.5 字段类型说明_第2张图片

PUT range_index
{
  "settings": {
    "number_of_shards": 2
  },
  "mappings": {
    "properties": {
      "expected_attendees": {
        "type": "integer_range"
      },
      "time_frame": {
        "type": "date_range", 
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}

PUT range_index/_doc/1?refresh
//更新字段
{
  "expected_attendees" : { 
    "gte" : 10,
    "lte" : 20
  },
  "time_frame" : { 
    "gte" : "2015-10-31 12:00:00", 
    "lte" : "2015-11-01"
  }
}

//查询数据
GET range_index/_search
{
  "query" : {
    "term" : {
      "expected_attendees" : {
        "value": 12
      }
    }
  }
}

//得到结果
{
  "took": 13,
  "timed_out": false,
  "_shards" : {
    "total": 2,
    "successful": 2,
    "skipped" : 0,
    "failed": 0
  },
  "hits" : {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "range_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "expected_attendees" : {
            "gte" : 10, "lte" : 20
          },
          "time_frame" : {
            "gte" : "2015-10-31 12:00:00", "lte" : "2015-11-01"
          }
        }
      }
    ]
  }
}

//日期查询
GET range_index/_search
{
  "query" : {
    "range" : {
      "time_frame" : { 
        "gte" : "2015-10-31",
        "lte" : "2015-11-01",
        "relation" : "within" 
      }
    }
  }
}

//得到结果
{
  "took": 13,
  "timed_out": false,
  "_shards" : {
    "total": 2,
    "successful": 2,
    "skipped" : 0,
    "failed": 0
  },
  "hits" : {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "range_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "expected_attendees" : {
            "gte" : 10, "lte" : 20
          },
          "time_frame" : {
            "gte" : "2015-10-31 12:00:00", "lte" : "2015-11-01"
          }
        }
      }
    ]
  }
}

//IP范围查询
PUT range_index/_mapping
{
  "properties": {
    "ip_whitelist": {
      "type": "ip_range"
    }
  }
}

//得到结果
PUT range_index/_doc/2
{
  "ip_whitelist" : "192.168.0.0/16"
}

等级特征数据类型/Rank features datatype

简介:用于提高快速查询数据的字段类型

PUT my_index
{
  "mappings": {
    "properties": {
      "pagerank": {
        "type": "rank_feature" 
      },
      "url_length": {
        "type": "rank_feature",
        "positive_score_impact": false 
      }
    }
  }
}

PUT my_index/_doc/1
{
  "pagerank": 8,
  "url_length": 22
}

GET my_index/_search
{
  "query": {
    "rank_feature": {
      "field": "pagerank"
    }
  }
}

//支持小数点
PUT my_index
{
  "mappings": {
    "properties": {
      "topics": {
        "type": "rank_features" 
      }
    }
  }
}

PUT my_index/_doc/1
{
  "topics": { 
    "politics": 20,
    "economics": 50.8
  }
}

PUT my_index/_doc/2
{
  "topics": {
    "politics": 5.2,
    "sports": 80.1
  }
}

GET my_index/_search
{
  "query": {
    "rank_feature": {
      "field": "topics.politics"
    }
  }
}


字段说明:

  1. rank_feature字段仅支持单值字段和严格为正的值。多值字段和负值将被拒绝
  2. rank_feature字段不支持查询,排序或汇总。它们只能在rank_feature查询中使用
  3. rank_feature 字段仅保留9个有效位以保持精度,这意味着约0.4%的相对误差

与得分负相关的排名特征应设置 positive_score_impact为false(默认为true)。rank_feature查询将使用它来修改评分公式,以使分数随着要素的值而减少而不是增加。例如,在Web搜索中,URL长度是常用的功能,它与分数负相关

输入时搜索类型

简介:
实时输入时搜索字段

PUT my_index
{
  "mappings": {
    "properties": {
      "my_field": {
        "type": "search_as_you_type",
        "max_shingle_size ": 3
      }
    }
  }
}

PUT my_index/_doc/1?refresh
{
  "my_field": "quick brown fox jump lazy dog"
}

GET my_index/_search
{
  "query": {
    "multi_match": {
      "query": "brown f",
      "type": "bool_prefix",
      "fields": [
        "my_field",
        "my_field._2gram",
        "my_field._3gram"
      ]
    }
  }
}

{
  "took" : 44,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.8630463,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.8630463,
        "_source" : {
          "my_field" : "quick brown fox jump lazy dog"
        }
      }
    ]
  }
}

字段说明:

  1. 搜索位置,匹配项从输入的开头开始
  2. 搜索位置,匹配项在输入中的任何位置
  3. max_shingle_size 有效整数范围2-4

text

PUT my_index
{
  "mappings": {
    "properties": {
      "full_name": {
        "type":  "text"
      }
    }
  }
}

字段说明:

  1. 全文本(text),一个用于全文本搜索
  2. 关键字(keyword),用于聚合和排序

令牌计数类型

简介:
类型字段token_count实际上是一个integer接受字符串值,对其进行分析,然后为字符串中的令牌数编制索引的字段

PUT my_index
{
  "mappings": {
    "properties": {
      "name": { 
        "type": "text",
        "fields": {
          "length": { 
            "type":     "token_count",
            "analyzer": "standard"
          }
        }
      }
    }
  }
}

PUT my_index/_doc/1
{ "name": "John Smith" }

PUT my_index/_doc/2
{ "name": "Rachel Alice Williams" }

//本次查询仅匹配到文档2
GET my_index/_search
{
  "query": {
    "term": {
      "name.length": 3 
    }
  }
}

你可能感兴趣的:(Elasticsearch7.5 字段类型说明)