ElasticSearch 6.x 学习笔记:13.mapping元字段

ElasticSearch 6.x 学习笔记:13.mapping元字段

    • 13.1 元数据概述
    • 13.2 _index
    • 13.3 _type
    • 13.4 _id
    • 13.5 _uid
    • 13.6 _source
    • 13.7 _size
    • 13.8 _all
    • 13.9 _field_names
    • 13.10 _routing

原文: https://blog.csdn.net/chengyuqiang/article/details/79054153

13.1 元数据概述

mapping元字段官网文档
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-fields.html#_document_source_meta_fields

mapping元字段是mapping映射中描述文档本身的字段,大致可以分为文档属性元数据、文档元数据、索引元数据、路由元数据和自定义元数据。

分类 元数据 说明
文档属性元数据 _index 文档所属的索引
_id 文档的id
_type 文档所属类型
_uid 由_type和_id字段组成
文档元数据 _source 文档的原生json字符串
_size 整个_source字段的字节数大小
索引元数据 _all 自动组合所有的字段值
_field_names 索引了每个字段的名称
路由元数据 _parent 指定文档之间父子关系,已过时
_routing 将一个文档根据路由存储到指定分片上
自定义元数据 _meta 用于自定义元数据

下面对重要的元字段做进一步的解读

13.2 _index

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-index-field.html
多索引查询时,有时候只需要在特地索引名上面进行查询,_index字段提供了便利,也就是说可以对索引名进行tern查询,terms查询,聚合分析,使用脚本和排序。

_index 是一个虚拟字段,不会真的加到Lucene索引中,对_index进行term、terms查询(也包括match、query_string、simple_query_string),但是不支持prefix、wildcard、regexp和fuzzy查询。

13.3 _type

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-type-field.html
在6.0.0中弃用。
此doc的mapping type名, 自动被索引,可被查询,聚合,排序使用,或者脚本里访问

13.4 _id

doc的id,建索引时候传入 ,不被索引, 可通过_uid被查询,脚本里使用,不能参与聚合或排序

PUT my_index

PUT my_index/my_type/1
{
  "text": "Document with ID 1"
}

PUT my_index/my_type/2&refresh=true
{
  "text": "Document with ID 2"
}
GET my_index/_search
{
  "query": {
    "terms": {
      "_id": [ "1", "2" ]
    }
  }
}

检索结果

{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "Document with ID 1"
        }
      }
    ]
  }
}

6.0之前的版本并不是这样的,因为它们支持多种类型,所以_type和_id被合并为_uid的符合主键

13.5 _uid

在6.0中弃用。现在,类型已被删除,文档的_id是唯一标识

13.6 _source

_source字段包含在索引时间传递的原始JSON文档正文。_source 字段本身没有编入索引(因此不可搜索),但他被存储,以便在执行获取请求时可以返回它。
默认_source 字段是开启的,也就说,默认情况下存储文档的原始值。

如果某个字段内容非常多(如一篇小说),或者查询业务只需要对该字段进行搜索,返回文档ID,然后通过其他途径查看文档原文,则不要保留_source元字段。可以通过禁用_source元字段,在ElasticSearch中只存储倒排索引,不保留字段原始值。

例子_source 禁用

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "_source": {
        "enabled": false
      }
    }
  }
}
PUT my_index/my_type/1
{
  "text": "This is a document"
}

GET my_index/my_type/1

查询结果中没有_source相关的数据

{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "1",
  "_version": 1,
  "found": true
}

例子 包含或者排除部分字段

DELETE my_index

PUT my_index
{
  "mappings": {
    "blog":{
      "_source":{
        "includes":["title","url"],
        "excludes":["content"]
      },
      "properties": {
        "title":{
          "type":"text"
        },
        "content":{
          "type":"text"
        },
        "url":{
          "type":"text"
        }
      }
    }  
  }
}

PUT my_index/blog/1
{
  "title":"yum源",
  "content":"CentOS更换国内yum源",
  "url":"http://url.cn/53788351"
}

PUT my_index/blog/2
{
  "title":"Ambari",
  "content":"CentOS7.x下的Ambari2.4源码编译",
  "url":"http://url.cn/53844169"
}

GET my_index/blog/1
{
  "_index": "my_index",
  "_type": "blog",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "title": "yum源",
    "url": "http://url.cn/53788351"
  }
}

13.7 _size

整个_source字段的字节数大小 。
需要安装插件,执行命令bin/elasticsearch-plugin install mapper-size:

[es@node1 ~]$ cd /opt/elasticsearch-6.1.1/
[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch-plugin install mapper-size
-> Downloading mapper-size from elastic
[=================================================] 100%   
-> Installed mapper-size
[es@node1 elasticsearch-6.1.1]$

然后重启ElasticSearch,mapper-size 插件才能生效。

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "_size": {
        "enabled": true
      }
    }
  }
}
PUT my_index/my_type/1
{
  "text": "This is a document"
}

PUT my_index/my_type/2
{
  "text": "This is another document"
}

查询文档时,可以通过_size元字段进行过滤

GET my_index/_search
{
  "query": {
    "range": {
      "_size": { 
        "gt": 10
      }
    }
  }
}
GET my_index/_search
{
  "took": 148,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "2",
        "_score": 1,
        "_source": {
          "text": "This is another document"
        }
      },
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "This is a document"
        }
      }
    ]
  }
}

补充:可以通过命令bin/elasticsearch-plugin remove mapper-size删除mapper-size插件。

13.8 _all

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html#enabling-all-field

_all可能不再为在6.0+中创建的索引启用,请使用自定义字段和映射copy_to参数,请参见《14.6 copy-to》小节相关内容。

_all字段是把其它字段拼接在一起的超级字段,所有的字段用空格分开,_all字段会被解析和索引,但是不存储。当你只想返回包含某个关键字的文档但是不明确地搜某个字段的时候就需要使用_all字段。

按照官方文档的说法,_all元字段默认是禁用的,如果需要使用,可以通过"_all": {“enabled”: true}开启,测试如下。

PUT myindex
{
  "mappings": {
    "mytype": {
      "_all": {"enabled": true},
      "properties": {
        "title": { 
          "type": "text"
        },
        "content": { 
          "type": "text"
        }
      }
    }
  }
}

但是创建失败,报错:”Enabling [_all] is disabled in 6.0. As a replacement, you can use [copy_to] on mapping fields to create your own catch all field.”

13.9 _field_names

_field_names字段索引文档中每个字段的名称,其中包含除null之外的任何值。存在查询使用此字段来查找对于特定字段具有或不具有任何非空值的文档。

PUT my_index

PUT my_index/my_type/1
{
  "title": "This is a document"
}

PUT my_index/my_type/2
{
  "title": "This is another document",
  "body": "This document has a body"
}

GET my_index/_search
{
  "query": {
    "terms": {
      "_field_names": ["body"]
    }
  }
}
PUT my_index

PUT my_index/my_type/1
{
  "title": "This is a document"
}

PUT my_index/my_type/2
{
  "title": "This is another document",
  "body": "This document has a body"
}

GET my_index/_search
{
  "query": {
      "exists" : { "field" : "body" }
  }
}

由于第一个文档,不存在body字段,所以只返回第二个文档

13.10 _routing

使用以下公式将文档路由到索引中的特定分片:
shard_num = hash(_routing) % num_primary_shards

_routing的默认值是文档的_id。

自定义路由模式可以通过指定每个文档的自定义路由值来实现。

PUT my_index/my_type/1?routing=user1&refresh=true 
{
  "title": "This is a document"
}

GET my_index/my_type/1?routing=user1
{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "1",
  "_version": 1,
  "_routing": "user1",
  "found": true,
  "_source": {
    "title": "This is a document"
  }
}

查询中可以使用_routing字段的值:

GET my_index/_search
{
  "query": {
    "terms": {
      "_routing": [ "user1" ] 
    }
  }
}
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 1,
        "_routing": "user1",
        "_source": {
          "title": "This is a document"
        }
      }
    ]
  }
}

你可能感兴趣的:(ElasticSearch)