google译自:https://www.elastic.co/guide/en/elasticsearch/reference/7.6/removal-of-types.html#removal-of-types
include_type_name参数设为true
设置为true后即可用原有方式put。
对于我们的用户来说,这是一个很大的变化,因此我们试图使其尽可能轻松。更改将按以下方式推出:
Elasticsearch 5.6.0
index.mapping.single_type: true
在索引上 设置将启用按索引的单一类型行为,该行为将在6.0中强制执行。join
字段替换。Elasticsearch 6.x
_doc
,因此索引API具有与7.0中相同的路径: PUT {index}/_doc/{id}
和POST {index}/_doc
_type
名称不能再与组合_id
以形成_uid
字段。该_uid
字段已成为该_id
字段的别名。join
字段。_default_
映射类型已弃用。include_type_name
),该参数指示请求和响应是否应包含类型名称。它默认为true
,并且应设置为明确的值以准备升级到7.0。不设置include_type_name
将导致弃用警告。没有显式类型的索引将使用虚拟类型名称_doc
。Elasticsearch 7.x
type
。新索引API适用PUT {index}/_doc/{id}
于显式ID和POST {index}/_doc
自动生成的ID。请注意,在7.0中,它_doc
是路径的永久部分,并且表示端点名称而不是文档类型。include_type_name
在创建索引,索引模板,地图API参数默认为false
。完全设置该参数将导致弃用警告。_default_
映射类型被去除。Elasticsearch 8.x
include_type_name
参数已删除。当然,一个集群中可以存在多少个主要分片是有限制的,因此您可能不想只花几千个文档就浪费整个分片。在这种情况下,您可以实现自己的自定义type
字段,该字段的工作方式与旧的类似_type
。
You can achieve the same thing by adding a custom type
field as follows:
假如索引twitter下,原有user和tweet两个type。可以改成twitter里加一个属性flag,flag有两个值,用于搜索时区分user和tweet。
PUT twitter { "mappings": { "_doc": { "properties": { "type": { "type": "keyword" }, //type 即上文提到的flag属性。 "name": { "type": "text" }, "user_name": { "type": "keyword" }, "email": { "type": "keyword" }, "content": { "type": "text" }, "tweeted_at": { "type": "date" } } } } }
PUT twitter/_doc/user-kimchy
{
"type": "user", // user类型
"name": "Shay Banon",
"user_name": "kimchy",
"email": "[email protected]"
}
PUT twitter/_doc/tweet-1
{
"type": "tweet", // tweet类型
"user_name": "kimchy",
"tweeted_at": "2017-10-24T09:00:00Z",
"content": "Types are going away"
}
GET twitter/_search
{
"query": {
"bool": {
"must": {
"match": {
"user_name": "kimchy"
}
},
"filter": {
"match": {
"type": "tweet" // 过滤
}
}
}
}
}
Removal of mapping types
Indices created in Elasticsearch 7.0.0 or later no longer accept a _default_
mapping. Indices created in 6.x will continue to function as before in Elasticsearch 6.x. Types are deprecated in APIs in 7.0, with breaking changes to the index creation, put mapping, get mapping, put template, get template and get field mappings APIs.
Since the first release of Elasticsearch, each document has been stored in a single index and assigned a single mapping type. A mapping type was used to represent the type of document or entity being indexed, for instance a twitter
index might have a user
type and a tweet
type.
Each mapping type could have its own fields, so the user
type might have a full_name
field, a user_name
field, and an email
field, while the tweet
type could have a content
field, a tweeted_at
field and, like the user
type, a user_name
field.
Each document had a _type
meta-field containing the type name, and searches could be limited to one or more types by specifying the type name(s) in the URL:
GET twitter/user,tweet/_search
{
"query": {
"match": {
"user_name": "kimchy"
}
}
}
The _type
field was combined with the document’s _id
to generate a _uid
field, so documents of different types with the same _id
could exist in a single index.
Mapping types were also used to establish a parent-child relationship between documents, so documents of type question
could be parents to documents of type answer
.
Initially, we spoke about an “index” being similar to a “database” in an SQL database, and a “type” being equivalent to a “table”.
This was a bad analogy that led to incorrect assumptions. In an SQL database, tables are independent of each other. The columns in one table have no bearing on columns with the same name in another table. This is not the case for fields in a mapping type.
In an Elasticsearch index, fields that have the same name in different mapping types are backed by the same Lucene field internally. In other words, using the example above, the user_name
field in the user
type is stored in exactly the same field as the user_name
field in the tweet
type, and both user_name
fields must have the same mapping (definition) in both types.
This can lead to frustration when, for example, you want deleted
to be a date
field in one type and a boolean
field in another type in the same index.
On top of that, storing different entities that have few or no fields in common in the same index leads to sparse data and interferes with Lucene’s ability to compress documents efficiently.
For these reasons, we have decided to remove the concept of mapping types from Elasticsearch.
Index per document typeedit
The first alternative is to have an index per document type. Instead of storing tweets and users in a single twitter
index, you could store tweets in the tweets
index and users in the user
index. Indices are completely independent of each other and so there will be no conflict of field types between indices.
This approach has two benefits:
Each index can be sized appropriately for the number of documents it will contain: you can use a smaller number of primary shards for users
and a larger number of primary shards for tweets
.
Custom type fieldedit
Of course, there is a limit to how many primary shards can exist in a cluster so you may not want to waste an entire shard for a collection of only a few thousand documents. In this case, you can implement your own custom type
field which will work in a similar way to the old _type
.
Let’s take the user
/tweet
example above. Originally, the workflow would have looked something like this:
PUT twitter
{
"mappings": {
"user": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
},
"tweet": {
"properties": {
"content": { "type": "text" },
"user_name": { "type": "keyword" },
"tweeted_at": { "type": "date" }
}
}
}
}
PUT twitter/user/kimchy
{
"name": "Shay Banon",
"user_name": "kimchy",
"email": "[email protected]"
}
PUT twitter/tweet/1
{
"user_name": "kimchy",
"tweeted_at": "2017-10-24T09:00:00Z",
"content": "Types are going away"
}
GET twitter/tweet/_search
{
"query": {
"match": {
"user_name": "kimchy"
}
}
}
You can achieve the same thing by adding a custom type
field as follows:
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"type": { "type": "keyword" },
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" },
"content": { "type": "text" },
"tweeted_at": { "type": "date" }
}
}
}
}
PUT twitter/_doc/user-kimchy
{
"type": "user",
"name": "Shay Banon",
"user_name": "kimchy",
"email": "[email protected]"
}
PUT twitter/_doc/tweet-1
{
"type": "tweet",
"user_name": "kimchy",
"tweeted_at": "2017-10-24T09:00:00Z",
"content": "Types are going away"
}
GET twitter/_search
{
"query": {
"bool": {
"must": {
"match": {
"user_name": "kimchy"
}
},
"filter": {
"match": {
"type": "tweet"
}
}
}
}
}
|
The explicit |
Parent/Child without mapping typesedit
Previously, a parent-child relationship was represented by making one mapping type the parent, and one or more other mapping types the children. Without types, we can no longer use this syntax. The parent-child feature will continue to function as before, except that the way of expressing the relationship between documents has been changed to use the new join
field.
This is a big change for our users, so we have tried to make it as painless as possible. The change will roll out as follows:
Elasticsearch 5.6.0
index.mapping.single_type: true
on an index will enable the single-type-per-index behaviour which will be enforced in 6.0.join
field replacement for parent-child is available on indices created in 5.6.Elasticsearch 6.x
_doc
, so that index APIs have the same path as they will have in 7.0: PUT {index}/_doc/{id}
and POST {index}/_doc
_type
name can no longer be combined with the _id
to form the _uid
field. The _uid
field has become an alias for the _id
field.join
field instead._default_
mapping type is deprecated.include_type_name
) which indicates whether requests and responses should include a type name. It defaults to true
, and should be set to an explicit value to prepare to upgrade to 7.0. Not setting include_type_name
will result in a deprecation warning. Indices which don’t have an explicit type will use the dummy type name _doc
.Elasticsearch 7.x
type
. The new index APIs are PUT {index}/_doc/{id}
in case of explicit ids and POST {index}/_doc
for auto-generated ids. Note that in 7.0, _doc
is a permanent part of the path, and represents the endpoint name rather than the document type.include_type_name
parameter in the index creation, index template, and mapping APIs will default to false
. Setting the parameter at all will result in a deprecation warning._default_
mapping type is removed.Elasticsearch 8.x
include_type_name
parameter is removed.The Reindex API can be used to convert multi-type indices to single-type indices. The following examples can be used in Elasticsearch 5.6 or Elasticsearch 6.x. In 6.x, there is no need to specify index.mapping.single_type
as that is the default.
Index per document typeedit
This first example splits our twitter
index into a tweets
index and a users
index:
PUT users
{
"settings": {
"index.mapping.single_type": true
},
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text"
},
"user_name": {
"type": "keyword"
},
"email": {
"type": "keyword"
}
}
}
}
}
PUT tweets
{
"settings": {
"index.mapping.single_type": true
},
"mappings": {
"_doc": {
"properties": {
"content": {
"type": "text"
},
"user_name": {
"type": "keyword"
},
"tweeted_at": {
"type": "date"
}
}
}
}
}
POST _reindex
{
"source": {
"index": "twitter",
"type": "user"
},
"dest": {
"index": "users",
"type": "_doc"
}
}
POST _reindex
{
"source": {
"index": "twitter",
"type": "tweet"
},
"dest": {
"index": "tweets",
"type": "_doc"
}
}
Custom type fieldedit
This next example adds a custom type
field and sets it to the value of the original _type
. It also adds the type to the _id
in case there are any documents of different types which have conflicting IDs:
PUT new_twitter
{
"mappings": {
"_doc": {
"properties": {
"type": {
"type": "keyword"
},
"name": {
"type": "text"
},
"user_name": {
"type": "keyword"
},
"email": {
"type": "keyword"
},
"content": {
"type": "text"
},
"tweeted_at": {
"type": "date"
}
}
}
}
}
POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
},
"script": {
"source": """
ctx._source.type = ctx._type;
ctx._id = ctx._type + '-' + ctx._id;
ctx._type = '_doc';
"""
}
}
In Elasticsearch 7.0, each API will support typeless requests, and specifying a type will produce a deprecation warning.
Typeless APIs work even if the target index contains a custom type. For example, if an index has the custom type name my_type
, we can add documents to it using typeless index
calls, and load documents with typeless get
calls.
Index APIsedit
Index creation, index template, and mapping APIs support a new include_type_name
URL parameter that specifies whether mapping definitions in requests and responses should contain the type name. The parameter defaults to true
in version 6.8 to match the pre-7.0 behavior of using type names in mappings. It defaults to false
in version 7.0 and will be removed in version 8.0.
It should be set explicitly in 6.8 to prepare to upgrade to 7.0. To avoid deprecation warnings in 6.8, the parameter can be set to either true
or false
. In 7.0, setting include_type_name
at all will result in a deprecation warning.
See some examples of interactions with Elasticsearch with this option set to false
:
PUT index?include_type_name=false
{
"mappings": {
"properties": {
"foo": {
"type": "keyword"
}
}
}
}
Copy as cURLView in Console
|
Mappings are included directly under the |
PUT index/_mappings?include_type_name=false
{
"properties": {
"bar": {
"type": "text"
}
}
}
Copy as cURLView in Console
|
Mappings are included directly under the |
GET index/_mappings?include_type_name=false
Copy as cURLView in Console
The above call returns
{
"index": {
"mappings": {
"properties": {
"foo": {
"type": "keyword"
},
"bar": {
"type": "text"
}
}
}
}
}
|
Mappings are included directly under the |
Document APIsedit
In 7.0, index APIs must be called with the {index}/_doc
path for automatic generation of the _id
and {index}/_doc/{id}
with explicit ids.
PUT index/_doc/1
{
"foo": "baz"
}
Copy as cURLView in Console
{
"_index": "index",
"_id": "1",
"_type": "_doc",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
Similarly, the get
and delete
APIs use the path {index}/_doc/{id}
:
GET index/_doc/1
Copy as cURLView in Console
In 7.0, _doc
represents the endpoint name instead of the document type. The _doc
component is a permanent part of the path for the document index
, get
, and delete
APIs going forward, and will not be removed in 8.0.
For API paths that contain both a type and endpoint name like _update
, in 7.0 the endpoint will immediately follow the index name:
POST index/_update/1
{
"doc" : {
"foo" : "qux"
}
}
GET /index/_source/1
Copy as cURLView in Console
Types should also no longer appear in the body of requests. The following example of bulk indexing omits the type both in the URL, and in the individual bulk commands:
POST _bulk
{ "index" : { "_index" : "index", "_id" : "3" } }
{ "foo" : "baz" }
{ "index" : { "_index" : "index", "_id" : "4" } }
{ "foo" : "qux" }
Copy as cURLView in Console
Search APIsedit
When calling a search API such _search
, _msearch
, or _explain
, types should not be included in the URL. Additionally, the _type
field should not be used in queries, aggregations, or scripts.
Types in responsesedit
The document and search APIs will continue to return a _type
key in responses, to avoid breaks to response parsing. However, the key is considered deprecated and should no longer be referenced. Types will be completely removed from responses in 8.0.
Note that when a deprecated typed API is used, the index’s mapping type will be returned as normal, but that typeless APIs will return the dummy type _doc
in the response. For example, the following typeless get
call will always return _doc
as the type, even if the mapping has a custom type name like my_type
:
PUT index/my_type/1
{
"foo": "baz"
}
GET index/_doc/1
Copy as cURLView in Console
{
"_index" : "index",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found": true,
"_source" : {
"foo" : "baz"
}
}
Index templatesedit
It is recommended to make index templates typeless by re-adding them with include_type_name
set to false
. Under the hood, typeless templates will use the dummy type _doc
when creating indices.
In case typeless templates are used with typed index creation calls or typed templates are used with typeless index creation calls, the template will still be applied but the index creation call decides whether there should be a type or not. For instance in the below example, index-1-01
will have a type in spite of the fact that it matches a template that is typeless, and index-2-01
will be typeless in spite of the fact that it matches a template that defines a type. Both index-1-01
and index-2-01
will inherit the foo
field from the template that they match.
PUT _template/template1
{
"index_patterns":[ "index-1-*" ],
"mappings": {
"properties": {
"foo": {
"type": "keyword"
}
}
}
}
PUT _template/template2?include_type_name=true
{
"index_patterns":[ "index-2-*" ],
"mappings": {
"type": {
"properties": {
"foo": {
"type": "keyword"
}
}
}
}
}
PUT index-1-01?include_type_name=true
{
"mappings": {
"type": {
"properties": {
"bar": {
"type": "long"
}
}
}
}
}
PUT index-2-01
{
"mappings": {
"properties": {
"bar": {
"type": "long"
}
}
}
}
Copy as cURLView in Console
In case of implicit index creation, because of documents that get indexed in an index that doesn’t exist yet, the template is always honored. This is usually not a problem due to the fact that typeless index calls work on typed indices.
Mixed-version clustersedit
In a cluster composed of both 6.8 and 7.0 nodes, the parameter include_type_name
should be specified in index APIs like index creation. This is because the parameter has a different default between 6.8 and 7.0, so the same mapping definition will not be valid for both node versions.
Typeless document APIs such as bulk
and update
are only available as of 7.0, and will not work with 6.8 nodes. This also holds true for the typeless versions of queries that perform document lookups, such as terms
.