Elasticsearch给全部文档添加新的字段

  • 大致有三种解决方法
    • scan and scroll all documents
    • _update_by_query api
    • re-index and add new fields, use the /_reindex API
  • 首先要设置允许script运行,修改elasticsearch.yml
script.engine.groovy.inline.aggs: on
script.engine.groovy.inline.search: on
script.engine.groovy.inline.update: on

或者
script.inline: true

  • scan anc scroll all documents
    • use /_search?scroll to fetch the docs
    • perform your operation
    • send /_bulk update requests
POST bt/bt/_update
{
    "script" : "ctx._source.new_field = \"value_of_new_field\""
}
  • _update_by_query API
POST bt/bt/_update_by_query
{
  "script": {
    "inline": "if (ctx._source.bt0 == null || ctx._source.bt1==null) { ctx._source.btNew1=null } else { ctx._source.btNew1 = ctx._source.bt0 + ctx._source.bt1 }"
  }
}
  • reindex
    https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-reindex.html
{
  "source": {
    "index": "bt",
    "type": "bt"
  },

  "dest": {
    "index": "new_bt1"
  },
  "script": {
    "inline": "if (ctx._source.bt0 == null || ctx._source.bt1==null) { ctx._source.btPlus=null } else { ctx._source.btPlus = ctx._source.bt0 + ctx._source.bt1 }"
  }
}
  • Difference between _update_by_query and _reindex
    • Just like _update_by_query, _reindex gets a snapshot of the source index but its target must be a different index so version conflicts are unlikely.
    • Unlike _update_by_query, the script is allowed to modify the document’s metadata.

其实_update_and_query和_reindex的实现差不多,性能没有测过,不过应该差不多。所以elasticsearch还有task api可以用来检测这两个任务运行情况。

你可能感兴趣的:(Elasticsearch给全部文档添加新的字段)