elasticsearch ingest node

ignest node定义一个process pipeline来处理数据,可以替代logstash的某些功能,个人感觉


elasticsearch.yml中定义node为ingest node

node.ingest: false
可以在request或者bulk request命令提交pipeline到ingest node
PUT my-index/my-type/my-id?pipeline=my_pipeline_id
{
  "foo": "bar"
}



{
  "description" : "...",
  "processors" : [ ... ]
}                 
description描述功能,processors定义处理列表
put api


可以更新和创建新的pipeline


PUT _ingest/pipeline/my-pipeline-id
{
  "description" : "describe pipeline",
  "processors" : [
    {
      "set" : {
        "field": "foo",
        "value": "bar"
      }
    }
  ]
}       
修改可以即时更新
get api


GET _ingest/pipeline/my-pipeline-id


返回 
{
  "my-pipeline-id" : {
    "description" : "describe pipeline",
    "processors" : [
      {
        "set" : {
          "field" : "foo",
          "value" : "bar"
        }
      }
    ]
  }
}


delet api


DELETE _ingest/pipeline/my-pipeline-id
模拟pepeline api
创建模拟的pipeline
POST _ingest/pipeline/_simulate
{
  "pipeline" : {
    // pipeline definition here
  },
  "docs" : [
    { /** first document **/ },
    { /** second document **/ },
    // ...
  ]
}
根据现有的pipeline


POST _ingest/pipeline/my-pipeline-id/_simulate
{
  "docs" : [
    { /** first document **/ },
    { /** second document **/ },
    // ...
  ]
}


example


POST _ingest/pipeline/_simulate
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "set" : {
          "field" : "field2",
          "value" : "_value"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_type": "type",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "index",
      "_type": "type",
      "_id": "id",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
返回值
{
   "docs": [
      {
         "doc": {
            "_id": "id",
            "_ttl": null,
            "_parent": null,
            "_index": "index",
            "_routing": null,
            "_type": "type",
            "_timestamp": null,
            "_source": {
               "field2": "_value",
               "foo": "bar"
            },
            "_ingest": {
               "timestamp": "2016-01-04T23:53:27.186+0000"
            }
         }
      },
      {
         "doc": {
            "_id": "id",
            "_ttl": null,
            "_parent": null,
            "_index": "index",
            "_routing": null,
            "_type": "type",
            "_timestamp": null,
            "_source": {
               "field2": "_value",
               "foo": "rab"
            },
            "_ingest": {
               "timestamp": "2016-01-04T23:53:27.186+0000"
            }
         }
      }
   ]
}
访问pipeline数据
处理时可以访问或者设置数据的字段,元数据等信息


访问source filed
{
  "set": {
    "field": "_source.my_field"
    "value": 582.1
  }
}
 修改元素数据字段,修改document _id
{
  "set": {
    "field": "_id"
    "value": "1"
  }
}
_index, _type, _id, _routing,_parent是可以接受访问的
访问ingest元数据 
{
  "set": {
    "field": "received"
    "value": "{{_ingest.timestamp}}"
  }
}
访问field与metafield在template中
{
  "set": {
    "field": "field_c"
    "value": "{{field_a}} {{field_b}}"
  }
}


{
  "set": {
    "field": "_index"
    "value": "{{geoip.country_iso_code}}"
  }

}


pipelin错误处理


下面的例子可以将foo字段转换成bar字段,如果没有foo将保存错误在elasticsearch中分析





{
  "description" : "my first pipeline with handled exceptions",
  "processors" : [
    {
      "rename" : {
        "field" : "foo",
        "target_field" : "bar",
        "on_failure" : [
          {
            "set" : {
              "field" : "error",
              "value" : "field \"foo\" does not exist, cannot rename to \"bar\""
            }
          }
        ]
      }
    }
  ]
}


怱略异常


{
  "description" : "my first pipeline with handled exceptions",
  "processors" : [
    {
      "rename" : {
        "field" : "foo",
        "target_field" : "bar",
        "ignore_failure" : true
      }
    }
  ]
}


访问错误信息在pipeline


{
  "description" : "my first pipeline with handled exceptions",
  "processors" : [
    {
      "rename" : {
        "field" : "foo",
        "to" : "bar",
        "on_failure" : [
          {
            "set" : {
              "field" : "error",
              "value" : "{{ _ingest.on_failure_message }}"
            }
          }
        ]
      }
    }
  ]
}


现有的process


https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest-processors.html

你可能感兴趣的:(elasticsearch ingest node)