Logstash filter插件开发

Logstash是一个具有实时管线能力的开源数据收集引擎。在ELK Stack中,通常选择更轻量级的Filebeat收集日志,然后将日志输出到Logstash进行加工处理,再将处理后的日志输出到指定的目标(ElasticSearch,Kafka等)当中。
Logstash事件的处理管线是inputs → filters → outputs,三个阶段都可以自定义插件,本文主要介绍如何开发自定义需求最多的filter插件。
Logstash的安装就不详细介绍了,下载传送门:https://www.elastic.co/downloads/logstash。

生成filter插件

cd到Logstash的跟目录,使用bin/logstash-plugin生成filter插件模板,如下:

bin/logstash-plugin generate --type filter --name test  --path vendor/localgems

vendor/localgems可修改为你自己的路径。
查看filter插件的目录结构,如下:

$ tree logstash-filter-test
├── Gemfile
├── LICENSE
├── README.md
├── Rakefile
├── lib
│   └── logstash
│       └── filters
│           └── test.rb
├── logstash-filter-test.gemspec
└── spec
    └── filters
        └── test_spec.rb
    └── spec_helper.rb

filter插件初探

代码结构

Logstash插件是用ruby写的,查看lib/logstash/filters/test.rb文件,如下:

# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"

# This  filter will replace the contents of the default 
# message field with whatever you specify in the configuration.
#
# It is only intended to be used as an .
class LogStash::Filters::Test < LogStash::Filters::Base

  # Setting the config_name here is required. This is how you
  # configure this filter from your Logstash config.
  #
  # filter {
  #    {
  #     message => "My message..."
  #   }
  # }
  #
  config_name "test"
  
  # Replace the message with this value.
  config :message, :validate => :string, :default => "Hello World!"
  

  public
  def register
    # Add instance variables 
  end # def register

  public
  def filter(event)

    if @message
      # Replace the event message with our message as configured in the
      # config file.
      event.set("message", @message)
    end

    # filter_matched should go in the last line of our successful code
    filter_matched(event)
  end # def filter
end # class LogStash::Filters::Test

UTF-8编码

Logstash依赖于UTF-8编码,需要在插件代码开始出添加:

# encoding: utf-8

require

模板代码里面默认require"logstash/filters/base""logstash/namespace",如果需要依赖其它代码或者gems就在这添加,可以参考后面在插件中查询MySql的代码。

插件名称配置

插件名称配置代码如下:

config_name "test"

test就是插件名称,在Logstash配置的filter块中使用。

插件参数配置

插件参数配置代码如下:

config :message, :validate => :string, :default => "Hello World!"

message是插件test的可选参数,默认值是"Hello World!"。下面是参数的通用配置代码:

config :variable_name, :validate => :variable_type, :default => "Default value", :required => boolean, :deprecated => boolean, :obsolete => string
  • :variable_name:参数名称
  • :validate:验证参数类型,如:string, :password, :boolean, :number, :array, :hash, :path
  • :required:是否必须配置
  • :default:默认值
  • :deprecated:是否废弃
  • :obsolete:声明该配置不再使用,通常提供升级方案

插件方法

Logstash插件必须实现两个方法:registerfilter
register方法代码如下:

  public
  def register
    # Add instance variables 
  end # def register

register方法相当于初始化方法,不需要手动调用,可以在这个方法里面调用配置变量,如@message,也可以初始化自己的实例变量。
filter方法代码如下:

  public
  def filter(event)

    if @message
      # Replace the event message with our message as configured in the
      # config file.
      event.set("message", @message)
    end

    # filter_matched should go in the last line of our successful code
    filter_matched(event)
  end # def filter

filter方法是插件的数据处理逻辑,其中event变量封装了数据流,可以通过接口访问event中的内容,具体参见https://www.elastic.co/guide/en/logstash/5.1/event-api.html。最后一句调用了filter_matched,这个方法用于保证Logstash的配置add_field, remove_field, add_tagremove_tag会被正确执行。

在插件中使用其它类库

这里以在插件中查询MySql为例进行说明,使用jdbc操作MySql,需要安装jdbc-mysql,操作如下:
添加Logstash的环境变量:

export LOGSTASH_HOME=/opt/logstash-5.2.1
export PATH=$PATH:$LOGSTASH_HOME/vendor/jruby/bin

安装jdbc-mysql

gem install jdbc-mysql

使用sequel(代码和文档请查看vendor/bundle/jruby/1.9/gems/sequel-4.43.0)操作MySql,首先需要在logstash-filter-test.gemspec配置文件中添加对sequel的依赖,如下:

# Gem dependencies
s.add_runtime_dependency "logstash-core-plugin-api", "~> 2.0"
s.add_runtime_dependency 'sequel'
s.add_development_dependency 'logstash-devutils'

然后在test.rbrequire相关代码:

require "sequel"
require "sequel/adapters/jdbc"

test.rb中添加:jdbc_driver_library配置参数,用于配置jdbc驱动库的path,我这的路径是"/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar"

config :jdbc_driver_library, :validate => :string, :required => true

register方法中做了两件事,一是初始化了几个实例变量,二是require依赖的jdbc库。简单说明下实例变量的用途,@logger用于输出日志,@connection_retry_attempts@connection_retry_attempts_wait_time用于数据库连接重试,@connection_wait_timeout用于设置MySql的session超时时间,避免与MySql连接过多,这是一个双保险策略,正常情况下MySql会设置全局的超时时间,并且查询完成之后我们会主动断开连接(见fetch_info方法),在断开失败且MySql的超时时间过长时@connection_wait_timeout才会起作用。

public
def register
  # Add instance variables 
  @logger = self.logger
  @connection_retry_attempts = 5
  @connection_retry_attempts_wait_time = 1
  @connection_wait_timeout = 10
  begin
    require @jdbc_driver_library
  rescue => e
    @logger.error("Failed to load #{@jdbc_driver_library}", :exception => e)
  end
end # def register

创建db实例:

private 
def create_db(conn_str)
  db = nil
  retry_attempts = @connection_retry_attempts
  while retry_attempts > 0 do
    retry_attempts -= 1
    begin
      tmp_db = Sequel.connect(conn_str)
    rescue Sequel::PoolTimeout => e
      if retry_attempts <= 0
        @logger.error("Failed to connect to database. 5 second timeout exceeded. Tried #{@connection_retry_attempts} times.")
        raise e
      else
        @logger.error("Failed to connect to database. 5 second timeout exceeded. Trying again.")  
      end
    rescue Sequel::Error => e
      if retry_attempts <= 0
        @logger.error("Unable to connect to database. Tried #{@connection_retry_attempts} times", :error_message => e.message)
        raise e
      else
        @logger.error("Unable to connect to database. Trying again", :error_message => e.message)
      end
    else
      db = tmp_db
      break
    end
    sleep(@connection_retry_attempts_wait_time)
  end
  db
end

查询数据:

private
def fetch_info(db, sql, key)
  all_info = {}
  retry_attempts = @connection_retry_attempts
  while retry_attempts > 0 do
    retry_attempts -= 1
    begin
      db.fetch(sql) do |row|
        all_info[row[key]] = row
      end
      db.run "set wait_timeout = " + @connection_wait_timeout.to_s
    rescue Sequel::DatabaseConnectionError, Sequel::DatabaseError => e
      if retry_attempts <= 0
        @logger.warn("Exception when executing JDBC query", :exception => e)
        raise e
      else
        @logger.error("Failed to execute query. Trying again.", :error_message => e.message)
      end
    else
      break
    end
    sleep(@connection_retry_attempts_wait_time)
  end
  db.disconnect()
  all_info
end

接下来就可以根据需要在registerfilter中使用create_dbfetch_info方法了。
注意:这里只是以查询MySql为例进行说明,处理Logstash事件时需要考虑对性能和吞吐量的影响。

在Logstash中配置定制的插件

cd到Logstash根目录下,在Gemfile添加以下配置:

gem "logstash-filter-test", :path => "vendor/localgems/logstash-filter-test"

启动Logstash

启动Logstash,配置我们定制的test插件,如下:

bin/logstash -e 'input { beats { port => "5043" } } filter { test { jdbc_driver_library => "/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar" } } output { stdout { codec => rubydebug }}'

也可以写配置文件,与上面的-e参数内容一致,然后使用配置文件启动Logstash。
启动Logstash的传送门:https://www.elastic.co/guide/en/logstash/5.1/running-logstash-command-line.html。

你可能感兴趣的:(Logstash filter插件开发)