logstash学习——01

目录标题

    • 如何启动logstash
    • 一、专业术语介绍
    • (一)@metadata
    • (二)field
    • (三)field reference
    • (四)input plugin
    • (五)filter plugin
    • (六)output plugin
    • (七)其他
    • 二、具体的logstash配置实例
    • 三、参考

如何启动logstash

# cd到 logstash解压文件的bin目录下
PS C:\Users\hs> cd D:\lihua\ELK\logstash-7.15.1-windows-x86_64\logstash-7.15.1\bin
# logstash -f 指定配置文件的路径
PS D:\lihua\ELK\logstash-7.15.1-windows-x86_64\logstash-7.15.1\bin> .\logstash -f D:\lihua\iot\iot-engine\code\hx-iot-engine-starter\src\main\resources\logstash.conf

一、专业术语介绍

(一)@metadata

用于存储您不想包含在输出事件中的内容的特殊字段。例如,该@metadata 字段可用于创建用于条件语句的临时字段。
例子:

filter {
  mutate { add_field => { "show" => "This data will be in the output" } }
  mutate { add_field => { "[@metadata][test]" => "Hello" } }
  mutate { add_field => { "[@metadata][no_show]" => "This data will not be in the output" } }
}

logstash控制台输出:

{
    "@timestamp" => 2016-06-30T02:46:48.565Z,
    # 被@metadata修饰的字段(field)是不会流入到output里面的。也就是这个字段是临时的字段,生命周期只存在filter 阶段
     "@metadata" => {
           "test" => "Hello",
        "no_show" => "This data will not be in the output"
    },
      "@version" => "1",
          "host" => "example.com",
          "show" => "This data will be in the output",
       "message" => "asdf"
}

@metadata当您需要临时字段但不希望它出现在最终输出中时,请随时使用该字段。
注意:mutate { add_field => { “[@metadata][test]” => “Hello” } } 中 field的name为[@metadata][test],你引用的时候不能写[test],需要写成[@metadata][test]

(二)field

一个事件属性。例如,apache 访问日志中的每个事件都有属性,例如状态代码(200、404)、请求路径(“/”、“index.html”)、HTTP 动词(GET、POST)、客户端 IP 地址、等等。Logstash 使用术语“字段”来指代这些属性。

  1. field的具体表现:
    logstash学习——01_第1张图片

  2. 创建(声明) field ——【 add_field】

    • 值类型是哈希
    • 默认值为 {}
    • 作用:向事件添加字段

例子:

input {
    file { add_field => { "show" => "This data will be in the output" } }
}
filter {
	mutate { add_field => { "show" => "这个字段可以流入output" } }
	mutate { add_field => { "[@metadata][test]" => "Hello" } }
	mutate { add_field => { "[@metadata][no_show]" => "这个字段是临时的,不能流入output" } }
}

注意:
1、三种类型的插件【input、filter、output】都能创建field ,只要具体插件中提供了add_field这个配置选项。
2、当创建的field 已经存在,那么会将这个field 转换成数组类型,并插入一个元素。如下:

在这里插入图片描述

  1. 使用(引用)field
    字段引用通常用方 ( [] ) 括号括起来,例如[fieldname]。如果您指的是顶级字段,则可以省略[]并仅使用字段名称。要引用嵌套字段,请指定该字段的完整路径:[top-level field][nested field]

    1. 在逻辑运算中引用field
      详细参考官网
      在逻辑运算中使用 [fieldname] 引用 field
    filter {
    	# 如果字段foo 在字段foobar中
      if [foo] in [foobar] {
      	# 向tag数组添加"field in field"这个元素
        mutate { add_tag => "field in field" }
      }
      if [foo] in "foo" {
        mutate { add_tag => "field in string" }
      }
      if "hello" in [greeting] {
        mutate { add_tag => "string in field" }
      }
      if [foo] in ["hello", "world", "foo"] {
        mutate { add_tag => "field in list" }
      }
      if [missing] in [alsomissing] {
        mutate { add_tag => "shouldnotexist" }
      }
      if !("foo" in ["hello", "world"]) {
        mutate { add_tag => "shouldexist" }
      }
    }
    

    注意:[foo] 本身可以判断是否存在foo 这个字段,例如:

    output {
      # 如果[loglevel]不为空,并且[loglevel]的值为 "ERROR" 
      if [loglevel] and [loglevel] == "ERROR" {
        pagerduty {
        ...
        }
      }
    }
    
    
    1. 在字符输出中引用field
      在字符输出中使用%{fieldname} 引用field
      参考官网文档
    output {
        elasticsearch {
            hosts => ["192.168.1.83:9200"]
            index => "jmqttlogs-%{type}-%{logger}-%{loglevel}-%{+YYYY.MM}"
        }
        stdout { codec => rubydebug }
    }
    

(三)field reference

对事件字段的引用。此引用可能出现在 Logstash 配置文件的输出块或过滤器块中。字段引用通常用方 ( [] ) 括号括起来,例如[fieldname]。如果您指的是顶级字段,则可以省略[]并仅使用字段名称。要引用嵌套字段,请指定该字段的完整路径:[top-level field][nested field]

可以认为概念与field 一致

(四)input plugin

从特定来源读取事件数据的 Logstash插件。输入插件是 Logstash 事件处理管道的第一阶段。流行的输入插件包括 file、syslog、redis 和 beats。

input {
	# file输入插件
    file{
    	# 插件提供的配置项 ,具体配置项可以查看官网
        path => ["/jmqttlogs/*.log","/jmqttlogs/"]
        type => "test"
        exclude => ["brokerLog.log","remotingLog.log"]
    }
    #beats 输入插件
    beats {
    	# 插件提供的配置项 ,具体配置项可以查看官网
    	port => 5044
  	}
  	# tcp输入插件
  	tcp {
  		# 插件提供的配置项 ,具体配置项可以查看官网
	    port => 12345
	    codec => json
  	}
}

logstash 为我们提供了以下输入插件: 官网地址

Plugin

Description

Github repository

azure_event_hubs

Receives events from Azure Event Hubs

azure_event_hubs

beats

Receives events from the Elastic Beats framework

logstash-input-beats

cloudwatch

Pulls events from the Amazon Web Services CloudWatch API

logstash-input-cloudwatch

couchdb_changes

Streams events from CouchDB’s _changes URI

logstash-input-couchdb_changes

dead_letter_queue

read events from Logstash’s dead letter queue

logstash-input-dead_letter_queue

elastic_agent

Receives events from the Elastic Agent framework

logstash-input-beats (shared)

elasticsearch

Reads query results from an Elasticsearch cluster

logstash-input-elasticsearch

exec

Captures the output of a shell command as an event

logstash-input-exec

file

Streams events from files

logstash-input-file

ganglia

Reads Ganglia packets over UDP

logstash-input-ganglia

gelf

Reads GELF-format messages from Graylog2 as events

logstash-input-gelf

generator

Generates random log events for test purposes

logstash-input-generator

github

Reads events from a GitHub webhook

logstash-input-github

google_cloud_storage

Extract events from files in a Google Cloud Storage bucket

logstash-input-google_cloud_storage

google_pubsub

Consume events from a Google Cloud PubSub service

logstash-input-google_pubsub

graphite

Reads metrics from the graphite tool

logstash-input-graphite

heartbeat

Generates heartbeat events for testing

logstash-input-heartbeat

http

Receives events over HTTP or HTTPS

logstash-input-http

http_poller

Decodes the output of an HTTP API into events

logstash-input-http_poller

imap

Reads mail from an IMAP server

logstash-input-imap

irc

Reads events from an IRC server

logstash-input-irc

java_generator

Generates synthetic log events

core plugin

java_stdin

Reads events from standard input

core plugin

jdbc

Creates events from JDBC data

logstash-integration-jdbc

jms

Reads events from a Jms Broker

logstash-input-jms

jmx

Retrieves metrics from remote Java applications over JMX

logstash-input-jmx

kafka

Reads events from a Kafka topic

logstash-integration-kafka

kinesis

Receives events through an AWS Kinesis stream

logstash-input-kinesis

log4j

Reads events over a TCP socket from a Log4j SocketAppender object

logstash-input-log4j

lumberjack

Receives events using the Lumberjack protocl

logstash-input-lumberjack

meetup

Captures the output of command line tools as an event

logstash-input-meetup

pipe

Streams events from a long-running command pipe

logstash-input-pipe

puppet_facter

Receives facts from a Puppet server

logstash-input-puppet_facter

rabbitmq

Pulls events from a RabbitMQ exchange

logstash-integration-rabbitmq

redis

Reads events from a Redis instance

logstash-input-redis

relp

Receives RELP events over a TCP socket

logstash-input-relp

rss

Captures the output of command line tools as an event

logstash-input-rss

s3

Streams events from files in a S3 bucket

logstash-input-s3

s3-sns-sqs

Reads logs from AWS S3 buckets using sqs

logstash-input-s3-sns-sqs

salesforce

Creates events based on a Salesforce SOQL query

logstash-input-salesforce

snmp

Polls network devices using Simple Network Management Protocol (SNMP)

logstash-input-snmp

snmptrap

Creates events based on SNMP trap messages

logstash-input-snmptrap

sqlite

Creates events based on rows in an SQLite database

logstash-input-sqlite

sqs

Pulls events from an Amazon Web Services Simple Queue Service queue

logstash-input-sqs

stdin

Reads events from standard input

logstash-input-stdin

stomp

Creates events received with the STOMP protocol

logstash-input-stomp

syslog

Reads syslog messages as events

logstash-input-syslog

tcp

Reads events from a TCP socket

logstash-input-tcp

twitter

Reads events from the Twitter Streaming API

logstash-input-twitter

udp

Reads events over UDP

logstash-input-udp

unix

Reads events over a UNIX socket

logstash-input-unix

varnishlog

Reads from the varnish cache shared memory log

logstash-input-varnishlog

websocket

Reads events from a websocket

logstash-input-websocket

wmi

Creates events based on the results of a WMI query

logstash-input-wmi

xmpp

Receives events over the XMPP/Jabber protocol

logstash-input-xmpp

注意:这里的input plugin、filter plugin、output plugin 虽然叫做插件,但是他们并不需要我们额外安装,logstash已经集成了他们。

(五)filter plugin

对事件执行中间处理的 Logstash插件。通常,过滤器在通过输入摄取事件数据后,通过根据配置规则对数据进行变异、丰富和/或修改来对事件数据进行处理。过滤器通常根据事件的特征有条件地应用。流行的过滤器插件包括 grok、mutate、drop、clone 和 geoip。过滤阶段是可选的。
logstash 为我们提供了以下过滤插件: 官网地址

Plugin

Description

Github repository

age

Calculates the age of an event by subtracting the event timestamp from the current timestamp

logstash-filter-age

aggregate

Aggregates information from several events originating with a single task

logstash-filter-aggregate

alter

Performs general alterations to fields that the mutate filter does not handle

logstash-filter-alter

bytes

Parses string representations of computer storage sizes, such as "123 MB" or "5.6gb", into their numeric value in bytes

logstash-filter-bytes

cidr

Checks IP addresses against a list of network blocks

logstash-filter-cidr

cipher

Applies or removes a cipher to an event

logstash-filter-cipher

clone

Duplicates events

logstash-filter-clone

csv

Parses comma-separated value data into individual fields

logstash-filter-csv

date

Parses dates from fields to use as the Logstash timestamp for an event

logstash-filter-date

de_dot

Computationally expensive filter that removes dots from a field name

logstash-filter-de_dot

dissect

Extracts unstructured event data into fields using delimiters

logstash-filter-dissect

dns

Performs a standard or reverse DNS lookup

logstash-filter-dns

drop

Drops all events

logstash-filter-drop

elapsed

Calculates the elapsed time between a pair of events

logstash-filter-elapsed

elasticsearch

Copies fields from previous log events in Elasticsearch to current events

logstash-filter-elasticsearch

environment

Stores environment variables as metadata sub-fields

logstash-filter-environment

extractnumbers

Extracts numbers from a string

logstash-filter-extractnumbers

fingerprint

Fingerprints fields by replacing values with a consistent hash

logstash-filter-fingerprint

geoip

Adds geographical information about an IP address

logstash-filter-geoip

grok

Parses unstructured event data into fields

logstash-filter-grok

http

Provides integration with external web services/REST APIs

logstash-filter-http

i18n

Removes special characters from a field

logstash-filter-i18n

java_uuid

Generates a UUID and adds it to each processed event

core plugin

jdbc_static

Enriches events with data pre-loaded from a remote database

logstash-integration-jdbc

jdbc_streaming

Enrich events with your database data

logstash-integration-jdbc

json

Parses JSON events

logstash-filter-json

json_encode

Serializes a field to JSON

logstash-filter-json_encode

kv

Parses key-value pairs

logstash-filter-kv

memcached

Provides integration with external data in Memcached

logstash-filter-memcached

metricize

Takes complex events containing a number of metrics and splits these up into multiple events, each holding a single metric

logstash-filter-metricize

metrics

Aggregates metrics

logstash-filter-metrics

mutate

Performs mutations on fields

logstash-filter-mutate

prune

Prunes event data based on a list of fields to blacklist or whitelist

logstash-filter-prune

range

Checks that specified fields stay within given size or length limits

logstash-filter-range

ruby

Executes arbitrary Ruby code

logstash-filter-ruby

sleep

Sleeps for a specified time span

logstash-filter-sleep

split

Splits multi-line messages, strings, or arrays into distinct events

logstash-filter-split

syslog_pri

Parses the PRI (priority) field of a syslog message

logstash-filter-syslog_pri

threats_classifier

Enriches security logs with information about the attacker’s intent

logstash-filter-threats_classifier

throttle

Throttles the number of events

logstash-filter-throttle

tld

Replaces the contents of the default message field with whatever you specify in the configuration

logstash-filter-tld

translate

Replaces field contents based on a hash or YAML file

logstash-filter-translate

truncate

Truncates fields longer than a given length

logstash-filter-truncate

urldecode

Decodes URL-encoded fields

logstash-filter-urldecode

useragent

Parses user agent strings into fields

logstash-filter-useragent

uuid

Adds a UUID to events

logstash-filter-uuid

wurfl_device_detection

Enriches logs with device information such as brand, model, OS

logstash-filter-wurfl_device_detection

xml

Parses XML into fields

logstash-filter-xml

(六)output plugin

将事件数据写入特定目的地的 Logstash插件。输出是事件管道的最后阶段。流行的输出插件包括 elasticsearch、file、graphite 和 statsd。

logstash 为我们提供了以下输出插件: 官网地址

Plugin

Description

Github repository

app_search

Sends events to the Elastic App Search solution

logstash-integration-elastic_enterprise_search

boundary

Sends annotations to Boundary based on Logstash events

logstash-output-boundary

circonus

Sends annotations to Circonus based on Logstash events

logstash-output-circonus

cloudwatch

Aggregates and sends metric data to AWS CloudWatch

logstash-output-cloudwatch

csv

Writes events to disk in a delimited format

logstash-output-csv

datadog

Sends events to DataDogHQ based on Logstash events

logstash-output-datadog

datadog_metrics

Sends metrics to DataDogHQ based on Logstash events

logstash-output-datadog_metrics

dynatrace

Sends events to Dynatrace based on Logstash events

logstash-output-dynatrace

elastic_app_search

Sends events to the Elastic App Search solution

logstash-integration-elastic_enterprise_search

elastic_workplace_search

Sends events to the Elastic Workplace Search solution

logstash-integration-elastic_enterprise_search

elasticsearch

Stores logs in Elasticsearch

logstash-output-elasticsearch

email

Sends email to a specified address when output is received

logstash-output-email

exec

Runs a command for a matching event

logstash-output-exec

file

Writes events to files on disk

logstash-output-file

ganglia

Writes metrics to Ganglia’s gmond

logstash-output-ganglia

gelf

Generates GELF formatted output for Graylog2

logstash-output-gelf

google_bigquery

Writes events to Google BigQuery

logstash-output-google_bigquery

google_cloud_storage

Uploads log events to Google Cloud Storage

logstash-output-google_cloud_storage

google_pubsub

Uploads log events to Google Cloud Pubsub

logstash-output-google_pubsub

graphite

Writes metrics to Graphite

logstash-output-graphite

graphtastic

Sends metric data on Windows

logstash-output-graphtastic

http

Sends events to a generic HTTP or HTTPS endpoint

logstash-output-http

influxdb

Writes metrics to InfluxDB

logstash-output-influxdb

irc

Writes events to IRC

logstash-output-irc

java_stdout

Prints events to the STDOUT of the shell

core plugin

juggernaut

Pushes messages to the Juggernaut websockets server

logstash-output-juggernaut

kafka

Writes events to a Kafka topic

logstash-integration-kafka

librato

Sends metrics, annotations, and alerts to Librato based on Logstash events

logstash-output-librato

loggly

Ships logs to Loggly

logstash-output-loggly

lumberjack

Sends events using the lumberjack protocol

logstash-output-lumberjack

metriccatcher

Writes metrics to MetricCatcher

logstash-output-metriccatcher

mongodb

Writes events to MongoDB

logstash-output-mongodb

nagios

Sends passive check results to Nagios

logstash-output-nagios

nagios_nsca

Sends passive check results to Nagios using the NSCA protocol

logstash-output-nagios_nsca

opentsdb

Writes metrics to OpenTSDB

logstash-output-opentsdb

pagerduty

Sends notifications based on preconfigured services and escalation policies

logstash-output-pagerduty

pipe

Pipes events to another program’s standard input

logstash-output-pipe

rabbitmq

Pushes events to a RabbitMQ exchange

logstash-integration-rabbitmq

redis

Sends events to a Redis queue using the RPUSH command

logstash-output-redis

redmine

Creates tickets using the Redmine API

logstash-output-redmine

riak

Writes events to the Riak distributed key/value store

logstash-output-riak

riemann

Sends metrics to Riemann

logstash-output-riemann

s3

Sends Logstash events to the Amazon Simple Storage Service

logstash-output-s3

sink

Discards any events received

core plugin

sns

Sends events to Amazon’s Simple Notification Service

logstash-output-sns

solr_http

Stores and indexes logs in Solr

logstash-output-solr_http

sqs

Pushes events to an Amazon Web Services Simple Queue Service queue

logstash-output-sqs

statsd

Sends metrics using the statsd network daemon

logstash-output-statsd

stdout

Prints events to the standard output

logstash-output-stdout

stomp

Writes events using the STOMP protocol

logstash-output-stomp

syslog

Sends events to a syslog server

logstash-output-syslog

tcp

Writes events over a TCP socket

logstash-output-tcp

timber

Sends events to the Timber.io logging service

logstash-output-timber

udp

Sends events over UDP

logstash-output-udp

webhdfs

Sends Logstash events to HDFS using the webhdfs REST API

logstash-output-webhdfs

websocket

Publishes messages to a websocket

logstash-output-websocket

workplace_search

Sends events to the Elastic Workplace Search solution

logstash-integration-elastic_enterprise_search

xmpp

Posts events over XMPP

logstash-output-xmpp

zabbix

Sends events to a Zabbix server

logstash-output-zabbix

(七)其他

参考官网介绍

二、具体的logstash配置实例

注意:配置文件不要写注释,不然会加载失败。

#配置输入
input {
	#file输入插件,数据来源于文件,这里是.log日志文件
    file{
    	# 指定文件路径,注意只能是绝对路径,不能是相对路径。这里有个细节,如果需要配置文件排除,那么必须给定一个文件夹路径。这个配置项是必须要写的
        path => ["D:/lihua/javacode/jmqtt/iot-jmqtt/code/jmqttlogs/*.log","D:/lihua/javacode/jmqtt/iot-jmqtt/code/jmqttlogs/"]
         # 这个是设置需要排除的文件,需要结合path使用。
        exclude => ["brokerLog.log","remotingLog.log"]
        # type这个配置是一个field,不是必须的,并且它的值没有具体要求,可以灵活设置
        type => "test"
       
    }
}
# 配置过滤器,可以过滤(处理、解析)input 输入的数据
filter {
	# 解析日志的插件。具体使用后面介绍。
    grok {
    	# 配置规则,这个规则可以通过官方提供的在线工具生成
        match => { "message" => "(?%{TIMESTAMP_ISO8601}) \[%{LOGLEVEL:loglevel}\] (?[A-Za-z0-9$_.]+) – %{GREEDYDATA:messagebody}$" }
    }
    #json解析插件:发现并解析日志中存在的json
    json {
    	# 指定需要解析哪个字段(field),解析后会将json里面的属性变成field
        source => "messagebody"
    }
    # 数据转换插件,通常用来转换field的值,比如转换成小写
    mutate {
    	# 将指定的字段转换成小写,注意:es的索引库的名字不能存在大写字母
        lowercase => [ "logger","loglevel" ]
        # 删除一些不需要的field和add_field配套,并且这两个配置项大多数插件都提供有。
        remove_field => ["path","timestamp"]
    }
}
# 配置输出
output {
	# elasticsearch 输出插件,将日志输出到es中存储
    elasticsearch {
    	# 配置es地址
        hosts => ["192.168.1.83:9200"]
        # 配置索引库,如果这个索引库不存在那么会创建。注意索引库的名字不能存在大写字母
        index => "jmqttlogs-%{type}-%{logger}-%{loglevel}-%{+YYYY.MM}"
    }
    # 控制台输出插件,配置了这个插件logstash的运行控制台才会输出调试日志
    stdout { codec => rubydebug }
}

三、参考

地址

你可能感兴趣的:(ELK,ELK,logstash,log4j2)