目录
1 简介
1.1 Inputs
1.1.1 Input plugins
1.2 Filters
1.3 Output
1.3.1 Output plugins
1.3.2 Csv输出插件示例
1.4 Logstash特点
1.4.1 即插即用
1.4.2 可扩展性
1.4.3 耐用性和安全性
1.4.4 检测
1.4.5 管理与检查
2 安装
2.1 下载上传
2.2 解压
2.3 配置
2.3.1 直接启动
2.3.2 以配置文件的形式
2.4 启动
3 常用配置案例
3.1 flow-es.conf
3.2 flow-kafka.conf
3.3 kafka-es.conf
3.4 logstash.conf
Logstash是一个开源的服务器端数据处理管道,它可以同时从多个源中提取数据,对其进行转换,然后将其发送到您最喜欢的“存储”。
开发语言: JRuby
摄取所有形状,大小和来源的数据
数据通常以多种格式分散或分散在许多系统中。Logstash支持各种输入,这些输入同时从众多公共源中提取事件。通过连续的流媒体方式轻松从日志,指标,Web应用程序,数据存储和各种AWS服务中提取数据。
https://www.elastic.co/guide/en/logstash/current/input-plugins.html
An input plugin enables a specific source of events to be read by Logstash.
The following input plugins are available below. For a list of Elastic supported plugins, please consult the Support Matrix.
Plugin |
Description |
Github repository |
beats |
Receives events from the Elastic Beats framework |
logstash-input-beats |
cloudwatch |
Pulls events from the Amazon Web Services CloudWatch API |
logstash-input-cloudwatch |
couchdb_changes |
Streams events from CouchDB’s _changes URI |
logstash-input-couchdb_changes |
dead_letter_queue |
read events from Logstash’s dead letter queue |
logstash-input-dead_letter_queue |
elasticsearch |
Reads query results from an Elasticsearch cluster |
logstash-input-elasticsearch |
exec |
Captures the output of a shell command as an event |
logstash-input-exec |
file |
Streams events from files |
logstash-input-file |
ganglia |
Reads Ganglia packets over UDP |
logstash-input-ganglia |
gelf |
Reads GELF-format messages from Graylog2 as events |
logstash-input-gelf |
generator |
Generates random log events for test purposes |
logstash-input-generator |
github |
Reads events from a GitHub webhook |
logstash-input-github |
google_pubsub |
Consume events from a Google Cloud PubSub service |
logstash-input-google_pubsub |
graphite |
Reads metrics from the graphite tool |
logstash-input-graphite |
heartbeat |
Generates heartbeat events for testing |
logstash-input-heartbeat |
http |
Receives events over HTTP or HTTPS |
logstash-input-http |
http_poller |
Decodes the output of an HTTP API into events |
logstash-input-http_poller |
imap |
Reads mail from an IMAP server |
logstash-input-imap |
irc |
Reads events from an IRC server |
logstash-input-irc |
jdbc |
Creates events from JDBC data |
logstash-input-jdbc |
jms |
Reads events from a Jms Broker |
logstash-input-jms |
jmx |
Retrieves metrics from remote Java applications over JMX |
logstash-input-jmx |
kafka |
Reads events from a Kafka topic |
logstash-input-kafka |
kinesis |
Receives events through an AWS Kinesis stream |
logstash-input-kinesis |
log4j |
Reads events over a TCP socket from a Log4j SocketAppender object |
logstash-input-log4j |
lumberjack |
Receives events using the Lumberjack protocl |
logstash-input-lumberjack |
meetup |
Captures the output of command line tools as an event |
logstash-input-meetup |
pipe |
Streams events from a long-running command pipe |
logstash-input-pipe |
puppet_facter |
Receives facts from a Puppet server |
logstash-input-puppet_facter |
rabbitmq |
Pulls events from a RabbitMQ exchange |
logstash-input-rabbitmq |
redis |
Reads events from a Redis instance |
logstash-input-redis |
relp |
Receives RELP events over a TCP socket |
logstash-input-relp |
rss |
Captures the output of command line tools as an event |
logstash-input-rss |
s3 |
Streams events from files in a S3 bucket |
logstash-input-s3 |
salesforce |
Creates events based on a Salesforce SOQL query |
logstash-input-salesforce |
snmptrap |
Creates events based on SNMP trap messages |
logstash-input-snmptrap |
sqlite |
Creates events based on rows in an SQLite database |
logstash-input-sqlite |
sqs |
Pulls events from an Amazon Web Services Simple Queue Service queue |
logstash-input-sqs |
stdin |
Reads events from standard input |
logstash-input-stdin |
stomp |
Creates events received with the STOMP protocol |
logstash-input-stomp |
syslog |
Reads syslog messages as events |
logstash-input-syslog |
tcp |
Reads events from a TCP socket |
logstash-input-tcp |
|
Reads events from the Twitter Streaming API |
logstash-input-twitter |
udp |
Reads events over UDP |
logstash-input-udp |
unix |
Reads events over a UNIX socket |
logstash-input-unix |
varnishlog |
Reads from the varnish cache shared memory log |
logstash-input-varnishlog |
websocket |
Reads events from a websocket |
logstash-input-websocket |
wmi |
Creates events based on the results of a WMI query |
logstash-input-wmi |
xmpp |
Receives events over the XMPP/Jabber protocol |
logstash-input-xmpp |
https://www.elastic.co/guide/en/logstash/current/filter-plugins.html
即时解析和转换您的数据
当数据从源传输到存储时,Logstash过滤器会解析每个事件,识别命名字段以构建结构,并将它们转换为汇聚在通用格式上,以便更轻松,更快速地进行分析和业务价值。
无论格式或复杂性如何,Logstash都会动态转换和准备您的数据:
丰富的过滤器库可以实现无限可能。
选择您的存储,传输您的数据
虽然Elasticsearch是我们的首选产品,它开辟了搜索和分析可能性的世界,但它并不是唯一可用的产品。
Logstash具有多种输出,可让您将数据路由到所需位置,从而可以灵活地解锁大量下游用例。
https://www.elastic.co/guide/en/logstash/current/output-plugins.html
An output plugin sends event data to a particular destination. Outputs are the final stage in the event pipeline.
The following output plugins are available below. For a list of Elastic supported plugins, please consult the Support Matrix.
Plugin |
Description |
Github repository |
boundary |
Sends annotations to Boundary based on Logstash events |
logstash-output-boundary |
circonus |
Sends annotations to Circonus based on Logstash events |
logstash-output-circonus |
cloudwatch |
Aggregates and sends metric data to AWS CloudWatch |
logstash-output-cloudwatch |
csv |
Writes events to disk in a delimited format |
logstash-output-csv |
datadog |
Sends events to DataDogHQ based on Logstash events |
logstash-output-datadog |
datadog_metrics |
Sends metrics to DataDogHQ based on Logstash events |
logstash-output-datadog_metrics |
elasticsearch |
Stores logs in Elasticsearch |
logstash-output-elasticsearch |
|
Sends email to a specified address when output is received |
logstash-output-email |
exec |
Runs a command for a matching event |
logstash-output-exec |
file |
Writes events to files on disk |
logstash-output-file |
ganglia |
Writes metrics to Ganglia’s gmond |
logstash-output-ganglia |
gelf |
Generates GELF formatted output for Graylog2 |
logstash-output-gelf |
google_bigquery |
Writes events to Google BigQuery |
logstash-output-google_bigquery |
graphite |
Writes metrics to Graphite |
logstash-output-graphite |
graphtastic |
Sends metric data on Windows |
logstash-output-graphtastic |
http |
Sends events to a generic HTTP or HTTPS endpoint |
logstash-output-http |
influxdb |
Writes metrics to InfluxDB |
logstash-output-influxdb |
irc |
Writes events to IRC |
logstash-output-irc |
juggernaut |
Pushes messages to the Juggernaut websockets server |
logstash-output-juggernaut |
kafka |
Writes events to a Kafka topic |
logstash-output-kafka |
librato |
Sends metrics, annotations, and alerts to Librato based on Logstash events |
logstash-output-librato |
loggly |
Ships logs to Loggly |
logstash-output-loggly |
lumberjack |
Sends events using the lumberjack protocol |
logstash-output-lumberjack |
metriccatcher |
Writes metrics to MetricCatcher |
logstash-output-metriccatcher |
mongodb |
Writes events to MongoDB |
logstash-output-mongodb |
nagios |
Sends passive check results to Nagios |
logstash-output-nagios |
nagios_nsca |
Sends passive check results to Nagios using the NSCA protocol |
logstash-output-nagios_nsca |
opentsdb |
Writes metrics to OpenTSDB |
logstash-output-opentsdb |
pagerduty |
Sends notifications based on preconfigured services and escalation policies |
logstash-output-pagerduty |
pipe |
Pipes events to another program’s standard input |
logstash-output-pipe |
rabbitmq |
Pushes events to a RabbitMQ exchange |
logstash-output-rabbitmq |
redis |
Sends events to a Redis queue using the RPUSHcommand |
logstash-output-redis |
redmine |
Creates tickets using the Redmine API |
logstash-output-redmine |
riak |
Writes events to the Riak distributed key/value store |
logstash-output-riak |
riemann |
Sends metrics to Riemann |
logstash-output-riemann |
s3 |
Sends Logstash events to the Amazon Simple Storage Service |
logstash-output-s3 |
sns |
Sends events to Amazon’s Simple Notification Service |
logstash-output-sns |
solr_http |
Stores and indexes logs in Solr |
logstash-output-solr_http |
sqs |
Pushes events to an Amazon Web Services Simple Queue Service queue |
logstash-output-sqs |
statsd |
Sends metrics using the statsd network daemon |
logstash-output-statsd |
stdout |
Prints events to the standard output |
logstash-output-stdout |
stomp |
Writes events using the STOMP protocol |
logstash-output-stomp |
syslog |
Sends events to a syslog server |
logstash-output-syslog |
tcp |
Writes events over a TCP socket |
logstash-output-tcp |
timber |
Sends events to the Timber.io logging service |
logstash-output-timber |
udp |
Sends events over UDP |
logstash-output-udp |
webhdfs |
Sends Logstash events to HDFS using the webhdfs REST API |
logstash-output-webhdfs |
websocket |
Publishes messages to a websocket |
logstash-output-websocket |
xmpp |
Posts events over XMPP |
logstash-output-xmpp |
zabbix |
Sends events to a Zabbix server |
logstash-output-zabbix |
Csv输出配置选项
Setting |
Input type |
Required |
create_if_deleted |
boolean |
No |
csv_options |
hash |
No |
dir_mode |
number |
No |
fields |
array |
Yes |
file_mode |
number |
No |
filename_failure |
string |
No |
flush_interval |
number |
No |
gzip |
boolean |
No |
path |
string |
Yes |
spreadsheet_safe |
boolean |
No |
output {
file {
path => ...
codec => line { format => "custom format: %{message}"}
}
}
The path to the file to write. Event fields can be used here,
like `/var/log/logstash/%{host}/%{application}`
One may also utilize the path option for date-based log
rotation via the joda time format. This will use the event
timestamp.
E.g.: `path => "./test-%{+YYYY-MM-dd}.txt"` to create
`./test-2013-05-29.txt`
如果使用绝对路径,则无法以动态字符串开头。例如:/%{myfield}/,/test-%{myfield}/不是有效路径
Common Options
The following configuration options are supported by all output plugins:
Setting |
Input type |
Required |
codec |
codec |
No |
enable_metric |
boolean |
No |
id |
string |
No |
output {
csv {
id => "my_plugin_id"
}
}
使用弹性堆栈加快洞察时间
Logstash模块使用ArcSight和NetFlow等流行数据源编排交钥匙摄取到可视化体验。凭借即时部署摄取管道和复杂仪表板的能力,您的数据探索将在几分钟内完成。
以您的方式创建和配置您的管道
Logstash有一个可插拔的框架,包含200多个插件。混合,匹配和编排不同的输入,过滤器和输出,以便在管道协调中工作。
从自定义应用程序中摄取?看不到你需要的插件?Logstash插件易于构建。我们有一个出色的插件开发API和一个插件生成器,可以帮助您开始和分享您的创作。
相信可以交付的管道
如果Logstash节点碰巧失败,Logstash会保证至少一次传递具有其持久队列的正在进行的事件。未成功处理的事件可以分流到死信队列以进行内省和重放。凭借吸收吞吐量的能力,Logstash可以通过摄取峰值进行扩展,而无需使用外部排队层。
无论您是运行10个还是1000个Logstash实例,我们都可以让您完全保护您的摄取管道。来自Beats的传入数据以及其他输入可以通过线路加密,并且与安全的Elasticsearch集群完全集成。
对部署具有完全可见性
Logstash管道通常具有多种功能,并且可以变得复杂,因此对管道性能,可用性和瓶颈的深刻理解是非常宝贵的。借助监视和管道查看器功能,您可以轻松观察和研究活动的Logstash节点或完整部署。
使用单个UI集中管理部署
使用管道管理UI掌控您的Logstash部署,这使得管理和管理管道变得轻而易举。管理控件还可与内置安全功能无缝集成,以防止任何意外重新布线。
https://www.elastic.co/guide/en/logstash/current/index.html
首先下载logstash,上传到服务器
logstash是用JRuby语言开发的,所以要安装JDK
tar -zxvf logstash-2.3.1.tar.gz -C /bigdata/
bin/logstash -e 'input { stdin {} } output { stdout{} }'
bin/logstash -e 'input { stdin {} } output { stdout{codec => rubydebug} }'
bin/logstash -e 'input { stdin {} } output { elasticsearch {hosts => ["172.16.0.14:9200"]} stdout{} }'
bin/logstash -e 'input { stdin {} } output { elasticsearch {hosts => ["172.16.0.15:9200", "172.16.0.16:9200"]} stdout{} }'
bin/logstash -e 'input { stdin {} } output { kafka { topic_id => "test" bootstrap_servers => "172.16.0.11:9092,172.16.0.12:9092,172.16.0.13:9092"} stdout{codec => rubydebug} }'
vi logstash.conf
input {
file {
type => "gamelog"
path => "/log/*/*.log"
discover_interval => 10
start_position => "beginning"
}
}
output {
elasticsearch {
index => "gamelog-%{+YYYY.MM.dd}"
hosts => ["172.16.0.14:9200", "172.16.0.15:9200", "172.16.0.16:9200"]
}
}
#启动logstack
bin/logstash -f logstash.conf
input {
file {
type => "flow"
path => "/var/nginx_logs/*.log"
discover_interval => 5
start_position => "beginning"
}
}
output {
if [type] == "flow" {
elasticsearch {
index => "flow-%{+YYYY.MM.dd}"
hosts => ["172.16.0.14:9200", "172.16.0.15:9200", "172.16.0.16:9200"]
}
}
}
input {
file {
path => "/export/data/logs/*.log"
discover_interval => 5
start_position => "beginning"
}
}
output {
kafka {
topic_id => "accesslog"
codec => plain {
format => "%{message}"
charset => "UTF-8"
}
bootstrap_servers => "192.168.175.128:9092,192.168.175.133:9092,192.168.175.130:9092"
}
}
input {
kafka {
type => "level-one"
auto_offset_reset => "smallest"
codec => plain {
charset => "GB2312"
}
group_id => "es"
topic_id => "test"
zk_connect => "172.16.0.11:2181,172.16.0.12:2181,172.16.0.13:2181"
}
}
filter {
mutate {
split => { "message" => " " }
add_field => {
"event_type" => "%{message[3]}"
"current_map" => "%{message[4]}"
"current_X" => "%{message[5]}"
"current_y" => "%{message[6]}"
"user" => "%{message[7]}"
"item" => "%{message[8]}"
"item_id" => "%{message[9]}"
"current_time" => "%{message[12]}"
}
remove_field => [ "message" ]
}
}
output {
elasticsearch {
index => "level-one-%{+YYYY.MM.dd}"
codec => plain {
charset => "GB2312"
}
hosts => ["172.16.0.14:9200", "172.16.0.15:9200", "172.16.0.16:9200"]
}
}
input {
file {
type => "syslog"
path => "/var/log/messages"
discover_interval => 10
start_position => "beginning"
}
file {
type => "gamelog"
path => "/log/*/*.log"
discover_interval => 10
start_position => "beginning"
}
}
output {
if [type] == "syslog" {
elasticsearch {
index => "syslog-%{+YYYY.MM.dd}"
hosts => ["172.16.0.14:9200", "172.16.0.15:9200", "172.16.0.16:9200"]
}
}
if [type] == "gamelog" {
elasticsearch {
index => "gamelog-%{+YYYY.MM.dd}"
hosts => ["172.16.0.14:9200", "172.16.0.15:9200", "172.16.0.16:9200"]
}
}
}