ELK

1. ELK简介

  • ELK是什么?
    ELK是Elasticsearch Logstash Kibana三者的缩写,原来称为ELK Stack ,现在称为Elastic Stack,加入了beats来优化Logstash。

  • ELK的主要用途是什么?
    大型分布式系统的日志集中分析。

  • 为什么要做日志集中分析?
    在生产系统中出现问题,我们通过查看日志定位问题,在大型的分布式系统中,若出现问题,你该如何查看日志?

  • 一个完整的集中式日志系统,需要包含以下几个主要特点

  • 收集 : 能够采集多种来源的日志数据
  • 传输、汇流 : 能够将日志分流、汇总,并传入中央存储
  • 转换 : 能够对收集的日志数据进行转换处理
  • 存储 : 如何存储日志数据
  • 分析 : 可以支持UI分析
  • 告警 : 能够提供错误报告,监控机制

ELK提供了一整套解决方案,并且都是开源软件,之间互相配合使用,完美衔接,高效的满足了很多场合的应用。目前主流的一种日志系统。

  • ELK架构(一) 老的架构

  • ELK架构(二) 用beats来进行采集的架构


2. Filebeat

  • Beats是什么?
    轻量型数据采集器。负责从目标源上采集数据。
    官网介绍:https://www.elastic.co/cn/products/beats

  • FileBeat 日志文件采集器工作原理

  • filebeat的安装,以7.17.7为例

访问地址:https://www.elastic.co/cn/downloads/past-releases/filebeat-7-17-7

然后下载相应的版本,这里我下载的是7.17.7-linux_X86_64版本的,地址如下:
https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.7-linux-x86_64.tar.gz

上传到linux,然后解压

tar -zxvf filebeat-7.17.7-linux-x86_64.tar.gz

最后再,改一下文件夹名字,这里是因为我觉得太长了,如果觉得无所谓,可以忽略

mv filebeat-7.17.7-linux-x86_64 filebeat-7.17.7
  • 这里用filebeat做一个测试,使用console输出结果
  1. 修改filebeat.yml,这里记得备份哦,备份是一个好习惯,前面好多我都忘了提醒了,但是这个应该刻在基因里边,不然出事了的话,要恢复的时候会很痛苦。
# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

# filestream is an input for collecting log messages from files.
- type: filestream

  # Unique ID among all inputs, an ID is required. 
  # 这里是默认的,如果觉得不好可以改成自己喜欢的,这里我没改,因为懒
  id: my-filestream-id

  # Change to true to enable this input configuration.
  # 这里默认是 false ,一定要记得改成true,否则这个input的配置不起作用
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  # 这是个很重要的配置,主要是配置了要获取到的日志的位置在哪里,可以使用 * 号作为通配符
  paths:
    - /var/log/filebeat/*/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1
  #
####################################################  
# 这里的一堆内容,保持默认就好
####################################################
#
# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

# ---------------------------- Elasticsearch Output ----------------------------
#  默认是使用这个,先注释掉
#output.elasticsearch:
  # Array of hosts to connect to.
  #  hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

# ------------------------------ Logstash Output -------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"


# ------------------------------ Console Output -------------------------------
# 这里是我们要加上去的内容
output.console:
  pretty: true

然后保存退出即可

  1. 尝试启动filebeat
./filebeat -e
  1. 往配置的log路径/var/log/filebeat/*.log,中添加内容
cd /var/log/
mkdir filebeat
cd filebeat
echo "this is a test log+++++++++++++++" >> 01.log
  1. 运行结果
[elastic@lazyfennec filebeat-7.17.7]$ ./filebeat -e
2022-11-16T22:51:39.959+0800    INFO    instance/beat.go:697    Home path: [/home/elastic/es7/filebeat-7.17.7] Config path: [/home/elastic/es7/filebeat-7.17.7] Data path: [/home/elastic/es7/filebeat-7.17.7/data] Logs path: [/home/elastic/es7/filebeat-7.17.7/logs] Hostfs Path: [/]
2022-11-16T22:51:39.960+0800    INFO    instance/beat.go:705    Beat ID: 7be4fff8-9b20-48a8-9793-7aacbb903281
2022-11-16T22:51:42.976+0800    WARN    [add_cloud_metadata]    add_cloud_metadata/provider_aws_ec2.go:79   read token request for getting IMDSv2 token returns empty: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers). No token in the metadata request will be used.
2022-11-16T22:51:42.978+0800    INFO    [seccomp]   seccomp/seccomp.go:124  Syscall filter successfully installed
2022-11-16T22:51:42.982+0800    INFO    [beat]  instance/beat.go:1051   Beat info   {"system_info": {"beat": {"path": {"config": "/home/elastic/es7/filebeat-7.17.7", "data": "/home/elastic/es7/filebeat-7.17.7/data", "home": "/home/elastic/es7/filebeat-7.17.7", "logs": "/home/elastic/es7/filebeat-7.17.7/logs"}, "type": "filebeat", "uuid": "7be4fff8-9b20-48a8-9793-7aacbb903281"}}}
2022-11-16T22:51:42.984+0800    INFO    [beat]  instance/beat.go:1060   Build info  {"system_info": {"build": {"commit": "2b200bdbf5d85553b8f02c8709142b01dfd1082d", "libbeat": "7.17.7", "time": "2022-10-17T16:55:51.000Z", "version": "7.17.7"}}}
2022-11-16T22:51:42.985+0800    INFO    [beat]  instance/beat.go:1063   Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":1,"version":"go1.18.5"}}}
2022-11-16T22:51:42.987+0800    INFO    [beat]  instance/beat.go:1067   Host info   {"system_info": {"host": {"architecture":"x86_64","boot_time":"2022-11-16T13:07:01+08:00","containerized":false,"name":"lazyfennec","ip":["127.0.0.1/8","::1/128","192.168.1.9/24","2408:825c:6e2:b8b9:97ac:336e:7bb7:af46/64","fe80::3932:ba1d:da41:e31d/64"],"kernel_version":"5.14.10-300.fc35.x86_64","mac":["08:00:27:73:59:06"],"os":{"type":"linux","family":"redhat","platform":"fedora","name":"Fedora Linux","version":"35 (Workstation Edition)","major":35,"minor":0,"patch":0},"timezone":"CST","timezone_offset_sec":28800,"id":"2dd7538b08ab45fa97d32ced12db623b"}}}
2022-11-16T22:51:42.989+0800    INFO    [beat]  instance/beat.go:1096   Process info    {"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":null,"effective":null,"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"ambient":null}, "cwd": "/home/elastic/es7/filebeat-7.17.7", "exe": "/home/elastic/es7/filebeat-7.17.7/filebeat", "name": "filebeat", "pid": 9332, "ppid": 7624, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2022-11-16T22:51:38.150+0800"}}}
2022-11-16T22:51:42.991+0800    INFO    instance/beat.go:291    Setup Beat: filebeat; Version: 7.17.7
2022-11-16T22:51:42.992+0800    INFO    [publisher] pipeline/module.go:113  Beat name: lazyfennec
2022-11-16T22:51:43.111+0800    WARN    beater/filebeat.go:202  Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-11-16T22:51:43.132+0800    INFO    [monitoring]    log/log.go:142  Starting metrics logging every 30s
2022-11-16T22:51:43.133+0800    INFO    instance/beat.go:456    filebeat start running.
2022-11-16T22:51:43.134+0800    INFO    memlog/store.go:119 Loading data file of '/home/elastic/es7/filebeat-7.17.7/data/registry/filebeat' succeeded. Active transaction id=0
2022-11-16T22:51:43.135+0800    INFO    memlog/store.go:124 Finished loading transaction log file for '/home/elastic/es7/filebeat-7.17.7/data/registry/filebeat'. Active transaction id=11
2022-11-16T22:51:43.137+0800    WARN    beater/filebeat.go:411  Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-11-16T22:51:43.145+0800    INFO    [registrar] registrar/registrar.go:109  States Loaded from registrar: 1
2022-11-16T22:51:43.150+0800    INFO    [crawler]   beater/crawler.go:71    Loading Inputs: 1
2022-11-16T22:51:43.151+0800    INFO    [crawler]   beater/crawler.go:117   starting input, keys present on the config: [filebeat.inputs.0.enabled filebeat.inputs.0.id filebeat.inputs.0.paths.0 filebeat.inputs.0.type]
2022-11-16T22:51:43.182+0800    INFO    [crawler]   beater/crawler.go:148   Starting input (ID: 10875199066285727974)
2022-11-16T22:51:43.182+0800    INFO    [crawler]   beater/crawler.go:106   Loading and starting Inputs completed. Enabled inputs: 1
2022-11-16T22:51:43.182+0800    INFO    [input.filestream]  compat/compat.go:113    Input 'filestream' starting {"id": "my-filestream-id"}
2022-11-16T22:51:43.182+0800    INFO    cfgfile/reload.go:164   Config reloader started
2022-11-16T22:51:45.978+0800    INFO    [add_cloud_metadata]    add_cloud_metadata/add_cloud_metadata.go:101    add_cloud_metadata: hosting provider type not detected.
2022-11-16T22:52:13.139+0800    INFO    [monitoring]    log/log.go:184  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"id":"session-3.scope"},"memory":{"id":"session-3.scope","mem":{"usage":{"bytes":991010816}}}},"cpu":{"system":{"ticks":80,"time":{"ms":83}},"total":{"ticks":230,"time":{"ms":241},"value":230},"user":{"ticks":150,"time":{"ms":158}}},"handles":{"limit":{"hard":65536,"soft":65536},"open":12},"info":{"ephemeral_id":"b424e54f-7dc5-4082-8732-ebf650bc514b","uptime":{"ms":33645},"version":"7.17.7"},"memstats":{"gc_next":20313288,"memory_alloc":11016552,"memory_sys":32850952,"memory_total":55476336,"rss":100814848},"runtime":{"goroutines":33}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"reloads":1,"scans":2},"output":{"events":{"active":0},"type":"console"},"pipeline":{"clients":2,"events":{"active":0},"queue":{"max_events":4096}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":1},"load":{"1":0.07,"15":0.05,"5":0.04,"norm":{"1":0.07,"15":0.05,"5":0.04}}}}}}
{
  "@timestamp": "2022-11-16T14:52:17.191Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.7"
  },
  "host": {
    "os": {
      "name": "Fedora Linux",
      "kernel": "5.14.10-300.fc35.x86_64",
      "type": "linux",
      "platform": "fedora",
      "version": "35 (Workstation Edition)",
      "family": "redhat"
    },
    "id": "2dd7538b08ab45fa97d32ced12db623b",
    "containerized": false,
    "ip": [
      "192.168.1.9",
      "2408:825c:6e2:b8b9:97ac:336e:7bb7:af46",
      "fe80::3932:ba1d:da41:e31d"
    ],
    "name": "lazyfennec",
    "mac": [
      "08:00:27:73:59:06"
    ],
    "hostname": "lazyfennec",
    "architecture": "x86_64"
  },
  "agent": {
    "type": "filebeat",
    "version": "7.17.7",
    "hostname": "lazyfennec",
    "ephemeral_id": "b424e54f-7dc5-4082-8732-ebf650bc514b",
    "id": "7be4fff8-9b20-48a8-9793-7aacbb903281",
    "name": "lazyfennec"
  },
  "log": {
    "offset": 72,
    "file": {
      "path": "/var/log/filebeat/01.log"
    }
  },
  "message": "this is a test log ++++++++++++++++",
  "input": {
    "type": "filestream"
  },
  "ecs": {
    "version": "1.12.0"
  }
}

3. Logstash

日志信息只是作为一个文本字段放入ES中,还是应该将其解析为多个特定意义的字段,方便统计分析?

  • Logstash的角色
    https://www.elastic.co/cn/products/logstash

Logstash是开源的服务端数据处理管道,能够同时从多个数据源采集数据、转换数据,然后将数据发送到你的存储中(在ELK中特指ElasticSearch) 。

  • Logstash Pipeline 管道 工作原理


  • Logstash 的安装,解压即用

访问地址:https://www.elastic.co/cn/downloads/past-releases/logstash-7-17-7

然后下载相应的版本,这里我下载的是7.17.7-linux_X86_64版本的,地址如下:
https://artifacts.elastic.co/downloads/logstash/logstash-7.17.7-linux-x86_64.tar.gz

上传到linux,然后解压

tar -zxvf logstash-7.17.7-linux-x86_64.tar.gz

然后测试一下

cd logstash-7.17.7/bin/  
./logstash -e 'input { stdin { } } output { stdout {} }'

输出结果

[elastic@lazyfennec bin]$ ./logstash -e 'input { stdin { } } output { stdout {} }'
Using JAVA_HOME defined java: /etc/softwares/jdk1.8
WARNING: Using JAVA_HOME while Logstash distribution comes with a bundled JDK.
DEPRECATION: The use of JAVA_HOME is now deprecated and will be removed starting from 8.0. Please configure LS_JAVA_HOME instead.
Sending Logstash logs to /home/elastic/es7/logstash-7.17.7/logs which is now configured via log4j2.properties
[2022-11-16T23:16:53,131][INFO ][logstash.runner          ] Log4j configuration path used is: /home/elastic/es7/logstash-7.17.7/config/log4j2.properties
[2022-11-16T23:16:53,212][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.17.7", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 Java HotSpot(TM) 64-Bit Server VM 25.333-b02 on 1.8.0_333-b02 +indy +jit [linux-x86_64]"}
[2022-11-16T23:16:53,221][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[2022-11-16T23:16:55,008][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2022-11-16T23:16:55,150][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"92ab1157-0581-4496-b0d0-8b03d5ca1e45", :path=>"/home/elastic/es7/logstash-7.17.7/data/uuid"}
[2022-11-16T23:17:02,061][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2022-11-16T23:17:05,985][INFO ][org.reflections.Reflections] Reflections took 339 ms to scan 1 urls, producing 119 keys and 419 values 
G^H^H^H[2022-11-16T23:17:14,441][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["config string"], :thread=>"#"}
[2022-11-16T23:17:17,382][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>2.91}
[2022-11-16T23:17:17,689][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[2022-11-16T23:17:18,111][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
Hello Logstash # 这个是自己输入的,回车后输出下面的内容
{
      "@version" => "1",
       "message" => "Hello Logstash",
    "@timestamp" => 2022-11-16T15:18:00.640Z,
          "host" => "lazyfennec"
}

4. 打通Filebeat 和 Logstash 以及 ES (这里默认没有开启XPack,因为好像有那么一点点麻烦,虽然我还没验证,但是先这样吧)

  1. 配置logstashinput来自Filebeat
    参考下面的网址:https://www.elastic.co/guide/en/logstash/7.17/advanced-pipeline.html

    查看相关网页后,我们发现里边有一个配置,那么我们将这个配置拷贝下来,并且创建在logstash-7.17.7/config/beats2es.conf 这个文件中,注意,这里只是将内容输出到控制台,但是后续我们会将内容输出到es中,所以先这样吧。
input {
    beats {
        port => "5044"
    }
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
    stdout { codec => rubydebug }
}
  1. 启动Logstash
    其中的--config.reload.automatic 表示如果有修改的情况下自动重新加载配置
bin/logstash -f config/beats2es.conf --config.reload.automatic
  1. 对Filebeat的输出进行重新配置,将其配置为输出到Logstash
    修改filebeat.yml,将原来的console输出方式注释掉,将logstash的注释放开,并且将hosts设置为要输出到的服务器IP和端口的形式,具体建议查看下面的网址,我觉得解释的还是挺详细的,另外也可以设置多个host哦,毕竟是hosts吗。
    配置filebeat的input的网页地址:https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-input-log.html
# ------------------------------ Logstash Output -------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["192.168.1.9:5044"]

# ------------------------------ Console Output -------------------------------
#output.console:
#  pretty: true
  1. 启动filebeat
    这里记得最好一定 】要在logstash启动之后再启动,因为收集到的内容要发送到相关的端口,如果端口没有启动,可能会发生一些问题。
./filebeat -e
  1. 测试一下
    在filebeat.yml 配置中指定的目录路径下,执行下面的指令,其实就是将一句话输出到某个文件中。
# 这里是进入相关的日志目录
cd /var/log/filebeat/
# 输入下面的内容到01.log
echo "This is a log test++++++++++++++++++++++" >> 01.log

然后切换到logstash可以看到以下的内容

  1. 打通Logstash到Elasticsearch
    修改 logstash-7.17.7/config/beats2es.conf,主要是注释stdout { codec => rubydebug },然后添加elasticsearch相关的内容,具体可以查看 https://www.elastic.co/guide/en/logstash/7.17/advanced-pipeline.html
input {
    beats {
        port => "5044"
    }
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
#    stdout { codec => rubydebug }
    elasticsearch {
        hosts => [ "192.168.1.9:9200" ]
    }
}

然后等待刷新,启动Kibana(你说我们没说到Kibana? 那还不简单,下载解压然后稍稍配置一下就好了),这里放一下kibana.yml

# 这里的IP要换成自己的ES服务器的IP哦
server.host: "192.168.1.9"
elasticsearch.hosts: ["http://192.168.1.9:9200"]
kibana.index: ".kibana"

# The URLs of the Elasticsearch instances to use for all your queries.
elasticsearch.hosts: ["http://192.168.1.9:9200"]

# Kibana uses an index in Elasticsearch to store saved searches, visualizations and
# dashboards. Kibana creates a new index if the index doesn't already exist.
kibana.index: ".kibana"
  • 测试一下
cd /var/log/filebeat/
echo "this is a log for elk test++++++++++++++++" >> 01.log

打开kibana,查看indices



然后打开Dev Tools

GET /logstash-2022.11.16-000001/_search

出现了结果


5. Kibana

kibana 用户手册:https://www.elastic.co/guide/cn/kibana/current/index.html


6. 架构分享

  • 我们上面的内容的架构类型:



    多台的logstash会很耗费资源,不是很好

  • 另一种好的方式,将内容发送到kafka集群,然后配置发送到logstash,这样的话,就



    关于kafaka,这里就先不介绍了,后续有空会进行相关的介绍


如果觉得有收获,欢迎点赞和评论,更多知识,请点击关注查看我的主页信息哦~

你可能感兴趣的:(ELK)