场景及日志量
由于公司是物联网场景,设备每10秒都会上报Gps数据,后端猿对每个每个上传数据的处理都加入了日志,加上后端猿用Quartz写了许多无状态的后台服务,每10s跑一次,而且有多个环境(开发,测试,线上环境)同时有日志写入,日志量大概在均值3000/s, 峰值5000/s。做了rotation后,3天的日志量在300G左右,最初的最小化安装已经无法支撑。
顺便分享下最小化安装的docker-compose.yml 参考
version: '2'
services:

MongoDB: https://hub.docker.com/_/mongo/

mongodb:
image: mongo:3
volumes:

  • mongo_data:/data/db

    Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docker.html

    elasticsearch:

    虽然这个img下载有些慢,但是请一定用这个

    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.3
    volumes:

  • es_data:/usr/share/elasticsearch/data
    environment:
  • http.host=0.0.0.0
  • transport.host=localhost
  • network.host=0.0.0.0

    Disable X-Pack security: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/security-settings.html#general-security-settings

  • xpack.security.enabled=false
  • "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
    memlock:
    soft: -1
    hard: -1
    mem_limit: 1g

    Graylog: https://hub.docker.com/r/graylog/graylog/

    graylog:
    image: graylog/graylog:2.4.0-1
    volumes:

  • graylog_journal:/usr/share/graylog/data/journal
    environment:

    CHANGE ME!

  • GRAYLOG_PASSWORD_SECRET=somepasswordpepper

    Password: admin 生成方法:echo -n yourpassword | shasum -a 256

  • GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
  • GRAYLOG_WEB_ENDPOINT_URI=http://127.0.0.1:9000/api
    links:
  • mongodb:mongo
  • elasticsearch
    depends_on:
  • mongodb
  • elasticsearch
    ports:

    Graylog web interface and REST API

  • 9000:9000

    Syslog TCP

  • 514:514

    Syslog UDP

  • 514:514/udp

    GELF TCP

  • 12201:12201

    GELF UDP

  • 12201:12201/udp

    Volumes for persisting data, see https://docs.docker.com/engine/admin/volumes/volumes/

    volumes:
    mongo_data:
    driver: local
    es_data:
    driver: local
    graylog_journal:
    driver: local
    使用心得
    使用前,请务必看下Architectural,了解下设计理念。
    • Graylog nodes should have a focus on CPU power. These also serve the user interface to the browser.
    • Elasticsearch nodes should have as much RAM as possible and the fastest disks you can get. Everything depends on I/O speed here.
    • MongoDB is storing meta information and configuration data and doesn’t need many resources.
    简单来说就是,Graylog用于采集日志写入Elasticsearch比较耗CPU,Elasticsearch比较耗RAM跟IO。知道这一点基本也就为后面调优确定了方向。
    三节点搭建步骤 参考
    准备工作(CentOS7.3)
    yum –y update
    yum install java-1.8.0-openjdk-headless.x86_64 #安装JDK
    MongoDB 3.6 replica set搭建
    首先每个node安装MongoDB

    1. 添加MongoDB软件源

      /etc/yum.repos.d/mongodb-org-3.6.repo

      [mongodb-org-3.6]
      name=MongoDB Repository
      baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.6/x86_64/
      gpgcheck=1
      enabled=1
      gpgkey=https://www.mongodb.org/static/pgp/server-3.6.asc

    2. 更新源并安装
      yum update -y && yum install -y mongodb-org
    3. 关键配置项

      MongoDB数据放在非系统目录

      添加目录权限 chown -R mongod:mongod /data/mongo

      storage:
      dbPath: /data/mongo

配置MongoDB repliSetName

replication:
replSetName: repl

配置bindIp,使用局域网Ip,保证该ip对于其他node可见

net:
port: 27017
bindIp: 192.168.168.242

  1. 启动MongoDB服务
    systemctl start mongod
  2. 初始化MongoDB replica set

    #start mongo shell and run
    rs.initiate( {
    _id : "rs0",
    members: [
    { _id: 0, host: "mongodb0.example.net:27017" },
    { _id: 1, host: "mongodb1.example.net:27017" },
    { _id: 2, host: "mongodb2.example.net:27017" }
    ]
    })

  3. 查看replica set配置
    rs.conf()
  4. 查看MongoDB replica set的Primary节点
    rs.status()
    Elasticsearch 5.6 集群搭建
  5. 导入Elasticsearch public GPG key
    #/etc/yum.repos.d/elasticsearch.repo
    [elasticsearch-5.x]
    name=Elasticsearch repository for 5.x packages
    baseurl=https://artifacts.elastic.co/packages/5.x/yum
    gpgcheck=1
    gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
    enabled=1
    autorefresh=1
    type=rpm-md
  6. Elasticsearch安装
    yum install elasticsearch
  7. Elasticsearch集群配置
    #/etc/elasticsearch/elasticsearch.yml

    集群各个instance的访问ip

    discovery.zen.ping.unicast.hosts: ["192.168.168.240", "192.168.168.241", "192.168.168.242"]

该节点的访问ip

network.host: 192.168.168.242

集群名称

cluster.name: ES_PROD

节点名称

node.name: ${HOSTNAME}

数据目录

mkdir -p /data/elasticsearch/data

chown -R elasticsearch:elasticsearch /data/elasticsearch

path.data: /data/elasticsearch/data

日志目录

mkdir -p /data/elasticsearch/logs

chown -R elasticsearch:elasticsearch /data/elasticsearch

path.logs: /data/elasticsearch/logs

  1. 编辑Elasticsearch系统服务
    #删除如下行
    -Edefault.path.logs=${LOG_DIR} \
    -Edefault.path.data=${DATA_DIR} \
  2. 启动服务
    systemctl start elasticsearch
  3. 健康检查
    curl -XGET 'http://192.168.168.240:9200/_cluster/state?pretty'
    Graylog 2.4 多节点搭建
    Graylog多节点搭建只需要保留一个instance用于web访问即可,日志可以配置写入不同node,当某个node压力过大,需要自行考虑手动将部分日志转到其他空闲node。(这一层的负载均衡待研究)
  4. Graylog 2.4 软件源安装
    rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-2.4-repository_latest.rpm
  5. graylog-server安装
    yum install graylog-server
  6. graylog多节点配置

    多节点安转时保留一个节点为true,其他节点为false

    is_master = true

时区配置为上海时区

root_timezone = PRC

Password: admin 生成方法:echo -n yourpassword | shasum -a 256

password_secret = 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918

多节点模式时,要保证该ip可被其他节点访问

rest_listen_uri = http://192.168.168.242:9000/api/

默认同上

rest_transport_uri = http://192.168.168.242:9000/api/

配置外网访问的Base URL,使用nginx反代理的话可以配置为http://192.168.168.242:9000/

web_listen_uri = http://121.196.213.107:50089/

Elasticsearch集群host列表(逗号分隔)

elasticsearch_hosts = http://192.168.168.240:9200,http://192.168.168.241:9200,http://192.168.168.242:9200

配置只保留72小时的日志

elasticsearch_max_time_per_index = 1h
elasticsearch_max_number_of_indices = 72

MongoDB连接字符串

mongodb_uri = mongodb://192.168.168.242:27017,192.168.168.241:27017,192.168.168.240:27017/graylog?replicaSet=repl

  1. 调整Graylog JVM HeapSize
    #/etc/sysconfig/graylog-server
    GRAYLOG_SERVER_JAVA_OPTS="-Xms4g -Xmx4g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow"
  2. 启动服务
    systemctl start graylog-server
    Nginx反向代理配置
  3. Nginx安装
    yum install nginx
  4. Nginx配置
    server
    {
    listen 50098 default_server;
    listen [::]:50098 default_server ipv6only=on;
    server_name 47.97.188.62:50098;

    location / {
    proxy_set_header Host $http_host;
    proxy_set_header X-Forwarded-Host $host;
    proxy_set_header X-Forwarded-Server $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Graylog-Server-URL http://$server_name/api;
    proxy_pass http://192.168.168.242:9000;
    }
    }
    后期优化可以考虑的点

  5. 调大cpu,可增加消息处理速度
  6. 多开几个udp端口,可增加消息处理速度
  7. 调大Elasticsearch内存占用,增加缓冲,降低IO
  8. 更换更高效的磁盘