ELK部署生产实践部署(1)

具体文档参照我的笔记 

### 日志采集前规范解决事项:

1、开发人员不能登录线上服务器查看详细日志。

2、各个系统都有日志,日志数据分散难以查找。

3、日志数据量大、查询速度慢

4、日志数据大量延迟

5、服务器时间不同步,导致日期错误


### 解决问题 

1. 方便快速查看各种日志

2. 故障发生,处理故障时才去查看日志,没有完整的日志告警机制

3. 节点多、日志分散、收集日志难度加大、没有统一规范存取路径

4. 运行日志、错误日志、需要固定存放位置


###

部署环境

[root@master_agent logfile]# cat /etc/redhat-release 

CentOS Linux release 7.0.1406 (Core) 



### ELKstack

- [ ] 运行流程: 收集--->>存储 ----->>搜索+统计+展示------->>报警,数据分析


### Elastic Serch 是一个基于Lucene搜索服务器,提供了一个分布式多用户能力的全文搜索引擎。


### ES特点

- [ ] cluster  集群

- 集群中有多个节点,其中有一个为主节点,这个主节点是可以通过选举产生的,主节点是对于集群内部管理的。



## ELKsatack介绍,架构图如下

- [x] 对于日志来说,最常见的需求就是收集、存储、查询、展示,开源社区正好有相对应的开源项目:logstash(收集)、elasticsearch(存储+搜索)、kibana(展示),我们将这三个组合起来的技术称之为ELKStack,所以说ELKStack指的是Elasticsearch、Logstash、Kibana技术栈的结合,一个通用的架构如下图所示: 

![p_w_picpath](http://note.youdao.com/yws/api/personal/file/910BE9E6CA6F4D90B8A709959FBF34D7?method=download&shareKey=b54685b1119f1d6c39403b71f7d96eb7)



## ElkStack部署 

#### Elasticsearch、需要Java环境,所以直接使用yum安装。

1. 安装java、并检查是否安装成功

```

[root@master_agent ~]#  yum install java -y

[root@master_agent ~]# java -version

openjdk version "1.8.0_131"

OpenJDK Runtime Environment (build 1.8.0_131-b12)

OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)

[root@master_agent ~]# 


```

2. 下载并安装GPG key


[root@master_agent ~]# rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch




3. 编辑yum仓库

[root@master_agent ~]# vim /etc/yum.repos.d/elasticsearch.repo


[elasticsearch-2.x]

name=Elasticsearch repository for 2.x packages

baseurl=http://packages.elastic.co/elasticsearch/2.x/centos

gpgcheck=1

gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch

enabled=1

             


4. yum安装需要配置limits

[root@master_agent ~]# vim /etc/security/limits.conf




elasticsearch soft memlock unlimited

elasticsearch hard memlock unlimited





5. 安装ElasticSearch


[root@master_agent ~]# yum install -y elasticsearch logstash kibana



## 配置Elasticsearch

[root@master_agent ~]# vim /etc/elasticsearch/elasticsearch.yml 


[root@master_agent ~]# mkdir -p /data/es-data ##创建存放目录

cluster.name: elkcluster  ##集群名称

node.name: elk-server01   ##节点名称

path.data: /data/es-data  ##修改目录存放路径

path.logs: /var/log/elasticsearch/    ##修改日志存放路径

bootstrap.memory_lock: true    不使用交换分区,锁住内存

network.host: 172.16.1.200    监听主机

http.port: 9200   打开监听端口



### 查看更改配置文件的内容



[root@master_agent ~]# grep '^[a-z]' /etc/elasticsearch/elasticsearch.yml 

cluster.name: elkcluster

node.name: elk-server01

path.data: /data/es-data

path.logs: /var/log/elasticsearch/

bootstrap.memory_lock: true

network.host: 172.16.1.200

http.port: 9200

action.destructive_requires_name: false

discovery.zen.ping.unicast.hosts: ["172.16.1.200", "172.16.1.201"]


[root@master_agent ~]# 



### 启动elasticsearch


```

[root@master_agent ~]#  systemctl start elasticsearch




### 查看启动状态、启动失败



[root@master_agent ~]#  systemctl status elasticsearch

● elasticsearch.service - Elasticsearch

   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled)

   Active: failed (Result: exit-code) since Mon 2017-06-26 14:52:39 EDT; 2s ago

     Docs: http://www.elastic.co

  Process: 20760 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -Des.pidfile=${PID_DIR}/elasticsearch.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.conf=${CONF_DIR} (code=exited, status=1/FAILURE)

  Process: 20759 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-pre-exec (code=exited, status=0/SUCCESS)

 Main PID: 20760 (code=exited, status=1/FAILURE)


Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Security.createPermissions(Security.java:212)

Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Security.configure(Security.java:118)

Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Bootstrap.setupSecurity(Bootstrap.java:212)

Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:183)

Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)

Jun 26 14:52:39 master_agent elasticsearch[20760]: at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)

Jun 26 14:52:39 master_agent elasticsearch[20760]: Refer to the log for complete error details.

Jun 26 14:52:39 master_agent systemd[1]: elasticsearch.service: main process exited, code=exited, status=1/FAILURE

Jun 26 14:52:39 master_agent systemd[1]: Unit elasticsearch.service entered failed state.

Jun 26 14:52:39 master_agent systemd[1]: elasticsearch.service failed.

[root@master_agent ~]# 



### 解决办法

- [x] 查看日志 

- [root@master_agent ~]# tail -100 /var/log/elasticsearch/elkcluster.log 

- [x] 修改目录所属权限

- [root@master_agent ~]# chown -R elasticsearch:elasticsearch /data/es-data/


### 查看启动


[root@master_agent ~]#  systemctl start elasticsearch

[root@master_agent ~]#  systemctl status elasticsearch

● elasticsearch.service - Elasticsearch

   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled)

   Active: active (running) since Mon 2017-06-26 14:56:06 EDT; 2s ago

     Docs: http://www.elastic.co

  Process: 20802 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-pre-exec (code=exited, status=0/SUCCESS)

 Main PID: 20803 (java)

   CGroup: /system.slice/elasticsearch.service

           └─20803 /bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFrac...



### 检查端口是否开启



[root@master_agent ~]#  systemctl restart elasticsearch

[root@master_agent ~]# ss -ntlp

State       Recv-Q Send-Q                                        Local Address:Port                                          Peer Address:Port 

LISTEN      0      50                                                        *:3306                                                     *:*      users:(("mysqld",19594,14))

LISTEN      0      128                                                       *:22                                                       *:*      users:(("sshd",1372,3))

LISTEN      0      50                                      ::ffff:172.16.1.201:9200                                                    :::*      users:(("java",21205,91))

LISTEN      0      50                                      ::ffff:172.16.1.201:9300                                                    :::*      users:(("java",21205,79))

LISTEN      0      128                                                      :::22                                                      :::*      users:(("sshd",1372,4))

[root@master_agent ~]# 



### 浏览器访问查看下结果



http://172.16.1.201:9200/

内容如下:

{

  "name" : "elk-server01",

  "cluster_name" : "elkcluster",

  "cluster_uuid" : "bkJEwJGARXq2Ki2xWa0oTQ",

  "version" : {

    "number" : "2.4.6",

    "build_hash" : "5376dca9f70f3abef96a77f4bb22720ace8240fd",

    "build_timestamp" : "2017-07-18T12:17:44Z",

    "build_snapshot" : false,

    "lucene_version" : "5.5.4"

  },

  "tagline" : "You Know, for Search"

}