生产环境Consul安装升级

Consul目录规划

  • Consul安装目录/usr/local/consul-VERSION(VERSION为版本号),安装完成后软链接到/usr/local/consul
  • Consul二进制程序目录/usr/local/consul/bin
  • Consul配置文件目录为/usr/local/consul/consul.d
  • Consul数据目录为/usr/local/consul/data
  • Consul日志目录为/usr/local/consul/logs
  • Consul快照目录为/usr/local/consul/snapshot

完整目录结构如下:

~]# tree -C -L 1 /usr/local/consul
/usr/local/consul
├── bin
├── consul.d
├── data
├── logs
└── snapshot

Consul安装

创建相应目录

~]# mkdir -p /usr/local/consul-1.5.1
~]# ln -s /usr/local/consul-1.5.1/ /usr/local/consul
~]# mkdir -p /usr/local/consul/{bin,consul.d,data,logs,snapshot}

下载程序包并解压

官网下载地址:;

下载软件包:

~]# wget https://releases.hashicorp.com/consul/1.5.1/consul_1.5.1_linux_amd64.zip

解压

~]# unzip consul_1.5.1_linux_amd64.zip -d /usr/local/consul/bin/

导出环境变量

~]# echo 'export PATH=$PATH:/usr/local/consul/bin' > /etc/profile.d/consul.sh
~]# source /etc/profile.d/consul.sh

Consul配置

Consul 可通过命令行指定具体启动参数或者指定配置文件目录启动为前台进程。我们可以通过nohup或者systemd等等启动为后台进程。这里我们通过systemd启动为后台进程。

在生产环境中强烈推荐使用json或hcl格式的配置文件进行启动,配置文件提供了某些命令行无法提供的参数和功能,如开启prometheus监控等等。

systemd通过指定全部命令行参数启动

Consul启动参数配置

Consul启动参数配置文件为/etc/sysconfig/consul。配置文件参数可酌情更改。

Consul-Server配置
############ consul-1.5.1配置#############
~]# cat /etc/sysconfig/consul
# Consul start options
OPTIONS="\
-server \
-datacenter=dc1 \
-node=consul-server-1 \
-bind=10.114.0.59 \
-client=0.0.0.0 \
-config-dir=/usr/local/consul/consul.d/ \
-data-dir=/usr/local/consul/data/ \
-bootstrap-expect=3 \
-join=10.114.0.59 \
-rejoin \
-ui \
-pid-file=/run/consul-server.pid \
-log-file=/usr/local/consul/logs/ \
-log-level=info"
Consul-Client配置
############ consul-1.5.1配置#############
~]# cat /etc/sysconfig/consul
# Consul start options
OPTIONS="\
-datacenter=dc1 \
-node=consul-client-1 \
-bind=10.114.0.61 \
-client=0.0.0.0 \
-config-dir=/usr/local/consul/consul.d/ \
-data-dir=/usr/local/consul/data/ \
-join=10.114.0.59 \
-ui \
-pid-file=/run/consul-client.pid \
-log-file=/usr/local/consul/logs/ \
-log-level=info"

Consul启动Unit配置(CentOS 7)

Consul启动Unit配置文件为/usr/lib/systemd/system/consul.service 。配置文件参数可酌情更改。需配置server端和client端。

~]# cat /usr/lib/systemd/system/consul.service
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/sysconfig/consul

[Service]
User=root
Group=root
EnvironmentFile=/etc/sysconfig/consul
# set GOMAXPROCS to number of processors
Environment=GOMAXPROCS=$(nproc)
ExecStart=/usr/local/consul/bin/consul agent $OPTIONS
ExecReload=/usr/local/consul/bin/consul reload
KillMode=process
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

修改完Unit文件后需重新载入

~]# systemctl daemon-reload

Consul启动

启动Server端和Client端
~]# systemctl start consul
~]# systemctl enable consul

systemd通过指定配置文件目录启动

配置文件支持JSON和HCL格式

配置文件模板

~]# cat config.example 
# The configuration file must be in JSON or HCL format.

# Consul server agents typically require a superset of configuration required by 
# Consul client agents. We will specify common configuration used by all Consul agents 
# in consul.hcl and server specific configuration in server.hcl.

#####     JSON configuration settings    ####
#############################################

#####     General configuration         #####
# Create a configuration file at /usr/local/consul/consul.d/consul.json
# Add this configuration to the consul.json configuration file:
{
    "datacenter": "dc1",
    "client_addr": "0.0.0.0",
    "bind_addr": "{{ GetInterfaceIP \"eth0\" }}",
    "data_dir": "/usr/local/consul/data",
    "retry_interval": "20s",
    "retry_join": ["10.114.0.59","10.114.0.60"],
    "enable_local_script_checks": true,
    "log_file": "/usr/local/consul/logs/",
    "log_level": "info",
    "pid_file": "/var/run/consul.pid",
    "performance": {
        "raft_multiplier": 1
    },
    "telemetry": {
        "prometheus_retention_time": "120s",
        "disable_hostname": true
    }
}

#####     Server configuration          #####
# Create a configuration file at /usr/local/consul/consul.d/server.json.
# Add this configuration to the server.json configuration file:
{
    "node_name": "consul-server-1",
    "bootstrap_expect": 2,
    "server": true,
    "ui": true
}

#####   HCL configuration settings     ######
#############################################
# HCL Syntax: https://github.com/hashicorp/hcl#syntax
#
#####     General configuration         #####
# Create a configuration file at /usr/local/consul/consul.d/consul.hcl
# Add this configuration to the consul.hcl configuration file:
datacenter = "dc1"
bind_addr = "{{ GetInterfaceIP \"eth0\" }}"
client_addr = "0.0.0.0"
data_dir = "/usr/local/consul/data"
retry_interval = "20s"
retry_join = ["10.114.0.59","10.114.0.60"]
enable_local_script_checks = true
log_file = "/usr/local/consul/logs/"
pid_file = "/var/run/consul.pid"
log_level = "info"
performance {
  raft_multiplier = 1
}
telemetry {
  prometheus_retention_time = "120s"
  disable_hostname = true
}
#####     Server configuration         #####

# Create a configuration file at /usr/local/consul/consul.d/server.hcl.
# Add this configuration to the server.hcl configuration file:
node_name = "consul-server-1"
bootstrap_expect = 2
server = true
ui = true

Consul配置文件参数配置

Consul-Server配置
]# cat consul.json
{
    "datacenter": "dc1",
    "client_addr": "0.0.0.0",
    "bind_addr": "{{ GetInterfaceIP \"eth0\" }}",
    "data_dir": "/usr/local/consul/data",
    "retry_interval": "20s",
    "retry_join": ["10.111.67.1","10.111.67.2","10.111.67.3","10.111.67.4","10.111.67.5"],
    "enable_local_script_checks": true,
    "log_file": "/usr/local/consul/logs/",
    "log_level": "debug",
    "enable_debug": true,
    "pid_file": "/var/run/consul.pid",
    "performance": {
        "raft_multiplier": 1
    },
    "telemetry": {
        "prometheus_retention_time": "120s",
        "disable_hostname": true
    }
}
]# cat server.json
{
    "node_name": "consul-server-1",
    "bootstrap_expect": 3,
    "server": true,
    "ui": true
}
Consul-Client配置
]# cat consul.json
{
    "datacenter": "dc1",
    "client_addr": "0.0.0.0",
    "bind_addr": "{{ GetInterfaceIP \"eth0\" }}",
    "data_dir": "/usr/local/consul/data",
    "retry_interval": "20s",
    "retry_join": ["10.111.67.1","10.111.67.2","10.111.67.3","10.111.67.4","10.111.67.5"],
    "enable_local_script_checks": true,
    "log_file": "/usr/local/consul/logs/",
    "log_level": "info",
    "pid_file": "/var/run/consul.pid",
    "performance": {
        "raft_multiplier": 1
    },
    "telemetry": {
        "prometheus_retention_time": "300s",
        "disable_hostname": true
    }
}

Consul启动Unit配置(CentOS 7)

]# cat /usr/lib/systemd/system/consul.service 
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/usr/local/consul/consul.d/consul.json

[Service]
ExecStart=/usr/local/consul/bin/consul agent -config-dir=/usr/local/consul/consul.d/
ExecReload=/usr/local/consul/bin/consul reload
KillMode=process
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

修改完Unit文件后需重新载入

~]# systemctl daemon-reload

Consul启动脚本(CentOS 6)

~]# cat /etc/init.d/consul 
#!/bin/sh
#
# consul - this script starts and stops the consul daemon
#
# chkconfig:   - 86 16
# description:  Consul is a distributed service mesh to connect, secure, \
#               and configure services across any runtime platform and public or private cloud.
# processname: consul
# config:      /usr/local/consul/consul.d/consul.json
# pidfile:     /var/run/consul.pid

# Source function library.
. /etc/rc.d/init.d/functions

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ "$NETWORKING" = "no" ] && exit 0

consul="/usr/local/consul/bin/consul"
prog=$(basename $consul)

lockfile="/var/lock/subsys/${prog}"
pidfile="/var/run/${prog}.pid"

CONSUL_CONF_DIR="/usr/local/consul/consul.d"
CONSUL_CONF_FILE="${CONSUL_CONF_DIR}/consul.json"

start() {
    [ -x $consul ] || exit 4
    [ -f $CONSUL_CONF_FILE ] || exit 5
    echo -n $"Starting $prog: "
    nohup $consul agent -config-dir=$CONSUL_CONF_DIR &>/dev/null &
    retval=$?
    if [ $retval -eq 0 ];then
        echo_success
        echo
        touch $lockfile
    else
        echo_failure
        echo         
    fi
    return $retval
}

stop() {
    echo -n $"Stopping $prog: "
    killproc -p $pidfile $prog
    retval=$?
    echo
    [ $retval -eq 0 ] && rm -f $lockfile
    return $retval
}

restart() {
    stop
    start
}

reload() {
    echo -n $"Reloading $prog: "
    #killproc -p $pidfile $prog -HUP
    echo
    $consul reload &>/dev/null
}

configtest() {
    $consul validate $CONSUL_CONF_DIR
}

rh_status() {
    status $prog
}

case "$1" in
    start)
        rh_status && exit 0
        $1
        ;;
    stop)
        rh_status || exit 0
        $1
        ;;
    restart|configtest)
        $1
        ;;
    reload)
        rh_status || exit 7
        $1
        ;;
    status)
        rh_$1
        ;;
    *)
        echo $"Usage: $0 {start|stop|reload|configtest|status|restart}"
        exit 2
esac

Consul启动

# Centos 7
~]# systemctl start consul
~]# systemctl enable consul
# CentOS 6
~]# /etc/init.d/consul start
~]# chkconfig --add consul

快照备份

经常进行快照的备份,以防集群数据丢失时用快照数据进行还原。

备份脚本

~]# cat /scripts/consul-backup.sh 
#!/bin/bash

# DATE
DATE=`date +%F_%H`

# Consul Home
CONSUL_HOME="/usr/local/consul"
CONSUL_BAK_DIR="$CONSUL_HOME/snapshot"

[ -d $CONSUL_BAK_DIR ] || mkdir -p $CONSUL_BAK_DIR
$CONSUL_HOME/bin/consul snapshot save $CONSUL_BAK_DIR/${DATE}.snap

定时任务

通过Linux定时任务定时执行

~]# crontab -l
# Consul backup         Consul          henry             2019-05-14
0 * * * * /bin/bash /scripts/consul-backup.sh >/dev/null 2>&1

Consul升级

Consul升级方案:

由于Consul关系到整个系统的正常运作,所以升级的时候还是要很小心。最好在测试环境试验多几次,再到生产环境升级。升级的状况可以归纳为下面三种,需要对号入座之后再进行升级。

◆ 特殊版本的升级。在upgrade-specific页面查看当前升级的版本是否有特殊说明。比如:0.5.1之前的版本直接升级到0.6版本,要借助工具consul-migrate进行数据迁移。

◆ 不兼容的升级。使用consul -v查看新版的向后兼容协议版本号,当出现与当前版本不兼容时,需要分两步升级。先通过参数-protocal=旧的协议版本号,把整个集群升级一次,再把启动命令中的参数-protocal去掉来重启所有节点。

◆ 标准的升级。如果上面两种情况都不是,那么恭喜你,你需要做的只是简单的标准升级。即:停止旧版本的agent,然后启动新版本的agent。PS:其实大多数情况都是标准升级。

升级节点的推荐顺序是,先升级Server的Follower节点,再升级Server的Leader节点,最后升级所有Client的节点。