etcd 集群部署

Etcd 是 CoreOS 基于 Raft 开发的分布式 key-value 存储,可用于服务发现、共享配置以及一致性保障(如数据库选主、分布式锁等)。

Etcd 主要功能

基本的 key-value 存储
监听机制
key 的过期及续约机制,用于监控和服务发现
原子 CAS 和 CAD,用于分布式锁和 leader 选举

环境准备

操作系统 IP 地址 HostName
CentOS7.x-86_x64 10.0.52.13 k8s.master
CentOS7.x-86_x64 10.0.52.14 k8s.node1
CentOS7.x-86_x64 10.0.52.6 k8s.node2

etcd 部署

官方文档

https://github.com/etcd-io/etcd

下载地址

https://github.com/etcd-io/etcd/releases/download/v3.3.10/etcd-v3.3.10-linux-amd64.tar.gz

安装必要软件

本次实验你将会安装一些实用的命令行工具,这包括 cfssl、cfssljson

安装 CFSSL

cfssl 下载地址:
链接:https://pan.baidu.com/s/1NVnjpgnO7sTBf4frvuNWKA
提取码:dawo

chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

验证 cfssl 的版本为 1.2.0 或是更高

[root@k8s ~]# cfssl version
Version: 1.2.0
Revision: dev
Runtime: go1.6

配置 CA 并创建 TLS 证书

我们将使用 CloudFlare's PKI 工具 cfssl 来配置 PKI Infrastructure,然后使用它去创建 Certificate Authority(CA), 并为 etcd创建 TLS 证书。

  • 创建目录
mkdir /opt/etcd/{bin,cfg,ssl} -p
cd /opt/etcd/ssl/
  • etcd ca配置
cat << EOF | tee ca-config.json
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "etcd": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF
  • etcd ca证书
cat << EOF | tee ca-csr.json
{
    "CN": "etcd CA",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing"
        }
    ]
}
EOF
  • 生成 CA 凭证和私钥:
cfssl gencert -initca ca-csr.json | cfssljson -bare ca 

[root@k8s ssl]# ls
ca-config.json  ca-csr.json
[root@k8s ssl]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2019/05/23 10:17:05 [INFO] generating a new CA key and certificate from CSR
2019/05/23 10:17:05 [INFO] generate received request
2019/05/23 10:17:05 [INFO] received CSR
2019/05/23 10:17:05 [INFO] generating key: rsa-2048
2019/05/23 10:17:06 [INFO] encoded CSR
2019/05/23 10:17:06 [INFO] signed certificate with serial number 20205354135917049449942525304158394005878373639
[root@k8s ssl]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem
[root@k8s ssl]# 

结果将生成以下两个文件:

ca-key.pem
ca.pem
  • etcd server证书
cat << EOF | tee server-csr.json
{
    "CN": "etcd",
    "hosts": [
    "10.0.52.14",
    "10.0.52.13",
    "10.0.52.6"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing"
        }
    ]
}
EOF
  • 生成server证书:
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd server-csr.json | cfssljson -bare server

[root@k8s ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd server-csr.json | cfssljson -bare server
2019/05/23 10:23:47 [INFO] generate received request
2019/05/23 10:23:47 [INFO] received CSR
2019/05/23 10:23:47 [INFO] generating key: rsa-2048
2019/05/23 10:23:47 [INFO] encoded CSR
2019/05/23 10:23:47 [INFO] signed certificate with serial number 199680945753638449140210062880512592204940964699
2019/05/23 10:23:47 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
[root@k8s ssl]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem  server.csr  server-csr.json  server-key.pem  server.pem
[root@k8s ssl]# 

结果将生成以下两个文件:

server-key.pem
server.pem
  • etcd安装
  1. 解压缩
[root@k8s ~]# tar zxf etcd-v3.3.10-linux-amd64.tar.gz 
[root@k8s ~]# cd etcd-v3.3.10-linux-amd64
[root@k8s etcd-v3.3.10-linux-amd64]# ls
Documentation  etcd  etcdctl  README-etcdctl.md  README.md  READMEv2-etcdctl.md
[root@k8s etcd-v3.3.10-linux-amd64]# cp etcd etcdctl /opt/etcd/bin/
[root@k8s etcd-v3.3.10-linux-amd64]# 

  1. 配置etcd主文件
cat << EOF | tee /opt/etcd/cfg/etcd.conf
#[Member]
ETCD_NAME="etcd01"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.0.52.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.0.52.13:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.0.52.13:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.0.52.13:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://10.0.52.13:2380,etcd02=https://10.0.52.14:2380,etcd03=https://10.0.52.6:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

#[Security]
ETCD_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_PEER_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_PEER_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"
EOF
  1. 配置etcd启动文件
cat << EOF | tee /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--name=\${ETCD_NAME} \
--data-dir=\${ETCD_DATA_DIR} \
--listen-peer-urls=\${ETCD_LISTEN_PEER_URLS} \
--listen-client-urls=\${ETCD_LISTEN_CLIENT_URLS} \
--advertise-client-urls=\${ETCD_ADVERTISE_CLIENT_URLS} \
--initial-advertise-peer-urls=\${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
--initial-cluster=\${ETCD_INITIAL_CLUSTER} \
--initial-cluster-token=\${ETCD_INITIAL_CLUSTER_TOKEN} \
--initial-cluster-state=\${ETCD_INITIAL_CLUSTER_STATE}  \
--cert-file=\${ETCD_CERT_FILE} \
--key-file=\${ETCD_KEY_FILE} \
--peer-cert-file=\${ETCD_PEER_CERT_FILE} \
--peer-key-file=\${ETCD_PEER_KEY_FILE} \
--trusted-ca-file=\${ETCD_TRUSTED_CA_FILE} \
--client-cert-auth=\${ETCD_CLIENT_CERT_AUTH} \
--peer-client-cert-auth=\${ETCD_PEER_CLIENT_CERT_AUTH} \
--peer-trusted-ca-file=\${ETCD_PEER_TRUSTED_CA_FILE}
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
  1. 启动
systemctl daemon-reload
systemctl enable etcd
systemctl restart etcd
  1. 其他节点重复如上操作,注意启动前etcd02、etcd03同样配置如下
etcd02
[root@k8s cfg]# cat etcd.conf 
#[Member]
ETCD_NAME="etcd02"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.0.52.14:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.0.52.14:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.0.52.14:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.0.52.14:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://10.0.52.13:2380,etcd02=https://10.0.52.14:2380,etcd03=https://10.0.52.6:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

#[Security]
ETCD_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_PEER_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_PEER_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"

etcd03
[root@k8s cfg]# cat etcd.conf 
#[Member]
ETCD_NAME="etcd03"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.0.52.6:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.0.52.6:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.0.52.6:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.0.52.6:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://10.0.52.13:2380,etcd02=https://10.0.52.14:2380,etcd03=https://10.0.52.6:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

#[Security]
ETCD_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_PEER_CERT_FILE="/opt/etcd/ssl/server.pem"
ETCD_PEER_KEY_FILE="/opt/etcd/ssl/server-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/opt/etcd/ssl/ca.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"

  1. 服务检查
[root@k8s cfg]# /opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/ca.pem --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --endpoints="https://10.0.52.13:2379,https://10.0.52.14:2379,https://10.0.52.6:2379" cluster-health
member 11babd38de9e1f0f is healthy: got healthy result from https://10.0.52.13:2379
member 22436a037c5adb3b is healthy: got healthy result from https://10.0.52.14:2379
member a5e80429e983b681 is healthy: got healthy result from https://10.0.52.6:2379
cluster is healthy


[root@k8s cfg]# /opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/ca.pem --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --endpoints="https://10.0.52.13:2379,https://10.0.52.14:2379,https://10.0.52.6:2379" member list
11babd38de9e1f0f: name=etcd01 peerURLs=https://10.0.52.13:2380 clientURLs=https://10.0.52.13:2379 isLeader=false
22436a037c5adb3b: name=etcd02 peerURLs=https://10.0.52.14:2380 clientURLs=https://10.0.52.14:2379 isLeader=true
a5e80429e983b681: name=etcd03 peerURLs=https://10.0.52.6:2380 clientURLs=https://10.0.52.6:2379 isLeader=false

可能遇到的问题

  1. 各个主机应该关闭firewalld服务,否则会出现类似提示无法获取某个节点健康状态的提示:
[root@k8s cfg]#  /opt/etcd/bin/etcdctl --ca-file=/opt/etcd/ssl/ca.pem --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --endpoints="https://10.0.52.13:2379,https://10.0.52.14:2379,https://10.0.52.6:2379" cluster-health
member 11babd38de9e1f0f is healthy: got healthy result from https://10.0.52.13:2379
failed to check the health of member 22436a037c5adb3b on https://10.0.52.14:2379: Get https://10.0.52.14:2379/health: dial tcp 10.0.52.14:2379: i/o timeout
member 22436a037c5adb3b is unreachable: [https://10.0.52.14:2379] are all unreachable
member a5e80429e983b681 is healthy: got healthy result from https://10.0.52.6:2379
  1. request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
[root@k8s cfg]# journalctl -u etcd -f
-- Logs begin at Thu 2019-05-23 14:29:05 CST. --
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:09 k8s.master etcd[13366]: publish error: etcdserver: request timed out
May 23 15:59:10 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:10 k8s.master etcd[13366]: request sent was ignored (cluster ID mismatch: peer[102b996c4aa7e55a]=4fb7ed98f0f6d1a7, local=4c0b05dc1b530742)
May 23 15:59:10 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:10 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:10 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)
May 23 15:59:10 k8s.master etcd[13366]: request cluster ID mismatch (got 4fb7ed98f0f6d1a7 want 4c0b05dc1b530742)

解决办法: 删除缓存,并重启etcd
[root@k8s cfg]# rm -rf /var/lib/etcd/default.etcd
[root@k8s cfg]# systemctl restart etcd

  1. rejected connection from "10.0.52.14:45458" (error "x509: certificate is valid for 10.0.52.12, 10.0.52.13, 10.0.52.6, not 10.0.52.14", ServerName "", IPAddresses ["10.0.52.12" "10.0.52.13" "10.0.52.6"], DNSNames [])
出现 x509的错误,请检查证书配置,上面的错误是,证书设置成了10.0.52.12,改成10.0.52.14即可.修改之后
systemctl daemon-reload
systemctl restart etcd

谢谢

你可能感兴趣的:(etcd 集群部署)