摘要:本文深入讲解使用CFSSL工具签发TLS证书,并部署生产级高可用ETCD集群的完整流程。涵盖证书全生命周期管理、集群配置优化及安全加固方案,适用于Kubernetes、分布式系统等场景。
节点IP | 角色 | 主机名 | 证书SAN扩展 |
---|---|---|---|
192.167.14.228 | ETCD Master | etcd-1 | IP:228,229,246 |
192.167.14.229 | ETCD Backup | etcd-2 | DNS:etcd-cluster |
192.167.14.246 | ETCD Backup | etcd-3 |
端口 | 协议 | 用途 |
---|---|---|
2379 | HTTPS | 客户端通信 |
2380 | HTTPS | 节点间Peer通信 |
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 \
https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 \
https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl* && mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
mkdir -p ~/etcd_tls && cd ~/etcd_tls
# CA配置文件
cat > ca-config.json <<EOF
{
"signing": {
"default": {"expiry": "876000h"},
"profiles": {
"kubernetes": {
"expiry": "876000h",
"usages": ["signing", "key encipherment", "server auth", "client auth"]
}
}
}
}
EOF
# CA CSR请求文件
cat > ca-csr.json <<EOF
{
"CN": "Kubernetes",
"key": {"algo": "rsa", "size": 2048},
"names": [{"C": "CN", "L": "Xi'an", "O": "k8s", "OU": "Cluster"}]
}
EOF
# 生成CA证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"192.167.14.228",
"192.167.14.229",
"192.167.14.246",
"etcd-cluster.local"
],
"key": {"algo": "rsa", "size": 2048},
"names": [{"C": "CN", "L": "Xi'an", "O": "k8s", "OU": "ETCD"}]
}
EOF
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
-config=ca-config.json -profile=kubernetes \
etcd-csr.json | cfssljson -bare etcd
ETCD_VER=v3.5.9
wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar -zxvf etcd-${ETCD_VER}-linux-amd64.tar.gz
mkdir -p /opt/etcd/{bin,cfg,ssl}
mv etcd-${ETCD_VER}-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
cat > /opt/etcd/cfg/etcd.conf <<EOF
[Member]
name = "etcd-1"
data-dir = "/var/lib/etcd"
listen-peer-urls = "https://192.167.14.228:2380"
listen-client-urls = "https://192.167.14.228:2379,https://127.0.0.1:2379"
[Cluster]
initial-advertise-peer-urls = "https://192.167.14.228:2380"
advertise-client-urls = "https://192.167.14.228:2379"
initial-cluster = "etcd-1=https://192.167.14.228:2380,etcd-2=https://192.167.14.229:2380,etcd-3=https://192.167.14.246:2380"
initial-cluster-token = "etcd-cluster"
initial-cluster-state = "new"
EOF
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=ETCD KeyValue Store
Documentation=https://etcd.io
After=network.target
[Service]
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--cert-file=/opt/etcd/ssl/etcd.pem \
--key-file=/opt/etcd/ssl/etcd-key.pem \
--peer-cert-file=/opt/etcd/ssl/etcd.pem \
--peer-key-file=/opt/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now etcd
ETCDCTL_API=3 /opt/etcd/bin/etcdctl \
--cacert=/opt/etcd/ssl/ca.pem \
--cert=/opt/etcd/ssl/etcd.pem \
--key=/opt/etcd/ssl/etcd-key.pem \
--endpoints="https://192.167.14.228:2379,https://192.167.14.229:2379,https://192.167.14.246:2379" \
endpoint health --write-out=table
预期输出:
+---------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+---------------------------+--------+-------------+-------+
| https://192.167.14.228:2379 | true | 14.567345ms | |
| https://192.167.14.229:2379 | true | 15.234512ms | |
| https://192.167.14.246:2379 | true | 16.789123ms | |
+---------------------------+--------+-------------+-------+
# 启用客户端证书认证
--client-cert-auth=true
# 定期轮换证书(每年)
openssl x509 -in /opt/etcd/ssl/etcd.pem -noout -dates
# 调整后端存储配额
--quota-backend-bytes=8589934592 # 8GB
# 优化日志配置
--log-level=warn
--logger=zap
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.167.14.0/24" port port="2379-2380" protocol="tcp" accept'
firewall-cmd --reload
现象 | 排查命令 | 解决方案 |
---|---|---|
节点无法加入集群 | journalctl -u etcd -f |
检查证书SAN与节点IP是否匹配 |
客户端连接超时 | telnet |
验证防火墙和SELinux策略 |
存储空间不足 | du -sh /var/lib/etcd/member/ |
清理快照或扩容存储 |
证书过期 | cfssl-certinfo -cert etcd.pem |
重新签发证书并滚动重启集群 |
扩展工具推荐:
通过本文,您已掌握企业级ETCD集群的构建与维护技能。建议定期进行灾难恢复演练确保集群高可用!
如果本教程帮助您解决了问题,请点赞❤️收藏⭐支持!欢迎在评论区留言交流技术细节!欲了解密码学知识,请订阅《密码学实战》专栏 → 密码学实战