etcd 是CoreOS团队于2013年6月发起的开源项目,它的目标是构建一个高可用的分布式键值数据库。etcd内部采用raft协议作为一致性算法,etcd基于Go语言实现
官方网址:
https://etcd.io
github地址:
https://github.com/etcd-io/etcd
官方硬件推荐:
https://etcd.io/docs/v3.5/op-guide/hardware
很少有 etcd 部署需要大量 CPU 容量。典型的集群需要两到四个核心才能顺利运行。重负载的 etcd 部署,每秒服务数千个客户端或数万个请求,往往受 CPU 限制,因为 etcd 可以服务来自内存的请求。如此繁重的部署通常需要八到十六个专用核心。
etcd 的内存占用相对较小,但其性能仍然取决于是否有足够的内存。etcd 服务器将积极缓存键值数据,并花费大部分其余的内存跟踪观察者。通常 8GB 就足够了。对于具有数千个观察者和数百万个键的大量部署,相应地分配 16GB 到 64GB 内存。
快速磁盘是 etcd 部署性能和稳定性的最关键因素。
慢速磁盘会增加 etcd 请求延迟并可能损害集群稳定性。由于 etcd 的共识协议依赖于将元数据持久存储到日志中,因此大多数 etcd 集群成员必须将每个请求写入磁盘。此外,etcd 还将逐步检查其状态到磁盘,以便截断此日志。如果这些写入时间过长,心跳可能会超时并触发选举,从而破坏集群的稳定性。一般来说,要判断一个磁盘对于 etcd 是否足够快,可以使用诸如fio之类的基准测试工具。在此处阅读示例。
etcd 对磁盘写入延迟非常敏感。通常需要 50 个顺序 IOPS(例如,7200 RPM 磁盘)。对于负载较重的集群,建议使用 500 顺序 IOPS(例如,典型的本地 SSD 或高性能虚拟化块设备)。请注意,大多数云提供商发布并发 IOPS 而不是顺序 IOPS;发布的并发 IOPS 可以是顺序 IOPS 的 10 倍。要测量实际的顺序 IOPS,我们建议使用磁盘基准测试工具,例如diskbench或fio。
etcd 只需要适度的磁盘带宽,但当失败的成员必须赶上集群时,更多的磁盘带宽会购买更快的恢复时间。通常 10MB/s 将在 15 秒内恢复 100MB 数据。对于大型集群,建议 100MB/s 或更高,以在 15 秒内恢复 1GB 数据。
如果可能,请使用 SSD 支持 etcd 的存储。SSD 通常比旋转磁盘提供更低的写入延迟和更少的变化,从而提高 etcd 的稳定性和可靠性。如果使用旋转磁盘,请尽可能获得最快的磁盘 (15,000 RPM)。对于旋转磁盘和 SSD,使用 RAID 0 也是提高磁盘速度的有效方法。对于至少三个集群成员,RAID 的镜像和/或奇偶校验变体是不必要的;etcd 的一致性复制已经获得了高可用性。
多成员 etcd 部署受益于快速可靠的网络。为了让 etcd 保持一致和分区容错,一个不可靠的网络分区中断将导致可用性差。低延迟确保 etcd 成员可以快速通信。高带宽可以减少恢复失败的 etcd 成员的时间。1GbE 足以满足常见的 etcd 部署。对于大型 etcd 集群,10GbE 网络将减少平均恢复时间。
尽可能在单个数据中心内部署 etcd 成员,以避免延迟开销并减少分区事件的可能性。如果需要另一个数据中心的故障域,请选择离现有数据中心更近的数据中心。另请阅读调优文档以获取有关跨数据中心部署的更多信息
基于硬件配置,可以支持的pod数量:
官方文档:
https://etcd.io/docs/v3.5/op-guide/maintenance/
etcd具有下面的属性:
注意:etcd没有配置文件,直接通过etcd.service传递:
[root@etcd01 ~]# cat /etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd #数据保存目录
ExecStart=/usr/local/bin/etcd \ #二进制文件路径
--name=etcd-192.168.17.140 \ #当前node名称
--cert-file=/etc/kubernetes/ssl/etcd.pem \ #公钥
--key-file=/etc/kubernetes/ssl/etcd-key.pem \ #私钥
--peer-cert-file=/etc/kubernetes/ssl/etcd.pem \ #连接对端的公钥
--peer-key-file=/etc/kubernetes/ssl/etcd-key.pem \ #连接对端的私钥
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \ #ca
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--initial-advertise-peer-urls=https://192.168.17.140:2380 \ #集群端口,通告自己的集群端口
--listen-peer-urls=https://192.168.17.140:2380 \ #集群之间通讯端口
--listen-client-urls=https://192.168.17.140:2379,http://127.0.0.1:2379 \ #客户端端口,
--advertise-client-urls=https://192.168.17.140:2379 \ #通告自己的客户端端口,跟api-server交互的端口
--initial-cluster-token=etcd-cluster-0 \ #创建集群使用的token,一个集群内的节点保持一致
--initial-cluster=etcd-192.168.17.140=https://192.168.17.140:2380,etcd-192.168.17.141=https://192.168.17.141:2380,etcd-192.168.17.142=https://192.168.17.142:2380 \ #集群所有节点信息
--initial-cluster-state=new \ #新建集群时为new,如果已经存在的集群为existing
--data-dir=/var/lib/etcd \ #数据目录路径
--wal-dir= \ #预写式日志路径,默认跟数据目录下
--snapshot-count=50000 \ #快照
#etcd参数优化:
--auto-compaction-retention=10 \ #数据压缩相关参数, 第一次压缩等待10小时,以后每次10小时*10%=1小时压缩一次,会导致cpu负载变高,可能会导致网络堵塞,
--auto-compaction-mode=periodic \ #周期性压缩
--max-request-bytes=10485760 \ # request size limit(请求的最大字节数,默认一个key最大1.5Mib官方推荐最大10Mib),10485760/1024/1024单个数据往etcd写入最大是多大
--quota-backend-bytes=8589934592 #storage size limit(磁盘存储空间大小限制,默认为2G,此值超过8G启动会有告警信息),
Restart=always
RestartSec=15
LimitNOFILE=65536
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
集群碎片整理(有些时候etcd工作时间比较长了,数据不是连续的需要整理,按照顺序的io):
ETCDCTL_API=3 /usr/local/bin/etcdctl defrag --cluster --endpoints=https://192.168.17.140:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem
[root@etcd01 ~]# ETCDCTL_API=3 /usr/local/bin/etcdctl defrag --cluster --endpoints=https://192.168.17.140:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem
Finished defragmenting etcd member[https://192.168.17.141:2379]
Finished defragmenting etcd member[https://192.168.17.142:2379]
Finished defragmenting etcd member[https://192.168.17.140:2379]
etcd数据存储的都是元数据,数据量也就几个G
[root@etcd01 ~]# ll /var/lib/etcd/
总用量 0
drwx------ 4 root root 29 5月 4 10:28 member
[root@etcd01 ~]# ll /var/lib/etcd/member/
总用量 0
drwx------ 2 root root 200 5月 4 11:06 snap
drwx------ 2 root root 109 5月 4 10:28 wal
[root@etcd01 ~]# ll /var/lib/etcd/member/snap/
总用量 4236
-rw-r--r-- 1 root root 7601 4月 19 01:00 0000000000000001-0000000000000003.snap
-rw-r--r-- 1 root root 10318 4月 22 17:34 0000000000000010-000000000000c354.snap
-rw-r--r-- 1 root root 10318 4月 22 23:06 0000000000000011-00000000000186a5.snap
-rw-r--r-- 1 root root 11489 5月 4 11:06 0000000000000017-00000000000249f6.snap
-rw------- 1 root root 4292608 5月 4 15:18 db
[root@etcd01 ~]# ll /var/lib/etcd/member/wal
总用量 187504
-rw------- 1 root root 64000616 4月 22 23:42 0000000000000000-0000000000000000.wal
-rw------- 1 root root 64000000 5月 4 15:18 0000000000000001-0000000000019b5d.wal
-rw------- 1 root root 64000000 5月 4 10:28 0.tmp
snap: 数据目录
wal: 预写式日志,在插入数据的时候,先写完成日志,再保存数据,如果日志没有写入成功就相当于数据未写入完成,可以通过日志恢复数
[root@etcd01 ~]# etcdctl --help
NAME:
etcdctl - A simple command line client for etcd3.
USAGE:
etcdctl [flags]
VERSION:
3.5.1
API VERSION:
3.5
COMMANDS:
alarm disarm Disarms all alarms
alarm list Lists all alarms
auth disable Disables authentication
auth enable Enables authentication
auth status Returns authentication status
check datascale Check the memory usage of holding data for different workloads on a given server endpoint.
check perf Check the performance of the etcd cluster
compaction Compacts the event history in etcd
defrag Defragments the storage of the etcd members with given endpoints
del Removes the specified key or range of keys [key, range_end)
elect Observes and participates in leader election
endpoint hashkv Prints the KV history hash for each endpoint in --endpoints
endpoint health Checks the healthiness of endpoints specified in `--endpoints` flag
endpoint status Prints out the status of endpoints specified in `--endpoints` flag
get Gets the key or a range of keys
help Help about any command
lease grant Creates leases
lease keep-alive Keeps leases alive (renew)
lease list List all active leases
lease revoke Revokes leases
lease timetolive Get lease information
lock Acquires a named lock
make-mirror Makes a mirror at the destination etcd cluster
member add Adds a member into the cluster
member list Lists all members in the cluster
member promote Promotes a non-voting member in the cluster
member remove Removes a member from the cluster
member update Updates a member in the cluster
move-leader Transfers leadership to another etcd cluster member.
put Puts the given key into the store
role add Adds a new role
role delete Deletes a role
role get Gets detailed information of a role
role grant-permission Grants a key to a role
role list Lists all roles
role revoke-permission Revokes a key from a role
snapshot restore Restores an etcd member snapshot to an etcd directory
snapshot save Stores an etcd node backend snapshot to a given file
snapshot status [deprecated] Gets backend snapshot status of a given file
txn Txn processes all the requests in one transaction
user add Adds a new user
user delete Deletes a user
user get Gets detailed information of a user
user grant-role Grants a role to a user
user list Lists all users
user passwd Changes password of user
user revoke-role Revokes a role from a user
version Prints the version of etcdctl
watch Watches events stream on keys or prefixes
OPTIONS:
--cacert="" verify certificates of TLS-enabled secure servers using this CA bundle
--cert="" identify secure client using this TLS certificate file
--command-timeout=5s timeout for short running command (excluding dial timeout)
--debug[=false] enable client-side debug logging
--dial-timeout=2s dial timeout for client connections
-d, --discovery-srv="" domain name to query for SRV records describing cluster endpoints
--discovery-srv-name="" service name to query when using DNS discovery
--endpoints=[127.0.0.1:2379] gRPC endpoints
-h, --help[=false] help for etcdctl
--hex[=false] print byte strings as hex encoded strings
--insecure-discovery[=true] accept insecure SRV records describing cluster endpoints
--insecure-skip-tls-verify[=false] skip server certificate verification (CAUTION: this option should be enabled only for testing purposes)
--insecure-transport[=true] disable transport security for client connections
--keepalive-time=2s keepalive time for client connections
--keepalive-timeout=6s keepalive timeout for client connections
--key="" identify secure client using this TLS key file
--password="" password for authentication (if this option is used, --user option shouldn't include password)
--user="" username[:password] for authentication (prompt if password is not supplied)
-w, --write-out="simple" set the output format (fields, json, protobuf, simple, table)
export NODE_IPS="192.168.17.140 192.168.17.141 192.168.17.142"
for ip in ${NODE_IPS};do ETCDCTL_API=3 /usr/local/bin/etcdctl --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health;done
[root@etcd01 ~]# export NODE_IPS="192.168.17.140 192.168.17.141 192.168.17.142"
[root@etcd01 ~]# for ip in ${NODE_IPS};do ETCDCTL_API=3 /usr/local/bin/etcdctl --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health;done
https://192.168.17.140:2379 is healthy: successfully committed proposal: took = 6.786219ms
https://192.168.17.141:2379 is healthy: successfully committed proposal: took = 10.078909ms
https://192.168.17.142:2379 is healthy: successfully committed proposal: took = 8.635907ms
以表格方式显示节点详细信息:
export NODE_IPS="192.168.17.140 192.168.17.141 192.168.17.142"
for ip in ${NODE_IPS};do ETCDCTL_API=3 /usr/local/bin/etcdctl --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health;done
#添加--write-out参数以表格方式显示详细信息
[root@etcd01 ~]# export NODE_IPS="192.168.17.140 192.168.17.141 192.168.17.142"
[root@etcd01 ~]# for ip in ${NODE_IPS};do ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table endpoint status --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health;done
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.17.140:2379 | f0b00c5fba82edde | 3.5.1 | 2.0 MB | false | false | 24 | 196500 | 196500 | |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.17.141:2379 | 31b79580a6603995 | 3.5.1 | 1.9 MB | true | false | 24 | 196500 | 196500 | |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.17.142:2379 | bd6bb6e56a019be8 | 3.5.1 | 2.0 MB | false | false | 24 | 196500 | 196500 | |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
# IS LEADER 是否是主节点,一般是主节点写,然后同步到其他节点
#IS LEARNER 是否再同步
[root@etcd01 ~]# etcdctl get / --prefix --keys-only
[root@etcd01 ~]# etcdctl get / --prefix --keys-only |grep nginx
/calico/resources/v3/projectcalico.org/workloadendpoints/default/node02-k8s-nginx-eth0
/registry/deployments/test/nginx-deployment
/registry/pods/default/nginx
/registry/services/specs/test/nginx-service
[root@etcd01 ~]#
[root@etcd01 ~]# etcdctl get /registry/pods/default/nginx
/registry/pods/default/nginx
k8s
v1Pod
nginxdefault"*$63501c8d-8349-4b93-b8ae-61384447b83f2ޕ퐆Z
runnginxz²
kubectl-runUpdatevޕ퐆FieldsV1:
metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"nginx\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}BĄ
Go-http-clientUpdatev¥¶¯FieldsV1:
{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:phase":{},"f:podIP":{},"f:podIPs":{".":{},"k:{\"ip\":\"10.200.140.73\"}":{".":{},"f:ip":{}}},"f:startTime":{}}}Bstatus
kube-api-access-rk8zmkЁh
"
token
(&
kube-root-ca.crt
ca.crtca.crt
)'
%
namespace
v1metadata.namespace¤
nginxnginx*BJL
kube-api-access-rk8zm-/var/run/secrets/kubernetes.io/serviceaccount"2j/dev/termination-logrAlways¢FileAlways 2
ClusterFirstBdefaultJdefaultR192.168.17.151X`hrdefault-scheduler²6
node.kubernetes.io/not-readyExists" NoExecute(¬²8
node.kubernetes.io/unreachableExists" NoExecute(¬ƁPreemptLowerPriority±
Running#
InitializedTruޕ퐆*2
ReadyTru*2'
ContainersReadyTru*2$
10.200.140.7ޕ퐆BĂ.*2"*192.168.17.1512
nginx
oÿError:Idocker://3a2f801bd7810b187320749d3d0d365d5020ef626dc127ea8660bb73d88d97a8 (2
nginx:latest:_docker-pullable://nginx@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31BIdocker://3eb9468b2bc7dfd62f0ba9588dce1e8a996a5884de7a0173c7cbe44fcbab7e54HJ
BestEffortZb
10.200.140.73"
[root@etcd01 ~]# etcdctl del /registry/pods/default/nginx
1
[root@master01 ssl]# kubectl get pods
NAME READY STATUS RESTARTS AGE
dujie-test1 1/1 Running 1 (20h ago) 11d
nginx 1/1 Running 2 (5h45m ago) 17d
[root@master01 ssl]#
[root@master01 ssl]# kubectl get pods
NAME READY STATUS RESTARTS AGE
dujie-test1 1/1 Running 1 ( ago) 7d7h
#在etcd集群的node1上watch一个key,没有此key也可以执行watch,后期可以再创建
[root@etcd01 ~]# ETCDCTL_API=3 etcdctl watch /data
#在etcd node2上修改数据,验证etcd node1是否能够发现数据变化
[root@etcd02 ~]# ETCDCTL_API=3 etcdctl put /data "data v1"
OK
[root@etcd02 ~]# ETCDCTL_API=3 etcdctl put /data "data v2"
OK
[root@etcd01 ~]# ETCDCTL_API=3 etcdctl watch /data
PUT
/data
data v1
PUT
/data
data v2