《kubernetes 1.8.0 测试环境安装部署》
时间:2017-12-11
在做zookeeper
这个样例之前,至少需要明白 StatefulSets, PodDisruptionBudgets,PodAntiAffinity,pv,pvc,storageclass等这些概念:
相关官网的doc:
后续再文档中我也会做适当的说明:
Dynamic Volume Provisioning
特性,即,需配置了storageclass
。Dynamic Volume Provisioning
,则在创建StatefulSets
之前需要预先创建好对应的pvc
(20G)并bound。(特别说明:statefulset
用的是template
方式创建pvc,所以对应的pvc
名字是固定的,采用statefulsetname-volumename-number
方式命名,比如statefulset
名为zk,volume名字为datadir,rc为3:则需提前创建3个30G的pvc,名字分别为zk-datadir-0、zk-datadir-1、zk-datadir-2) Dynamic Volume Provisioning
特性,之前已经做了本例中做部分修改:—> 《kubernetes-1.8.0》15-addon-vSphere Cloud Provider创建缺省的storageclass
,即没有指定storageclassname
的请求都从这个storageclass
中划分资源:
创建缺省storageclass
:
default-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: default
annotations: {
"storageclass.kubernetes.io/is-default-class" : "true"
}
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
datastore: local_datastore_47
reclaimPolicy: Retain
annotations
: 需指定storageclass.kubernetes.io/is-default-class
为true
,表明本sc 为 default sc:开启apiserver支持:
在三台master上修改/etc/kubernetes/apiserver配置文件,ADMISSION_CONTROL字段中加入DefaultStorageClass字段:
重启apiserver:
$ systemctl daemon-reload
$ systemctl restart kube-apiserver
应用yaml:
$ kubectl create -f default-sc.yaml
查看所创建的StorageClass:
[root@node-131 zookeeper]# kubectl get sc
NAME PROVISIONER
default (default) kubernetes.io/vsphere-volume
fast kubernetes.io/vsphere-volume
至此基础环境大致完成了,这里多提醒一句,在做vsphere Volume Provisioning
尽量使用共享数据存储,如果使用主机数据存储那必须保证所有虚拟机在同一个数据存储之上,否则后续会导致部分pod无法挂载pvc的情况出现;
zookeeper.yaml
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
ports:
- port: 2181
name: client
selector:
app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 3
updateStrategy:
type: RollingUpdate
podManagementPolicy: Parallel
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: kubernetes-zookeeper
imagePullPolicy: Always
image: "gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
readinessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
securityContext:
runAsUser: 1000
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
service
,一个PodDisruptionBudget
,一个StatefulSet
clusterIP: None
:表明是 Headless Service
,statefulset
用固定的hostname进行解析(podname.servicename.namespace.svc.cluster.local.)
,继而绕过clusterIP
的虚地址负载均衡,直接访问到pod。PodDisruptionBudget
:是一个保护机制,其中确保了某个tag的pod实例的最小个数,因为想zk或者etcd这样的群集系统,为了防止管理员误操作或者autoscale缩容等使得pod实例数不满足最小条件,导致zk群集故障。比如该例子中设置标签是app: zk
的pod最大不可用数maxUnavailable
为1
。此时如果delete某一个pod是ok的,但是在这个pod恢复Running之前还想再delete一个则将被拒绝,因为….maxUnavailable
为1
;StatefulSet
: podManagementPolicy: Parallel
:表示同时启动或者终止所有的pod,缺省为OrderedReady
表明必须按照顺序 0 ~ N-1启动spec.affinity
: 这个部分涉及亲和性部署,其中spec.affinity.podAntiAffinity
:表明是反亲和性部署requiredDuringSchedulingIgnoredDuringExecution
表明了是hard方式,也就是必须遵循反亲和性部署原则;后续labelSelector
说明了只要看到某个节点上的pods中 app:这个标签中有 zk的都不在这个node上创建该pod。 volumeClaimTemplates
:写的是一个pvc的申请模板,会自动的为每个pod申请10G的pvc;StorageClass
,需在volumeClaimTemplates
中指定storageClassName
:声明从哪个StorageClass
中创建pvc。加载yaml
$ kubectl apply -f zookeeper.yaml
查看pvc是否创建并bound
[root@node-131 zookeeper]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
...
datadir-zk-0 Bound pvc-fc2918cf-dd83-11e7-8e94-005056bc80ed 10Gi RWO default 18h
datadir-zk-1 Bound pvc-fc2ac19e-dd83-11e7-8e94-005056bc80ed 10Gi RWO default 18h
datadir-zk-2 Bound pvc-fc2c2889-dd83-11e7-8e94-005056bc80ed 10Gi RWO default 18h
...
查看pod运行情况
kubectl get pods -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 0/1 Pending 0 0s
zk-0 0/1 Pending 0 0s
zk-0 0/1 ContainerCreating 0 0s
zk-0 0/1 Running 0 19s
zk-0 1/1 Running 0 40s
zk-1 0/1 Pending 0 0s
zk-1 0/1 Pending 0 0s
zk-1 0/1 ContainerCreating 0 0s
zk-1 0/1 Running 0 18s
zk-1 1/1 Running 0 40s
zk-2 0/1 Pending 0 0s
zk-2 0/1 Pending 0 0s
zk-2 0/1 ContainerCreating 0 0s
zk-2 0/1 Running 0 19s
zk-2 1/1 Running 0 40s
直到zk-0 ~ zk-2 running完成后,基本搭建工作就算完成了:
查看主机名:
[root@node-131 zookeeper]# for i in 0 1 2; do kubectl exec zk-$i -- hostname; done
zk-0
zk-1
zk-2
查看三个pod的myid
[root@node-131 zookeeper]# for i in 0 1 2; do echo "myid zk-$i";kubectl exec zk-$i -- cat /var/lib/zookeeper/data/myid; done
myid zk-0
1
myid zk-1
2
myid zk-2
3
查看三个pod的FQDN名
[root@node-131 zookeeper]# for i in 0 1 2; do kubectl exec zk-$i -- hostname -f; done
zk-0.zk-hs.default.svc.cluster.local.
zk-1.zk-hs.default.svc.cluster.local.
zk-2.zk-hs.default.svc.cluster.local.
查看zookeeper的配置
[root@node-131 zookeeper]# kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg
#This file was autogenerated DO NOT EDIT
clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/data/log
tickTime=2000
initLimit=10
syncLimit=5
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
autopurge.snapRetainCount=3
autopurge.purgeInteval=12
server.1=zk-0.zk-hs.default.svc.cluster.local.:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local.:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local.:2888:3888
测试zookeeper组件功能
在一个节点上创建数据,从另一个节点读取:
zk-0上创建键值对hello/world
[root@node-131 zookeeper]# kubectl exec zk-0 zkCli.sh create /hello world
Connecting to localhost:2181
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
Node already exists: /hello
zk-1上读取键值hello的value
[root@node-131 zookeeper]# kubectl exec zk-1 zkCli.sh get /hello
Connecting to localhost:2181
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
world
cZxid = 0x100000002
ctime = Sun Dec 10 11:24:11 UTC 2017
mZxid = 0x100000002
mtime = Sun Dec 10 11:24:11 UTC 2017
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
测试数据持久
删除statefulset:
[root@node-131 zookeeper]# kubectl delete statefulset zk
statefulset "zk" deleted
重新创建statefulset:
[root@node-131 zookeeper]# kubectl apply -f zookeeper.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
service "zk-hs" configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
service "zk-cs" configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
poddisruptionbudget "zk-pdb" configured
statefulset "zk" created
[root@node-131 zookeeper]# kubectl get pods -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 0/1 ContainerCreating 0 12s
zk-1 0/1 ContainerCreating 0 12s
zk-2 0/1 ContainerCreating 0 12s
zk-2 0/1 Running 0 14s
zk-1 0/1 Running 0 15s
zk-1 1/1 Running 0 29s
zk-2 1/1 Running 0 31s
在zk-2上查看是否还能get到 hello这个键值:
[root@node-131 zookeeper]# kubectl exec zk-2 zkCli.sh get /hello
Connecting to localhost:2181
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
world
cZxid = 0x100000002
ctime = Sun Dec 10 11:24:11 UTC 2017
mZxid = 0x100000002
mtime = Sun Dec 10 11:24:11 UTC 2017
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
还能get到 说明,statefulset删除后数据有持久化。因为在创建statefulset的时候就已经动态创建了3个pvc,当statefulset被reschedule的时候,还是依然将这3个pvc挂载到对应的目录下(/var/lib/zookeeper)
验证配置是否持久化:
$ kubectl get sts zk -o yaml
...
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
...
查看日志相关配置:
[root@node-131 ~]# kubectl exec zk-0 cat /usr/etc/zookeeper/log4j.properties
zookeeper.root.logger=CONSOLE
zookeeper.console.threshold=INFO
log4j.rootLogger=${zookeeper.root.logger}
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n
zkGenConfig.sh
脚本生成,控制zookeeper的日志。并通过时间和大小进行滚动(lograte)查看相关安全上下文:
在之前的yaml中写到:
securityContext:
runAsUser: 1000
fsGroup: 1000
这段表明使用非特权用户运行zookeeper进程。即,在pod中的container里,UID 1000对应zookeeper用户 GID 1000对应zookeeper组。
查看zookeeper进程:
用的是zookeeper用户执行的进程而非root用户。
同样,在缺省情况下,pod的pv是挂载至zookeeper server的数据目录,只允许root用户访问。上述上下文的配置能够支持zookeeper进程访问相应的数据目录:
查看数据目录权限:
[root@node-131 ~]# kubectl exec -ti zk-0 -- ls -ld /var/lib/zookeeper/data
drwxrwsr-x 4 zookeeper zookeeper 4096 Dec 10 08:27 /var/lib/zookeeper/data
可以看到数据目录的属组 属主为zookeeper,这是因为在fsGroup中指定1000,则此时pod的pv的所有者将自动设置为zookeeper组,如此一来zookeeper进程就有了访问数据目录的权限;
管理zookeeper进程:
在zookeeper官方文档中有提到,“可能需要一个监督管理进程来检查群集中所有zookeeper server进程的状态,在一个分布式环境中能够及时重启失败的进程”,在k8s环境中可以利用kubernetes作为watchdog代替外部的工具。
在线升级zookeeper组件
在之前的yaml文件中已经制定了update方式:
updateStrategy:
type: RollingUpdate
可以使用kubectl path 的方式更新pod的cpu数(0.5降低为0.3):
kubectl patch sts zk --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value":"0.3"}]'
statefulset "zk" patched
用kubectl rollout status 查看update过程:
[root@node-131 ~]# kubectl rollout status sts/zk
Waiting for 1 pods to be ready...
Waiting for 1 pods to be ready...
waiting for statefulset rolling update to complete 1 pods at revision zk-7c9f9fc76b...
Waiting for 1 pods to be ready...
Waiting for 1 pods to be ready...
waiting for statefulset rolling update to complete 2 pods at revision zk-7c9f9fc76b...
Waiting for 1 pods to be ready...
Waiting for 1 pods to be ready...
statefulset rolling update complete 3 pods at revision zk-7c9f9fc76b...
查看rollout history:
[root@node-131 ~]# kubectl rollout history sts/zk
statefulsets "zk"
REVISION
1
2
验证升级效果:
[root@node-131 ~]# kubectl get pod zk-0 -o yaml
...
resources:
requests:
cpu: 300m
memory: 1Gi
...
关于升级的回滚:
[root@node-131 ~]# kubectl rollout undo sts/zk
statefulset "zk" rolled back
验证回滚结果:
[root@node-131 ~]# kubectl rollout history sts/zk
statefulsets "zk"
REVISION
2
3
[root@node-131 ~]# kubectl get pod zk-0 -o yaml
...
resources:
requests:
cpu: 500m
memory: 1Gi
...
关于失败进程的处理:
之前已经提过了,在kubernetes群集中不需要在部署特别的外部工具,因为kubernetes自带健康检查以及相应的恢复策略,比如本statefulset中Restart Policies 为 Always,因此只要判定检查检查失败就会restart the pod;
测试一下,查看zk进程:
[root@node-131 ~]# kubectl exec zk-0 -- ps -ef
UID PID PPID C STIME TTY TIME CMD
zookeep+ 1 0 0 08:02 ? 00:00:00 sh -c start-zookeeper --servers=3 --data_dir=/var/lib/zookeeper/data --data_log_dir=/var/lib/zookeeper/data/log --conf_dir=/opt/zookeeper/conf --client_port=2181 --election_port=3888 --server_port=2888 --tick_time=2000 --init_limit=10 --sync_limit=5 --heap=512M --max_client_cnxns=60 --snap_retain_count=3 --purge_interval=12 --max_session_timeout=40000 --min_session_timeout=4000 --log_level=INFO
zookeep+ 7 1 0 08:02 ? 00:00:01 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,CONSOLE -cp /usr/bin/../build/classes:/usr/bin/../build/lib/*.jar:/usr/bin/../share/zookeeper/zookeeper-3.4.10.jar:/usr/bin/../share/zookeeper/slf4j-log4j12-1.6.1.jar:/usr/bin/../share/zookeeper/slf4j-api-1.6.1.jar:/usr/bin/../share/zookeeper/netty-3.10.5.Final.jar:/usr/bin/../share/zookeeper/log4j-1.2.16.jar:/usr/bin/../share/zookeeper/jline-0.9.94.jar:/usr/bin/../src/java/lib/*.jar:/usr/bin/../etc/zookeeper: -Xmx512M -Xms512M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/bin/../etc/zookeeper/zoo.cfg
在一个节点上开启watch检查,zk相关pod状态,在另一个节点kill掉zk-0 的zookeeper进程:
[root@node-131 ~]# kubectl exec zk-0 -- pkill java
在node.132上查看到的信息:
[root@node-132 ~]# kubectl get pod -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 0 8m
zk-1 1/1 Running 0 9m
zk-2 1/1 Running 0 11m
zk-0 0/1 Error 0 8m
zk-0 0/1 Running 1 8m
zk-0 1/1 Running 1 8m
测试健康检查:
如果只是根据进程是否存活来判断群集是否健康这显然是不够的,还有很多健康检查的方案可以判定系统进程存活但无法响应或者不健康。我们可以使用liveness
探针,来通知kubernetes你的应用是否不健康应该重启;
本例中的 liveness probe
如下:
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
这个探针用了一个很简单的脚本,用ruok这四个字母,测试服务的健康度
#!/usr/bin/env bash
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# zkOk.sh uses the ruok ZooKeeper four letter work to determine if the instance
# is health. The $? variable will be set to 0 if server responds that it is
# healthy, or 1 if the server fails to respond.
OK=$(echo ruok | nc 127.0.0.1 $1)
if [ "$OK" == "imok" ]; then
exit 0
else
exit 1
进到zk-0里验证一下(给ruok恢复imok…说明正常):
捣乱,在一个节点上删除zookeeper-ready,在另一个节点上查看状态:
[root@node-131 ~]# kubectl exec zk-0 -- rm /opt/zookeeper-3.4.10/bin/zookeeper-ready
在node.132上查看:
[root@node-132 ~]# kubectl get pod -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 1 28m
zk-1 1/1 Running 0 29m
zk-2 1/1 Running 0 30m
zk-0 0/1 Running 1 28m
zk-0 0/1 Running 2 29m
zk-0 1/1 Running 2 29m
zk-0,自动重启了,虽然它的zk进程并没问题:
除了用liveness
还可以用Readiness
:
readinessProbe:
exec:
command:
- "zookeeper-ready 2181"
initialDelaySeconds: 15
timeoutSeconds: 5
liveness
:该探针,判断何时重启pod,关乎于是否活着Readiness
:该探针,判断何时到导入流量到当前pod,因为有可能进程正常,但是需要读大量的配置文件或者数据,在此阶段pod是健康的但是在加载完数据前不应该导流量到当前pod对于node故障的容忍度:
对于一个三个节点的zk群集来说,至少需要有2个zk server运行才能保证群集健康。为了避免误操作或者pod分布不合理,需要合理的规划(亲和性部署)以及PDB;
先来看本例中用到的反亲和性部署(podAntiAffinity):
查看zk目前部署在哪些节点之上:
[root@node-132 ~]# for i in 0 1 2; do kubectl get pod zk-$i --template {{.spec.nodeName}}; echo ""; done
node.132
node.131
node.134
为什么三个pod平均分布在三个节点上?因为这段:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk-headless
topologyKey: "kubernetes.io/hostname"
requiredDuringSchedulingIgnoredDuringExecution
:这条说明了告诉kube-scheduler
永远不要部署两个zk-headless服务的pod在同一个topologyKey定义的域内;topologyKey kubernetes.io/hostname
:定义了这个域为一个独立的node;接下来测试将一个node cordon并且 drain,并测试对群集的影响:
之前看了zk的3个pod,分别跑在node.131、node.132、node.134上:
将node.134 cordon
$ kubectl cordon node.134
查看pdb状态:
[root@node-132 ~]# kubectl get pdb zk-pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
zk-pdb N/A 1 1 1d
max-unavailable
:说明了zk这个statefulset最大不用的pod数量为1
;在一个终端watch状态:
$ kubectl get pods -w -l app=zk
在另一个终端检查pod当前部署的节点:
[root@node-132 ~]# for i in 0 1 2; do kubectl get pod zk-$i --template {{.spec.nodeName}}; echo ""; done
node.132
node.131
node.134
使用kubectl drain,cordown和drain zk-2这个pod所在node(node.134),所谓drain也就是terminating掉这个node上所有的pod;
[root@node-132 ~]# kubectl drain $(kubectl get pod zk-2 --template {{.spec.nodeName}}) --ignore-daemonsets --force --delete-local-data
node "node.134" already cordoned
WARNING: Ignoring DaemonSet-managed pods: calico-node-kt5fk, node-exporter-b9wwq; Deleting pods with local storage: elasticsearch-logging-0, monitoring-influxdb-78c4cffd8f-bfjz7, alertmanager-main-1
...
pod "zk-2" evicted
...
node "node.134" drained
当前本群集有4台机器,node.134drain后zk-2应该就自动部署到node.133上了:
[root@node-131 ~]# kubectl get pods -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 2 58m
zk-1 1/1 Running 0 59m
zk-2 1/1 Running 0 1h
zk-2 1/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Pending 0 0s
zk-2 0/1 Pending 0 0s
zk-2 0/1 ContainerCreating 0 0s
zk-2 0/1 Running 0 20s
zk-2 1/1 Running 0 34s
等到zk-2 running后
[root@node-132 ~]# for i in 0 1 2; do kubectl get pod zk-$i --template {{.spec.nodeName}}; echo ""; done
node.132
node.131
node.133
接下来。。。drain zk-1所在node:
[root@node-132 ~]# kubectl drain $(kubectl get pod zk-1 --template {{.spec.nodeName}}) --ignore-daemonsets --force --delete-local-data
node "node.131" cordoned
WARNING: Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: pvpod, pvpod-sc, test-vmdk; Ignoring DaemonSet-managed pods: calico-node-fgcnz, node-exporter-qgfv5; Deleting pods with local storage: elasticsearch-logging-0, alertmanager-main-0, grafana-7d966ff57-lzwqs, prometheus-k8s-1
...
pod "zk-1" evicted
...
node "node.131" drained
drained完node.131后,查看watch那端发现:
[root@node-131 ~]# kubectl get pods -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 2 58m
zk-1 1/1 Running 0 59m
zk-2 1/1 Running 0 1h
zk-2 1/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Pending 0 0s
zk-2 0/1 Pending 0 0s
zk-2 0/1 ContainerCreating 0 0s
zk-2 0/1 Running 0 20s
zk-2 1/1 Running 0 34s
zk-1 1/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Pending 0 0s
zk-1 0/1 Pending 0 0s
查看zk-1的event:
[root@node-132 ~]# kubectl describe pod zk-1
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 15s (x13 over 2m) default-scheduler No nodes are available that match all of the predicates: MatchInterPodAffinity (2), NodeUnschedulable (2).
接下来。。。drain zk-0所在node:
[root@node-132 ~]# kubectl drain $(kubectl get pod zk-2 --template {{.spec.nodeName}}) --ignore-daemonsets --force --delete-local-data
node "node.133" cordoned
...
There are pending pods when an error occurred: Cannot evict pod as it would violate the pod's disruption budget.
pod/zk-2
测试一下群集是否还能正常提供服务:
[root@node-132 ~]# kubectl exec zk-0 zkCli.sh get /hello
Connecting to localhost:2181
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
world
cZxid = 0x100000002
ctime = Sun Dec 10 11:24:11 UTC 2017
mZxid = 0x100000002
mtime = Sun Dec 10 11:24:11 UTC 2017
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
把node.134解放回来:
[root@node-132 ~]# kubectl uncordon node.134
node "node.134" uncordoned
查看watch端,发现zk-1有地方去了:
[root@node-131 ~]# kubectl get pods -w -l app=zk
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 2 58m
zk-1 1/1 Running 0 59m
zk-2 1/1 Running 0 1h
zk-2 1/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Terminating 0 1h
zk-2 0/1 Pending 0 0s
zk-2 0/1 Pending 0 0s
zk-2 0/1 ContainerCreating 0 0s
zk-2 0/1 Running 0 20s
zk-2 1/1 Running 0 34s
zk-1 1/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Terminating 0 1h
zk-1 0/1 Pending 0 0s
zk-1 0/1 Pending 0 0s
zk-1 0/1 Pending 0 6m
zk-1 0/1 Pending 0 18m
zk-1 0/1 ContainerCreating 0 18m
zk-1 0/1 Running 0 18m
zk-1 1/1 Running 0 18m
接下来吧刚才几个drain的节点 uncordon 吧:
[root@node-132 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
node.131 Ready,SchedulingDisabled <none> 3d v1.8.0
node.132 Ready <none> 3d v1.8.0
node.133 Ready,SchedulingDisabled <none> 3d v1.8.0
node.134 Ready <none> 3d v1.8.0
[root@node-132 ~]# kubectl uncordon node.131
node "node.131" uncordoned
[root@node-132 ~]# kubectl uncordon node.133
node "node.133" uncordoned
[root@node-132 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
node.131 Ready <none> 3d v1.8.0
node.132 Ready <none> 3d v1.8.0
node.133 Ready <none> 3d v1.8.0
node.134 Ready <none> 3d v1.8.0
至此,zookeeper算是正式折腾结束!
本系列其他内容:
01-环境准备
02-etcd群集搭建
03-kubectl管理工具
04-master搭建
05-node节点搭建
06-addon-calico
07-addon-kubedns
08-addon-dashboard
09-addon-kube-prometheus
10-addon-EFK
11-addon-Harbor
12-addon-ingress-nginx
13-addon-traefik
参考资料:
https://kubernetes.io/docs/tutorials/stateful-application/zookeeper/