强大的自愈能力是Kubernetes这类容器编排引擎的一个重要特性。自愈的默认实现方式是自动重启发生故障的容器。除此之外,用户还可以利用Liveness和Readiness探测机制设置更精细的健康检查,进而实现如下需求:
我们首先学习Kubernetes默认的健康检查机制:每个容器启动时都会执行一个进程,此进程由Dockerfile的CMD或ENTRYPOINT指定。如果进程退出时返回码非零,则认为容器发生故障,Kubernetes就会根据restartPolicy重启容器
下面我们模拟一个容器发生故障的场景
[k8s@server1 ~]$ cat healthcheck.yml
apiVersion: v1
kind: Pod
metadata:
labels:
test: healthcheck
name: healthcheck
spec:
restartPolicy: OnFailure
containers:
- name: healthcheck
image: busybox
args:
- /bin/sh
- -c
- sleep 10; exit 1
Pod的restartPolicy设置为OnFailure,默认为Always。
sleep 10; exit 1模拟容器启动10秒后发生故障。
可看到容器当前已经重启了
[k8s@server1 ~]$ kubectl apply -f healthcheck.yml
pod/healthcheck created
[k8s@server1 ~]$ kubectl get pod healthcheck
NAME READY STATUS RESTARTS AGE
healthcheck 1/1 Running 0 8s
[k8s@server1 ~]$ kubectl get pod healthcheck
NAME READY STATUS RESTARTS AGE
healthcheck 0/1 CrashLoopBackOff 4 3m34s ##重新启动了四次
在上面的例子中,容器进程返回值非零,Kubernetes则认为容器
发生故障,需要重启。
有不少情况是发生了故障,但进程并不会退出。比如访问Web服务器时显示500内部错误,可能是系统超载,也可能是资源死锁,此时httpd进程并没有异常退出,在这种情况下重启容器可能是最直接、最有效的解决方案,那我们如何利用HealthCheck机制来处理这类场景呢?
答案就是Liveness探测。
Liveness探测让用户可以自定义容器是否健康的条件。如果
探测失败,Kubernetes就会重启容器
[k8s@server1 ~]$ cat liveness.yml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness
spec:
restartPolicy: OnFailure
containers:
- name: liveness
image: busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initiaDelaySeconds: 10
periodSeconds: 5
启动进程首先创建文件/tmp/healthy,30秒后删除,在我们的设定
中,如果/tmp/healthy文件存在,则认为容器处于正常状态,反之则发生故障
livenessProbe部分定义如何执行Liveness探测:
[k8s@server1 ~]$ kubectl apply -f liveness.yml
error: error validating "liveness.yml": error validating data: ValidationError(Pod.spec.containers[0].livenessProbe): unknown field "initiaDelaySeconds" in io.k8s.api.core.v1.Probe; if you choose to ignore these errors, turn validation off with --validate=false
[k8s@server1 ~]$ kubectl apply -f liveness.yml --validate=false
pod/liveness created
前30s是好的
[k8s@server1 ~]$ kubectl describe pod liveness
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness to server3
Normal Pulling 25s kubelet, server3 Pulling image "busybox"
Normal Pulled 20s kubelet, server3 Successfully pulled image "busybox"
Normal Created 20s kubelet, server3 Created container liveness
Normal Started 20s kubelet, server3 Started container liveness
35s之后通过kubectl describe pod liveness也可以看到liveness探测失败的日志
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness to server3
Normal Pulled 110s (x3 over 4m22s) kubelet, server3 Successfully pulled image "busybox"
Normal Created 110s (x3 over 4m22s) kubelet, server3 Created container liveness
Normal Started 110s (x3 over 4m22s) kubelet, server3 Started container liveness
Warning Unhealthy 67s (x9 over 3m52s) kubelet, server3 Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 67s (x3 over 3m42s) kubelet, server3 Container liveness failed liveness probe, will be restarted
Normal Pulling 36s (x4 over 4m27s) kubelet, server3 Pulling image "busybox"
[k8s@server1 ~]$ kubectl get pod liveness
NAME READY STATUS RESTARTS AGE
liveness 1/1 Running 3 4m25s
除了Liveness探测,Kubernetes Health Check机制还包括Readiness探测。用户通过Liveness探测可以告诉Kubernetes什么时候重启容器实现自愈;Readiness探测则是告诉Kubernetes什么时候可以将容器加入到Service负载均衡池中,对外提供服务。
Readiness探测的配置语法与Liveness探测完全一样
[k8s@server1 ~]$ cat readiness.yml
apiVersion: v1
kind: Pod
metadata:
labels:
test: readiness
name: readiness
spec:
restartPolicy: OnFailure
containers:
- name: readiness
image: busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initiaDelaySeconds: 10
periodSeconds: 5
[k8s@server1 ~]$ kubectl apply -f readiness.yml
error: error validating "readiness.yml": error validating data: ValidationError(Pod.spec.containers[0].readinessProbe): unknown field "initiaDelaySeconds" in io.k8s.api.core.v1.Probe; if you choose to ignore these errors, turn validation off with --validate=false
[k8s@server1 ~]$ kubectl apply -f readiness.yml --validate=false
pod/readiness created
[k8s@server1 ~]$ kubectl get pod readiness
NAME READY STATUS RESTARTS AGE
readiness 1/1 Running 0 46s
[k8s@server1 ~]$ kubectl get pod readiness
NAME READY STATUS RESTARTS AGE
readiness 0/1 Running 0 46s
Pod readiness的READY状态经历了如下变化:
通过kubectl describe pod readiness也可以看到Readiness探测失败
的日志
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/readiness to server3
Normal Pulling 2m8s kubelet, server3 Pulling image "busybox"
Normal Pulled 2m4s kubelet, server3 Successfully pulled image "busybox"
Normal Created 2m4s kubelet, server3 Created container readiness
Normal Started 2m4s kubelet, server3 Started container readiness
Warning Unhealthy 3s (x19 over 93s) kubelet, server3 Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory
下面对Liveness探测和Readiness探测做个比较:
对于多副本应用,当执行Scale Up操作时,新副本会作为backend
被添加到Service的负载均衡中,与已有副本一起处理客户的请求。
考虑到应用启动通常都需要一个准备阶段,比如加载缓存数据、连接数据库等,从容器启动到真正能够提供服务是需要一段时间的。我们可以通过Readiness探测判断容器是否就绪,避免将请求发送到还没有准备好的backend
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
selector:
matchLabels:
run: web
template:
metadata:
labels:
run: web
spec:
containers:
- name: web
image: httpd
ports:
- containerPort: 8080
readinessProbe:
httpGet:
scheme: HTTP
path: /healthy
port: 8080
initiaDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: httpd2-svc
spec:
selector:
run: web
ports:
- protocol: TCP
port: 8080
targetPort: 80
(1)readinessProbe部分
我们使用了不同于exec的另一种探测方法httpGet。Kubernetes对于该方法探测成功的判断条件是http请求的返回代码在200~400之间。
(2)schema指定协议,支持HTTP(默认值)和HTTPS。
(3)path指定访问路径。
(4)port指定端口。
上面配置的作用是:
Health Check另一个重要的应用场景是Rolling Update。
试想一下,现有一个正常运行的多副本应用,接下来对应用进行更新(比如使用更高版本的image),Kubernetes会启动新副本,然后发生了如下事件:
(1)正常情况下新副本需要10秒钟完成准备工作,在此之前无法响应业务请求
(2)由于人为配置错误,副本始终无法完成准备工作(比如无法连接后端数据库)
如果没有配置Health Check,会出现怎样的情况?
因为新副本本身没有异常退出(程序没有异常退出),默认的Health Check机制会认为容器已经就绪,进而会逐步用新副本替换现有副本,其结果就是:当所有旧副本都被替换后,整个应用将无法处理请求,无法对外提供服务。如果这是发生在重要的生产系统上,后果会非常严重
如果正确配置了Health Check,新副本只有通过了Readiness探测
才会被添加到Service;如果没有通过探测,现有副本不会被全部替
换,业务仍然正常进行
下面通过例子来实践Health Check在Rolling Update中的应用
使用如下配置文件app.v1.yml模拟一个10副本的应用
[k8s@server1 ~]$ cat app.v1.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
replicas: 10
selector:
matchLabels:
run: app
template:
metadata:
labels:
run: app
spec:
containers:
- name: app
image: busybox
args:
- /bin/sh
- -c
- sleep 10; touch /tmp/healthy; sleep 30000
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
10秒后副本能够通过Readiness探测
[k8s@server1 ~]$ kubectl get deployment app
NAME READY UP-TO-DATE AVAILABLE AGE
app 10/10 10 10 30s
接下来滚动更新应用,配置文件app.v2.yml
[k8s@server1 ~]$ cat app.v2.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
replicas: 10
selector:
matchLabels:
run: app
template:
metadata:
labels:
run: app
spec:
containers:
- name: app
image: busybox
args:
- /bin/sh
- -c
- sleep 3000
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
很显然,由于新副本中不存在/tmp/healthy,因此是无法通过Readiness探测的
[k8s@server1 ~]$ kubectl apply -f app.v2.yml --record
deployment.apps/app configured
[k8s@server1 ~]$ kubectl get deployment app
NAME READY UP-TO-DATE AVAILABLE AGE
app 8/10 5 8 55s
[k8s@server1 ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
app-5bb6568bb9-5g426 0/1 Running 0 17s
app-5bb6568bb9-75xc2 0/1 Running 0 17s
app-5bb6568bb9-v8lfw 0/1 Running 0 17s
app-5bb6568bb9-wlt74 0/1 Running 0 17s
app-5bb6568bb9-ww2nk 0/1 Running 0 17s
app-6d76c4459d-2pf7z 1/1 Running 0 68s
app-6d76c4459d-62dzv 1/1 Terminating 0 68s
app-6d76c4459d-cft7n 1/1 Running 0 68s
app-6d76c4459d-jxpnc 1/1 Running 0 68s
app-6d76c4459d-sz5mx 1/1 Running 0 68s
app-6d76c4459d-t2f6k 1/1 Running 0 68s
app-6d76c4459d-vl27p 1/1 Terminating 0 68s
app-6d76c4459d-vvn6p 1/1 Running 0 68s
app-6d76c4459d-wvm6b 1/1 Running 0 68s
app-6d76c4459d-xfz8d 1/1 Running 0 68s
先关注kubectl get pod输出:
(1)从Pod的AGE栏可判断,最后5个Pod是新副本,目前处于NOT READY状态
(2)旧副本从最初10个减少到8个
再来看kubectl get deployment app的输出
(1)DESIRED 10表示期望的状态是10个READY的副本
(2)UP-TO-DATE 5表示当前已经完成更新的副本数,即5个新副本
(3)AVAILABLE 8表示当前处于READY状态的副本数,即8个旧副本
(4)CURRENT 13表示当前副本的总数,即8个旧副本+5个新副本
在我们的设定中,新副本始终都无法通过Readiness探测,所以这个状态会一直保持下去
上面我们模拟了一个滚动更新失败的场景。不过幸运的是:
HealthCheck帮我们屏蔽了有缺陷的副本,同时保留了大部分旧副本,业务没有因更新失败受到影响
接下来我们要回答:为什么新创建的副本数是5个,同时只销毁了2个旧副本?
原因是:滚动更新通过参数maxSurge和maxUnavailable来控制副本替换的数量
maxSurge
此参数控制滚动更新过程中副本总数超过DESIRED的上限。
maxSurge可以是具体的整数(比如3),也可以是百分百,向上取
整。maxSurge默认值为25%。
在上面的例子中,DESIRED为10,那么副本总数的最大值为
roundUp(10 + 10 * 25%) =13,所以我们看到CURRENT就是13
maxUnavailable
此参数控制滚动更新过程中,不可用的副本相占DESIRED的最
大比例。maxUnavailable可以是具体的整数(比如3),也可以是百分
百,向下取整。maxUnavailable默认值为25%。
在上面的例子中,DESIRED为10,那么可用的副本数至少要为
10 - roundDown(10 * 25%)= 8,所以我们看到AVAILABLE是8
maxSurge值越大,初始创建的新副本数量就越多;
maxUnavailable值越大,初始销毁的旧副本数量就越多
理想情况下,我们这个案例滚动更新的过程应该是这样的:
而我们的实际情况是在第4步就卡住了,新副本无法通过Readiness探测。
这个过程可以在kubectl describe deployment app的日志部分查看
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 10m deployment-controller Scaled up replica set app-6d76c4459d to 10
Normal ScalingReplicaSet 9m56s deployment-controller Scaled up replica set app-5bb6568bb9 to 3
Normal ScalingReplicaSet 9m56s deployment-controller Scaled down replica set app-6d76c4459d to 8
Normal ScalingReplicaSet 9m56s deployment-controller Scaled up replica set app-5bb6568bb9 to 5
如果滚动更新失败,可以通过kubectl rollout undo回滚到上一个版本
[k8s@server1 ~]$ kubectl rollout history deployment app
deployment.apps/app
REVISION CHANGE-CAUSE
1 kubectl apply --filename=app.v1.yml --record=true
2 kubectl apply --filename=app.v2.yml --record=true
[k8s@server1 ~]$ kubectl rollout undo deployment app --to-revision=1
deployment.apps/app rolled back
[k8s@server1 ~]$ kubectl get deployment app
NAME READY UP-TO-DATE AVAILABLE AGE
app 10/10 10 10 12m
如果要定制maxSurge和maxUnavailable
[k8s@server1 ~]$ cat app.v2.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
strategy:
rollingUpdate:
maxSurge: 35%
maxUnavailable: 35%
replicas: 10
selector:
matchLabels:
run: app
template:
metadata:
labels:
run: app
spec:
containers:
- name: app
image: busybox
args:
- /bin/sh
- -c
- sleep 3000
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
小结:我们讨论了Kubernetes健康检查的两种机制:Liveness探测和Readiness探测,并实践了健康检查在Scale Up和Rolling Update场景中的应用