Author: [email protected]
摘要: Kubernetes Deployment滚动更新机制不同于ReplicationController rolling update,Deployment rollout还提供了滚动进度查询,滚动历史记录,回滚等能力,无疑是使用Kubernetes进行应用滚动发布的首选。本博文,将带你聊聊那些容易被大家忽略或者误解的特性。
以下面的frontend Deployment为例,重点关注.spec.minReadySeconds
,.spec.strategy.rollingUpdate.maxSurge
,.spec.strategy.rollingUpdate. maxUnavailable
。
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: frontend
spec:
minReadySeconds: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 3
maxUnavailable: 2
replicas: 25
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: php-redis
image: gcr.io/google-samples/gb-frontend:v4
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# If your cluster config does not include a dns service, then to
# instead access environment variables to find service host
# info, comment out the 'value: dns' line above, and uncomment the
# line below:
# value: env
ports:
- containerPort: 80
.spec.minReadySeconds
: 新创建的Pod状态为Ready持续的时间至少为.spec.minReadySeconds
才认为Pod Available(Ready)。.spec.strategy.rollingUpdate.maxSurge
: specifies the maximum number of Pods that can be created over the desired number of Pods. The value cannot be 0 if MaxUnavailable is 0. 可以为整数或者百分比,默认为desired Pods数的25%. Scale Up新的ReplicaSet时,按照比例计算出允许的MaxSurge,计算时向上取整(比如3.4,取4)。.spec.strategy.rollingUpdate.maxUnavailable
: specifies the maximum number of Pods that can be unavailable during the update process. The value cannot be 0 if maxSurge is 0.可以为整数或者百分比,默认为desired Pods数的25%. Scale Down旧的ReplicaSet时,按照比例计算出允许的maxUnavailable,计算时向下取整(比如3.6,取3)。因此,在Deployment rollout时,需要保证Available(Ready) Pods数不低于 desired pods number - maxUnavailable
; 保证所有的Pods数不多于 desired pods number + maxSurge
。
Note: A Deployment’s rollout is triggered if and only if the Deployment’s pod template (that is, .spec.template) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.
我们继续以上面的Deployment为例子,并考虑最常用的情况–更新image(发布新版本):
kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v3 --record
set image之后,导致Deployment’s Pod Template发生变化,就会触发rollout。我们只考虑RollingUpdate策略(Kubernetes还支持ReCreate更新策略)。通过kubectl get rs -w
来watch ReplicaSet的变化。
[root@master03 ~]# kubectl get rs -w
NAME DESIRED CURRENT READY AGE
frontend-3114648124 25 25 25 14m
frontend-3099797709 0 0 0 1h
frontend-3099797709 0 0 0 1h
frontend-3099797709 3 0 0 1h
frontend-3114648124 23 25 25 17m
frontend-3099797709 5 0 0 1h
frontend-3114648124 23 25 25 17m
frontend-3114648124 23 23 23 17m
frontend-3099797709 5 0 0 1h
frontend-3099797709 5 3 0 1h
frontend-3099797709 5 5 0 1h
frontend-3099797709 5 5 1 1h
frontend-3114648124 22 23 23 17m
frontend-3099797709 5 5 2 1h
frontend-3114648124 22 23 23 17m
frontend-3114648124 22 22 22 17m
frontend-3099797709 6 5 2 1h
frontend-3114648124 21 22 22 17m
frontend-3099797709 6 5 2 1h
frontend-3114648124 21 22 22 17m
frontend-3099797709 7 5 2 1h
frontend-3099797709 7 6 2 1h
frontend-3114648124 21 21 21 17m
frontend-3099797709 7 6 2 1h
frontend-3099797709 7 7 2 1h
frontend-3099797709 7 7 2 1h
frontend-3099797709 7 7 3 1h
frontend-3099797709 7 7 4 1h
frontend-3114648124 20 21 21 17m
frontend-3099797709 8 7 4 1h
frontend-3114648124 20 21 21 17m
frontend-3114648124 20 20 20 17m
frontend-3099797709 8 7 4 1h
frontend-3099797709 8 8 4 1h
frontend-3099797709 8 8 5 1h
frontend-3114648124 19 20 20 17m
frontend-3099797709 9 8 5 1h
frontend-3114648124 19 20 20 17m
frontend-3099797709 9 8 5 1h
frontend-3099797709 9 9 5 1h
frontend-3114648124 19 19 19 17m
frontend-3099797709 9 9 5 1h
frontend-3114648124 18 19 19 18m
frontend-3099797709 10 9 5 1h
frontend-3114648124 18 19 19 18m
frontend-3099797709 10 9 5 1h
frontend-3114648124 18 18 18 18m
frontend-3099797709 10 10 5 1h
frontend-3099797709 10 10 5 1h
frontend-3114648124 18 18 18 18m
frontend-3099797709 10 10 6 1h
frontend-3099797709 10 10 6 1h
frontend-3114648124 17 18 18 18m
frontend-3114648124 17 18 18 18m
frontend-3099797709 11 10 6 1h
frontend-3099797709 11 10 6 1h
frontend-3114648124 17 17 17 18m
frontend-3099797709 11 11 6 1h
说明:
1. frontend-3114648124为原来的RS(成为OldRS),frontend-3099797709为新建的RS(成为NewRS,当然也可能是Old RS,如果之前执行过这个一样的内容)。
2. maxSurge:3, maxUnavailable=2, desired replicas=25
desired replicas + maxSurge
(28个)desired replicas - maxUnavailable
(23个)desired replicas + maxSurge
有差值空间,就会接着创建新的Pods。desired replicas
,并等待它们都Ready,然后再删除所有剩余的旧的Pods。至此,滚动流程结束。我们考虑这个情况,但用户执行某个滚动更新后,未等待此次滚动更新结束,就继续执行了一次新的滚动更新请求,这时后台滚动流程会怎么样呢?会乱成一锅粥么?
我们继续以这个例子来看:
# deploy frontend 稳定运行在v2(frontend-888714875)时:
[root@master03 ~]# kubectl get rs -w
NAME DESIRED CURRENT READY AGE
====执行 kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v3 --record
----备注: v3 --> frontend-776431694
frontend-776431694 0 0 0 6h
frontend-776431694 0 0 0 6h
frontend-776431694 3 0 0 6h
frontend-888714875 23 25 25 5h
frontend-776431694 5 0 0 6h
frontend-888714875 23 25 25 5h
frontend-888714875 23 23 23 5h
frontend-776431694 5 0 0 6h
frontend-776431694 5 3 0 6h
frontend-776431694 5 5 0 6h
frontend-776431694 5 5 1 6h
frontend-776431694 5 5 2 6h
frontend-776431694 5 5 3 6h
frontend-776431694 5 5 4 6h
frontend-776431694 5 5 4 6h
frontend-888714875 22 23 23 5h
frontend-776431694 6 5 4 6h
frontend-888714875 22 23 23 5h
frontend-888714875 22 22 22 5h
frontend-776431694 6 5 4 6h
frontend-776431694 6 6 4 6h
frontend-776431694 6 6 4 6h
frontend-888714875 19 22 22 5h
frontend-776431694 9 6 4 6h
frontend-888714875 19 22 22 5h
frontend-776431694 9 6 4 6h
frontend-888714875 19 19 19 5h
frontend-776431694 9 9 4 6h
frontend-888714875 19 19 19 5h
==== 执行 kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v4 --record ====
----- 备注:v4 --> frontend-3099797709 ----
frontend-3099797709 0 0 0 6h
frontend-3099797709 0 0 0 6h
frontend-776431694 4 9 4 6h
frontend-3099797709 5 0 0 6h
frontend-3099797709 5 0 0 6h
frontend-3099797709 5 5 0 6h
frontend-776431694 4 9 4 6h
frontend-776431694 4 4 4 6h
frontend-3099797709 5 5 0 6h
frontend-3099797709 5 5 1 6h
frontend-3099797709 5 5 2 6h
frontend-3099797709 5 5 3 6h
frontend-3099797709 5 5 4 6h
frontend-3099797709 5 5 4 6h
frontend-776431694 2 4 4 6h
frontend-3099797709 7 5 4 6h
frontend-776431694 2 4 4 6h
frontend-776431694 2 2 2 6h
frontend-776431694 2 2 2 6h
frontend-3099797709 7 5 4 6h
frontend-776431694 0 2 2 6h
frontend-3099797709 7 7 4 6h
frontend-776431694 0 2 2 6h
frontend-3099797709 9 7 4 6h
frontend-776431694 0 0 0 6h
frontend-3099797709 9 7 4 6h
frontend-3099797709 9 9 4 6h
frontend-776431694 0 0 0 6h
frontend-3099797709 9 9 4 6h
frontend-3099797709 9 9 5 6h
frontend-3099797709 9 9 6 6h
frontend-3099797709 9 9 7 6h
frontend-888714875 17 19 19 5h
frontend-3099797709 11 9 7 6h
frontend-888714875 17 19 19 5h
frontend-888714875 17 17 17 5h
frontend-3099797709 11 9 7 6h
frontend-888714875 16 17 17 5h
frontend-3099797709 11 11 7 6h
frontend-3099797709 12 11 7 6h
frontend-888714875 16 17 17 5h
frontend-888714875 16 16 16 5h
frontend-3099797709 12 11 7 6h
frontend-3099797709 12 12 7 6h
frontend-3099797709 12 12 8 6h
frontend-3099797709 12 12 8 6h
frontend-888714875 15 16 16 5h
frontend-3099797709 13 12 8 6h
frontend-888714875 15 16 16 5h
frontend-888714875 15 15 15 5h
frontend-3099797709 13 12 8 6h
frontend-3099797709 13 13 8 6h
frontend-3099797709 13 13 8 6h
frontend-3099797709 13 13 9 6h
frontend-3099797709 13 13 10 6h
frontend-888714875 14 15 15 5h
frontend-3099797709 14 13 10 6h
frontend-888714875 14 15 15 5h
frontend-888714875 14 14 14 5h
frontend-3099797709 14 13 10 6h
frontend-888714875 14 14 14 5h
frontend-3099797709 14 14 11 6h
frontend-3099797709 14 14 12 6h
frontend-3099797709 14 14 12 6h
frontend-3099797709 14 14 12 6h
frontend-888714875 11 14 14 5h
frontend-3099797709 17 14 12 6h
frontend-888714875 11 14 14 5h
frontend-3099797709 17 14 12 6h
frontend-888714875 11 11 11 5h
frontend-3099797709 17 17 12 6h
frontend-888714875 11 11 11 5h
frontend-3099797709 17 17 12 6h
frontend-3099797709 17 17 13 6h
frontend-3099797709 17 17 14 6h
frontend-3099797709 17 17 14 6h
frontend-888714875 10 11 11 5h
frontend-3099797709 18 17 14 6h
frontend-888714875 10 11 11 5h
frontend-888714875 10 10 10 5h
frontend-3099797709 18 17 14 6h
frontend-3099797709 18 18 14 6h
frontend-3099797709 18 18 15 6h
frontend-888714875 9 10 10 5h
frontend-3099797709 18 18 16 6h
frontend-888714875 9 10 10 5h
frontend-3099797709 19 18 16 6h
frontend-3099797709 19 18 16 6h
frontend-888714875 9 9 9 5h
frontend-888714875 7 9 9 5h
frontend-3099797709 19 18 16 6h
frontend-888714875 7 9 9 5h
frontend-3099797709 21 18 16 6h
frontend-888714875 7 9 9 5h
frontend-3099797709 21 19 16 6h
frontend-888714875 7 7 7 5h
frontend-3099797709 21 21 16 6h
frontend-888714875 7 7 7 5h
frontend-3099797709 21 21 17 6h
frontend-3099797709 21 21 18 6h
frontend-3099797709 21 21 18 6h
frontend-888714875 5 7 7 5h
frontend-888714875 5 7 7 5h
frontend-3099797709 23 21 18 6h
frontend-888714875 5 5 5 5h
frontend-3099797709 23 21 18 6h
frontend-3099797709 23 23 18 6h
frontend-3099797709 23 23 18 6h
frontend-3099797709 23 23 19 6h
frontend-3099797709 23 23 20 6h
frontend-3099797709 23 23 20 6h
frontend-888714875 3 5 5 5h
frontend-3099797709 25 23 20 6h
frontend-888714875 3 5 5 5h
frontend-888714875 3 3 3 5h
frontend-3099797709 25 23 20 6h
frontend-888714875 3 3 3 5h
frontend-3099797709 25 25 20 6h
frontend-3099797709 25 25 21 6h
frontend-3099797709 25 25 22 6h
frontend-3099797709 25 25 22 6h
frontend-888714875 2 3 3 5h
frontend-888714875 2 3 3 5h
frontend-888714875 2 2 2 5h
frontend-888714875 2 2 2 5h
frontend-3099797709 25 25 23 6h
frontend-888714875 1 2 2 5h
frontend-888714875 1 2 2 5h
frontend-888714875 1 1 1 5h
frontend-3099797709 25 25 23 6h
frontend-888714875 0 1 1 5h
frontend-888714875 0 1 1 5h
frontend-888714875 0 0 0 5h
frontend-3099797709 25 25 24 6h
frontend-3099797709 25 25 25 6h
frontend-3099797709 25 25 25 6h
说明:
deployment frontend稳定运行在v2版本(RS:frontend-888714875),然后执行kubectl set image触发滚动更新到v3版本(RS: frontend-776431694), 当v3 RS的desired个数scale up到9个,ready个数为4个时,用户又执行kubectl set image触发滚动更新到v4版本(RS: frontend-3099797709)。说明,我自己是这样玩的,先创建的v4 RS,然后v3 RS,然后v2 RS。因此按照创建时间从新到旧排序RS为,v2–>v3–>v4。
设想一个更复杂的场景:如果在上述v4滚动更新替换到半吊子的v3 RS过程中,用户又触发了一个滚动更新到v5版本,流程会怎么样呢?
不要怕,原理是一样的,Deployment rolling update总是先把最老的RS滚动更新替换掉,然后逐步把新的RS滚动更新替换掉,直到最最新的那个RS scale down为0,流程就结束了。
或许很多人至今还会这么觉得:整个滚动更新的过程中,一旦用户执行了kubectl rollout pause deploy/frontend
后,正在执行的滚动流程就会立刻停止,然后用户执行kubectl rollout resume deploy/frontend
就会继续未完成的滚动更新。那你就大错特错了!
kubectl rollout pause
只会用来停止触发下一次rollout。什么意思呢? 上面描述的这个场景,正在执行的滚动历程是不会停下来的,而是会继续正常的进行滚动,直到完成。等下一次,用户再次触发rollout时,Deployment就不会真的去启动执行滚动更新了,而是等待用户执行了kubectl rollout resume
,流程才会真正启动执行。
前提,你要知道关于
--record
:
Setting the kubectl flag –record to true allows you to record current command in the annotations of the resources being created or updated.
默认情况下,所有通过kubectl xxxx –record都会被kubernetes记录到etcd进行持久化,这无疑会占用资源,最重要的是,时间久了,当你kubectl get rs时,会有成百上千的垃圾RS返回给你,那时你可能就眼花缭乱了。
上生产时,我们最好通过设置Deployment的.spec.revisionHistoryLimit
来限制最大保留的revision number,比如15个版本,回滚的时候一般只会回滚到最近的几个版本就足够了。
执行下面的命令,可以返回某个Deployment的所有record记录:
$ kubectl rollout history deployment/nginx-deployment
deployments "nginx-deployment"
REVISION CHANGE-CAUSE
1 kubectl create -f docs/user-guide/nginx-deployment.yaml --record
2 kubectl set image deployment/nginx-deployment nginx=nginx:1.9.1
3 kubectl set image deployment/nginx-deployment nginx=nginx:1.91
然后执行rollout undo命令就可以回滚到to-revision
指定的版本。
kubectl rollout undo deployment/nginx-deployment --to-revision=2
deployment "nginx-deployment" rolled back
其实rollout history
中记录的revision都和ReplicaSets一一对应。如果手动delete某个ReplicaSet,对应的rollout history就会被删除,也就是还说你无法回滚到这个revison了。
roolout history和ReplicaSet的对应关系,可以在kubectl describe rs $RSNAME
返回的revision字段中得到,这里的revision就对应着roolout history返回的revison。
用户通过执行rollout undo并指定--to-revison
,可以将Deployment回滚到指定的revision。
kubectl rollout undo deploy frontend --to-revision=7
通过观察后端RS的数据变化,同样发现,回滚的时候也是按照滚动的机制进行的,同样要遵守maxSurge和maxUnavailable的约束。并不是一次性将所有的Pods删除,然后再一次性创建新的Pods。
[root@master01 ~]# kubectl get rs -w
NAME DESIRED CURRENT READY AGE
frontend-888714875 3 0 0 23h
frontend-776431694 8 10 10 23h
frontend-888714875 5 0 0 23h
frontend-776431694 8 10 10 23h
frontend-776431694 8 8 8 23h
frontend-888714875 5 0 0 23h
frontend-888714875 5 3 0 23h
frontend-888714875 5 5 0 23h
frontend-888714875 5 5 1 23h
frontend-888714875 5 5 2 23h
frontend-888714875 5 5 4 23h
frontend-776431694 6 8 8 23h
frontend-888714875 5 5 4 23h
frontend-888714875 5 5 5 23h
frontend-776431694 6 8 8 23h
frontend-888714875 7 5 5 23h
frontend-776431694 6 6 6 23h
frontend-776431694 3 6 6 23h
frontend-888714875 10 5 5 23h
frontend-776431694 3 6 6 23h
frontend-776431694 3 3 3 23h
frontend-888714875 10 5 5 23h
frontend-776431694 3 3 3 23h
frontend-888714875 10 7 5 23h
frontend-888714875 10 10 5 23h
frontend-888714875 10 10 6 23h
frontend-888714875 10 10 7 23h
frontend-888714875 10 10 8 23h
frontend-888714875 10 10 8 23h
frontend-888714875 10 10 9 23h
frontend-888714875 10 10 9 23h
frontend-888714875 10 10 9 23h
frontend-776431694 0 3 3 23h
frontend-776431694 0 3 3 23h
frontend-776431694 0 0 0 23h
frontend-888714875 10 10 10 23h
frontend-888714875 10 10 10 23h
本博文介绍了关于Deployment rolling update那些容易被大家忽略或者误解的特性,如果看完这篇博文,你觉得“我去! 本来就是这样子的啊!”,那说明你对Deployment Controller非常熟悉。