pod的亲和度调度

image.png

前言:
由于Affinity对pod的调度更加精细,我们在使用中逐渐代替了NodeSelector。可以分为node亲和性调度和pod亲和性调度。
1)node亲和性调度:不仅有NodeSelector的硬限制,而且可以在软限制中定义权重。
2)pod亲和性调度:它可以使得pod根据在节点上正在运行的pod的标签(而不是节点的标签)进行调度,要求对节点和pod两个条件进行匹配。

1. Node Affinity

1.1 node节点的预制标签

有一些预置的标签,我们可以直接使用

  • 查看master上的标签
[root@DoM01 ~]# kubectl get node dom03 --show-labels
NAME    STATUS   ROLES    AGE   VERSION   LABELS
dom03   Ready    master   52d   v1.15.2   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=dom03,kubernetes.io/os=linux,node-role.kubernetes.io/master=
  • 格式不是很友好,我们用describe看一下
[root@DoM01 ~]# kubectl describe node dom01
Name:               dom01
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=dom01
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
……

说明:
"beta.kubernetes.io/arch=amd64","beta.kubernetes.io/os=linux" 这两个在1.18中弃用。
看名字使用,也没什么解释的。

1.2 自定义标签

1.2.1 给node增加标签

  • 语法
# kubectl label node node名  键=值
  • 示例
[root@DoM01 ~]# kubectl label node don01 zone=east
node/don01 labeled
  • 验证
    如下可见 don01 设置了标签 zone=east
[root@DoM01 ~]# kubectl describe node don01
Name:               don01
Roles:              
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=don01
                    kubernetes.io/os=linux
                    zone=east

1.2.2 修改label

说明:加 --overwrite参数

# kubectl label node don01 zone=south --overwrite

1.2.3 修改label

说明:删除一个key为zone的标签,只需把key的后边加一个减号即会删除该key

# kubectl label node don01 zone-

1.3 Require

  • 概述:
    equiredDuringSchedulingIgnoredDuringExecution是硬限制,必须满足此条件才可以调度pod到该node上(功能和nodeSelector很像)

  • 示例

apiVersion: v1
kind: Pod
metadata:
  name: nginxtest
  namespace: test
spec:
  affinity:
     # 说明是"节点亲和性调度"
    nodeAffinity:
      # 说明是"节点亲和性调度"
      requiredDuringSchedulingIgnoredDuringExecution:
        #说明要选择节点了
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - "east"
  containers:
    - name: nginxtest
      image: harbocto.boe.com.cn/public/nginx
  • 创建并查看结果
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
nginxtest   1/1     Running   0          91s   10.244.5.166   don03              

说明:可以看到,pod被调度到了一个zone=east的节点don01上

  • 更改调度
    将yml文件修改成调度到zone=south的节点,然后更新pod如下
[root@DoM01 test]# kubectl apply -f nginx.yml
The Pod "nginxtest" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)
  core.PodSpec{
        ……

说明:以上报错为了引出下边两个重要规则。

  • 规则
    1)亲和度的值是不能直接修改的。
    2)如果此时修改了node标签,使得节点不满足要求,这个改变也将被系统忽略。(即pod仍会在该节点上运行)

1.4 Perferred

  • 概述
    preferredDuringSchedulingIgnoredDuringExecution 是软限制,强调优先满足制定规则,多个优先级可以设置权重。

  • 示例

apiVersion: v1
kind: Pod
metadata:
  name: nginxtest
  namespace: test
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 60
          preference:
            matchExpressions:
            - key: zone
              operator: In
              values:
              - "east"
        - weight: 80
          preference:
            matchExpressions:
            - key: zone
              operator: In
              values:
              - "south"
  containers:
    - name: nginxtest
      image: harbocto.boe.com.cn/public/nginx  
  • 创建并查看结果
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test  -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
nginxtest   1/1     Running   0          23s   10.244.7.190   don05              

说明:虽然 zone=east的权重是60,但是仍可以调度到上边

  • 修改east的权重为20,删除pod再启动一下
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 ~]# kubectl get pod -n test  -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
nginxtest   1/1     Running   0          18m   10.244.3.33   don01              
[root@DoM01 ~]#

说明:可以看到此时该pod被调度到zone=south的节点上了,权重的作用可见一斑。

1.3 注意事项

  • 如果设置了nodeSelector和nodeAffinity,则需同时满足。
  • 如果设置多个nodeSelectorTerms,有一个满足即可。
  • 如果nodeSelectorTerms下有多个 matchExpressions,则必须满足所有条件才可以。

2. Pod Affinity

说明:
根据pod1的标签选是否在某一组(或一个)node节点上部署pod2。
这一组node上用来限制亲和度的标签的key 称为 topologyKey。
因此pod2需要两个标签来确定亲和度:
(1)限制在那个范围内(topologyKey)。( 2)和哪个pod亲和(相应pod的标签)。
关于topologyKey,我们不需要指明值,因为只要同一个值的一组node下亲和就可以了。

2.1 Pod Affinity

说明:
同样分为
"requiredDuringSchedulingIgnoredDuringExecution"
"preferredDuringSchedulingIgnoredDuringExecution"
两种

2.1.1 required

  • 参照目标pod
apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag
  namespace: test
labels:
  security: "S1"
  app: "nginx-flag"
spec:       
  containers:
    - name: nginx-flag
      image: nginx
  • pod的亲和度调度
apiVersion: v1
kind: Pod
metadata:
  name: nginxtest
  namespace: test
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: security
            operator: In
            values:
            - "S1"
        topologyKey: kubernetes.io/hostname
  containers:
  - name: nginxtest
    image: harbocto.boe.com.cn/public/nginx

  • 启动并查看结果
[root@DoM01 test]# kubectl create -f nginx-flag.yml
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
nginx-flag   1/1     Running   1          14m   10.244.7.191   don05              
nginxtest    1/1     Running   0          58s   10.244.7.192   don05              

说明:可见nginxtest调度到了nginx-flag上

  • 找不到合适节点
    如果没有启动nginx-flag而直接启动nginxtest,系统找不到合适的节点调度,会一直处于pending状态。
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test
NAME        READY   STATUS    RESTARTS   AGE
nginxtest   0/1     Pending   0          3s

如下可见,nginx-flag启动之后,nginxtest被调度到了有nginx-flag的节点上。

[root@DoM01 test]# kubectl create -f nginx-flag.yml
pod/nginx-flag created
NAME         READY   STATUS    RESTARTS   AGE     IP             NODE    NOMINATED NODE   READINESS GATES
nginx-flag   1/1     Running   0          17s     10.244.5.169   don03              
nginxtest    1/1     Running   0          2m33s   10.244.5.168   don03              

2.1.2 preferred

  • 参照目标pod
    同上
  • pod的亲和度调度
    创建nginx.yml文件如下:
apiVersion: v1
kind: Pod
metadata:
  name: nginxtest
  namespace: test
spec:
  affinity:
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 20
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: security
              operator: In
              values:
              - "S1"
          topologyKey: kubernetes.io/hostname
      - weight: 80
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: security
              operator: In
              values:
              - "S2"
          topologyKey: kubernetes.io/hostname
  containers:
    - name: nginxtest
      image: harbocto.boe.com.cn/public/nginx
  • 启动和查看
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test  -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
nginx-flag   1/1     Running   0          56s   10.244.3.34   don01              
nginxtest    1/1     Running   0          8s    10.244.3.35   don01              

如上,可见nginxtest被调度到了nginx-flag所在的节点上

  • 亲和度是相互的

测试:
再启动一个nginx-flag-02,lable设置为security=S2。由前边可知,nginxtest和它的亲和度是80,但是它会主动选择nginxtest。

apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag-02
  namespace: test
  labels:
    security: "S2"
    app: "nginx-flag"
spec:
  containers:
    - name: nginx-flag-02
      image: harbocto.boe.com.cn/public/nginx

删除nginxtest,在重新启动

[root@DoM01 test]# kubectl create -f nginx-flag02.yml
pod/nginx-flag-02 created
[root@DoM01 test]# kubectl get pod -n test  -o wide
NAME            READY   STATUS    RESTARTS   AGE     IP            NODE    NOMINATED NODE   READINESS GATES
nginx-flag      1/1     Running   0          3m18s   10.244.3.34   don01              
nginx-flag-02   1/1     Running   0          6s      10.244.3.36   don01              
nginxtest       1/1     Running   0          2m30s   10.244.3.35   don01              

如上:
发现它们竟然会粘在一起,nginx-flag-02居然也会启动在(看了一下node资源,如果没有亲和度的话nginx-flag-02应该启动在don03上。)

  • 删除ngintest,在删除nginx-flag-02。再启动nginx-flag-02,果然如上边预期被调度到了don03上。
[root@DoM01 test]# kubectl delete -n test pod nginxtest
pod "nginxtest" deleted
[root@DoM01 test]# kubectl delete -n test pod nginx-flag-02
pod "nginx-flag-02" deleted
[root@DoM01 test]# kubectl create -f nginx-flag02.yml
pod/nginx-flag-02 created
[root@DoM01 test]# kubectl get pod -n test  -o wide
NAME            READY   STATUS    RESTARTS   AGE     IP             NODE    NOMINATED NODE   READINESS GATES
nginx-flag      1/1     Running   0          5m36s   10.244.3.34    don01              
nginx-flag-02   1/1     Running   0          2s      10.244.5.191   don03              
  • 测试亲和度权重
[root@DoM01 test]# kubectl create -f nginx.yml
pod/nginxtest created
[root@DoM01 test]# kubectl get pod -n test -o wide
NAME            READY   STATUS    RESTARTS   AGE     IP             NODE    NOMINATED NODE   READINESS GATES
nginx-flag      1/1     Running   0          4h24m   10.244.3.34    don01              
nginx-flag-02   1/1     Running   0          4h19m   10.244.5.191   don03              
nginxtest       1/1     Running   0          14s     10.244.5.194   don03              

说明:如上可见,nginxtest被调度到权重更高的nginx-flag-02的节点上了。

2.2 Pod Anti Affinity

  • 参照pod
    复制一下刚才的参照pod,把五个节点的四个都占了
apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag-01
  namespace: test
  labels:
    security: "S1"
    app: "nginx-flag"
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - "don01"
  containers:
    - name: nginx-flag-01
      image: harbocto.boe.com.cn/public/nginx
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag-02
  namespace: test
  labels:
    security: "S2"
    app: "nginx-flag"
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - "don02"
  containers:
    - name: nginx-flag-02
      image: harbocto.boe.com.cn/public/nginx
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag-03
  namespace: test
  labels:
    security: "S3"
    app: "nginx-flag"
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - "don03"
  containers:
    - name: nginx-flag-03
      image: harbocto.boe.com.cn/public/nginx
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx-flag-04
  namespace: test
  labels:
    security: "S4"
    app: "nginx-flag"
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - "don04"
  containers:
    - name: nginx-flag-04
      image: harbocto.boe.com.cn/public/nginx
  • pod的反亲和调度
apiVersion: v1
kind: Pod
metadata:
  name: nginxtest-02
  namespace: test
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - "nginx-flag"
        topologyKey: kubernetes.io/hostname
  containers:
  - name: nginxtest-02
    image: harbocto.boe.com.cn/public/nginx
  • 创建和查看
[root@DoM01 test]# kubectl get pod -n test -o wide
NAME            READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
nginx-flag-01   1/1     Running   0          15m   10.244.3.38    don01              
nginx-flag-02   1/1     Running   0          15m   10.244.4.35    don02              
nginx-flag-03   1/1     Running   0          15m   10.244.5.201   don03              
nginx-flag-04   1/1     Running   0          15m   10.244.6.27    don04              
nginxtest-02    1/1     Running   0          14m   10.244.7.210   don05              

如上可见,nginxtest-02被调度到最后剩下的一个节点上了

2.3 注意事项

  • topology
    1)反亲和性 requiredDuringScheduling 中topologyKey 不能为空
    2)反亲和性 preferredDuringScheduling 中topologyKey 为空,则被认为是如下的组合:
    kubernetes.io/hostname
    failure-domain.beta.kubernetes.io/zone
    failure-domain.beta.kubernetes.io/region
    3)如果admission controller 设置了LimitPodHardAntiAffinityTopology ,则互斥性被限制在 kubernetes.io/hostname

  • namespace限制
    1)位置:和topologyKey同级
    2)未定义namespace:表示和参照目标的pod相同
    3)设置为空:表示所有namespace


你可能感兴趣的:(pod的亲和度调度)