Pod调度到指定Node的方式主要有4种:
nodeName
,调度到指定name的节点上。nodeSelector
,调度到带有指定label
的节点上。有了nodeName调度
、nodeSelector调度
、污点(Taints)和容忍度(Tolerations)调度
,为什么还需要亲和-反亲和调度
呢?
为了应对更灵活更复杂的调度场景。比如有些场景想把2个Pod 调度到一台节点上,有的场景为了隔离性高可用性想把2个Pod分开到不同节点上,或者有的场景想把Pod调度到指定的一些特点节点上。
label
在K8S中是非常重要的概念,不管是什么场景,只要和选择、筛选相关的,基本是用label
字段来匹配的。label
字段。Node亲和性调度的图示如下,Pod亲和性调用和Pod反亲和性调用也类似。
Affinity
的中文意思是亲近
,用来表述亲和性调度。
亲和性调度:指Node(或者Pod)和Pod的关联关系,Pod可以部署在符合这种label的Node,也可与其他Pod共享相同的调度策略。
反亲和性调度:主要针对两个pod相反的调度策略,即pod A选择node1,那么pod2绝对不会选择node1进行调度。
亲和性调度 和 反亲和性调度的关系就3种:
不管是Node亲和 还是Pod亲和,他们都有2种亲和性表达方式:
Required
这个词,中文意思必须的
。Preferred
这个词,中文意思是首选
,用来说明选择规则的优先级,确实比较合适。这两个字段也比较长,我们来做下拆解,将RequiredDuringSchedulingIgnoredDuringExecution拆解为RequiredDuringScheduling
和IgnoredDuringExecution
。
RequiredDuringScheduling
:定义的规则必须强制满足(Required
)才会把Pod调度到节点上。IgnoredDuringExecution
:已经在节点上运行的Pod不需要满足定义的规则,即使去除节点上的某个标签,那些需要节点包含该标签的Pod依旧会在该节点上运行。或者这么理解:如果Pod所在的节点在Pod运行期间标签被删除了,不再符合该Pod的节点亲和性规则,那也没关系,该Pod 还能继续在该节点上运行。亲和性表达方式需要用到如下几个可选的操作符operator
:
这些操作符里,虽然没有排斥某个节点的功能,但是用这几个标签也可以变相的实现排斥的功能。
topologyKey很多地方解释为拓扑键,其实本质上就是个作用域
的概念。
topologyKey配置了一个label的key,那么存在这个key对应的label的所有Node就在同一个作用域里。
nodeName和NodeSelelctor调度实战参考:Kubernetes系列-Pod的定向调度_当创建一个pod实例,是怎么调度到node节点上面的-CSDN博客
Kubernetes系列-部署pod到集群中的指定node_kubectl 部署pod到某个节点-CSDN博客
比如要将Pod调度到nodeName是ops-worker-2的节点上
$ vim webapp.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp
namespace: demo
labels:
app: webapp
spec:
nodeName: 'k8s-worker-2'
containers:
- name: webapp
image: nginx
ports:
- containerPort: 80
$ kubectl apply -f webapp.yaml
pod/webapp created
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
webapp 1/1 Running 0 8s 172.25.50.142 ops-worker-2
比如要将Pod调度到具有"special-app=specialwebapp"的label节点上。
节点ops-worker-2打上"special-app=specialwebapp"标签:
$ kubectl label node ops-worker-1 special-app=specialwebapp
node/ops-worker-1 labeled
查看节点信息:
$ kubectl describe node ops-worker-1
Name: ops-worker-1
Roles:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
env=uat
kubernetes.io/arch=amd64
kubernetes.io/hostname=ops-worker-1
kubernetes.io/os=linux
special-app=specialwebapp
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 10.220.43.204/20
projectcalico.org/IPv4IPIPTunnelAddr: 172.25.78.64
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sun, 17 Dec 2023 15:32:04 +0800
Taints:
Unschedulable: false
Lease:
HolderIdentity: ops-worker-1
AcquireTime:
RenewTime: Mon, 22 Jan 2024 21:59:33 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sun, 17 Dec 2023 15:32:48 +0800 Sun, 17 Dec 2023 15:32:48 +0800 CalicoIsUp Calico is running on this node
MemoryPressure False Mon, 22 Jan 2024 21:59:30 +0800 Sun, 17 Dec 2023 15:32:04 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Mon, 22 Jan 2024 21:59:30 +0800 Sun, 17 Dec 2023 15:32:04 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Mon, 22 Jan 2024 21:59:30 +0800 Sun, 17 Dec 2023 15:32:04 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 22 Jan 2024 21:59:30 +0800 Sun, 17 Dec 2023 15:32:54 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.220.43.204
Hostname: ops-worker-1
Capacity:
cpu: 8
ephemeral-storage: 103080204Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 15583444Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 94998715850
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 15481044Ki
pods: 110
System Info:
Machine ID: c72f33a969d84fac8d6f7b35c035bafa
System UUID: e2ef28e5-4140-41a9-807d-78ecf09efb8d
Boot ID: 879480b6-2f5a-45e5-9b31-4c7aab3caa33
Kernel Version: 4.19.91-27.6.al7.x86_64
OS Image: Alibaba Cloud Linux (Aliyun Linux) 2.1903 LTS (Hunting Beagle)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.21
Kubelet Version: v1.21.9
Kube-Proxy Version: v1.21.9
PodCIDR: 172.25.1.0/24
PodCIDRs: 172.25.1.0/24
Non-terminated Pods: (11 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default nginx-node1-6c7874c7b8-q2swk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 9d
kube-system calico-kube-controllers-5d4b78db86-gmvg5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 36d
kube-system calico-kube-controllers-5d4b78db86-qvrnk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d23h
kube-system calico-node-jk7zc 250m (3%) 0 (0%) 0 (0%) 0 (0%) 36d
kube-system coredns-59d64cd4d4-zr4hd 100m (1%) 0 (0%) 70Mi (0%) 170Mi (1%) 36d
kube-system kube-proxy-rm64j 0 (0%) 0 (0%) 0 (0%) 0 (0%) 36d
kube-system metrics-server-54cc454bdd-ds4zp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12d
kube-system vpa-admission-controller-54d7b4896d-75g5d 50m (0%) 200m (2%) 200Mi (1%) 500Mi (3%) 8d
kube-system vpa-admission-controller-558664548-fbhzt 50m (0%) 200m (2%) 200Mi (1%) 500Mi (3%) 5d23h
kube-system vpa-recommender-84d88664b8-4kdn5 50m (0%) 200m (2%) 500Mi (3%) 1000Mi (6%) 12d
kube-system vpa-updater-5545848b57-lq5sf 50m (0%) 200m (2%) 500Mi (3%) 1000Mi (6%) 5d23h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 550m (6%) 800m (10%)
memory 1470Mi (9%) 3170Mi (20%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Pod的yaml编排文件:
$ vim webapp2.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp-2
namespace: default
labels:
app: webapp-2
spec:
nodeSelector:
special-app: specialwebapp
containers:
- name: webapp-2
image: nginx
ports:
- containerPort: 80
$ kubectl apply -f webapp2.yaml
pod/webapp-2 created
查看Pod被调度到哪台机器上:
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
webapp-2 0/1 ContainerCreating 0 11s ops-worker-1
pod被调度在label为 special-app的node上。
Node的亲和调度是指,Node和Pod的关系。
定义Pod-Node的硬亲和yaml文件:pod_node_required_affinity.yaml
。文件内容如下:
$ vim pod_node_required_affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp-3
namespace: default
labels:
app: webapp-3
spec:
containers:
- name: webapp-3
image: nginx
ports:
- containerPort: 80
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- backend
$ kubectl apply -f pod_node_required_affinity.yaml
pod/webapp-3 created
给ops-master-3
节点添加label:
$ kubectl label node ops-master-3 app=backend
node/ops-master-3 labeled
查看ops-master-3
节点的label情况:
$ kubectl get node ops-master-3 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
ops-master-3 Ready control-plane,master 36d v1.21.9 app=backend,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,env=uat,kubernetes.io/arch=amd64,kubernetes.io/hostname=ops-master-3,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
查看调度结果:
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
webapp-3 1/1 Running 0 90s 172.25.186.68 ops-master-3
软亲和调度,主要就是加入了多个规则,每个设置了权重,yaml文件如下:
$ vim pod_node_preferred_affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp-4
namespace: default
labels:
app: webapp-4
spec:
containers:
- name: webapp-4
image: nginx
ports:
- containerPort: 80
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: app2
operator: Exists
- weight: 20
preference:
matchExpressions:
- key: app
operator: In
values:
- backend2
给节点ops-master-2设置app2=backend的标签。
$ kubectl label node ops-master-2 app2=backend
node/ops-master-2 labeled
$ kubectl apply -f webapp-4.yaml
pod/webapp-4 created
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
webapp-4 1/1 Running 0 5s 172.25.78.133 ops-master-2
pod调度到ops-master-2上面。
Pod亲和调度,是指Pod和Pod之间的关系。
比如Pod1跟随Pod2,Pod2被调度到B节点,那么Pod1也被调度到B节点。
所以需要部署2个Pod。Pod1使用上面的例子,让Pod1采用Node硬亲和调度到k8s-worker-3
节点。然后再部署Pod2,让它跟随Pod1,也会被调度到k8s-worker-3
节点。
准备Pod2的yaml编排文件pod_pod_required_affinity.yaml
,如下:
$ vim pod_pod_required_affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp-5
namespace: default
labels:
app: webapp-5
spec:
containers:
- name: webapp-5
image: nginx
ports:
- containerPort: 80
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- webapp-3
$ kubectl apply -f pod_pod_required_affinity.yaml
pod/webapp-5 created
查看调度结果:
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
webapp-3 1/1 Running 0 18m 172.25.186.68 ops-master-3
webapp-4 1/1 Running 0 4m51s 172.25.78.133 ops-master-2
webapp-5 1/1 Running 0 8s 172.25.186.69 ops-master-3
webapp-3和webapp-5调度在同一个node上。
软亲和和硬亲和类似,只是多了权重。
$ vim webapp-6.yaml
apiVersion: v1
# 选择调度到具有这个label的节点
kind: Pod
metadata:
name: webapp-6
namespace: default
labels:
app: webapp-6
spec:
containers:
- name: webapp-6
image: nginx
ports:
- containerPort: 80
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 40
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app2
operator: Exists
topologyKey: kubernetes.io/hostname
- weight: 60
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- webapp-4
topologyKey: kubernetes.io/hostname
$ kubectl apply -f webapp-6.yaml
pod/webapp-6 created
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-node1-6c7874c7b8-d6cnw 1/1 Running 0 6d 172.25.78.131 ops-master-2
nginx-node1-6c7874c7b8-q2swk 1/1 Running 0 9d 172.25.78.80 ops-worker-1
nginx-test-6b7c99bbb-b6smk 0/1 Pending 0 6d
nginx-test-6b7c99bbb-jd5xt 0/1 Pending 0 6d
webapp 1/1 Running 0 63m 172.25.50.142 ops-worker-2
webapp-1 1/1 Running 0 56m 172.25.50.143 ops-worker-2
webapp-2 1/1 Running 0 51m 172.25.78.85 ops-worker-1
webapp-3 1/1 Running 0 46m 172.25.186.68 ops-master-3
webapp-4 1/1 Running 0 33m 172.25.78.133 ops-master-2
webapp-5 1/1 Running 0 28m 172.25.186.69 ops-master-3
webapp-6 0/1 ContainerCreating 0 3s ops-master-2
$ vim webapp-8.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp-2
namespace: demo
labels:
app: webapp-2
spec:
containers:
- name: webapp
image: nginx
ports:
- containerPort: 80
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- webapp
$ kubectl apply -f webapp-7.yaml
pod/webapp-8 created
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-node1-6c7874c7b8-d6cnw 1/1 Running 0 6d 172.25.78.131 ops-master-2
nginx-node1-6c7874c7b8-q2swk 1/1 Running 0 9d 172.25.78.80 ops-worker-1
nginx-test-6b7c99bbb-b6smk 0/1 Pending 0 6d
nginx-test-6b7c99bbb-jd5xt 0/1 Pending 0 6d
webapp 1/1 Running 0 66m 172.25.50.142 ops-worker-2
webapp-1 1/1 Running 0 59m 172.25.50.143 ops-worker-2
webapp-2 1/1 Running 0 55m 172.25.78.85 ops-worker-1
webapp-3 1/1 Running 0 49m 172.25.186.68 ops-master-3
webapp-4 1/1 Running 0 36m 172.25.78.133 ops-master-2
webapp-5 1/1 Running 0 31m 172.25.186.69 ops-master-3
webapp-6 1/1 Running 0 3m13s 172.25.78.134 ops-master-2
webapp-8 1/1 Running 0 5s 172.25.186.78 ops-master-3
webapp-8没有和webapp调度到同一个node上。
反亲和的软亲和 和 硬亲和类似,只是多了权重,此处不做测试。