该篇文章已经被专栏《从零开始学k8s》收录
pod 自身的亲和性调度有两种表示形式
podaffinity:pod 和 pod 更倾向腻在一起,把相近的 pod 结合到相近的位置,比如同一区域,同一机架,这样的话 pod 和 pod 之间更好通信,比方说有两个机房,这两个机房部署的集群有 1000 台主机,那么我们希望把 nginx 和 tomcat 都部署同一个地方的 node 节点上,可以提高通信效率。
podunaffinity:pod 和 pod 更倾向不腻在一起,如果部署两套程序,那么这两套程序更倾向于反亲和性,这样相互之间不会有影响。
第一个 pod 随机选则一个节点,做为评判后续的 pod 能否到达这个 pod 所在的节点上的运行方式,这就称为 pod 亲和性;我们怎么判定哪些节点是相同位置的,哪些节点是不同位置的。我们在定义 pod 亲和性时需要有一个前提,哪些 pod 在同一个位置,哪些 pod 不在同一个位置,这个位置是怎么定义的,标准是什么?以节点名称为标准,这个节点名称相同的表示是同一个位置,节点名称不相同的表示不是一个位置,或者其他方式。
[root@k8smaster ~]# kubectl explain pods.spec.affinity.podAffinity
KIND: Pod
VERSION: v1
RESOURCE: podAffinity <Object>
DESCRIPTION:
Describes pod affinity scheduling rules (e.g. co-locate this pod in the
same node, zone, etc. as some other pod(s)).
Pod affinity is a group of inter pod affinity scheduling rules.
FIELDS:
preferredDuringSchedulingIgnoredDuringExecution <[]Object> #软亲和性
requiredDuringSchedulingIgnoredDuringExecution <[]Object> #硬亲和性
[root@k8smaster ~]# kubectl explain pods.spec.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution
KIND: Pod
VERSION: v1
RESOURCE: requiredDuringSchedulingIgnoredDuringExecution <[]Object>
DESCRIPTION:
If the affinity requirements specified by this field are not met at
scheduling time, the pod will not be scheduled onto the node. If the
affinity requirements specified by this field cease to be met at some point
during pod execution (e.g. due to a pod label update), the system may or
may not try to eventually evict the pod from its node. When there are
multiple elements, the lists of nodes corresponding to each podAffinityTerm
are intersected, i.e. all terms must be satisfied.
Defines a set of pods (namely those matching the labelSelector relative to
the given namespace(s)) that this pod should be co-located (affinity) or
not co-located (anti-affinity) with, where co-located is defined as running
on a node whose value of the label with key <topologyKey> matches that of
any node on which a pod of the set of pods is running
FIELDS:
labelSelector <Object> #我们要判断 pod 跟别的 pod 亲和,跟哪个 pod 亲和,需要靠 labelSelector,通过 labelSelector选则一组能作为亲和对象的 pod 资源
namespaces <[]string> #labelSelector 需要选则一组资源,那么这组资源是在哪个名称空间中呢,通过 namespace 指定,如果不指定 namespaces,那么就是当前创建 pod 的名称空间
topologyKey <string> -required- #位置拓扑的键,这个是必须字段
怎么判断是不是同一个位置
rack=rack1
row=row1
使用 rack 的键是同一个位置
使用 row 的键是同一个位置
[root@k8smaster ~]# kubectl explain pods.spec.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution.labelSelector
KIND: Pod
VERSION: v1
RESOURCE: labelSelector <Object>
DESCRIPTION:
A label query over a set of resources, in this case pods.
A label selector is a label query over a set of resources. The result of
matchLabels and matchExpressions are ANDed. An empty label selector matches
all objects. A null label selector matches no objects.
FIELDS:
matchExpressions <[]Object>
matchLabels <map[string]string>
例:pod 节点亲和性
#定义两个 pod,第一个 pod 做为基准,第二个 pod 跟着它走
[root@k8smaster ~]# kubectl delete pods pod-first
pod "pod-first" deleted
[root@k8smaster ~]# vim pod-required-affinity-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app2: myapp2
tier: frontend
spec:
containers:
- name: myapp
image: nginx
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app2, operator: In, values: ["myapp2"]}
topologyKey: kubernetes.io/hostname
busybox如果不写sleep会自动关闭,sleep是等待多长时间。然后下面就是pod亲和性,硬亲和性,然后通过label(标签选择器选择),之前我们也通过帮助文档看过了选择器里面的字段(也可以深入看match里面有啥)
下面的key意思是选择app2=myapp2的标签做亲和性,如果不写第二个pod会找不到和哪个做亲和性。
最后一行可以直接用nodes里的已有标签来位置拓扑的键。kubectl get nodes --show-labels查看标签。
kubernetes.io/hostname标签对应的是具体的k8snode节点名,如果frist调度到node2或者node1,第二个pod也跟着调度到哪个节点(根据主机名做位置)
#上面表示创建的 pod 必须与拥有 app2=myapp2 标签的 pod 在一个节点上
[root@k8smaster ~]# kubectl apply -f pod-required-affinity-demo.yaml
pod/pod-first created
pod/pod-second created
[root@k8smaster ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
pod-first 1/1 Running 0 26s 10.244.1.21 k8snode2 <none>
pod-second 1/1 Running 0 26s 10.244.1.22 k8snode2 <none>
#上面说明第一个 pod 调度到哪,第二个 pod 也调度到哪,这就是 pod 节点亲和性
[root@k8smaster ~]# kubectl delete -f pod-required-affinity-demo.yaml
pod "pod-first" deleted
pod "pod-second" deleted
定义两个 pod,第一个 pod 做为基准,第二个 pod 跟它调度节点相反 同样基于node名字作为基准
[root@k8smaster ~]# vim pod-required-anti-affinity-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app1: myapp1
tier: frontend
spec:
containers:
- name: myapp
image: nginx
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app1, operator: In, values: ["myapp1"]}
topologyKey: kubernetes.io/hostname
[root@k8smaster ~]# kubectl apply -f pod-required-anti-affinity-demo.yaml
pod/pod-first created
pod/pod-second created
[root@k8smaster ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
pod-first 1/1 Running 0 21s 10.244.1.23 k8snode2 <none>
pod-second 1/1 Running 0 21s 10.244.2.20 k8snode <none>
#两个 pod 不在一个 node 节点上,这就是 pod 节点反亲和性
[root@k8smaster ~]# kubectl delete -f pod-required-anti-affinity-demo.yaml
pod "pod-first" deleted
pod "pod-second" deleted
#例3:换一个 topologykey
[root@k8smaster ~]# kubectl label nodes k8snode zone=foo --overwrite
node/k8snode not labeled
[root@k8smaster ~]# kubectl label nodes k8snode2 zone=foo --overwrite
node/k8snode2 labeled
#然后去pp node 和当前目录下都删除掉k8s的pod。
[root@k8smaster pp]# kubectl delete -f .
resourcequota "mem-cpu-quota" deleted
pod "pod-test" deleted
[root@k8smaster node]# kubectl delete -f .
pod "demo-pod-1" deleted
pod "demo-pod" deleted
pod "pod-node-affinity-demo-2" deleted
pod "pod-node-affinity-demo" deleted
[root@k8smaster node]# kubectl get pods
NAME READY STATUS RESTARTS AGE
[root@k8smaster node]# vim pod-first-required-anti-affinity-demo-1.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app3: myapp3
tier: frontend
spec:
containers:
- name: myapp
image: nginx
[root@k8smaster node]# vim pod-second-required-anti-affinity-demo-1.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app3, operator: In, values: ["myapp3"]}
topologyKey: zone
#如果写在一起,可能启动顺序会有错误,比如第二个pod先启动。不管pod调度到哪个节点,都都是以zone标签作为位置。
[root@k8smaster node]# kubectl apply -f pod-first-required-anti-affinity-demo-1.yaml
pod/pod-first created
[root@k8smaster node]# kubectl apply -f pod-second-required-anti-affinity-demo-1.yaml
pod/pod-second created
[root@k8smaster node]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
pod-first 1/1 Running 0 21s 10.244.1.23 k8snode2 <none>
pod-second 0/1 pending 0 21s <none> <none> <none>
第二个节点现是 pending,因为两个节点是同一个位置(因为配置了一样的zone标签,如果pod1调度到有zone标签的node上,那么第二个pod就永远不会调度到有zone标签的node上,因为我们要求的是反亲和性)现在没有不是同一个位置的了,所以就会处于 pending 状态,如果在反亲和性这个位置把 required 改成 preferred,那么也会运行。
podaffinity:pod 节点亲和性,pod 倾向于哪个 pod
nodeaffinity:node 节点亲和性,pod 倾向于哪个 node
创作不易,如果觉得内容对你有帮助,麻烦给个三连关注支持一下我!如果有错误,请在评论区指出,我会及时更改!
目前正在更新的系列:从零开始学k8s
感谢各位的观看,文章掺杂个人理解,如有错误请联系我指出~