节点亲和性可以根据节点上的标签来约束 Pod 可以调度到哪些节点上。 节点亲和性有两种:
requiredDuringSchedulingIgnoredDuringExecution
: 调度器只有在规则被满足的时候才能执行调度。也就是我们所说的硬亲和。preferredDuringSchedulingIgnoredDuringExecution
: 调度器会尝试寻找满足对应规则的节点。如果找不到匹配的节点,调度器仍然会调度该 Pod。被成为软亲和[root@master ~]# kubectl explain pod.spec.affinity
KIND: Pod
VERSION: v1
FIELD: affinity
DESCRIPTION:
If specified, the pod's scheduling constraints
Affinity is a group of affinity scheduling rules.
FIELDS:
nodeAffinity #节点亲和性
Describes node affinity scheduling rules for the pod.
podAffinity #pod的亲和性
Describes pod affinity scheduling rules (e.g. co-locate this pod in the same
node, zone, etc. as some other pod(s)).
podAntiAffinity #pod的反亲和性
Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod in
the same node, zone, etc. as some other pod(s)).
[root@master ~]# kubectl explain pod.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution
KIND: Pod
VERSION: v1
FIELD: requiredDuringSchedulingIgnoredDuringExecution
DESCRIPTION:
If the affinity requirements specified by this field are not met at
scheduling time, the pod will not be scheduled onto the node. If the
affinity requirements specified by this field cease to be met at some point
during pod execution (e.g. due to an update), the system may or may not try
to eventually evict the pod from its node.
A node selector represents the union of the results of one or more label
queries over a set of nodes; that is, it represents the OR of the selectors
represented by the node selector terms.
FIELDS:
nodeSelectorTerms <[]NodeSelectorTerm> -required-
Required. A list of node selector terms. The terms are ORed.
nodeAffinity 的基础上添加多个 nodeSelectorTerms 字段,调度的时候 Node 只需要 nodeSelectorTerms 中的某一个符合条件就符合 nodeAffinity 的规则.在nodeSelectorTerms 中添加 matchExpressions,需要可以调度的Node是满足 matchExpressions 中表示的所有规则.
[root@master ~]# kubectl explain pod.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms
KIND: Pod
VERSION: v1
FIELD: nodeSelectorTerms <[]NodeSelectorTerm>
DESCRIPTION:
Required. A list of node selector terms. The terms are ORed.
A null or empty node selector term matches no objects. The requirements of
them are ANDed. The TopologySelectorTerm type implements a subset of the
NodeSelectorTerm.
FIELDS:
matchExpressions <[]NodeSelectorRequirement>
A list of node selector requirements by node's labels.
matchFields <[]NodeSelectorRequirement>
A list of node selector requirements by node's fields.
matchExpressions : 匹配表达式,这个标签可以指定一段,例如pod中定义的key为zone,operator为In(包含那些),values为 foo和bar。就是在node节点中包含foo和bar的标签中调度
matchFields : 匹配字段,不过可以不定义标签值,可以定义匹配在 node 有 zone 标签值为 foo 或 bar 值的节点上运行 pod
[root@master ~]# kubectl explain pod.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchExpressions
KIND: Pod
VERSION: v1
FIELD: matchExpressions <[]NodeSelectorRequirement>
DESCRIPTION:
A list of node selector requirements by node's labels.
A node selector requirement is a selector that contains values, a key, and
an operator that relates the key and values.
FIELDS:
key -required-
The label key that the selector applies to.
operator -required- #匹配规则
Represents a key's relationship to a set of values. Valid operators are In,
NotIn, Exists, DoesNotExist. Gt, and Lt.
Possible enum values:
- `"DoesNotExist"`
- `"Exists"`
- `"Gt"`
- `"In"`
- `"Lt"`
- `"NotIn"`
values <[]string>
An array of string values. If the operator is In or NotIn, the values array
must be non-empty. If the operator is Exists or DoesNotExist, the values
array must be empty. If the operator is Gt or Lt, the values array must have
a single element, which will be interpreted as an integer. This array is
replaced during a strategic merge patch.
operator
字段来为 Kubernetes 设置在解释规则时要使用的逻辑操作符。 可以使用 In
、NotIn
、Exists
、DoesNotExist
、Gt
和 Lt
之一作为操作符。
下面是可以在上述 nodeAffinity
和 podAffinity
的 operator
字段中可以使用的所有逻辑运算符。
操作符 | 行为 |
---|---|
In |
标签值存在于提供的字符串集中 |
NotIn |
标签值不包含在提供的字符串集中 |
Exists |
对象上存在具有此键的标签 |
DoesNotExist |
对象上不存在具有此键的标签 |
以下操作符只能与 nodeAffinity
一起使用。
操作符 | 行为 |
---|---|
Gt |
字段值将被解析为整数,并且该整数小于通过解析此选择算符命名的标签的值所得到的整数 |
Lt |
字段值将被解析为整数,并且该整数大于通过解析此选择算符命名的标签的值所得到的整数 |
Gt
和 Lt
操作符不能与非整数值一起使用。 如果给定的值未解析为整数,则该 Pod 将无法被调度。 另外,Gt
和 Lt
不适用于 podAffinity
。
requiredDuringSchedulingIgnoredDuringExecution 硬亲和
[root@master pod]# cat pod-nodeaffinity.yml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
labels:
app: nginx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
nodeSelectorTerms: #使用node标签匹配
- matchExpressions: #匹配标签表达式
- key: kubernetes/test-pod #指定节点上存在的标签
operator: "In" #定义匹配逻辑
values: #标签对应的值
- "node-1"
- "testing"
containers:
- name: nginx
image: daocloud.io/library/nginx
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
kubernetes/test-pod
的标签, 并且该标签的取值必须为 node-1
或 testing
。[root@master pod]# kubectl apply -f pod-nodeaffinity.yml
pod/test-pod created
[root@master pod]# kubectl get pod
NAME READY STATUS RESTARTS AGE
test-pod 0/1 Pending 0 3s
[root@master pod]# kubectl describe pod test-pod
Name: test-pod
Namespace: default
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11s default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
给任意一个node节点设置标签
[root@master ~]# kubectl label node node-1 kubernetes/test-pod=node-1
node/node-1 labeled
[root@master ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
test-pod 1/1 Running 0 3m29s
preferredDuringSchedulingIgnoredDuringExecution: 软亲和
[root@master pod]# cat pod-nodeaffinity-pre.yml
apiVersion: v1
kind: Pod
metadata:
name: pod-2
namespace: default
labels:
type: app
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: kubernetes/test-app
operator: "In"
values:
- "type"
- "type-2"
weight: 1 #设置权重值,值越高优先级越高
- preference:
matchExpressions:
- key: zone
operator: "In"
values:
- "foo"
- "bar"
weight: 10
containers:
- image: daocloud.io/library/nginx
imagePullPolicy: IfNotPresent
name: nginx-2
[root@master pod]# kubectl apply -f pod-nodeaffinity-pre.yml
节点最好具有一个键名为 kubernetes/test-app
且取值为 type
的标签,如果没有也每关系,这时候就会随机调度到一个节点。如果node节点的标签都一样那就会通过权重值来进行选择,权重值大的优先被选择调度。
软硬亲和性同时存在
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
containers:
- name: with-node-affinity
image: nginx
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: In
values:
- dev
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: disktype
operator: In
values:
- ssd
此节点亲和性规则表示,只能将Pod放在带有标签的键为 zone 值为 dev 的node上,在满足该条件的节点中,应该首选带有键值为 disktype 值为 ssd 的节点.
如果同时指定nodeSelector和nodeAffinity,则必须满足两个条件,才能将Pod调度到候选节点上。
如果指定了多个nodeSelectorTerms关联nodeAffinity类型,那么pod 可以安排到满足nodeSelectorTerms之一的节点。
如果指定matchExpressions与关联的多个nodeSelectorTerms,则只有matchExpressions在满足所有nodeSelectorTerms条件的情况下,才能将Pod调度到节点上。
如果删除或更改计划了pod的节点的标签,则该pod不会被删除。亲和性选择仅在安排pod时有效。
weight 在场 preferredDuringSchedulingIgnoredDuringExecution 的范围是从1-100,值越大优先级越高,计算节点权重值之和,和 matchExpressions 的匹配度结合,实现调度 pod 节点的选择。