Kubernetes将Pod而不是单个容器作为最小的可部署单元。如果要部署应用程序,则必须将它作为容器部署在Pod中。尽管应用程序可以在容器中运行,但在Kubernetes中,容器必须是Pod的一部分。实际使用中很少直接创建Pod,而是使用高层级的负载均衡资源及其控制器来管理Pod副本。但是,工作负载资源使用Pod模板来创建相应的Pod,仍然涉及Pod的配置,因此我们有必要掌握Pod的创建和管理方法。
Pod 是 Kubernetes 中最小的可部署和可管理的计算单元,用于封装一个或多个紧密关联的容器,并为其提供共享的运行环境。以下是关于 Pod 的核心要点:
localhost
直接通信。emptyDir
、configMap
)可被所有容器共享。核心组件
YAML 定义示例
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
- name: sidecar
image: busybox
command: ["sh", "-c", "tail -f /dev/null"]
kubectl explain pod
可查询字段的详细说明。阶段与状态
Pod 的生命周期包括以下阶段和状态:
状态 | 描述 |
---|---|
Pending | Pod 已提交但未完成调度或容器镜像下载 |
Running | 容器已创建且至少有一个在运行 |
Succeeded | 所有容器正常终止(退出码为 0) |
Failed | 至少一个容器异常终止(退出码非 0) |
Unknown | 无法获取状态(通常因节点通信故障) |
关键流程
preStop
钩子 → 发送 SIGTERM 信号 → 强制终止(宽限期后)。Job
或 CronJob
运行一次性任务。Deployment
、StatefulSet
等控制器管理 Pod,实现自愈、滚动更新等功能。requests
和 limits
,防止资源争抢。livenessProbe
和 readinessProbe
确保应用可用性。imagePullPolicy
控制镜像更新逻辑(Always
、IfNotPresent
、Never
)。Pod 是 Kubernetes 编排能力的基石,通过抽象容器间的共享环境简化了复杂应用的部署。理解其生命周期、组成及使用场景,是设计高可用、可扩展服务的关键。
以下是关于 Pod 与容器 的简要介绍,基于您提出的四个方面:
localhost
通信)。场景:适用于单个容器即可完成任务的简单应用(如运行一个 Nginx Web 服务器)。
优势:部署简单,资源占用少,是 Kubernetes 最常见的使用模式。
示例
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:latest
场景
:需要紧密协作的容器组,例如:
优势:共享网络和存储,减少跨容器通信开销。
示例
spec:
containers:
- name: app
image: my-app
- name: log-agent
image: fluentd
volumeMounts:
- name: logs
mountPath: /var/log
volumes:
- name: logs
emptyDir: {}
作用:在主容器启动前执行初始化任务(如下载依赖、等待数据库就绪),且必须成功退出后才会启动主容器。
特点
示例
spec:
initContainers:
- name: init-db
image: busybox
command: ['sh', '-c', 'until nslookup mysql-service; do echo waiting; sleep 2; done']
containers:
- name: app
image: my-app
以下是基于 YAML 配置文件 的 Pod 定义详解:
Pod 的 YAML 定义核心字段
apiVersion: v1 # Kubernetes API 版本(Pod 属于核心 API,固定为 v1)
kind: Pod # 资源类型标识,此处为 Pod
metadata: # 元数据,描述 Pod 的标识信息
name: my-pod # Pod 名称(同一命名空间内唯一)
namespace: default # 所属命名空间(默认为 default)
labels: # 标签,用于资源筛选和关联
app: web
env: dev
spec: # Pod 的具体配置规则
containers: # 容器列表(必填,至少一个容器)
- name: nginx # 容器名称(Pod 内唯一)
image: nginx:1.25 # 容器镜像地址(必填)
imagePullPolicy: IfNotPresent # 镜像拉取策略(Always/Never/IfNotPresent)
ports: # 容器暴露的端口(可选,仅用于文档说明,不实际控制端口开放)
- containerPort: 80
protocol: TCP
resources: # 资源限制与请求
requests:
memory: "128Mi"
cpu: "0.5"
limits:
memory: "256Mi"
cpu: "1"
volumeMounts: # 挂载存储卷到容器内路径
- name: logs-volume
mountPath: /var/log/nginx
volumes: # 定义 Pod 级别的存储卷(供所有容器挂载)
- name: logs-volume
emptyDir: {} # 使用临时空目录作为存储卷
关键字段说明
metadata
(元数据)name
:Pod 的唯一名称(命名规则:小写字母、数字或 -
,不能以数字开头)。namespace
:Pod 所属的命名空间(默认 default
)。labels
:键值对标签,用于筛选和管理 Pod(如 kubectl get pods -l app=web
)。spec
(规格配置)containers
:定义 Pod 中的容器列表(核心配置):
name
:容器名称(同一 Pod 内唯一)。
image
:容器镜像地址(如 nginx:latest
、my-registry/app:v1
)。
imagePullPolicy
:镜像拉取策略:
Always
:总是从仓库拉取(默认策略,当镜像标签为 latest
时)。IfNotPresent
:本地不存在时拉取。Never
:仅使用本地镜像。ports
:声明容器监听的端口(仅用于文档,实际端口由容器进程决定)。
resources
:资源配额(避免资源争抢):
requests
:容器启动的最低资源需求(调度依据)。limits
:容器运行时的资源上限(超过会被终止或限制)。volumes
与 volumeMounts
volumes
:定义 Pod 级别的存储卷(如 emptyDir
、configMap
、persistentVolumeClaim
)。volumeMounts
:将存储卷挂载到容器的指定路径(如日志目录、配置文件)。完整示例:多容器 Pod
apiVersion: v1
kind: Pod
metadata:
name: web-app
labels:
app: frontend
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
volumeMounts:
- name: config
mountPath: /etc/nginx/conf.d
- name: log-collector
image: fluentd:latest
volumeMounts:
- name: logs
mountPath: /var/log/nginx
volumes:
- name: config
configMap: # 使用 ConfigMap 存储 Nginx 配置
name: nginx-config
- name: logs
emptyDir: {} # 临时存储卷(Pod 删除后数据丢失)
操作步骤
保存配置到文件(如 pod.yaml
)。
创建 Pod:
kubectl apply -f pod.yaml
查看 Pod 状态:
kubectl get pods -o wide
kubectl describe pod web-app
注意事项
Deployment
或 StatefulSet
管理 Pod(支持滚动更新、自愈)。kubectl explain pod
查看字段详细说明。Pod 是 Kubernetes 中最小的可调度单元,其生命周期从创建到终止涉及多个关键阶段和机制。以下是 Pod 生命周期的核心要点及详细说明:
Pod 的状态通过 Phase
字段描述,主要分为以下五个阶段:
阶段 | 描述 |
---|---|
Pending | Pod 已提交但未完成调度或容器镜像下载,可能因资源不足、调度延迟或节点故障导致。 |
Running | Pod 已调度到节点,且至少有一个容器正在运行(包括初始化容器完成后的主容器)。 |
Succeeded | 所有容器正常终止(退出码为 0),适用于一次性任务(如批处理作业)。 |
Failed | 至少有一个容器异常终止(退出码非 0 或资源耗尽),且不再重启。 |
Unknown | 无法获取 Pod 状态,通常因节点通信故障或 API Server 不可达。 |
.spec.schedulingGates
可延迟调度,直到条件满足(如依赖资源就绪)。resources
字段设置 CPU/内存的请求(requests
)和上限(limits
),防止资源争抢。terminationGracePeriodSeconds
),超时后强制终止。通过 .spec.restartPolicy
定义容器异常退出时的处理方式:
策略 | 描述 |
---|---|
Always | 始终重启容器(默认策略,适用于长期运行的服务) |
OnFailure | 仅在容器异常退出(非零状态码)时重启,适用于任务型作业 |
Never | 不重启容器,依赖上层控制器(如 Deployment)管理 Pod 生命周期 |
示例配置:
spec:
restartPolicy: OnFailure
containers:
- name: app
image: my-app
Pod 的详细状态通过 Conditions
字段细化:
状况类型 | 描述 |
---|---|
PodScheduled | Pod 已成功调度到节点 |
Initialized | 所有 Init 容器执行完毕 |
ContainersReady | 所有主容器已就绪 |
Ready | Pod 可接收流量(需 Readiness Probe 通过) |
requests
和 limits
避免资源争抢,提升集群稳定性。preStop
钩子确保服务平滑下线(如通知注册中心)。apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
initContainers:
- name: init-db
image: busybox
command: ["sh", "-c", "until nslookup mysql; do sleep 2; done"]
containers:
- name: nginx
image: nginx:latest
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo 'Started at $(date)' > /usr/share/nginx/html/start.html"]
preStop:
exec:
command: ["nginx", "-s", "quit"]
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 15
periodSeconds: 20
terminationGracePeriodSeconds: 60
Pod 的生命周期管理是 Kubernetes 编排能力的核心,涵盖调度、初始化、健康检查、优雅终止等关键机制。理解各阶段的触发条件和配置方法,能够有效提升应用的稳定性和可维护性。实际应用中需结合控制器和探针机制,实现自动化运维和故障恢复。
探针类型 | 作用 | 失败处理 | 探测方式(示例) | 关键参数(示例) |
---|---|---|---|---|
Liveness Probe | 检测容器是否存活(如进程崩溃、死锁)。 | 重启容器 | httpGet 、exec 、tcpSocket |
initialDelaySeconds: 15 periodSeconds: 10 failureThreshold: 3 |
Readiness Probe | 检测容器是否就绪(可接收流量)。 | 从 Service 流量中移除该 Pod | httpGet: path=/ready |
successThreshold: 1 timeoutSeconds: 5 |
Startup Probe | 检测应用是否完成启动(允许启动慢的服务)。 | 重启容器,直到成功后才执行其他探针 | tcpSocket: port=8080 |
failureThreshold: 30 periodSeconds: 5 (允许最长 30×5=150秒启动时间) |
补充说明
Startup Probe
> Liveness Probe
& Readiness Probe
。initialDelaySeconds
: 0(立即探测)periodSeconds
: 10(每10秒探测一次)timeoutSeconds
: 1(超时1秒视为失败)failureThreshold
: 3(连续失败3次标记为异常)下面创建一个包含两个容器的pod,两个容器共享一个用于它们之间通信的卷。
(1)创建pod配置文件
[root@master ~]# vim two-containers-pod.yaml
[root@master ~]# cat two-containers-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: two-containers-pod
spec:
# Pod级配置
restartPolicy: Never
volumes: # 定义共享数据的卷
- name: shared-data
emptyDir: {}
containers:
# 第1个容器配置
- name: nginx-container
image: nginx
volumeMounts: # 挂载共享卷
- name: shared-data
mountPath: /usr/share/nginx/html # 挂载路径
# 第2个容器配置
- name: busybox-container
image: busybox
volumeMounts: # 挂载共享卷
- name: shared-data
mountPath: /pod-data # 挂载路径
# 容器启动命令及参数
command: ["/bin/sh"]
args: ["-c", "echo Hello from the busybox container > /pod-data/index.html"]
[root@master ~]#
该配置文件中为pod定义了一个名为shared-data的共享卷,这是emptyDir类型的卷,只要Pod存在,该卷就一直存在,只有Pod被删除时改卷才会被删除。
两个容器都挂载该卷。第1个容器运行nginx服务器,共享卷的挂载路径是/usr/share/nginx/html;
第2个容器运行BusyBox系统,共享卷挂载路径是/pod-data。
需要注意的是,第2个容器运行容器启动命令,将消息写入指定的index.html文件后会终止运行。由于与第1个容器共享卷,该文件会被写入nginx服务器的根目录下。
(2)基于上述配置文件创建Pod。
[root@master ~]# kubectl apply -f two-containers-pod.yaml
pod/two-containers-pod created
(3)查看Pod及其容器的信息,以YAML格式输出
[root@master ~]# kubectl get pod two-containers-pod --output=yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
cni.projectcalico.org/containerID: 105c87f280ac172e6204ec63850269d8fe691d1588220ad58cec5c515eef2fcf
cni.projectcalico.org/podIP: 10.244.166.142/32
cni.projectcalico.org/podIPs: 10.244.166.142/32
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"two-containers-pod","namespace":"default"},"spec":{"containers":[{"image":"nginx","name":"nginx-container","volumeMounts":[{"mountPath":"/usr/share/nginx/html","name":"shared-data"}]},{"args":["-c","echo Hello from the busybox container \u003e /pod-data/index.html"],"command":["/bin/sh"],"image":"busybox","name":"busybox-container","volumeMounts":[{"mountPath":"/pod-data","name":"shared-data"}]}],"restartPolicy":"Never","volumes":[{"emptyDir":{},"name":"shared-data"}]}}
creationTimestamp: "2025-03-30T11:52:36Z"
name: two-containers-pod
namespace: default
resourceVersion: "23136"
uid: b8fb5a02-6fe9-4dfc-a096-c1e4d2a49370
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx-container
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/nginx/html
name: shared-data
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-mr94t
readOnly: true
- args:
- -c
- echo Hello from the busybox container > /pod-data/index.html
command:
- /bin/sh
image: busybox
imagePullPolicy: Always
name: busybox-container
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /pod-data
name: shared-data
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-mr94t
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: node1
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: shared-data
- name: kube-api-access-mr94t
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-03-30T11:52:36Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2025-03-30T11:52:36Z"
message: 'containers with unready status: [busybox-container]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2025-03-30T11:52:36Z"
message: 'containers with unready status: [busybox-container]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2025-03-30T11:52:36Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://7ac5b9a00e3b01b3f15cd8c3f4c234f02fc39152aa393a13d86cb43a92a09dd8
image: busybox:latest
imageID: docker-pullable://busybox@sha256:37f7b378a29ceb4c551b1b5582e27747b855bbfaa73fa11914fe0df028dc581f
lastState: {}
name: busybox-container
ready: false
restartCount: 0
started: false
state:
terminated: //终止
containerID: docker://7ac5b9a00e3b01b3f15cd8c3f4c234f02fc39152aa393a13d86cb43a92a09dd8
exitCode: 0
finishedAt: "2025-03-30T11:52:52Z"
reason: Completed
startedAt: "2025-03-30T11:52:52Z"
- containerID: docker://0bce9a982c62efc5d22ca94d4a770a3690d1a62b2cb98c35141986b8b9d4f4b5
image: nginx:latest
imageID: docker-pullable://nginx@sha256:124b44bfc9ccd1f3cedf4b592d4d1e8bddb78b51ec2ed5056c52d3692baebc19
lastState: {}
name: nginx-container
ready: true
restartCount: 0
started: true
state:
running: //正在运行
startedAt: "2025-03-30T11:52:39Z"
hostIP: 192.168.10.31
phase: Running
podIP: 10.244.166.142
podIPs:
- ip: 10.244.166.142
qosClass: BestEffort
startTime: "2025-03-30T11:52:36Z"
可以发现,busybox容器已经被终止,而nginx容器依然在运行
(4)进入nginx容器的Shell环境,使用curl命令向nginx服务器发起请求
[root@master ~]# kubectl exec -it two-containers-pod -c nginx-container -- /bin/bash
root@two-containers-pod:/# curl localhost
Hello from the busybox container
root@two-containers-pod:/# exit
exit
由于busybox容器在nginx容器的根目录下创建了index.html文件,所以这里能够访问该文件。
(5)使用curl命令向Pod的IP地址发起请求,也能访问该index.html文件。
[root@master ~]# curl 10.244.166.142
Hello from the busybox container
[root@master ~]#
(6)执行kubectl delete -f命令删除该pod
[root@master ~]# kubectl delete -f two-containers-pod.yaml
pod "two-containers-pod" deleted
[root@master ~]# kubectl get pod two-containers-pod --output=yaml
Error from server (NotFound): pods "two-containers-pod" not found
定义Pod时可以根据需要为每个容器设置所需要的资源数量,也就是资源配额,以免容器占用大量资源导致其他容器无法运行。
Kubermetes 使用.spec.resources字段为容器设置资源配额,该字段包括以下两个子字段,用于设置资源配额的上下限。
实际应用中主要设置CPU和内存这两种资源。CPU资源以CPU为单位,1个CPU等于个物理 CPU核或者一个虚拟核。CPU资源的CPU数可以是整数和小数,也可以用毫核(m)为单位表示。1个CPU等于1000m,Kuberetes不允许设置精度小于1m的CPU资源。内存资源以字节为单位,可以使用普通的整数,或者带有E、P、T、G、M、k等数量单位的数;也可以使用对应的2的幂数,如Ei、Pi、Ti、Gi、Mi、Ki。
下面创建两个容器的pod,并为每个容器分别设置CPU和内存配额,其中第2个容器使用stress程序做压力测试。stress是Linux的一个压力测试工具,可以对CPU、内存、磁盘等做压力测试。
(1)创建pod配置文件
[root@master ~]# vim resources-limit-pod.yaml
[root@master ~]# cat resources-limit-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: resources-limit-pod
spec:
containers:
- name: nginx
image: nginx
resources: # 资源配额
limits: # 限制资源(上限)
cpu: 200m # CPU限制
memory: 400Mi # 内存限制
requests: # 请求资源(下限)
cpu: 100m
memory: 200Mi
- name: stress
image: polinux/stress
resources: # 资源配额
limits: # 限制资源(上限)
memory: "200Mi"
requests: # 请求资源(下限)
memory: "100Mi"
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]
最后两行是第2个容器的启动命令,表示执行stress命令压满150MB内存。–vm选项用于指定进程数量,–vm-bytes选项表示分配的内存量,–vm-hang选项表示内存分配多长时间后释放掉,单位是秒。
(2)基于该配置文件创建pod
[root@master ~]# kubectl apply -f resources-limit-pod.yaml
pod/resources-limit-pod created
(3)验证pod中的容器是否已经运行,可以发现两个容器都能正常运行
[root@master ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
resources-limit-pod 2/2 Running 0 3m15s
(4)查看pod相关详细信息,可以发现,两个容器的cpu和内存配置限制与定义相同。
[root@master ~]# kubectl get pod resources-limit-pod --output=yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
cni.projectcalico.org/containerID: 0132686d82d318297fc120bdc1aa9eb0b9220c6bdd8671e58a96df1c52c4ddb5
cni.projectcalico.org/podIP: 10.244.166.143/32
cni.projectcalico.org/podIPs: 10.244.166.143/32
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"resources-limit-pod","namespace":"default"},"spec":{"containers":[{"image":"nginx","name":"nginx","resources":{"limits":{"cpu":"200m","memory":"400Mi"},"requests":{"cpu":"100m","memory":"200Mi"}}},{"args":["--vm","1","--vm-bytes","150M","--vm-hang","1"],"command":["stress"],"image":"polinux/stress","name":"stress","resources":{"limits":{"memory":"200Mi"},"requests":{"memory":"100Mi"}}}]}}
creationTimestamp: "2025-03-30T12:15:00Z"
name: resources-limit-pod
namespace: default
resourceVersion: "25229"
uid: 43751441-ee62-465b-80e8-85e9f5f693fd
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
resources:
limits:
cpu: 200m
memory: 400Mi
requests:
cpu: 100m
memory: 200Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-lsjhl
readOnly: true
- args:
- --vm
- "1"
- --vm-bytes
- 150M
- --vm-hang
- "1"
command:
- stress
image: polinux/stress
imagePullPolicy: Always
name: stress
resources:
limits:
memory: 200Mi
requests:
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-lsjhl
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: node1
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: kube-api-access-lsjhl
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-03-30T12:15:00Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2025-03-30T12:15:50Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2025-03-30T12:15:50Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2025-03-30T12:15:00Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://6b98d30bc2f6a52f5902b68125adacb9efe76bfd15651ffc8549fc9fe57299b3
image: nginx:latest
imageID: docker-pullable://nginx@sha256:124b44bfc9ccd1f3cedf4b592d4d1e8bddb78b51ec2ed5056c52d3692baebc19
lastState: {}
name: nginx
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2025-03-30T12:15:06Z"
- containerID: docker://799b744bea3112ae168924495b22689d54c3133f2393c9217f4f9d53f8e6e974
image: polinux/stress:latest
imageID: docker-pullable://polinux/stress@sha256:b6144f84f9c15dac80deb48d3a646b55c7043ab1d83ea0a697c09097aaad21aa
lastState: {}
name: stress
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2025-03-30T12:15:50Z"
hostIP: 192.168.10.31
phase: Running
podIP: 10.244.166.143
podIPs:
- ip: 10.244.166.143
qosClass: Burstable
startTime: "2025-03-30T12:15:00Z"
[root@master ~]#
(5)删除该pod以恢复实验环境
[root@master ~]# kubectl delete -f resources-limit-pod.yaml
pod "resources-limit-pod" deleted
[root@master ~]# kubectl get pod
No resources found in default namespace.
[root@master ~]# kubectl get pod resources-limit-pod --output=yaml
Error from server (NotFound): pods "resources-limit-pod" not found
当节点拥有足够多的可用资源时,容器可以使用其请求的资源。但是,容器不允许使用超过其限制的资源。如果给容器分配的资源超过其限制,该容器会成为被终止的候选容器。如果容器继续消耗超出其限制的资源,则会被终止。下面进行测试和验证。
(1)修改以上pod配置文件,将最后一行改为
args: ["--vm", "1", "--vm-bytes", "500M", "--vm-hang", "1"]
stress容器会尝试分配500MB的内存,远高于其200MB的限制。
(2)保存该配置文件,重新基于该文件创建pod
[root@master ~]# cp resources-limit-pod.yaml resources-limit-pod-new.yml
[root@master ~]# vi resources-limit-pod-new.yml
[root@master ~]# cat resources-limit-pod-new.yml
apiVersion: v1
kind: Pod
metadata:
name: resources-limit-pod
spec:
containers:
- name: nginx
image: nginx
resources: # 资源配额
limits: # 限制资源(上限)
cpu: 200m # CPU限制
memory: 400Mi # 内存限制
requests: # 请求资源(下限)
cpu: 100m
memory: 200Mi
- name: stress
image: polinux/stress
resources: # 资源配额
limits: # 限制资源(上限)
memory: "200Mi"
requests: # 请求资源(下限)
memory: "100Mi"
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "500M", "--vm-hang", "1"]
[root@master ~]# kubectl apply -f resources-limit-pod-new.yml
pod/resources-limit-pod created
(3)执行以下命令监视Pod的状态
[root@master ~]# kubectl get pod -w
NAME READY STATUS RESTARTS AGE
resources-limit-pod 1/2 OOMKilled 2 (30s ago) 49s
resources-limit-pod 1/2 CrashLoopBackOff 2 (16s ago) 53s
resources-limit-pod 1/2 OOMKilled 3 (31s ago) 68s
resources-limit-pod 1/2 CrashLoopBackOff 3 (16s ago) 84s
resources-limit-pod 1/2 OOMKilled 4 (47s ago) 115s
resources-limit-pod 1/2 CrashLoopBackOff 4 (15s ago) 2m9s
resources-limit-pod 1/2 OOMKilled 5 (95s ago) 3m29s
resources-limit-pod 1/2 CrashLoopBackOff 5 (14s ago) 3m43s
^C[root@master ~]#
等候一段时间,按ctrl+c组合键终止。
输出结果表明,该pod中有一个容器(stress)被终止、重启、再终止、再重启,默认终止的容器可以被重启,就像其他任何类型的容器运行时是吧一样。另一个容器(nginx)始终处于正常运行状态。
(4)查看pod详细信息
[root@master ~]# kubectl describe pod resources-limit-pod
Name: resources-limit-pod
Namespace: default
Priority: 0
Service Account: default
Node: node1/192.168.10.31
Start Time: Sun, 30 Mar 2025 20:33:58 +0800
Labels: <none>
Annotations: cni.projectcalico.org/containerID: be049fbfabe0d5ecf16f33fa74bbcd5672757e255e0e6a39dbd00a1240f87f7b
cni.projectcalico.org/podIP: 10.244.166.144/32
cni.projectcalico.org/podIPs: 10.244.166.144/32
Status: Running
IP: 10.244.166.144
IPs:
IP: 10.244.166.144
Containers:
nginx:
Container ID: docker://8696e221f38e5a6c771d51831d6ed86d21d21a94714a06d7706ae1cb860b744d
Image: nginx
Image ID: docker-pullable://nginx@sha256:124b44bfc9ccd1f3cedf4b592d4d1e8bddb78b51ec2ed5056c52d3692baebc19
Port: <none>
Host Port: <none>
State: Running
Started: Sun, 30 Mar 2025 20:34:00 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 200m
memory: 400Mi
Requests:
cpu: 100m
memory: 200Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-69nzk (ro)
stress:
Container ID: docker://09e63ba04e8a8771bec9b61a0240b1660a99bf8ac42e491e42269b7ea72c9595
Image: polinux/stress
Image ID: docker-pullable://polinux/stress@sha256:b6144f84f9c15dac80deb48d3a646b55c7043ab1d83ea0a697c09097aaad21aa
Port: <none>
Host Port: <none>
Command:
stress
Args:
--vm
1
--vm-bytes
500M
--vm-hang
1
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 1
Started: Sun, 30 Mar 2025 20:37:27 +0800
Finished: Sun, 30 Mar 2025 20:37:27 +0800
Ready: False
Restart Count: 5
Limits:
memory: 200Mi
Requests:
memory: 100Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-69nzk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-69nzk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m32s default-scheduler Successfully assigned default/resources-limit-pod to node1
Normal Pulling 4m31s kubelet Pulling image "nginx"
Normal Pulled 4m30s kubelet Successfully pulled image "nginx" in 1.209s (1.209s including waiting)
Normal Created 4m30s kubelet Created container nginx
Normal Started 4m30s kubelet Started container nginx
Normal Pulled 4m22s kubelet Successfully pulled image "polinux/stress" in 8.251s (8.251s including waiting)
Normal Pulled 4m14s kubelet Successfully pulled image "polinux/stress" in 7.913s (7.913s including waiting)
Normal Pulled 3m55s kubelet Successfully pulled image "polinux/stress" in 3.821s (3.821s including waiting)
Normal Pulling 3m26s (x4 over 4m30s) kubelet Pulling image "polinux/stress"
Normal Created 3m24s (x4 over 4m22s) kubelet Created container stress
Normal Started 3m24s (x4 over 4m22s) kubelet Started container stress
Warning BackOff 3m24s (x5 over 4m12s) kubelet Back-off restarting failed container stress in pod resources-limit-pod_default(f3f8166f-7ba2-4e1a-b967-a406b51430fb)
Normal Pulled 3m24s kubelet Successfully pulled image "polinux/stress" in 2.069s (2.069s including waiting)
[root@master ~]#
结果表明它由于内存溢出而被"杀掉"
(5)删除该pod以恢复实验环境
[root@master ~]# kubectl delete -f resources-limit-pod-new.yml
pod "resources-limit-pod" deleted
[root@master ~]# kubectl get pod
No resources found in default namespace.
kubernetes提供的存活探测器用于实现健康检查,通过检测容器的响应是否正常来决定是否重启容器。pod定义存活探测器,可以让kubernetes自动感知pod是否正常运行。这里以HTTP GET方式为例示范pod容器健康检查的实现方法。
(1)创建pod配置文件
[root@master ~]# vim liveness-probe-pod.yaml
[root@master ~]# cat liveness-probe-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: liveness-probe-pod
spec:
containers:
- name: liveness-probe
image: nginx
livenessProbe: # 定义存活探测器
httpGet:
path: /
port: 80
initialDelaySeconds: 10 # 容器启动后10秒开始探测
timeoutSeconds: 2 #容器必须在2秒内做出相应反馈给探测器,否则视为探测失败
periodSeconds: 30 # 探测周期,每30秒探测一次
successThreshold: 1 # 连续探测1次成功表示成功
failureThreshold: 3 # 连续探测3次失败表示失败
[root@master ~]#
探测器向容器的80端口发送HTTP GET请求,如果请求不成功,kubernetes会重启容器。文件中对探测器做了定制,容器启动的10秒后开始探测,如果2秒内容器没有做出回应则被认为探测失败。每30秒做一次探测,再连续探测失败3次后重启容器。
(2)基于配置文件创建pod
[root@master ~]# kubectl apply -f liveness-probe-pod.yaml
pod/liveness-probe-pod created
(3)查看该pod详细信息
[root@master ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
liveness-probe-pod 1/1 Running 0 118s
[root@master ~]# kubectl describe pod liveness-probe-pod
Name: liveness-probe-pod
Namespace: default
Priority: 0
Service Account: default
Node: node1/192.168.10.31
Start Time: Sun, 30 Mar 2025 20:47:08 +0800
Labels: <none>
Annotations: cni.projectcalico.org/containerID: 711ec9d867e754f658537ac8e0ea66e51a41db795cd55aab978ec7dd8654e502
cni.projectcalico.org/podIP: 10.244.166.145/32
cni.projectcalico.org/podIPs: 10.244.166.145/32
Status: Running
IP: 10.244.166.145
IPs:
IP: 10.244.166.145
Containers:
liveness-probe:
Container ID: docker://6159e1ef29854643decf37b0df2cd4d5d82f6e316e476ae07118bbc7d725178b
Image: nginx
Image ID: docker-pullable://nginx@sha256:124b44bfc9ccd1f3cedf4b592d4d1e8bddb78b51ec2ed5056c52d3692baebc19
Port: <none>
Host Port: <none>
State: Running
Started: Sun, 30 Mar 2025 20:47:11 +0800
Ready: True
Restart Count: 0
Liveness: http-get http://:80/ delay=10s timeout=2s period=30s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pw9z2 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-pw9z2:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m14s default-scheduler Successfully assigned default/liveness-probe-pod to node1
Normal Pulling 2m14s kubelet Pulling image "nginx"
Normal Pulled 2m12s kubelet Successfully pulled image "nginx" in 2.054s (2.054s including waiting)
Normal Created 2m12s kubelet Created container liveness-probe
Normal Started 2m12s kubelet Started container liveness-probe
[root@master ~]#
可以发现,该pod当前处于正常运行状态(running),重启次数(Restart Count)为0,表面目前没有重启,容器一直处于健康状态。如果重启次数大于0,则说明已经重启,容器曾有过"不健康"的历史。
(4)删除该pod以恢复实验环境
[root@master ~]# kubectl delete pod liveness-probe-pod
pod "liveness-probe-pod" deleted
[root@master ~]# kubectl get pod
No resources found in default namespace.
以上示范的是常见的探测方法,其具体机制是向容器发送HTTP GET请求,如果探测器收到“2xx”或“3xx”信息,说明容器是健康的。
环境变量是pod容器运行环境中设定的一个变量,便于对容器进行灵活的配置。创建pod时,可以通过配置文件的.spec.env和.spec.envFrom字段来设置环境变量。