Kubernetes学习之Job控制器

一、认识Job控制器
  Job控制器用于调配Pod对象运行一次性任务,容器中的进程在正常运行结束后不会对其进行重启,而是将Pod对象置于"Completed"(完成状态),若是容器中进程因错误而终止了,则需要依配置确定是否重启,未运行完成的Pod对象因其所在的节点故障而意外终止后被重新调度。
  Job Controller负责根据Job Spec中的定义来创建Pod,并持续监控Pod状态,直至其成功结束。如果失败了,则根据用户定义的restartPolicy(只支持OnFailure和Never,不支持Always)来决定是否创建新的Pod再次重试任务。
Kubernetes学习之Job控制器_第1张图片
  在实际中,有的作业任务可能不止需要运行一次,用户可以配置它们以串行或者并行的方式运行。总结起来,这种类型的Job控制器对象有两种,具体如下:
  1)单工作队列(work queue)的串行式Job:即以多个一次性的作业方式串行执行多次作业,直至满足期望的次数;这个Job也可以理解为并行度为1的作业执行方式,在某个时刻仅存在一个Pod资源对象。 Kubernetes学习之Job控制器_第2张图片
  2)多工作队列的并行式Job:这种方式可以设置工作队列数,即作业数,每个队列仅负责运行一个作业;也可以用有限的工作队列运行较多的任务,即工作队列数少于总作业数,相当于运行多个串行作业队列。将并行度属性.spec.paralleism的值设置为1,并设置总任务数.spec.completion属性便能够让Job控制器以串行的方式运行多任务。.spec.parallelism能够定义作业执行的并行度,将其设置为2或者以上的值即可实现并行多队列作业运行。同时,如果.spec.completions属性值设置大于.spec.parallelism的属性值,则表示使用多队列串行任务作业模式。
Kubernetes学习之Job控制器_第3张图片
  Job控制器待其Pod资源运行完成后,将不再占用系统资源。用户可按需保留或使用资源删除命令将其删除。不过,如果某Job控制器的容器应用总是无法正常结束运行,而其restartPolicy又定为了重启,则它可能会一直处于不停的重启和错误循环当中。所幸的是,Job控制器提供了两个属性用于抑制这种情况的方式:
  1).spec.activeDeadineSeconds:Job的deadline,用于为其指定最大活动时间长度,超出此时长的作业将被终止;
  2).spec.backoffLimit:将作业标记为失败状态之前的可重试次数,默认值为6次;

二、Job控制器实验

1)编写创建Job控制器的yaml文件

]# cat job.yml 
apiVersion: batch/v1
kind: Job
metadata:
  name: job-example
spec:
  template:
    spec:
      containers:
      - name: myjob
        image: alpine
        command: ["/bin/sh","-c","sleep 120"]
      restartPolicy: Never
      
]# kubectl apply -f job.yaml 
job.batch/job-example created

2)查看Pod及Job详情

]# kubectl get jobs -o wide 
NAME          COMPLETIONS   DURATION   AGE   CONTAINERS   IMAGES   SELECTOR
job-example   0/1           59s        59s   myjob        alpine   controller-uid=9da99a63-6e39-4e7e-a974-fc0dd6e6216a

]# kubectl get pods -o wide 
NAME                READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
job-example-hqc7l   1/1     Running   0          65s   10.244.2.23   node2   <none>           <none>

]# kubectl get jobs -o wide 
NAME          COMPLETIONS   DURATION   AGE     CONTAINERS   IMAGES   SELECTOR
job-example   1/1           2m9s       2m23s   myjob        alpine   controller-uid=9da99a63-6e39-4e7e-a974-fc0dd6e6216a

可以看到当经过120秒之后,此Job的状态即转变为Completions(完成状态)

3)查看Job的详细信息

]# kubectl describe job job-example
Name:           job-example
Namespace:      default
Selector:       controller-uid=9da99a63-6e39-4e7e-a974-fc0dd6e6216a
Labels:         controller-uid=9da99a63-6e39-4e7e-a974-fc0dd6e6216a
                job-name=job-example
Annotations:    Parallelism:  1
Completions:    1
Start Time:     Mon, 10 Aug 2020 09:08:20 +0800
Completed At:   Mon, 10 Aug 2020 09:10:29 +0800
Duration:       2m9s
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=9da99a63-6e39-4e7e-a974-fc0dd6e6216a
           job-name=job-example
  Containers:
   myjob:
    Image:      alpine
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
      sleep 120
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age    From            Message
  ----    ------            ----   ----            -------
  Normal  SuccessfulCreate  3m56s  job-controller  Created pod: job-example-hqc7l
  Normal  Completed         107s   job-controller  Job completed

4)创建并行队列Job

]# cat job-muilt.yml 
apiVersion: batch/v1
kind: Job
metadata:
  name: job-mulit
spec:
  completions: 6
  parallelism: 2
  template:
    spec:
      containers:
      - name: myjob
        image: alpine
        command: ["/bin/sh", "-c", "sleep 20"]
      restartPolicy: OnFailure

]# kubectl apply -f job-muilt.yml

5)查看Job控制器状态

]# kubectl get job -o wide -w 
NAME        COMPLETIONS   DURATION   AGE   CONTAINERS   IMAGES   SELECTOR
job-mulit   0/6                      0s    myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
job-mulit   0/6           0s         0s    myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41

job-mulit   1/6           22s        22s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
job-mulit   2/6           25s        25s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41

job-mulit   3/6           44s        44s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
job-mulit   4/6           46s        46s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41

job-mulit   5/6           72s        72s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
job-mulit   6/6           73s        73s   myjob        alpine   controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41


]# kubectl get pods -o wide  -w 
NAME              READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
job-mulit-fj7q8   0/1     Pending   0          0s    <none>   <none>   <none>           <none>
job-mulit-fj7q8   0/1     Pending   0          0s    <none>   node2    <none>           <none>
job-mulit-cbcgv   0/1     Pending   0          0s    <none>   <none>   <none>           <none>
job-mulit-cbcgv   0/1     Pending   0          0s    <none>   node1    <none>           <none>
job-mulit-fj7q8   0/1     ContainerCreating   0          0s    <none>   node2    <none>           <none>
job-mulit-cbcgv   0/1     ContainerCreating   0          0s    <none>   node1    <none>           <none>
job-mulit-cbcgv   1/1     Running             0          2s    10.244.1.70   node1    <none>           <none>
job-mulit-fj7q8   1/1     Running             0          6s    10.244.2.27   node2    <none>           <none>

job-mulit-cbcgv   0/1     Completed           0          22s   10.244.1.70   node1    <none>           <none>
job-mulit-sbmth   0/1     Pending             0          0s    <none>        <none>   <none>           <none>
job-mulit-sbmth   0/1     Pending             0          0s    <none>        node1    <none>           <none>
job-mulit-sbmth   0/1     ContainerCreating   0          0s    <none>        node1    <none>           <none>
job-mulit-sbmth   1/1     Running             0          2s    10.244.1.71   node1    <none>           <none>
job-mulit-fj7q8   0/1     Completed           0          25s   10.244.2.27   node2    <none>           <none>
job-mulit-xln8h   0/1     Pending             0          0s    <none>        <none>   <none>           <none>
job-mulit-xln8h   0/1     Pending             0          0s    <none>        node2    <none>           <none>
job-mulit-xln8h   0/1     ContainerCreating   0          0s    <none>        node2    <none>           <none>
job-mulit-xln8h   1/1     Running             0          2s    10.244.2.28   node2    <none>           <none>

job-mulit-sbmth   0/1     Completed           0          22s   10.244.1.71   node1    <none>           <none>
job-mulit-jkv78   0/1     Pending             0          0s    <none>        <none>   <none>           <none>
job-mulit-jkv78   0/1     Pending             0          0s    <none>        node1    <none>           <none>
job-mulit-jkv78   0/1     ContainerCreating   0          0s    <none>        node1    <none>           <none>
job-mulit-xln8h   0/1     Completed           0          21s   10.244.2.28   node2    <none>           <none>
job-mulit-x49rj   0/1     Pending             0          0s    <none>        <none>   <none>           <none>
job-mulit-x49rj   0/1     Pending             0          0s    <none>        node2    <none>           <none>
job-mulit-x49rj   0/1     ContainerCreating   0          0s    <none>        node2    <none>           <none>
job-mulit-jkv78   1/1     Running             0          8s    10.244.1.72   node1    <none>           <none>
job-mulit-x49rj   1/1     Running             0          7s    10.244.2.29   node2    <none>           <none>
job-mulit-jkv78   0/1     Completed           0          28s   10.244.1.72   node1    <none>           <none>
job-mulit-x49rj   0/1     Completed           0          27s   10.244.2.29   node2    <none>           <none>

从监视Job控制器和Pod状态中可以发现,Job控制器下的各Pod都是两个两个一起完成的,符合我们定义的并发任务执行数

6)查看Job详细信息

]# kubectl describe jobs job-mulit
Name:           job-mulit
Namespace:      default
Selector:       controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
Labels:         controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
                job-name=job-mulit
Annotations:    Parallelism:  2
Completions:    6
Start Time:     Mon, 10 Aug 2020 09:19:19 +0800
Completed At:   Mon, 10 Aug 2020 09:20:32 +0800
Duration:       73s
Pods Statuses:  0 Running / 6 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=86b6fb38-0e4f-4d3b-a0aa-7502c49b7d41
           job-name=job-mulit
  Containers:
   myjob:
    Image:      alpine
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
      sleep 20
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age    From            Message
  ----    ------            ----   ----            -------
  Normal  SuccessfulCreate  3m58s  job-controller  Created pod: job-mulit-fj7q8
  Normal  SuccessfulCreate  3m58s  job-controller  Created pod: job-mulit-cbcgv
  Normal  SuccessfulCreate  3m36s  job-controller  Created pod: job-mulit-sbmth
  Normal  SuccessfulCreate  3m33s  job-controller  Created pod: job-mulit-xln8h
  Normal  SuccessfulCreate  3m14s  job-controller  Created pod: job-mulit-jkv78
  Normal  SuccessfulCreate  3m12s  job-controller  Created pod: job-mulit-x49rj
  Normal  Completed         2m45s  job-controller  Job completed

你可能感兴趣的:(Kubernetes学习)