部署的pod处于CrashLoopBackOff状态

1 问题描述

使用命令kubectl create -f myubuntu_deploy.yaml --record生成pod,结果显示pod处于CrashLoopBackOff状态。

CrashLoopBackOff 告诉我们,Kubernetes 正在尽力启动这个 Pod,但是一个或多个容器已经挂了,或者正被删除。

This is what I keep getting:

[root@centos-master ~]# kubectl get pods
NAME               READY     STATUS             RESTARTS   AGE
nfs-server-h6nw8   1/1       Running            0          1h
nfs-web-07rxz      0/1       CrashLoopBackOff   8          16m
nfs-web-fdr9h      0/1       CrashLoopBackOff   8          16m

Below is output from "describe pods" kubectl describe pods

Events:
  FirstSeen LastSeen    Count   From                SubobjectPath       Type        Reason      Message
  --------- --------    -----   ----                -------------       --------    ------      -------
  16m       16m     1   {default-scheduler }                    Normal      Scheduled   Successfully assigned nfs-web-fdr9h to centos-minion-2
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Created     Created container with docker id 495fcbb06836
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Started     Started container with docker id 495fcbb06836
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Started     Started container with docker id d56f34ae4e8f
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Created     Created container with docker id d56f34ae4e8f
  16m       16m     2   {kubelet centos-minion-2}               Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"

I have two pods: nfs-web-07rxz, nfs-web-fdr9h, but if I do "kubectl logs nfs-web-07rxz" or with "-p" option I don't see any log in both pods.

[root@centos-master ~]# kubectl logs nfs-web-07rxz -p
[root@centos-master ~]# kubectl logs nfs-web-07rxz

This is my replicationController yaml file: replicationController yaml file

apiVersion: v1 kind: ReplicationController metadata:   name: nfs-web spec:   replicas: 2   selector:
    role: web-frontend   template:
    metadata:
      labels:
        role: web-frontend
    spec:
      containers:
      - name: web
        image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
        ports:
          - name: web
            containerPort: 80
        securityContext:
          privileged: true

My Docker image was made from this simple docker file:

FROM ubuntu
RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y nfs-common

I am running my kubernetes cluster on CentOs-1611, kube version:

[root@centos-master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

If I run the docker image by "docker run" I was able to run the image without any issue, only through kubernetes I got the crash.

Can someone help me out, how can I debug without seeing any log?

The entire dokcerfile is just one command "FROM ubuntu" and it is still crashing

2 解决方法

you need to have your Dockerfile have a Command to run or have your ReplicationController specify a command.

The pod is crashing because it starts up then immediately exits, thus Kubernetes restarts and the cycle continues.

查看了我制作镜像的Dockerfile,是dockerfile文件中最后的CMD命令出错。

修改后执行命令重新生成镜像:

docker build -t mynginx:1.13.9 .

执行命令:kubectl create -f nginx_deploy.yaml --record生成pod

deployment文件:

root@master:~/deployment# cat nginx_deploy.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: mynginx:1.13.9
        ports:
        - containerPort: 80

3 常用操作

使用以下命令可以看到目前集群里的信息:

master

 
      
1
2
3
4
5
6
 
      
kubectl get po # 查看目前所有的pod
kubectl get rs # 查看目前所有的replica set
kubectl get deployment # 查看目前所有的deployment
kubectl describe po my-nginx # 查看my-nginx pod的详细状态
kubectl describe rs my-nginx # 查看my-nginx replica set的详细状态
kubectl describe deployment my-nginx # 查看my-nginx deployment的详细状态
7     kubectl get eventskubectl get events查看相关事件

8  kubectl delete deployment my-nginx

参考:

1 https://stackoverflow.com/questions/41604499/my-kubernetes-pods-keep-crashing-with-crashloopbackoff-but-i-cant-find-any-lo


你可能感兴趣的:(kubernetes)