k8s pod/weave-net-xx*** CrashLoopBackoff, weave-kube一直重启问题的解决

k8s测试环境的一台机器升级内核到4.12以解决devicemapper的问题，然后重启机器，却遇到另外一个weave-kube问题：一台主机上weave-kube容器一直重启, 日志如下

docker logs da256187d80b

INFO: 2017/08/10 07:16:08.232950 Command line options: map[datapath:datapath ipalloc-init:consensus=2 name:76:e7:10:71:17:69 status-addr:0.0.0.0:6782 docker-api: http-addr:127.0.0.1:6784 ipalloc-range:10.32.0.0/12 nickname:iZuf6agwpkr1k313nwz91iZ no-dns:true port:6783]
INFO: 2017/08/10 07:16:08.248006 Communication between peers is unencrypted.
INFO: 2017/08/10 07:16:08.253298 Our name is 76:e7:10:71:17:69(iZuf6agwpkr1k313nwz91iZ)
INFO: 2017/08/10 07:16:08.253340 Launch detected - using supplied peer list: [10.12.0.100 10.12.0.252]
INFO: 2017/08/10 07:16:08.259119 [allocator 76:e7:10:71:17:69] Initialising with persisted data
INFO: 2017/08/10 07:16:08.259251 Sniffing traffic on datapath (via ODP)
INFO: 2017/08/10 07:16:08.274114 ->[10.12.0.100:6783] attempting connection
INFO: 2017/08/10 07:16:08.274394 ->[10.12.0.252:6783] attempting connection
INFO: 2017/08/10 07:16:08.287067 ->[10.12.0.252:59991] connection accepted
INFO: 2017/08/10 07:16:08.287968 ->[10.12.0.252:59991|76:e7:10:71:17:69(iZuf6agwpkr1k313nwz91iZ)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/08/10 07:16:08.288367 ->[10.12.0.100:6783|5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)]: connection ready; using protocol version 2
INFO: 2017/08/10 07:16:08.293217 overlay_switch ->[5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)] using fastdp
INFO: 2017/08/10 07:16:08.296917 ->[10.12.0.100:6783|5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)]: connection added (new peer)
INFO: 2017/08/10 07:16:08.306995 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2017/08/10 07:16:08.307805 Listening for metrics requests on 0.0.0.0:6782
INFO: 2017/08/10 07:16:08.334478 ->[10.12.0.252:6783|76:e7:10:71:17:69(iZuf6agwpkr1k313nwz91iZ)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/08/10 07:16:08.362250 EMSGSIZE on send, expecting PMTU update (IP packet was 60028 bytes, payload was 60020 bytes)
INFO: 2017/08/10 07:16:08.362542 overlay_switch ->[5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)] using sleeve
INFO: 2017/08/10 07:16:08.362628 ->[10.12.0.100:6783|5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)]: connection fully established
INFO: 2017/08/10 07:16:08.383371 sleeve ->[10.12.0.100:6783|5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)]: Effective MTU verified at 1438
INFO: 2017/08/10 07:16:08.383408 overlay_switch ->[5a:82:11:b4:24:b9(iZuf60abyslw0fgs3ar2vmZ)] using fastdp
exit status 1: iptables: No chain/target/match by that name.r

image.png

kubectl get pods -n kube-system -o wide

image.png

kubectl describe pods weave-net-xx690 -n kube-system

image.png

查了weave相关issue, 发现如下相关，貌似和ulogd有关，是weave的一个bug，已经在1.8.1中解决

https://github.com/weaveworks/weave/issues/2653

而我使用的是 http://k8s.oss-cn-shanghai.aliyuncs.com/kube/weave-kube-1.7.2，建议更改镜像到1.8.2

registry.cn-hangzhou.aliyuncs.com/kargo/weave-kube:1.8.2
registry.cn-hangzhou.aliyuncs.com/kargo/weave-npc:1.8.2

解决方法

>ps -ef |grep ulogd
root      1708  1600  0 15:33 ?        00:00:00 /usr/sbin/ulogd -v
root     11068  1055  0 15:47 pts/0    00:00:00 grep --color=auto ulogd

>kill -9 1708

>systemctl restart docker

附上weave-kube.yaml

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: weave-net
  namespace: kube-system
spec:
  template:
    metadata:
      labels:
        name: weave-net
      annotations:
        scheduler.alpha.kubernetes.io/tolerations: |
          [
            {
              "key": "dedicated",
              "operator": "Equal",
              "value": "master",
              "effect": "NoSchedule"
            }
          ]
    spec:
      hostNetwork: true
      hostPID: true
      containers:
        - name: weave
          #image: registry.cn-hangzhou.aliyuncs.com/google-containers/weave-kube:1.7.2
          image: registry.cn-hangzhou.aliyuncs.com/kargo/weave-kube:1.8.2
          command:
            - /home/weave/launch.sh
          env:
            - name: IPALLOC_RANGE
              value: 10.32.0.0/12
          livenessProbe:
            initialDelaySeconds: 32
            httpGet:
              host: 127.0.0.1
              path: /status
              port: 6784
          securityContext:
            privileged: true
          volumeMounts:
            - name: weavedb
              mountPath: /weavedb
            - name: cni-bin
              mountPath: /opt
            - name: cni-bin2
              mountPath: /host_home
            - name: cni-conf
              mountPath: /etc
          resources:
            requests:
              cpu: 10m
        - name: weave-npc
          #image: registry.cn-hangzhou.aliyuncs.com/google-containers/weave-npc:1.7.2
          image: registry.cn-hangzhou.aliyuncs.com/kargo/weave-npc:1.8.2
          resources:
            requests:
              cpu: 10m
          securityContext:
            privileged: true
      restartPolicy: Always
      volumes:
        - name: weavedb
          emptyDir: {}
        - name: cni-bin
          hostPath:
            path: /opt
        - name: cni-bin2
          hostPath:
            path: /home
        - name: cni-conf
          hostPath:
            path: /etc

misc

kubectl apply -f http://k8s.oss-cn-shanghai.aliyuncs.com/kube/weave-kube-1.7.2
kubectl apply -f https://git.io/weave-kube

k8s pod/weave-net-xx*** CrashLoopBackoff, weave-kube一直重启 问题的解决

你可能感兴趣的:(k8s pod/weave-net-xx*** CrashLoopBackoff, weave-kube一直重启 问题的解决)

k8s pod/weave-net-xx*** CrashLoopBackoff, weave-kube一直重启问题的解决

你可能感兴趣的:(k8s pod/weave-net-xx*** CrashLoopBackoff, weave-kube一直重启问题的解决)