今天,在部署 k8s 集群时,发现 CoreDNS 一直无法启动成功,报类似标题所示的错误,现记录下问题排查的主体过程:
1. 首先,正常情况下,所有 Node 应该都能够访问 10.96.0.1:443,到 CoreDNS Pod 所在节点,执行:
# curl https://10.96.0.1
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html
curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn't adequate, you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
the -k (or --insecure) option.
无需关心返回的内容,有返回即代表 10.96.0.1:443 是可以访问的,如果不能访问进入第 2 步
2. 查看对应 Node 上的 kube-proxy 日志,根据日志判断 iptables 规则设置有无异常
E1017 09:28:33.524808 1 proxier.go:688] Failed to ensure that filter chain KUBE-EXTERNAL-SERVICES exists: error creating chain "KUBE-EXTERNAL-SERVICES": exit status 3: modprobe: ERROR: could not insert 'ip6_tables': Exec format error
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
E1017 09:29:03.528199 1 proxier.go:688] Failed to ensure that filter chain KUBE-EXTERNAL-SERVICES exists: error creating chain "KUBE-EXTERNAL-SERVICES": exit status 3: modprobe: ERROR: could not insert 'ip6_tables': Exec format error
ip6tables v1.6.0: can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
如有异常,则根据错误提示,配置 Node 解决 iptables 故障