因服务器内存不足,关机扩容内存并重启主机之后,rancher访问失败
解决办法:
1. 删除有问题的ingress规则(别问为啥是这个 因为没别的了 如果规则多 请倒着删)
[root@i-5wa2ciao ~]# kubectl delete ingress -nkaishidongle test-ingrress
ingress.extensions "test-ingrress" deleted
2. 重建ingress pod
[root@i-5wa2ciao ~]# kubectl delete po nginx-ingress-controller-89827 nginx-ingress-controller-pdvzj nginx-ingress-controller-zd7fd -ningress-nginx
pod "nginx-ingress-controller-89827" deleted
pod "nginx-ingress-controller-pdvzj" deleted
pod "nginx-ingress-controller-zd7fd" deleted
3. 验证
[root@i-5wa2ciao ~]# kubectl get po -ningress-nginx
NAME READY STATUS RESTARTS AGE
default-http-backend-598b7d7dbd-mbw6n 1/1 Running 0 41m
nginx-ingress-controller-d44jn 1/1 Running 0 9m29s
nginx-ingress-controller-dr5gr 1/1 Running 0 9m25s
nginx-ingress-controller-glf4x 1/1 Running 0 9m19s
排查过程:
- 查看ingress规则,确保rancher域名规则存在
[root@i-5wa2ciao ~]# kubectl get ingress -A
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
cattle-system rancher merancher.enncloud.cn 80, 443 140d
kaishidongle test-ingrress lmnginx.enncloud.cn 80 101d
- 查看ingress状态
[root@i-5wa2ciao ~]# kubectl get po -A|grep ingress
ingress-nginx default-http-backend-598b7d7dbd-mbw6n 1/1 Running 0 7m49s
ingress-nginx nginx-ingress-controller-89827 0/1 CrashLoopBackOff 6 7m44s
ingress-nginx nginx-ingress-controller-pdvzj 0/1 CrashLoopBackOff 6 7m41s
ingress-nginx nginx-ingress-controller-zd7fd 0/1 CrashLoopBackOff 6 7m40s
- 因为ingress 处于 CrashLoopBackOff 状态,使用descirbe 查看错误
[root@i-5wa2ciao .kube]# kubectl describe po -ningress-nginx nginx-ingress-controller-48288
.........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 101s default-scheduler Successfully assigned ingress-nginx/nginx-ingress-controller-48288 to rancher-40-181
Warning Unhealthy 17s (x6 over 87s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Normal Killing 17s (x2 over 67s) kubelet Container nginx-ingress-controller failed liveness probe, will be restarted
Warning Unhealthy 11s (x8 over 91s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Normal Pulled 4s (x3 over 101s) kubelet Container image "rancher/nginx-ingress-controller:nginx-0.35.0-rancher2" already present on machine
Normal Created 4s (x3 over 101s) kubelet Created container nginx-ingress-controller
Normal Started 4s (x3 over 101s) kubelet Started container nginx-ingress-controller
- 没有获取有用信息 ,查看ingress 日志
I0616 09:39:18.759940 6 status.go:86] new leader elected: nginx-ingress-controller-48288
I0616 09:39:18.766025 6 status.go:208] runningAddresses: pod [nginx-ingress-controller-48288] on [rancher-40-181] is not ready
I0616 09:39:18.766039 6 status.go:208] runningAddresses: pod [nginx-ingress-controller-4knqx] on [rancher-40-185] is not ready
I0616 09:39:18.766044 6 status.go:208] runningAddresses: pod [nginx-ingress-controller-7kl82] on [rancher-40-179] is not ready
E0616 09:39:18.816189 6 controller.go:153] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2022/06/16 09:39:18 [emerg] 33#33: "proxy_http_version" directive is duplicate in /tmp/nginx-cfg111270477:554
nginx: [emerg] "proxy_http_version" directive is duplicate in /tmp/nginx-cfg111270477:554
nginx: configuration file /tmp/nginx-cfg111270477 test failed
-------------------------------------------------------------------------------
W0616 09:39:18.816207 6 queue.go:130] requeuing initial-sync, err
-------------------------------------------------------------------------------
Error: exit status 1
2022/06/16 09:39:18 [emerg] 33#33: "proxy_http_version" directive is duplicate in /tmp/nginx-cfg111270477:554
nginx: [emerg] "proxy_http_version" directive is duplicate in /tmp/nginx-cfg111270477:554
nginx: configuration file /tmp/nginx-cfg111270477 test failed
-------------------------------------------------------------------------------
W0616 09:39:22.082672 6 controller.go:1163] SSL certificate for server "merancher.enncloud.cn" is about to expire (2022-06-20 08:01:06 +0000 UTC)
I0616 09:39:22.082752 6 controller.go:141] Configuration changes detected, backend reload required.
E0616 09:39:22.120857 6 controller.go:153] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2022/06/16 09:39:22 [emerg] 40#40: "proxy_http_version" directive is duplicate in /tmp/nginx-cfg838461768:554
nginx: [emerg] "proxy_http_version" directive is duplicate in /tmp/nginx-cfg838461768:554
nginx: configuration file /tmp/nginx-cfg838461768 test failed
-------------------------------------------------------------------------------
W0616 09:39:22.120873 6 queue.go:130] requeuing cattle-monitoring-system/pushprox-kube-proxy-client, err
-------------------------------------------------------------------------------
Error: exit status 1
2022/06/16 09:39:22 [emerg] 40#40: "proxy_http_version" directive is duplicate in /tmp/nginx-cfg838461768:554
nginx: [emerg] "proxy_http_version" directive is duplicate in /tmp/nginx-cfg838461768:554
nginx: configuration file /tmp/nginx-cfg838461768 test failed
-------------------------------------------------------------------------------
W0616 09:39:25.416024 6 controller.go:1163] SSL certificate for server "merancher.enncloud.cn" is about to expire (2022-06-20 08:01:06 +0000 UTC)
I0616 09:39:25.416103 6 controller.go:141] Configuration changes detected, backend reload required.
E0616 09:39:25.452786 6 controller.go:153] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2022/06/16 09:39:25 [emerg] 48#48: "proxy_http_version" directive is duplicate in /tmp/nginx-cfg224385031:554
nginx: [emerg] "proxy_http_version" directive is duplicate in /tmp/nginx-cfg224385031:554
nginx: configuration file /tmp/nginx-cfg224385031 test failed
经查询得知出现此问题的原因为之前部署的某个服务ingress有问题,导致后部署的ingress无法生效,且重启nginx后拉取ingress配置错误启动失败,导致nginx所有服务无法代理
参考网络文章1
nginx ingress最后的倔强解决办法
1. 查询nginx规则
[root@i-5wa2ciao ~]# kubectl get ingress -A
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
cattle-system rancher merancher.enncloud.cn 80, 443 140d
kaishidongle test-ingrress lmnginx.enncloud.cn 80 101d
2. 删除有问题的ingress规则(别问为啥是这个 因为没别的了 如果规则多 请倒着删)
[root@i-5wa2ciao ~]# kubectl delete ingress -nkaishidongle test-ingrress
ingress.extensions "test-ingrress" deleted
3. 重建ingress pod
[root@i-5wa2ciao ~]# kubectl delete po nginx-ingress-controller-89827 nginx-ingress-controller-pdvzj nginx-ingress-controller-zd7fd -ningress-nginx
pod "nginx-ingress-controller-89827" deleted
pod "nginx-ingress-controller-pdvzj" deleted
pod "nginx-ingress-controller-zd7fd" deleted
4. 验证
[root@i-5wa2ciao ~]# kubectl get po -ningress-nginx
NAME READY STATUS RESTARTS AGE
default-http-backend-598b7d7dbd-mbw6n 1/1 Running 0 41m
nginx-ingress-controller-d44jn 1/1 Running 0 9m29s
nginx-ingress-controller-dr5gr 1/1 Running 0 9m25s
nginx-ingress-controller-glf4x 1/1 Running 0 9m19s
-
页面访问正常