一直在用一个五台机器组成的MongoDB集群(192.168.40.80 ~ 84),5个shard,分了3个分片。之前一直运行正常,最近一段时间发现服务很不稳定,show db老提示说shard 4 error,并且有时候有机器会因为负载过高而宕机。
今日偶然查看MongoDB日志,发现跟shard 4相关的几台机器都在报同样地错误:
[rsHealthPoll] couldn't connect to 192.168.40.83:29022: couldn't connect to server 192.168.40.83:29022
[rsHealthPoll] replset info 192.168.40.80:29022 thinks that we are down
[root@mongodb04 ~]# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere ACCEPT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination
service iptables stop chkconfig --level 2345 iptables off