glusterd服务无法启动的故障

1、背景

glusterfs所在的其中一台服务端主机因被人为重启后,glusterfs服务无法正常起来。

root@public_gfs1[/var/log/glusterfs]#systemctl start glusterd
A dependency job for glusterd.service failed. See 'journalctl -xe' for details.

2、查看详细

root@public_gfs1[/var/log/glusterfs]#journalctl -xe
8月 28 10:57:49 public_gfs1 bash[30357]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 10:57:49 systemctl sta
8月 28 10:57:56 public_gfs1 polkitd[755]: Registered Authentication Agent for unix-process:30373:456395 (system bus name :1.62 [/usr
8月 28 10:57:56 public_gfs1 systemd[1]: rpcbind.socket failed to listen on sockets: Address family not supported by protocol
8月 28 10:57:56 public_gfs1 systemd[1]: Failed to listen on RPCbind Server Activation Socket.
-- Subject: Unit rpcbind.socket has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.socket has failed.
-- 
-- The result is failed.
8月 28 10:57:56 public_gfs1 systemd[1]: Dependency failed for RPC bind service.
-- Subject: Unit rpcbind.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.service has failed.
-- 
-- The result is dependency.
8月 28 10:57:56 public_gfs1 systemd[1]: Dependency failed for GlusterFS, a clustered file-system server.
-- Subject: Unit glusterd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit glusterd.service has failed.
-- 
-- The result is dependency.
8月 28 10:57:56 public_gfs1 systemd[1]: Job glusterd.service/start failed with result 'dependency'.
8月 28 10:57:56 public_gfs1 polkitd[755]: Unregistered Authentication Agent for unix-process:30373:456395 (system bus name :1.62, ob
8月 28 10:57:56 public_gfs1 systemd[1]: Job rpcbind.service/start failed with result 'dependency'.
8月 28 10:57:56 public_gfs1 systemd[1]: Starting RPCbind Server Activation Socket.
-- Subject: Unit rpcbind.socket has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.socket has begun starting up.
8月 28 10:57:56 public_gfs1 bash[30382]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 10:57:56 systemctl sta

8月 28 10:57:56 public_gfs1 systemd[1]: rpcbind.socket failed to listen on sockets: Address family not supported by protocol
8月 28 10:57:56 public_gfs1 systemd[1]: Failed to listen on RPCbind Server Activation Socket.

8月 28 10:57:56 public_gfs1 systemd[1]: Job rpcbind.service/start failed with result 'dependency'.

注意这部分内容。

貌似glusterd依赖于一个rpc服务,然后去查了一下这个服务状态。

root@public_gfs1[/var/log/glusterfs]#systemctl status rpcbind
● rpcbind.service - RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; indirect; vendor preset: enabled)
   Active: inactive (dead)

8月 28 09:42:49 public_gfs1 systemd[1]: Dependency failed for RPC...
8月 28 09:42:49 public_gfs1 systemd[1]: Job rpcbind.service/start...
8月 28 10:41:21 public_gfs1 systemd[1]: Dependency failed for RPC...
8月 28 10:41:21 public_gfs1 systemd[1]: Job rpcbind.service/start...
8月 28 10:57:56 public_gfs1 systemd[1]: Dependency failed for RPC...
8月 28 10:57:56 public_gfs1 systemd[1]: Job rpcbind.service/start...
Hint: Some lines were ellipsized, use -l to show in full.

3、尝试启动该服务:

root@public_gfs1[/var/log/glusterfs]#systemctl start rpcbind
A dependency job for rpcbind.service failed. See 'journalctl -xe' for details.

查看详细错误信息:

root@public_gfs1[/var/log/glusterfs]#journalctl -xe
-- Subject: Unit user-0.slice has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit user-0.slice has begun shutting down.
8月 28 11:01:09 public_gfs1 bash[31138]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 11:01:09 systemctl sta
8月 28 11:01:35 public_gfs1 polkitd[755]: Registered Authentication Agent for unix-process:31239:478275 (system bus name :1.67 [/usr
8月 28 11:01:35 public_gfs1 polkitd[755]: Unregistered Authentication Agent for unix-process:31239:478275 (system bus name :1.67, ob
8月 28 11:01:35 public_gfs1 bash[31248]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 11:01:35 systemctl sto
8月 28 11:01:37 public_gfs1 bash[31253]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 11:01:37 systemctl sta
8月 28 11:01:47 public_gfs1 polkitd[755]: Registered Authentication Agent for unix-process:31302:479486 (system bus name :1.68 [/usr
8月 28 11:01:47 public_gfs1 systemd[1]: rpcbind.socket failed to listen on sockets: Address family not supported by protocol
8月 28 11:01:47 public_gfs1 systemd[1]: Failed to listen on RPCbind Server Activation Socket.
-- Subject: Unit rpcbind.socket has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.socket has failed.
-- 
-- The result is failed.
8月 28 11:01:47 public_gfs1 systemd[1]: Dependency failed for RPC bind service.
-- Subject: Unit rpcbind.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.service has failed.
-- 
-- The result is dependency.
8月 28 11:01:47 public_gfs1 systemd[1]: Job rpcbind.service/start failed with result 'dependency'.
8月 28 11:01:47 public_gfs1 systemd[1]: Starting RPCbind Server Activation Socket.
-- Subject: Unit rpcbind.socket has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rpcbind.socket has begun starting up.
8月 28 11:01:47 public_gfs1 polkitd[755]: Unregistered Authentication Agent for unix-process:31302:479486 (system bus name :1.68, ob
8月 28 11:01:47 public_gfs1 bash[31311]: user=root,ppid=25026,from=,pwd=/var/log/glusterfs,command:2019-08-28 11:01:47 systemctl sta

4、查了一下资料,网上说内核参数ipv6被禁用了,导致起不来。

vi /etc/sysctl.conf

我的现有配置是

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6参数没配。

需要将此两个参数改成:

net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0

保存退出。

5、使改动的内核参数生效。

sysctl -p

或者重启主机。

6、重新启动rpcbind服务

root@public_gfs1[/var/log/glusterfs]#systemctl start rpcbind
root@public_gfs1[/var/log/glusterfs]#systemctl status rpcbind
● rpcbind.service - RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; indirect; vendor preset: enabled)
   Active: active (running) since 三 2019-08-28 11:05:03 CST; 3s ago
  Process: 32091 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 32092 (rpcbind)
   CGroup: /system.slice/rpcbind.service
           └─32092 /sbin/rpcbind -w

8月 28 11:05:03 public_gfs1 systemd[1]: Starting RPC bind service...
8月 28 11:05:03 public_gfs1 systemd[1]: Started RPC bind service.
root@public_gfs1[/var/log/glusterfs]#systemctl stop glusterd 
root@public_gfs1[/var/log/glusterfs]#systemctl status glusterd 
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:glusterd(8)

8月 28 09:42:49 public_gfs1 systemd[1]: Dependency failed for GlusterFS, a clustered file-system server.
8月 28 09:42:49 public_gfs1 systemd[1]: Job glusterd.service/start failed with result 'dependency'.
8月 28 10:41:21 public_gfs1 systemd[1]: Dependency failed for GlusterFS, a clustered file-system server.
8月 28 10:41:21 public_gfs1 systemd[1]: Job glusterd.service/start failed with result 'dependency'.
8月 28 10:57:56 public_gfs1 systemd[1]: Dependency failed for GlusterFS, a clustered file-system server.
8月 28 10:57:56 public_gfs1 systemd[1]: Job glusterd.service/start failed with result 'dependency'.

7、启动glusterd服务

root@public_gfs1[/var/log/glusterfs]#systemctl start glusterd 
root@public_gfs1[/var/log/glusterfs]#systemctl status glusterd 
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: active (running) since 三 2019-08-28 11:05:47 CST; 4s ago
     Docs: man:glusterd(8)
  Process: 32275 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 32276 (glusterd)
   CGroup: /system.slice/glusterd.service
           ├─32276 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
           ├─32292 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id centervol.10.138.121.40.usr-data-center -p /var/run/gluster...
           ├─32303 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id commvol.10.138.121.40.usr-data-comm -p /var/run/gluster/vol...
           ├─32314 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id memberfs.10.138.121.40.usr-data-memberfs -p /var/run/gluste...
           ├─32325 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id orabaktmpvol.10.138.121.40.usr-data-orabaktemp -p /var/run/...
           ├─32336 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id payvol.10.138.121.40.usr-data-pay -p /var/run/gluster/vols/...
           ├─32347 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id picvol.10.138.121.40.usr-data-pic -p /var/run/gluster/vols/...
           ├─32369 /usr/sbin/glusterfsd -s 10.138.121.40 --volfile-id tourpicvol.10.138.121.40.usr-data-tourpic -p /var/run/glust...
           └─32384 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid...

8月 28 11:05:46 public_gfs1 systemd[1]: Starting GlusterFS, a clustered file-system server...
8月 28 11:05:47 public_gfs1 systemd[1]: Started GlusterFS, a clustered file-system server.

 

你可能感兴趣的:(gfs)