1.第一个办法,摘自网络  http://www.linuxidc.com/Linux/2013-09/90321.htm

DRBD脑裂的模拟及修复
注释:我们还接着上面的实验继续进行,现在NOD2为主节点而NOD1为备节点
1、断开主(parmary)节点;关机、断开网络或重新配置其他的IP都可以;这里选择的是断开网络
2、查看两节点状态
[root@nod2 ~]# drbd-overview
  0:drbd/0  WFConnection Primary/Unknown UpToDate/DUnknown C r----- /mnt ext4 2.0G 68M 1.9G 4%
[root@nod1 ~]# drbd-overview
  0:drbd/0  StandAlone Secondary/Unknown UpToDate/DUnknown r-----
######由上可以看到两个节点已经无法通信;NOD2为主节点,NOD1为备节点

3、将NOD1节点升级为主(primary)节点并挂载资源

[root@nod1 ~]# drbdadm primary drbd
[root@nod1 ~]# drbd-overview
  0:drbd/0  StandAlone Primary/Unknown UpToDate/DUnknown r-----
[root@nod1 ~]# mount /dev/drbd0 /mnt/
[root@nod1 ~]# mount | grep drbd0
/dev/drbd0 on /mnt type ext4 (rw)

4、假如原来的主(primary)节点修复好重新上线了,这时出现了脑裂情况

[root@nod2 ~]# tail -f /var/log/messages
Sep 19 01:56:06 nod2 kernel: d-con drbd: Terminating drbd_a_drbd
Sep 19 01:56:06 nod2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
Sep 19 01:56:06 nod2 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
Sep 19 01:56:06 nod2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
Sep 19 01:56:06 nod2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
Sep 19 01:56:06 nod2 kernel: d-con drbd: conn( NetworkFailure -> Disconnecting )
Sep 19 01:56:06 nod2 kernel: d-con drbd: error receiving ReportState, e: -5 l: 0!
Sep 19 01:56:06 nod2 kernel: d-con drbd: Connection closed
Sep 19 01:56:06 nod2 kernel: d-con drbd: conn( Disconnecting -> StandAlone )
Sep 19 01:56:06 nod2 kernel: d-con drbd: receiver terminated
Sep 19 01:56:06 nod2 kernel: d-con drbd: Terminating drbd_r_drbd
Sep 19 01:56:18 nod2 kernel: block drbd0: role( Primary -> Secondary )

5、再次查看两节点的状态

[root@nod1 ~]# drbdadm role drbd
Primary/Unknown
[root@nod2 ~]# drbdadm role drbd
Primary/Unknown

6、查看NOD1与NOD2连接状态

[root@nod1 ~]# drbd-overview
  0:drbd/0  StandAlone Primary/Unknown UpToDate/DUnknown r----- /mnt ext4 2.0G 68M 1.9G 4%
[root@nod2 ~]# drbd-overview
  0:drbd/0  WFConnection Primary/Unknown UpToDate/DUnknown C r----- /mnt ext4 2.0G 68M 1.9G 4%
######由上可见,状态为StandAlone时,主备节点是不会通信的

7、查看DRBD的服务状态

[root@nod1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-05-27 04:30:21
m:res   cs          ro               ds                 p       mounted  fstype
0:drbd  StandAlone  Primary/Unknown  UpToDate/DUnknown  r-----  ext4
[root@nod2 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-05-27 04:30:21
m:res   cs            ro               ds                 p  mounted  fstype
0:drbd  WFConnection  Primary/Unknown  UpToDate/DUnknown  C  /mnt     ext4

8、在NOD1备用节点处理办法
[root@nod1 ~]# umount /mnt/
[root@nod1 ~]# drbdadm disconnect drbd
drbd: Failure: (162) Invalid configuration request
additional info from kernel:
unknown connection
Command 'drbdsetup disconnect ipv4:192.168.137.225:7789 ipv4:192.168.137.222:7789' terminated with exit code 10
[root@nod1 ~]# drbdadm secondary drbd
[root@nod1 ~]# drbd-overview
  0:drbd/0  StandAlone Secondary/Unknown UpToDate/DUnknown r-----
[root@nod1 ~]# drbdadm connect --discard-my-data drbd
######执行完以上三步后,你查看会发现还是不可用
[root@nod1 ~]# drbd-overview
  0:drbd/0  WFConnection Secondary/Unknown UpToDate/DUnknown C r-----

9、需要在NOD2节点上重新建立连接资源

[root@nod2 ~]# drbdadm connect drbd
######查看节点连接状态
[root@nod2 ~]# drbd-overview
  0:drbd/0  Connected Primary/Secondary UpToDate/UpToDate C r----- /mnt ext4 2.0G 68M 1.9G 4%
[root@nod1 ~]# drbd-overview
  0:drbd/0  Connected Secondary/Primary UpToDate/UpToDate C r-----
######由上可见已经恢复到正常运行状态





2.第二个办法   http://www.j3j5.com/post-128.html

Error Meaage:

Node1、Node2 主机启动 DRBD 后状态始终是 Secondary/Unknown

#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha WFConnection Secondary/Unknown Inconsistent/DUnknown C
Ans:

1、Node1、Node2 没有打开相对应的 Port,请开启相对应的 Port 或先把 IPTables 服务关闭即可。
2、可能发生了脑裂行为,一般出现在ha切换时,解决方法:
在一节点执行:
drbdadm secondary resource
drbdadm connect –discard-my-data resource
另一节点执行:
drbdadm connect resource
Q5.1: Failure: (104) Can not open backing device