DRBD故障处理

drbd1主,drbd2辅

 

1,正常情况下状态:

 

[root@drbd1 ~]# cat /proc/drbd 

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----

    ns:2144476 nr:0 dw:36468 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

[root@drbd2 ~]# cat /proc/drbd 

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

2,drbd1故障后

 

drbd1状态:

[root@drbd1 ~]# cat /proc/drbd 

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----

    ns:4 nr:102664 dw:102668 dr:157 al:1 bm:8 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

drbd2的状态:

[root@drbd2 ~]# cat /proc/drbd 

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----

    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

3,处理方法:

 

a,将secondary配置成primary角色

[root@drbd2 ~]# drbdsetup /dev/drbd0 primary -o

[root@drbd2 ~]# cat /proc/drbd 

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----

    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

挂载:

[root@drbd2 /]# mount /dev/drbd0 /data1

[root@drbd2 data1]# ll

total 10272

-rw-r--r-- 1 root root 10485760 Feb 13 11:26 aa.img

drwx------ 2 root root    16384 Feb 13 11:25 lost+found

 

这个时候drbd2开始提供服务,开始写数据

 

drbd1主恢复正常后:

[root@drbd1 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----

    ns:2144476 nr:0 dw:36484 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8

 

drbd1状态是:StandAlone,此时,drbd1是不会和drbd2互相联系的

 

我们来查看下日志:

 

[root@drbd1 ~]# tailf /var/log/messages

Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0

Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)

Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 

Feb 13 16:14:27 drbd1 kernel: block drbd0: error receiving ReportState, l: 4!

Feb 13 16:14:27 drbd1 kernel: block drbd0: asender terminated

Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_asender

Feb 13 16:14:27 drbd1 kernel: block drbd0: Connection closed

Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 

Feb 13 16:14:27 drbd1 kernel: block drbd0: receiver terminated

Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_receiver

 

脑裂出现!

 

解决方法:

 

1>,我们需要将现在的drbd1角色修改为secondary

[root@drbd1 ~]# drbdadm secondary r0

[root@drbd1 ~]# drbdadm -- --discard-my-data connect r0  ##该命令告诉drbd,secondary上的数据不正确,以primary上的数据为准。

 

2>,我们还需要在drbd2上执行下面操作

[root@drbd2 /]# drbdadm connect r0

 

这样drbd1就能和drbd2开始连接上了,并且保证数据不会丢失:

[root@drbd1 ~]# cat /proc/drbd      

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9 

 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

    ns:0 nr:20592 dw:20592 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0


你可能感兴趣的:(脑裂,drbd,故障处理)