环境:

    两台华为服务器,一台主存华为18500,一台备存5300,多路径软件为华为的power path

    操作系统是RHEL 6.5,做了集群,集群软件是RHCS,应用为ORACLE,主备模式

    已经做完了mirror卷,对应两台设备上的/dev/sdb和/dev/sdc lun==接上一篇《lvm2线性卷转成镜像卷》


现象:

    重起两个node之后,发现两个node上/dev/sdb和/dev/sdc互换名字了


node1上

Disk /dev/sdb: 644.2 GB, 644245094400 bytes

255 heads, 63 sectors/track, 78325 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


Disk /dev/sdc: 536.9 GB, 536870912000 bytes

255 heads, 63 sectors/track, 65270 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


node2上

Disk /dev/sdb: 536.9 GB, 536870912000 bytes

255 heads, 63 sectors/track, 65270 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


Disk /dev/sdc: 644.2 GB, 644245094400 bytes

255 heads, 63 sectors/track, 78325 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


在node2上名字是正确的顺序,所以服务运行在node2上

Cluster Status for db @ Wed Apr 18 17:00:02 2018

Member Status: Quorate

Member Name                                                     ID   Status

------ ----                                                     ---- ------

node1                                                           1 Online, rgmanager

node2                                                           2 Online, Local, rgmanager

Service Name                                                     Owner (Last)                                                     State         

------- ----                                                     ----- ------                                                     -----         

service:db-oracle                                              node2                                                        started   


现在需要在两个node上将多路径聚合出来的名字设置正确,使用udev来解决,解决过程如下:

step1:将rhcs freeze

[root@node1 ~]# clustat

Cluster Status for db @ Thu Apr 19 14:49:00 2018

Member Status: Quorate

Member Name                                                     ID   Status

------ ----                                                     ---- ------

node1                                                           1 Online, Local, rgmanager

node2                                                           2 Online, rgmanager

Service Name                                                     Owner (Last)                                                     State         

------- ----                                                     ----- ------                                                     -----         

service:db-oracle                                              node2                                                        started       


[root@node1 ~]# clusvcadm -Z db-oracle

Local machine freezing service:db-oracle...Success

[root@node1 ~]# clustat

Cluster Status for db @ Thu Apr 19 14:49:19 2018

Member Status: Quorate

Member Name                                                     ID   Status

------ ----                                                     ---- ------

node1                                                           1 Online, Local, rgmanager

node2                                                           2 Online, rgmanager

Service Name                                                     Owner (Last)                                                     State         

------- ----                                                     ----- ------                                                     -----         

service:db-oracle                                              node2                                                        started    [Z]

[root@node1 ~]#


step2:在两个node上使用udev绑定lun的名字

注意:

    多路径软件必须实现的功能有2个:1 多路径聚合,并在两个node上生成唯一的名字;2 这个名字必须在两个节点上一致

    两个node的/dev/sdb和/dev/sdc都是经过多路径聚合后生成的名字。但华为的iscsi多路径软件有问题,只生成了唯一的名字,但在各个node上的名字顺序是不统一的,这说明华为的多路径软件很不好,没有实现多路径软件应有的功能。我们必须手动解决这个问题


在两个node上查看两个lun的属性

[root@node1 ~]# udevadm info --query=env --path=/block/sdb

UDEV_LOG=3

DEVPATH=/devices/up_primary/up_adapter/host15/target15:0:0/15:0:0:1/block/sdb

MAJOR=8

MINOR=16

DEVNAME=/dev/sdb

......

ID_MODEL=XSG1

......

ID_REVISION=4303

......

ID_SERIAL=36a8ca7b100808f4e5f3ad82e00000068

......


[root@node1 ~]# udevadm info --query=env --path=/block/sdc

UDEV_LOG=3

DEVPATH=/devices/up_primary/up_adapter/host15/target15:0:1/15:0:1:1/block/sdc

MAJOR=8

MINOR=32

DEVNAME=/dev/sdc

......

ID_MODEL=HVS85T

......

ID_REVISION=4101

......

ID_SERIAL=364482e5100a340a72a7803c900000079

......


[root@node2 ~]# udevadm info --query=env --path=/block/sdb

UDEV_LOG=3

DEVPATH=/devices/up_primary/up_adapter/host15/target15:0:0/15:0:0:1/block/sdb

MAJOR=8

MINOR=16

DEVNAME=/dev/sdb

......

ID_MODEL=HVS85T

......

ID_REVISION=4101

......

ID_SERIAL=364482e5100a340a72a7803c900000079

......


[root@node2 ~]# udevadm info --query=env --path=/block/sdc

UDEV_LOG=3

DEVPATH=/devices/up_primary/up_adapter/host15/target15:0:1/15:0:1:1/block/sdc

MAJOR=8

MINOR=32

DEVNAME=/dev/sdc

......

ID_MODEL=XSG1

......

ID_REVISION=4303

......

ID_SERIAL=36a8ca7b100808f4e5f3ad82e00000068

......


经过对比,可以发现

    node1 /dev/sdb的ID_SERIAL等于node2 /dev/sdc的ID_SERIAL

    node1 /dev/sdc的ID_SERIAL等于node2 /dev/sdb的ID_SERIAL


node1

/dev/sdb

ID_SERIAL=36a8ca7b100808f4e5f3ad82e00000068

/dev/sdc

ID_SERIAL=364482e5100a340a72a7803c900000079


node2

/dev/sdb

ID_SERIAL=364482e5100a340a72a7803c900000079

/dev/sdc

ID_SERIAL=36a8ca7b100808f4e5f3ad82e00000068


step3:将数据库关掉

SYS@nbora2> shutdown immediate;

[oracle@node2 ~]$ lsnrctl stop


step4:在step2中找到唯一标识符后,在step3中关库、关监听,之后就可以配置udev了

node1上

[root@node1 ~]# more /etc/udev/rules.d/99-z-multipath.rules

KERNEL=="sd*",SUBSYSTEM=="block",ENV{ID_MODEL}=="HVS85T",ENV{ID_REVISION}=="4101",ENV{ID_SERIAL}=="364482e5100a340a72a7803c900000079",NAME="sdb"

KERNEL=="sd*",SUBSYSTEM=="block",ENV{ID_MODEL}=="XSG1",ENV{ID_REVISION}=="4303",ENV{ID_SERIAL}=="36a8ca7b100808f4e5f3ad82e00000068",NAME="sdc"


node2上相同,把/etc/udev/rules.d/99-z-multipath.rules拷贝到node2的/etc/udev/rules.d中


之后在两个node上分别执行如下命令即可==重新识别所有udev规则

    start_udev


注意:

    我之前在两个node的udev中只写了ENV{ID_SERIAL},重启后发现还是乱的,于是就又在两个node添加了两个属性ENV{ID_MODEL}和ENV{ID_REVISION},这样,两个lun在两个node上就是唯一的了


[root@node1 ~]# fdisk -l /dev/sdb

Disk /dev/sdb: 536.9 GB, 536870912000 bytes

255 heads, 63 sectors/track, 65270 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


[root@node1 ~]# fdisk -l /dev/sdc

Disk /dev/sdc: 644.2 GB, 644245094400 bytes

255 heads, 63 sectors/track, 78325 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


[root@node2 rules.d]# fdisk -l /dev/sdb

Disk /dev/sdb: 536.9 GB, 536870912000 bytes

255 heads, 63 sectors/track, 65270 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000


[root@node2 rules.d]# fdisk -l /dev/sdc

Disk /dev/sdc: 644.2 GB, 644245094400 bytes

255 heads, 63 sectors/track, 78325 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000

可以看到,两个node上的/dev/sdb和/dev/sdc就完全一致了


step5:现在将cluster unfreeze

[root@node2 ~]# clusvcadm -U db-oracle

之后切换测试一下

clusvcadm -r db-oracle -m node1


[root@node1 ~]# clustat

Cluster Status for db @ Thu Apr 19 15:17:53 2018

Member Status: Quorate

Member Name                                                     ID   Status

------ ----                                                     ---- ------

node1                                                           1 Online, Local, rgmanager

node2                                                           2 Online, rgmanager

Service Name                                                     Owner (Last)                                                     State         

------- ----                                                     ----- ------                                                     -----         

service:db-oracle                                              node1                                                        started  


集群切换正常,可以正常切换到node1上


查看两个node上lvm信息

[root@node1 ~]# lvs -a -o name,copy_percent,devices vg_oracle

  LV                   Cpy%Sync Devices                                    

  lv_oracle              100.00 lv_oracle_mimage_0(0),lv_oracle_mimage_1(0)

  [lv_oracle_mimage_0]          /dev/sdb(0)                                

  [lv_oracle_mimage_1]          /dev/sdc(0)                                

  [lv_oracle_mlog]              /dev/sdc(127999)                         

[root@node2 ~]# lvs -a -o name,copy_percent,devices vg_oracle

  LV                   Cpy%Sync Devices                                    

  lv_oracle              100.00 lv_oracle_mimage_0(0),lv_oracle_mimage_1(0)

  [lv_oracle_mimage_0]          /dev/sdb(0)                                

  [lv_oracle_mimage_1]          /dev/sdc(0)                                

  [lv_oracle_mlog]              /dev/sdc(127999)            

至此,线性卷修改为mirror卷OK,华为多路径导致lun名字混乱问题解决,cluster切换测试完毕