在多节点群集中同步emcpower设备符

 
1)         环境介绍
OS redhat enterprise Linux 4.6 x86
Cluster:RHCS 2 nodes
多路径软件: emc powerpath 5.1 for linux
Storage:EMC AX4-5   EMC CX300
Ax4-5 有一个 LUN 映射给主机, CX300 有两个 LUN 映射给主机
2)         故障描述
在磁阵上配置好 LUN 映射后,先后重新两节点服务器。两节点都认到所映射存储单元( LUN )。运行 fdisk �Cl 查看 LUN 在主机( OS )看到的设备名。发现两节点认到的设备名不一致。其中, node1 认到 emcpowera emcpowerc emcpowerd node2 认到 emcpowera emcpowerb emcpowerc ;根据所划分空间的大小,可知其中 node1 emcpowera对应node2 emcpoweranode1 emcpowerc对应node2 emcpowerbnode1 emcpowerd对应node2 emcpowerc
由于两节点要做 cluster ,在群集中配置共享存储时,要求两节点对识别到的 LUN 要有相同的设备名。
3)         分析排错
node2 识别到的盘符是对的; node1 有问题,不知道为何把 emcpowerb 搞没了。
node1 上执行 powermt display dev=all
emcpadm getfreepseudos �Cn 5 发现 node1 emcpowerb 并在列表中。
由于业务系统上线在即,没有更多的时间去考虑和分析。当时想到两种思路,一是删除 node1 上识别到的路径,重启机器看看是否能解决;二是,将 node2 的盘符手动修改为和 node1 一样。
排错思路一操作:
powermt remove dev=all // 删除当前认到的路径
powermt config  // 路径重认
powermt display dev=all
reboot
问题依然存在,没有得到解决;
排错思路二操作:
node2 上操作
emcpadm getfreepseudos // 发现 emcpowerd 可用;
emcpadm �Cs emcpowerc �Ct emcpowerd
emcpadm �Cs emcpowerb �Ct emcpowerc
powermt save
Reboot
至此,两节点都认到 emcpowera,emcpowerc,emcpowerd ,问题解决。
 
4)         结论
由于 node1 之前做测试时,曾有 emcpowerb 存在过,在移走该设备后, powerpath 配置数据库未能及时更新。导致 emcpowerb 表现为占用。
后续我找了相关的文章,发现通过强制删除 powerpath 配置的文件方式尝试进行解决。操作步骤如下:
停止 powerpath 服务
/etc/init.d/PowerPath stop   
保存当前配置文件的备份
# cp /etc/powermt.custom /etc/powermt.custom.old_config
# cp /etc/emcp_devicesDB.dat /etc/emcp_devicesDB.dat.old_config
# cp /etc/emcp_devicesDB.idx /etc/emcp_devicesDB.idx.old_config
删除 powerpath 相关配置文件
 # rm /etc/powermt.custom /etc/emcp_devicesDB.dat /etc/emcp_devicesDB.idx
重启 powerpath 服务
# /etc/init.d/PowerPath start
保持 powerpath 配置
# powermt save
5)         参考
root cause 1
In some cases, during installation of PowerPath and device reconfiguration, a server may skip a few "emcpower" devices due to devices that were removed.  PowerPath keeps track of devices and makes sure that the emcpower device names remains the same regardless of the underlying Linux /dev/sd# device.
Fix: steps for powerpath 4.x
1) Make sure all I/O is stopped and all of the file systems to the array are unmounted.
2) Stop PowerPath
# /etc/init.d/PowerPath stop
3) Make a backup copy of the current PowerPath custom file just in case
# cp /etc/powermt.custom /etc/powermt.custom.old_config
4) Make a backup copy of the current PowerPath config dat file...just in case
# cp /etc/emcp_devicesDB.dat /etc/emcp_devicesDB.dat.old_config
5) Make a backup copy of the current PowerPath config idx file...just in case
# cp /etc/emcp_devicesDB.idx /etc/emcp_devicesDB.idx.old_config
6) Remove the old config files # rm /etc/powermt.custom /etc/emcp_devicesDB.dat /etc/emcp_devicesDB.idx
7) Remove the /etc/emc/archive directory.
# rm �Cr /etc/emc/arvhive
8) Start PowerPath
# /etc/init.d/PowerPath start
9) Save the new configuration
# powermt save
In some cases with PowerPath 4.x this process will clean up the PowerPath devices but they still will not be discovered in Bus-Target-LUN order so if you are trying to synchronize emcpower device numbers between two cluster nodes it may not work.  In this case it is recommended that you present the devices to the node one at a time in the order you want them to appear.
root cause 2
Devices were not added to the nodes in the same order
Fix: steps for powerpath 4.x
       Use the emcpadm command to change the emcpower pseudo devices to the desired names.
In order to "fix" the discrepancy between the two nodes the emcpadm command can be used.
1 Use the command below in order to determine the emcpower devices that are already in use
# emcpadm getused
PowerPath pseudo device names in use:
        Pseudo Device Name      Major# Minor#
                emcpowera         232      0
                emcpowerb         232     16
                emcpowerc         232     32
                emcpowerd         232     48
                emcpowere         232     64
                emcpowerg         232     96
2 Use the command below in order to determine the emcpower devices that are available
# emcpadm getfree -n 5 -b emcpowera
PowerPath pseudo device names not in use:
        Pseudo Device Name      Major# Minor#
                emcpowerf         232     80
                emcpowerh         232    112
                emcpoweri         232    128
                emcpowerj         232    144
                emcpowerk         232    160
3 Use the command below to rename a device
# emcpadm rename -s emcpowerg -t emcpowerf  
4 The "emcpadm getused" command can now be used again to check the devices after the rename
# emcpadm getused
PowerPath pseudo device names in use:
        Pseudo Device Name      Major# Minor#
                emcpowera         232      0
                emcpowerb         232     16
                emcpowerc         232     32
                emcpowerd         232     48
                emcpowere         232     64
                emcpowerf         232     80
5 Note In order to make sure that the actual volumes match between the two cluster nodes the "powermt display dev=all" command can be used from each node in the cluster for comparison.
# powermt display dev=all
Pseudo name=emcpowerf
CLARiiON ID=WRE00021500573 [Linux103]
Logical device ID=6006016022470A0084D8358B528BD911 [LUN 10]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B
==============================================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats -
## HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs errors
==============================================================================
  2 lpfc                      sdg        SP A0     active  alive      0      0
  3 lpfc                      sdm        SP B0     active  alive      0      0
 

你可能感兴趣的:(linux,redhat,powerpath,emcpadm,emcpower,powermt)