AIX rootvg 错误之--错误删除PV故障
系统环境:
操作系统: AIX 5300-08
前两天在做AIX系统运维时,客户遇到以下的案例:
错误现象:
1、查看rootvg时,一个PV missing
[root@aix199 /]#lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 524 0 00..00..00..00..00
0516-304 : Unable to find device id 000681aa29d5ceff in the Device
Configuration Database.
000681aa29d5ceff missing 480 184 70..00..00..18..96
[root@aix199 /]#
2、在/home下创建目录,出现I/O错误(创建普通文件可以)
[root@aix199 /]#mkdir /home/aaa
mkdir: cannot create /home/aaa.
/home/aaa: I/O error
[root@aix199 /]#df -m
Filesystem MB blocks Free %Used Iused %Iused Mounted on /dev/hd4 10880.00 10012.31 8% 5890 1% / /dev/hd2 10560.00 8354.95 21% 45691 2% /usr /dev/hd9var 5120.00 4769.67 7% 1145 1% /var /dev/hd3 9856.00 9404.13 5% 440 1% /tmp /dev/hd10opt 5120.00 4693.67 9% 4800 1% /opt /dev/lv00 5120.00 4959.23 4% 18 1% /var/adm/csd /dev/lv_soft 9600.00 8029.12 17% 2522 1% /soft /dev/hd1 5120.00 4877.34 5% 213 1% /home
--不是空间不足的原因
初步分析,应该是在rootvg下有两个PV,而一个PV被错误的删除后,导致出现以上错误!
解决方法:
1、正常删除丢失rootvg的PV
[root@aix199 /]#reducevg rootvg 000681aa29d5ceff
0516-016 ldeletepv: Cannot delete physical volume with allocated partitions. Use either migratepv to move the partitions or reducevg with the -d option to delete the partitions. 0516-884 reducevg: Unable to remove physical volume 000681aa29d5ceff.
[root@aix199 /]#reducevg -d rootvg 000681aa29d5ceff
0516-914 rmlv: Warning, all data belonging to logical volume lv00 on physical volume 000681aa29d5ceff will be destroyed. rmlv: Do you wish to continue? y(es) n(o)? y 0516-1008 rmlv: Logical volume lv00 must be closed. If the logical volume contains a filesystem, the umount command will close the LV device. 0516-1008 rmlv: Logical volume hd9var must be closed. If the logical volume contains a filesystem, the umount command will close the LV device. 0516-1008 rmlv: Logical volume hd10opt must be closed. If the logical volume contains a filesystem, the umount command will close the LV device. 0516-884 reducevg: Unable to remove physical volume 000681aa29d5ceff.
在rootvg中一部分LV的PP是在丢失的PV中分配的,其中包括hd9var 、hd10opt的逻辑卷;如果不清除这些PP的信息,将无法删除丢失的PV.
[root@aix199 /]#lsvg -l rootvg
rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 8 8 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 170 170 1 open/syncd / hd2 jfs 165 165 1 open/syncd /usr 0516-1147 : Warning - logical volume hd9var may be partially mirrored. hd9var jfs 80 81 3 open/stale /var hd3 jfs 154 154 1 open/syncd /tmp hd1 jfs 80 80 2 open/syncd /home hd10opt jfs 80 80 2 open/syncd /opt lv00 jfs 80 80 2 open/syncd /var/adm/csd
[root@aix199 /]#
2、查看hd9var上PP的分配信息
[root@aix199 /]#lslv -m hd9var
hd9var:/var LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0215 hdisk0 0002 0504 hdisk0 0003 0509 hdisk0 0004 0510 hdisk0 0005 0513 hdisk0 0006 0514 hdisk0 0007 0521 hdisk0 0008 0522 hdisk0 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 0009 0217 000681aa29d5ceff 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 0010 0218 000681aa29d5ceff ......
从以上可以看出,在hdisk0上,hd9var 总共分配了8个PP,剩下的都在丢失的PV上分配
3、将PP分配信息写入到一个临时文件中
[root@aix199 /]#lquerylv -L `getlvodm -l hd9var` -r >/tmp/mapfile
注意:其中使用的是倒引号
查看PP分配表,并修改
[root@aix199 /]#cat /tmp/mapfile
0009affa94970f34 215 1 0009affa94970f34 504 2 0009affa94970f34 509 3 0009affa94970f34 510 4 ...... 000681aa29d5ceff 286 78 000681aa29d5ceff 287 79 000681aa29d5ceff 288 80
[root@aix199 /]#
注:总共80个PP
在文件中保留要删除的PP(前8个PP在hdisk0):
[root@aix199 /]#cat /tmp/mapfile
000681aa29d5ceff 217 9 000681aa29d5ceff 218 10 ...... 000681aa29d5ceff 287 79 000681aa29d5ceff 288 80
[root@aix199 /]#
4、删除在丢失PV上分配的PP
[root@aix199 /]#wc -l /tmp/mapfile
73 /tmp/mapfile
[root@aix199 /]#lreducelv -l `getlvodm -l hd9var` -s 73 /tmp/mapfile
[root@aix199 /]#
5、查看hd9var的信息
[root@aix199 /]#lslv hd9var
LOGICAL VOLUME: hd9var VOLUME GROUP: rootvg LV IDENTIFIER: 0008570c00004c0000000144684ecb4c.6 PERMISSION: read/write VG STATE: active/complete LV STATE: opened/syncd TYPE: jfs WRITE VERIFY: off MAX LPs: 512 PP SIZE: 64 megabyte(s) COPIES: 1 SCHED POLICY: parallel LPs: 8 PPs: 8 STALE PPs: 0 BB POLICY: relocatable INTER-POLICY: minimum RELOCATABLE: yes INTRA-POLICY: center UPPER BOUND: 32 MOUNT POINT: /var LABEL: /var MIRROR WRITE CONSISTENCY: on/ACTIVE EACH LP COPY ON A SEPARATE PV ?: yes Serialize IO ?: NO
[root@aix199 /]#getlvcb -AT hd9var
AIX LVCB intrapolicy = c copies = 1 interpolicy = m lvid = 0008570c00004c0000000144684ecb4c.6 lvname = hd9var label = /var machine id = 8570C4C00 number lps = 8 relocatable = y strict = y stripe width = 0 stripe size in exponent = 0 type = jfs upperbound = 32 fs = time created = Tue Feb 25 09:10:14 2014 time modified = Thu Mar 6 12:21:22 2014
保存配置信息:
[root@aix199 /]#savebase
[root@aix199 /]#lsvg -l rootvg
rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 8 8 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 170 170 1 open/syncd / hd2 jfs 165 165 1 open/syncd /usr hd9var jfs 8 8 1 open/syncd /var hd3 jfs 154 154 1 open/syncd /tmp hd1 jfs 80 80 2 closed/syncd /home hd10opt jfs 80 80 2 open/syncd /opt lv00 jfs 3 3 1 closed/syncd /var/adm/csd
用同样方法处理hd10opt:
[root@aix199 /]#lslv -m hd10opt
hd10opt:/opt LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0218 hdisk0 0002 0229 hdisk0 0003 0505 hdisk0 0004 0506 hdisk0 0005 0507 hdisk0 0006 0511 hdisk0 0007 0512 hdisk0 0008 0517 hdisk0 0009 0518 hdisk0 0010 0523 hdisk0 0011 0524 hdisk0 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 0012 0193 000681aa29d5ceff 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 0013 0194 000681aa29d5ceff 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 0014 0195 000681aa29d5ceff ......
[root@aix199 /]#cat /tmp/mapfile
0009affa94970f34 218 1 0009affa94970f34 229 2 0009affa94970f34 505 3 ...... 000681aa29d5ceff 190 78 000681aa29d5ceff 191 79 000681aa29d5ceff 192 80
[root@aix199 /]#
在临时文件中保留要删除的PP:
[root@aix199 /]#cat /tmp/mapfile
000681aa29d5ceff 193 12 000681aa29d5ceff 194 13 000681aa29d5ceff 195 14 ...... 000681aa29d5ceff 190 78 000681aa29d5ceff 191 79 000681aa29d5ceff 192 80
[root@aix199 /]#wc -l /tmp/mapfile
69 /tmp/mapfile
[root@aix199 /]#
[root@aix199 /]#lreducelv -l `getlvodm -l hd10opt` -s 69 /tmp/mapfile
[root@aix199 /]#lslv -m hd10opt
hd10opt:/opt LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0218 hdisk0 0002 0229 hdisk0 0003 0505 hdisk0 0004 0506 hdisk0 0005 0507 hdisk0 0006 0511 hdisk0 0007 0512 hdisk0 0008 0517 hdisk0 0009 0518 hdisk0 0010 0523 hdisk0 0011 0524 hdisk0
6、再次删除rootvg中丢失的PV
[root@aix199 /]#lsvg -p rootvg
rootvg: PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdisk0 active 524 0 00..00..00..00..00 0516-304 : Unable to find device id 000681aa29d5ceff in the Device Configuration Database. 000681aa29d5ceff missing 480 480 96..96..96..96..96 [root@aix199 /]#reducevg -d rootvg 000681aa29d5ceff 0516-304 putlvodm: Unable to find device id 000681aa29d5ceff0000000000000000 in the Device Configuration Database. 0516-896 reducevg: Warning, cannot remove physical volume 000681aa29d5ceff from Device Configuration Database. [root@aix199 /]#lsvg -p rootvg rootvg: PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdisk0 active 524 0 00..00..00..00..00
注:丢失的PV已经被删除
7、解决I/O出错问题
[root@aix199 /]#mkdir /home/aaa
mkdir: cannot create /home/aaa.
/home/aaa: I/O error
注:在/home/下创建目录依然出错
[root@aix199 /]#lslv -m hd1
hd1:/home LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0217 hdisk0 0002 0515 hdisk0 0003 0516 hdisk0 [root@aix199 /]#lsvg rootvg VOLUME GROUP: rootvg VG IDENTIFIER: 0008570c00004c0000000144684ecb4c VG STATE: active PP SIZE: 64 megabyte(s) VG PERMISSION: read/write TOTAL PPs: 524 (33536 megabytes) MAX LVs: 256 FREE PPs: 0 (0 megabytes) LVs: 10 USED PPs: 524 (33536 megabytes) OPEN LVs: 8 QUORUM: 2 (Enabled) TOTAL PVs: 1 VG DESCRIPTORS: 2 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 1 AUTO ON: yes MAX PPs per VG: 32512 MAX PPs per PV: 1016 MAX PVs: 32 LTG size (Dynamic): 2048 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable
[root@aix199 /]#lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 8 8 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 170 170 1 open/syncd / hd2 jfs 165 165 1 open/syncd /usr hd9var jfs 8 8 1 open/syncd /var hd3 jfs 154 154 1 open/syncd /tmp hd1 jfs 3 3 1 open/syncd /home hd10opt jfs 11 11 1 open/syncd /opt lv00 jfs 3 3 1 closed/syncd /var/adm/csd
[root@aix199 /]#df -m
Filesystem MB blocks Free %Used Iused %Iused Mounted on /dev/hd4 10880.00 10012.24 8% 5890 1% / /dev/hd2 10560.00 8354.95 21% 45691 2% /usr /dev/hd9var 5120.00 4769.67 7% 1145 1% /var /dev/hd3 9856.00 9404.13 5% 441 1% /tmp /dev/hd10opt 5120.00 4693.67 9% 4800 1% /opt /dev/lv_soft 9600.00 8029.12 17% 2522 1% /soft /dev/hd1 5120.00 4877.34 5% 213 1% /home 以上可以看出,/home对应的LV只分配了3个PP,而显示的空间却有5120m,所以/home的空间应该还在使用丢失的PV.
8、对/home进行备份,并删除/home文件系统进行重新建立
[root@aix199 /]#umount /home
[root@aix199 /]#smit rmfs
删除/home文件系统后,hd1逻辑卷也被删除
[root@aix199 /]#lsvg -l rootvg
rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 8 8 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 170 170 1 open/syncd / hd2 jfs 165 165 1 open/syncd /usr hd9var jfs 8 8 1 open/syncd /var hd3 jfs 154 154 1 open/syncd /tmp hd10opt jfs 11 11 1 open/syncd /opt lv00 jfs 3 3 1 open/syncd /var/adm/csd
9、重新建立hd1的逻辑卷,并mount到/home
[root@aix199 /]#lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 0008570c00004c0000000144684ecb4c VG STATE: active PP SIZE: 64 megabyte(s) VG PERMISSION: read/write TOTAL PPs: 524 (33536 megabytes) MAX LVs: 256 FREE PPs: 3 (192 megabytes) LVs: 9 USED PPs: 521 (33344 megabytes) OPEN LVs: 8 QUORUM: 2 (Enabled) TOTAL PVs: 1 VG DESCRIPTORS: 2 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 1 AUTO ON: yes MAX PPs per VG: 32512 MAX PPs per PV: 1016 MAX PVs: 32 LTG size (Dynamic): 2048 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable
[root@aix199 /]#lsvg -l rootvg
rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 8 8 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 170 170 1 open/syncd / hd2 jfs 165 165 1 open/syncd /usr hd9var jfs 8 8 1 open/syncd /var hd3 jfs 154 154 1 open/syncd /tmp hd1 jfs 2 2 1 closed/syncd N/A hd10opt jfs 11 11 1 open/syncd /opt lv00 jfs 3 3 1 open/syncd /var/adm/csd
[root@aix199 /]#mount /home
验证:
[root@aix199 /]#ls /home lost+found [root@aix199 /]#mkdir /home/aa [root@aix199 /]#ls -l /home total 16 drwxr-sr-x 2 root sys 512 May 4 17:54 aa drwxrwx--- 2 root system 512 May 4 17:53 lost+found [root@aix199 /]#df -m Filesystem MB blocks Free %Used Iused %Iused Mounted on /dev/hd4 10880.00 10012.22 8% 5893 1% / /dev/hd2 10560.00 8354.95 21% 45691 2% /usr /dev/hd9var 5120.00 4769.67 7% 1145 1% /var /dev/hd3 9856.00 9404.12 5% 441 1% /tmp /dev/hd10opt 5120.00 4693.67 9% 4800 1% /opt /dev/lv00 5120.00 4959.23 4% 18 1% /var/adm/csd /dev/lv_soft 9600.00 8029.12 17% 2522 1% /soft /dev/hd1 128.00 123.94 4% 18 1% /home
@至此,问题解决。在给rootvg添加PV时,一定要使用本地磁盘,而不要使用阵列上的磁盘,在删除PV时,应该选择正确的步骤!
文档参考:
http://www.doc88.com/p-687602374822.html