ASM磁盘状态为forcing

结论:
如果一个diskgroup有一个failgroup offline,超出disk_repair_time定义的时间后,asm会对该failgroup做drop操作。
如果drop之后剩余的failgroup少于冗余策略的最低要求(normal为2,high为3)或者剩余空间不足以满足冗余的空间
需求,也就是rebalance无法正常进行,就会出现forcing状态的磁盘。

如果有forcing状态的盘,使用alter添加磁盘时要指定name, name是forcing状态的盘原先的name。

例如ssddg的磁盘ssddg_0001盘被drop掉了,state为forcing,用下面命令恢复磁盘组之后,SSDDG_0001状态变为normal:

alter diskgroup ssddg add failgroup rac2 disk '/dev/raw/raw3' name SSDDG_0001 force;
实验过程:

实验对象为一个有3个failgroup的磁盘组,实验需要,把disk_repair_time设置为5min:

SQL>  select name,failgroup,path from v$asm_disk where group_number=(select group_number from v$asm_diskgroup where  name='SSDDG');
NAME        FAILG PATH
--------------- ----- --------------------------------------------------------------------------------
SSDDG_0001    RAC2  /dev/raw/raw3
SSDDG_0000    RAC1  /dev/raw/raw2
SSDDG_0001_0    RAC3  /dev/raw/raw5
ASMCMD> lsattr -G ssddg -lm
Group_Name  Name                     Value       RO  Sys  
SSDDG       access_control.enabled   FALSE       N   Y    
SSDDG       access_control.umask     066         N   Y    
SSDDG       au_size                  1048576     Y   Y    
SSDDG       cell.smart_scan_capable  FALSE       N   N    
SSDDG       compatible.asm           11.2.0.3.0  N   Y    
SSDDG       compatible.rdbms         11.2.0.3    N   Y    
SSDDG       content.type             data        N   Y    
SSDDG       disk_repair_time         5m          N   Y    
SSDDG       idp.boundary             auto        N   Y    
SSDDG       idp.type                 dynamic     N   Y    
SSDDG       sector_size              512         Y   Y

磁盘组里只有很少的数据:

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     20480    18245                0           18245              0             N  DATADG/
MOUNTED  EXTERN  N         512   4096  1048576      2048     1653                0            1653              0             Y  OCRVOTE/
MOUNTED  NORMAL  N         512   4096  1048576     24576    24237             8192            8022              0             N  SSDDG/

实验一、
手动把SSDDG_0001_0这块盘offline:

SQL> select name,failgroup,path,mode_status,state from v$asm_disk where group_number=(select group_number from v$asm_diskgroup where  name='SSDDG');
NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
SSDDG_0001_0    RAC3                                               OFFLINE          NORMAL
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

5分钟之后盘被drop

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0001_S RAC3                                               OFFLINE          FORCING
SDDG
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

drop之后的最终结果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

结论:offline的盘被drop之后,磁盘组还能保持2个failgroup,空间也足够normal冗余,则盘的信息不再有记录。

实验二、在实验一基础上继续offline一个failgroup,drop之后的最终结果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0002_S RAC2                                               OFFLINE          FORCING
SDDG
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL

实验三、环境同实验一,在ssddg上创建10G的数据文件,如果drop掉一个failgroup,diskgroup的空间不能满足normal冗余。将一个failgroup offline ,drop之后的最终结果:

NAME        FAILG PATH                                           MODE_STATUS    STATE
--------------- ----- -------------------------------------------------------------------------------- -------------- ----------------
_DROPPED_0002_S RAC3                                               OFFLINE          FORCING
SDDG
SSDDG_0000    RAC1  /dev/raw/raw2                                       ONLINE          NORMAL
SSDDG_0001    RAC2  /dev/raw/raw3                                       ONLINE          NORMAL

asmcmd lsop的结果:

[grid@node1 ~]$ asmcmd lsop
Group_Name  Dsk_Num  State  Power  EST_WORK  EST_RATE  EST_TIME  
SSDDG       REBAL    ERRS   1

alert日志内容:

Thu Mar 31 15:31:00 2016
ERROR: ORA-15041 thrown in ARB0 for group number 3
Errors in file /oracle/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_22412.trc:
ORA-15041: diskgroup "SSDDG" space exhausted
Thu Mar 31 15:31:00 2016
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 3/0xf7e87524 (SSDDG)

你可能感兴趣的:(asm)