netapp 3050换盘操作流程
1、 先查看信息
netapp-db-B>sysconfig –r
。。。。。。现在的状态
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 2a.29 2a 1 13 FC:A - FCAL 10000 136000/278528000 137422/281442144
spare 1a.45 1a 2 13 FC:A - FCAL 10000 272000/557056000 280104/573653840
。。。。。。原来的的状态
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 2a.48 2a 3 0 FC:A - FCAL 10000 136000/278528000 137104/280790184
spare 2a.29 2a 1 13 FC:A - FCAL 10000 136000/278528000 137422/281442144
spare 1a.45 1a 2 13 FC:A - FCAL 10000 272000/557056000 280104/573653840
可看出比原来的少了一块热备盘。( 2a.48)
其他命令验证
netapp-db-B> aggr status -s
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 2a.29 2a 1 13 FC:A - FCAL 10000 136000/278528000 137422/281442144
spare 1a.45 1a 2 13 FC:A - FCAL 10000 272000/557056000 280104/573653840
执行命令
netapp-db-B> rdfile /etc/messages
netapp-db-B> rdfile /etc/messages.0
netapp-db-B> rdfile /etc/messages.1
netapp-db-B> rdfile /etc/messages.2
netapp-db-B> rdfile /etc/messages.3
netapp-db-B> rdfile /etc/messages.4
看到如下信息:
Tue Jan 1 00:28:16 CST [xxzx-netapp-db-B: ems.engine.inputSuppress:error]: Event 'scsi.path.excessiveErrors' suppressed 2 times since Wed Aug 31 02:26:16 CST 2011.
Tue Jan 1 00:28:16 CST [xxzx-netapp-db-B: scsi.path.excessiveErrors:error]: Excessive errors encountered by adapter 2a on disk device 2a.48.
Tue Jan 1 00:31:54 CST [xxzx-netapp-db-B: ispfc_init_0:error]: Failed to build FCAL map on Fibre Channel adapter 2a -- resetting adapter.
Tue Jan 1 00:32:00 CST [xxzx-netapp-db-B: raid.config.filesystem.disk.missing:info]: File system Disk /aggr0/plex0/rg0/2a.48 Shelf 3 Bay 0 [NETAPP X274_HPYTA146F10 NA03] S/N [V5WZYZAA] is missing.(坏盘的型号SN都提示出来的)
Tue Jan 1 00:32:00 CST [xxzx-netapp-db-B: fmmb.lock.disk.remove:info]: Disk ?.? removed from local mailbox set.
Tue Jan 1 00:32:00 CST [xxzx-netapp-db-B: raid.rg.recons.missing:notice]: RAID group /aggr0/plex0/rg0 is missing 1 disk(s).
Tue Jan 1 00:32:00(重构开始时间) CST [xxzx-netapp-db-B: raid.rg.recons.info:notice]: Spare disk 2a.23 will be used to reconstruct(重构) one missing disk in RAID group /aggr0/plex0/rg0.
Tue Jan 1 00:32:00 CST [xxzx-netapp-db-B: esh.bypass.err.disk:error]: Disk id 48 on channels 2a/PARTNER shelf id 3 ESH A bay 0 Bypassed at request of partner
Tue Jan 1 00:32:00 CST [xxzx-netapp-db-B: raid.rg.recons.start:notice]: /aggr0/plex0/rg0: starting reconstruction, using disk 2a.23
Tue Jan 1 00:32:03 CST [xxzx-netapp-db-B: fmmb.current.lock.disk:info]: Disk 2a.32 is a local HA mailbox(仲裁盘) disk.
Tue Jan 1 00:32:08 CST [xxzx-netapp-db-B: fmmb.current.lock.disk:info]: Disk 2a.16 is a local HA mailbox disk.
Tue Jan 1 00:32:08 CST [xxzx-netapp-db-B: fmmb.current.lock.disk:info]: Disk 2a.32 is a local HA mailbox disk.
Tue Jan 1 00:32:17 CST [xxzx-netapp-db-B: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from xxzx-netapp-db-B (DISK_FAIL - Bypassed by ESH) WARNING
Tue Jan 1 00:32:55 CST [xxzx-netapp-db-B: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from xxzx-netapp-db-B (FILESYSTEM DISK MISSING) WARNING
Tue Jan 1 02:00:00 CST [xxzx-netapp-db-B: kern.uptime.filer:info]: 2:00am up 748 days, 13:18 0 NFS ops, 165356 CIFS ops, 460 HTTP ops, 3634619546 FCP ops, 181212908 iSCSI ops
Tue Jan 1 03:00:00 CST [xxzx-netapp-db-B: kern.uptime.filer:info]: 3:00am up 748 days, 14:18 0 NFS ops, 165356 CIFS ops, 460 HTTP ops, 3634817890 FCP ops, 181212915 iSCSI ops
Tue Jan 1 03:51:44 CST [xxzx-netapp-db-B: raid.rg.recons.done:notice]: /aggr0/plex0/rg0: reconstruction completed for 2a.23 in 3:19:43.02(重构结束时间)
Tue Jan 1 03:51:45 CST [xxzx-netapp-db-B: fmmb.lock.disk.remove:info]: Disk 2a.16 removed from local mailbox set.
Tue Jan 1 03:51:45 CST [xxzx-netapp-db-B: fmmb.current.lock.disk:info]: Disk 2a.32 is a local HA mailbox disk.
Tue Jan 1 03:51:45 CST [xxzx-netapp-db-B: fmmb.current.lock.disk:info]: Disk 2a.23 is a local HA mailbox disk.
必须注意一定要看到重构结束。否则不要更换硬盘
2、 更换硬盘。
Netapp 物理机拔出黄灯报警硬盘,几秒钟后插入新盘,注意观察有闪黄灯变为绿灯过程
3、 验证
netapp-db-B> aggr status -s
。。。。。。。。
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 2a.48 2a 3 0 FC:A - FCAL 10000 136000/278528000 137104/280790184
spare 2a.29 2a 1 13 FC:A - FCAL 10000 136000/278528000 137422/281442144
spare 1a.45 1a 2 13 FC:A - FCAL 10000 272000/557056000 280104/573653840
注意:2a.48重新加入Spare序列中。。
netapp-db-B> vol status –s
spare 2a.48 2a 3 0 (not zerod)
spare 2a.29 2a 1 13
spare 1a.45 1a 2 13
注意:如发现 not zerod 状态(出厂空白盘的话无此提示)
需执行以下命令(以下命令有风险,需谨慎。。。。谨慎。。。)
netapp-db-B> disk zero spares (磁盘零化)
。。。。。。。。。。。
netapp-db-B> vol status –s
spare 2a.48 2a 3 0 (zero 6%)
spare 2a.29 2a 1 13
spare 1a.45 1a 2 13
零化进行中。。。
至此整个换盘过程结束。。。