故障现象

朋友购买netapp fas3020C控制器,DS14MK2扩展柜,13个FC 146G硬盘;

连接开机,启动报如下错误:

NetApp Release 7.3.2P5: Sat Jan 30 08:39:53 PST 2010
Copyright (c) 1992-2009 NetApp.
Starting boot on Thu Feb 23 06:36:20 GMT 2012
Thu Feb 23 06:36:58 GMT [fci.initialization.failed:error]: Initialization failed on Fibre Channel adapter 0d.
Thu Feb 23 06:36:58 GMT [kern.version.change:notice]: Data ONTAP kernel version was changed from NetApp Release 7.2.5.1P6 to NetApp Release 7.3.2P5.
Thu Feb 23 06:37:01 GMT [disk.init.badSectorSize:error]: Disk 0b.35 has an unexpected sector size (512 bytes) and cannot be used.
Thu Feb 23 06:37:01 GMT [disk.init.badSectorSize:error]: Disk 0b.44 has an unexpected sector size (512 bytes) and cannot be used.
Thu Feb 23 06:37:01 GMT [disk.init.badSectorSize:error]: Disk 0b.45 has an unexpected sector size (512 bytes) and cannot be used.

按ctrl+c进入special menu,选择4a后系统panic,进入maintenance mode,sysconfig看不到磁盘;

将盘柜连接到正常控制器,sysconfig -d和sysconfig -v可以看到磁盘,

fas3020c> sysconfig -d

Device          HA    SHELF BAY CHAN    Disk Vital Product Information

----------      --------------- -----   ------------------------------

0a.60           0a    ?   ?     FC:A    3KS04DBW00007534USN4

0a.61           0a    ?   ?     FC:A    3KS1F59600007548RWT7

0b.32           0b    ?   ?     FC:A    Not available.

0b.33           0b    ?   ?     FC:A    Not available.

fas3020c> sysconfig -v

    NetApp Release 7.3.2P5: Sat Jan 30 08:39:53 PST 2010

    System ID: 0101193720 (fas3020c); partner ID: 0101193773 ()

    System Serial Number: 3070394 (fas3020c)

    System Rev: B0

    System Storage Configuration: Single-Path HA

    System ACP Connectivity: No Connectivity

    slot 0: System Board 2797 MHz (System Board X A2)

                Model Name:         FAS3020

                Part Number:        110-00081

                Revision:           A2

                Serial Number:      0385445

                Firmware release:   CFE 3.1.0

                Agent FW version    15

                LCD FW version      1.7

                Processors:         2

                Processor type:     Intel Xeon

                Memory Size:        2048 MB

    Remote LAN Module           Status: Online

        Part Number:        110-00057

        Revision:           C0

        Serial Number:      380739

        Firmware Version:   3.0

        Mgmt MAC Address:   00:A0:98:05:41:B1

        Ethernet Link:      down

        Using DHCP:         no

        IP Address:         192.168.115.170

        Netmask:            255.255.255.0

        Gateway:            192.168.115.254

    slot 0: Dual 10/100/1000 Ethernet Controller VI

        e0a MAC Address:    00:a0:98:06:a9:ba (auto-100tx-fd-up)

        e0b MAC Address:    00:a0:98:06:a9:bb (auto-unknown-cfg_down)

        e0c MAC Address:    00:a0:98:06:a9:b8 (auto-unknown-cfg_down)

        e0d MAC Address:    00:a0:98:06:a9:b9 (auto-unknown-cfg_down)

        Device Type:        Rev 3

    slot 0: FC Host Adapter 0a (Dual-channel, QLogic 2322 rev. 3, 64-bit, L-port, )

        Firmware rev:    3.3.27

        Host Loop Id:    7    FC Node Name:    5:00a:098200:016943

        Cacheline size:    16    FC Packet size:    2048

        SRAM parity:    Yes    External GBIC:    No

        Link Data Rate:    1 Gbit

                 ID     Vendor   Model            FW    Size

                 60: NETAPP   X274_S10K7146F10 NA00 136.0GB (280790184 520B/sect)

                 61: NETAPP   X274_S10K7146F10 NA03 136.0GB (280790184 520B/sect)

    slot 0: FC Host Adapter 0b (Dual-channel, QLogic 2322 rev. 3, 64-bit, L-port, )

        Firmware rev:    3.3.27

        Host Loop Id:    7    FC Node Name:    5:00a:098300:016943

        Cacheline size:    16    FC Packet size:    2048

        SRAM parity:    Yes    External GBIC:    No

        Link Data Rate:    1 Gbit

                 ID     Vendor   Model            FW    Size

                32: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 33: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 34: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 35: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 36: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 37: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 38: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 40: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 41: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 42: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 43: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 44: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

                 45: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 512B/sect) (Found in scan)

    slot 0: FC Host Adapter 0c (Dual-channel, QLogic 2322 rev. 3, 64-bit, L-port, )

        Firmware rev:    3.3.27

        Host Loop Id:    0    FC Node Name:    5:00a:098000:016943

        Cacheline size:    16    FC Packet size:    2048

        SRAM parity:    Yes    External GBIC:    No

        Link Data Rate:    1 Gbit

    slot 0: FC Host Adapter 0d (Dual-channel, QLogic 2322 rev. 3, 64-bit, L-port, )

        Firmware rev:    3.3.27

        Host Loop Id:    0    FC Node Name:    5:00a:098100:016943

        Cacheline size:    16    FC Packet size:    2048

        SRAM parity:    Yes    External GBIC:    No

        Link Data Rate:    1 Gbit

    slot 0: SCSI Host Adapter 0e (Adaptec AIC7892, )

        HW Version 0.0

        Ultra160 SCSI, Low Voltage Differential

    slot 0: ATA/IDE Adapter 0f (0x000001f0)

                0f.0                 NACF1GBJ U-A11            12-15-04 977MB 512B/sect (STI1J78606326165829)

    slot 3: NVRAM (NetApp NVRAM V)

                Revision:           B1

                Serial Number:      530424

                Memory Size:        512 MB

                Battery Status:     OK (3726 mV)

                Charger Status:     OFF

                Running Firmware:   5 (2.0.0)

                Cluster Interconnect Port 1:  disconnected

                Cluster Interconnect Port 2:  disconnected

问题分析

盘阵将FC/SAS盘格式化为520字节/扇区(SATA盘格式为512字节/扇区),现在扇区大小为512字节,所以盘阵控制器不能使用;估计该磁盘扩展柜被主机格式化使用过;所以现在的问题是,如何将这些磁盘格式化为520字节/扇区?

磁盘低格520字节/扇区

  1. 工具搜索

  1. 下载安装

http://sg.danny.cz/sg/sg3_utils.html

我下载的是1.29版本,下面链接有安装步骤

http://www.linuxfromscratch.org/blfs/view/svn/general/sg3_utils.html

./configure --prefix=/usr && make
make install
    1. 格式化

扫盘:将扩展柜用与主机fc卡连接,系统扫到13块磁盘,如下:

[root@localhost ~]# cd /dev
[root@localhost dev]# ls sd*
sda   sdb   sdc   sdd   sde   sdf   sdg   sdh   sdi   sdj   sdk   sdl   sdm
sda1  sdb1  sdc1  sdd1  sde1  sdf1  sdg1  sdh1  sdi1  sdj1  sdk1  sdl1  sdm1
[root@localhost dev]# fdisk -l

Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1          13      104391   83  Linux
/dev/hda2              14        9729    78043770   8e  Linux LVM

Disk /dev/sda: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1       17478   140392003+  42  SFS

Disk /dev/sdb: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       17478   140392003+  42  SFS

Disk /dev/sdc: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdd: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sde: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdf: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdg: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdh: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdh1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdi: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdi1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdj: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdj1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdk: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdk1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdl: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdl1               1       17478   140392003+   7  HPFS/NTFS

Disk /dev/sdm: 143.7 GB, 143764574208 bytes
255 heads, 63 sectors/track, 17478 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdm1               1       17478   140392003+   7  HPFS/NTFS
[root@localhost dev]#

安装SG3:

[root@localhost sg3_utils-1.29]# ./configure --prefix=/usr &&
> make
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk

……..
[root@localhost sg3_utils-1.29]# make install
Making install in include
make[1]: Entering directory `/root/sg3_utils-1.29/include'
make[2]: Entering directory `/root/sg3_utils-1.29/include'
make[2]: Nothing to be done for `install-exec-am'.
test -z "/usr/local/include/scsi" || /bin/mkdir -p "/usr/local/include/scsi"

……..


低级格式化:

[root@localhost /]# sg_format --format --size=520 /dev/sda
    NETAPP    X274_S10K7146F10  NA01   peripheral_type: disk [0x0]
Mode Sense (block descriptor) data, prior to changes:
  Number of blocks=280790184 [0x10bc84a8]
  Block size=512 [0x200]

A FORMAT will commence in 10 seconds
    ALL data on /dev/sda will be DESTROYED
        Press control-C to abort
A FORMAT will commence in 5 seconds
    ALL data on /dev/sda will be DESTROYED
        Press control-C to abort

Format has started
Format in progress, 0% done
Format in progress, 0% done
Format in progress, 0% done
Format in progress, 1% done
Format in progress, 2% done
Format in progress, 2% done
Format in progress, 3% done
Format in progress, 3% done
Format in progress, 4% done

Format in progress, 97% done

completed

[root@localhost /]# sginfo -a /dev/sda
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NETAPP 
Product:                   X274_S10K7146F10
Revision level:            NA01

Serial Number '3KS27B2B0000760513WT'

Read-Write Error Recovery mode page (0x1)
-----------------------------------------
AWRE                               0

连接到NetApp使用格式化后的磁盘:

将格式化为520个字节的磁盘连接到netapp控制器,

*> sysconfig -v
Tue Mar  6 12:10:36 GMT [shelf.config.spha:info]: System is using single path HA attached storage only.
        NetApp Release 7.3.2P5: Sat Jan 30 08:39:53 PST 2010
        System ID: 0101193720 (); partner ID: 0101193773 ()
        System Serial Number: 3070394 ()
        System Rev: B0
        System Storage Configuration: Single-Path HA
        System ACP Connectivity: No Connectivity
        slot 0: System Board 2797 MHz (System Board X A2)
                Model Name:         FAS3020
                Part Number:        110-00081
                Revision:           A2
                Serial Number:      0385445
                Firmware release:   CFE 3.1.0
                Agent FW version    15
                LCD FW version      1.7
                Processors:         1
                Processor type:     Intel Xeon
                Memory Size:        2048 MB
        slot 0: Dual 10/100/1000 Ethernet Controller VI
                e0a MAC Address:    00:a0:98:06:a9:ba (auto-unknown-cfg_down)
                e0b MAC Address:    00:a0:98:06:a9:bb (auto-unknown-cfg_down)
                e0c MAC Address:    00:a0:98:06:a9:b8 (auto-unknown-cfg_down)
                e0d MAC Address:    00:a0:98:06:a9:b9 (auto-unknown-cfg_down)
                Device Type:        Rev 3
        slot 0: FC Host Adapter 0a (Dual-channel, QLogic 2322 rev. 3, 64-bit, L-port, )
                Firmware rev:   3.3.27
                Host Loop Id:   7       FC Node Name:   5:00a:098200:016943
                Cacheline size: 16      FC Packet size: 2048
                SRAM parity:    Yes     External GBIC:  No
                Link Data Rate: 1 Gbit
                 ID     Vendor   Model            FW    Size
                 32: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 520B/sect)
                 33: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 520B/sect)
                 34: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 520B/sect)
                 35: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184 520B/sect)
                 36: NETAPP   X274_S10K7146F10 NA01 136.0GB (280790184

NetApp Release 7.3.2P5: Sat Jan 30 08:39:53 PST 2010
Copyright (c) 1992-2009 NetApp.
Starting boot on Tue Mar  6 12:25:39 GMT 2012
Tue Mar  6 12:26:21 GMT [diskown.isEnabled:info]: software ownership has been enabled for this system
Tue Mar  6 12:26:23 GMT [config.noPartnerDisks:CRITICAL]: No disks were detected for the partner; this node will be unable to takeover correctly


(1)  Normal boot.
(2)  Boot without /etc/rc.
(3)  Change password.
(4)  Initialize owned disks (6 disks are owned by this filer).
(4a) Same as option 4, but create a flexible root volume.
(5)  Maintenance mode boot.

Selection (1-5)? 4a
The system has 6 disks assigned whereas it needs 3 to boot, will try to assign the required number.
Zero disks and install a new file system? Tue Mar  6 12:26:28 GMT [monitor.chassisPower.degraded:notice]: Chassis power is degraded:
Tue Mar  6 12:26:39 GMT [ses.status.psError:CRITICAL]: DS14-Mk2-FC shelf 2 on channel 0a power error for Power supply 1: critical status; power supply failed. This module is on the rear side of the shelf, at the left.
Tue Mar  6 12:26:39 GMT [ses.psu.powerReqError:CRITICAL]: Not enough power supplies are present in channel 0a disk shelf 2 to satisfy disk drive and shelf power requirements.
yes
This will erase all the data on the disks, are you sure? yes
Zeroing disks takes about 40 minutes.
Tue Mar  6 12:27:11 GMT [coredump.spare.none:info]: No sparecore disk was found.
...........................................................................

知识点问题

    1. BCS/ZCS:磁盘阵列为保证数据存储高可考性,提供两种扇区数据校验,一是Block校验,520个字节/扇区,512+8,8个字节校验信息对512个字节进行校验;一是Zone校验,(8+1)*512,第9个扇区存放校验信息对前面8个扇区数据进行校验;写时连同校验信息一同写入,读时会对读出的数据进行校验比对,如果不一致,会进行RAID级条带恢复;
    2. SCSI
    3. SFS/HPFS文件系统

IBM TotalStorage SAN文件系统(SAN File System),IBM SFS将多个独立的文件系统抽象为一个单一的共享文件系统,从而解决了传统SAN架构中的文件和数据管理问题;

高性能文件系统 (High Performance File System ,HPFS),HPFS是Microsoft的LAN Manager中的文件系统,同时也是IBM的LAN Server和OS/2产品。在OS/2中,它就是HPFS,但是在LAN Manager和LAN Server产品中,它是HPFS386,这是HPFS的改进版本。HPFS提供了DOS文件系统中的文件分配表(FAT)所没有的长文件名和性能增强特性。另外,HPSF还能访问较大的硬盘驱动器,提供更多的组织特性并改善了文件系统的安全特性。HPFS386增加了对HPFS文件系统的32位访问,另外还增加了容错和安全×××