https://yq.aliyun.com/articles/18067
摘要: 由于扇区损坏导致多路径设备failed. 现象如下 : # dmesg : device-mapper: multipath: Failing path 8:128. qla2xxx 0000:07:00.
device-mapper: multipath: Failing path 8:128.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
qla2xxx 0000:07:00.1: scsi(1:0:0:1): Mid-layer underflow detected (40000 of 40000 bytes)...returning error status.
sd 1:0:0:1: SCSI error: return code = 0x00070000
end_request: I/O error, dev sda, sector 3870947799
device-mapper: multipath: Failing path 8:0.
sd 0:0:0:1: SCSI error: return code = 0x08000002
sdi: Current: sense key: Medium Error
Add. Sense: Unrecovered read error
Info fld=0x0
end_request: I/O error, dev sdi, sector 3870947799
msa2vd01vol01 (3600c0ff000xxxxxxf810394c01000000) dm-0 HP,MSA2312fc
[size=1.8T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][enabled]
\_ 0:0:0:1 sdi 8:128 [failed][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 1:0:0:1 sda 8:0 [failed][ready]
echo 1 > /sys/block/sda/device/delete
echo 1 > /sys/block/sdi/device/delete
echo "- - -" > /sys/class/scsi_host/${HBA}/scan, 如下
echo "- - -" > /sys/class/scsi_host/host0/scan
echo "- - -" > /sys/class/scsi_host/host1/scan
: checker msg is "readsector0 checker reports path is down"
reload: msa2vd01vol01 (3600c0ff000xxxxxxf810394c01000000) HP,MSA2312fc
[size=1.8T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=50][undef]
\_ 0:0:0:1 sda 8:0 [undef][ready]
\_ round-robin 0 [prio=10][undef]
\_ 1:0:0:1 sde 8:64 [undef][ready]
Vendor: HP Model: MSA2312fc Rev: M112
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 3902343744 512-byte hdwr sectors (1998000 MB)
sda: Write Protect is off
sda: Mode Sense: 93 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 3902343744 512-byte hdwr sectors (1998000 MB)
sda: Write Protect is off
sda: Mode Sense: 93 00 00 08
SCSI device sda: drive cache: write back
sda: sda1
sd 0:0:0:1: Attached scsi disk sda
sd 0:0:0:1: Attached scsi generic sg1 type 0
Vendor: HP Model: MSA2312fc Rev: M112
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sde: 3902343744 512-byte hdwr sectors (1998000 MB)
sde: Write Protect is off
sde: Mode Sense: 93 00 00 08
SCSI device sde: drive cache: write back
SCSI device sde: 3902343744 512-byte hdwr sectors (1998000 MB)
sde: Write Protect is off
sde: Mode Sense: 93 00 00 08
SCSI device sde: drive cache: write back
sde: sde1
sd 1:0:0:1: Attached scsi disk sde
sd 1:0:0:1: Attached scsi generic sg6 type 0
msa2vd01vol01 (3600c0ff000xxxxxxf810394c01000000) dm-0 HP,MSA2312fc
[size=1.8T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][enabled]
\_ 0:0:0:1 sda 8:0 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 1:0:0:1 sde 8:64 [active][ready]
fsck -v -p /dev/mapper/msa2vd01vol01p1
Filesystem volume name:
Last mounted on:
Filesystem UUID: 63d08b38-6214-4a28-8f3e-0b8fd9c1734e
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 243908608
Block count: 487791627
Reserved block count: 24389581
Free blocks: 117876429
Free inodes: 243257871
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 907
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Filesystem created: Wed Jul 21 19:10:23 2010
Last mount time: Tue May 14 07:38:24 2013
Last write time: Tue May 14 07:38:24 2013
Mount count: 1
Maximum mount count: 36
Last checked: Mon May 13 22:19:35 2013
Check interval: 15552000 (6 months)
Next check after: Sat Nov 9 22:19:35 2013
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Group 14766: (Blocks 483852288-483885055)
Block bitmap at 483852288 (+0), Inode bitmap at 483852289 (+1)
Inode table at 483852290-483852801 (+2)
1027 free blocks, 16384 free inodes, 0 directories
Free blocks: 483855860-483856886
Free inodes: 241926145-241942528
badblocks -v -b 4096 -o ./badblocks.20130514 /dev/sda 483868480 483868460
# cat badblocks.20130514
483868476
# dd if=/dev/sda of=./badblock483868476.data bs=4096 skip=483868476 count=1
dd: reading `/dev/sda': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.003665 seconds, 0.0 kB/s
dd if=./badblock483868476.data of=/dev/sda seek=483868476 bs=4096 count=1
badblocks -f -v -w -b 4096 /dev/sda 483868477 483868476
/dev/sda is apparently in use by the system; badblocks forced anyway.
Checking for bad blocks in read-write mode
From block 483868476 to 483868477
Testing with pattern 0xaa: done
Reading and comparing: done
Testing with pattern 0x55: done
Reading and comparing: done
Testing with pattern 0xff: done
Reading and comparing: done
Testing with pattern 0x00: done
Reading and comparing: done
Pass completed, 0 bad blocks found.
# badblocks -v -b 4096 /dev/sda 483868480 483868460
Checking blocks 483868460 to 483868480
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found.
ZERO_DAMAGED_PAGES = true
ERROR: ZLIB uncompress failed (detail: 'The compressed data was corrupted', output buffer length 32568, compressed length 12136, (block count 3651)) (gp_compress.c:355) (seg4 slice1 dw-host37-if2:51001 pid=5024)