记录一次OpenStack云盘(系统+数据)损坏的修复过程

问题概述

云环境:OpenStack Mitaka

虚拟机系统:CentOS 7.2

云盘:Cinder对接Ceph实现

经过一系列复杂原因(- _ -),某个虚拟机的后端云盘损坏(该云盘为启动盘,同时包含大量数据),所创建的虚拟机无法正常进入操作系统。因为nova resuce仅支持从镜像直接启动的虚拟机,因此无法使用resuce对虚拟机进行救援。

 

解决流程:

因为该云盘无法正确创建虚拟机并进入系统,因此创建一个临时虚拟机,将该问题云盘挂载至虚拟机上,然后进入系统尝试将该云盘挂载至目录:

mount /dev/vdd1 /temp

发生错误:

mount: wrong fs type, bad option, bad superblock on /dev/vdd1,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

查询message日志:

[root@host-192-168-111-11 /]# dmesg | tail
[ 6874.548652] pci 0000:00:0a.0: reg 0x10: [io  0x0000-0x003f]
[ 6874.548726] pci 0000:00:0a.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 6874.550200] pci 0000:00:0a.0: BAR 1: assigned [mem 0xc0000000-0xc0000fff]
[ 6874.559723] pci 0000:00:0a.0: BAR 0: assigned [io  0x1000-0x103f]
[ 6874.563685] virtio-pci 0000:00:0a.0: enabling device (0000 -> 0003)
[ 6874.625262] virtio-pci 0000:00:0a.0: virtio_pci: leaving for legacy driver
[ 6874.630530] virtio-pci 0000:00:0a.0: irq 31 for MSI/MSI-X
[ 6874.630565] virtio-pci 0000:00:0a.0: irq 32 for MSI/MSI-X
[ 6874.634169]  vdd: vdd1
[ 6903.305531] XFS (vdd1): Filesystem has duplicate UUID fc1bfc5d-a5d1-4c3c-afda-167500654723 - can't mount

这是因为vda1和vdd1的文件系统的uuid重复了,因为这两台虚拟机都是由同一个镜像创建而来。

因此使用不同镜像重新创建临时虚拟机,并重复上述挂载步骤:

[root@host-192-168-1-10 ~]# mount /dev/vdc1 /harbor-bak/
mount: mount /dev/vdc1 on /harbor-bak failed: Structure needs cleaning

显示磁盘有错误,经查询使用xfs_repair命令进行修复:

xfs_repair -L /dev/vdc1

该命令输出如下所示:

Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
block (1,1703-1706) multiply claimed by bno space tree, state - 2
block (1,1703-1706) multiply claimed by cnt space tree, state - 2
agi unlinked bucket 61 is 1405 in ag 1 (inode=8390013)
sb_fdblocks 254960792, counted 254969223
        - 16:45:31: scanning filesystem freespace - 501 of 501 allocation groups done
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - 16:45:31: scanning agi unlinked lists - 501 of 501 allocation groups done
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 180
        - agno = 120


......


        - agno = 497
        - agno = 498
        - agno = 499
        - agno = 500
        - agno = 412
        - 16:45:32: check for inodes claiming duplicate blocks - 61760 of 61760 inodes done
Phase 5 - rebuild AG headers and trees...
        - 16:45:32: rebuild AG headers and trees - 501 of 501 allocation groups done
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 8390013, moving to lost+found
Phase 7 - verify and correct link counts...
        - 16:45:33: verify and correct link counts - 501 of 501 allocation groups done
done

此时,重新挂载:

mount /dev/vdd1 /temp

成功,可以访问目录。

后续就是复制文件等恢复操作的过程了。

你可能感兴趣的:(openstack)