OpenStack从数据库恢复Volume状态

问题

OpenStack中很容易导致数据库和真实状态不一致的情况。因为OpenStack中操作基本都是分步完成的,从api接受请求到调度再到具体的操作节点,每一步都有可能更新数据库状态,如果哪一个出错就会直接抛出异常导致整个操作链中断,然后数据库就处于上一个操作后的更新状态。比较典型的就是删除实例,如果在nova-compute出错那这个实例的状态就可能永远处于deleting状态了。

现在我遇到这样一个问题,我有一个Volume挂载在一个实例上,但是不知道什么原因,这个Volume与这个实例的联系断了,在nova-volume通过tgtadm查看发现已经没有客户端连接到该Volume了。但是数据库中该记录还在,这导致以下结果:
1) 删除实例时无法删除,提示“Stderr: 'iscsiadm: No records found'”
2) 无法从实例卸载Volume 。于是只能直接操作数据库了。

与Volume相关的表

数据库中与Volume直接相关的几个表如下所示
OpenStack从数据库恢复Volume状态_第1张图片

操作Volume时数据库的相关数据变化

新建Volume:

select * from volumes where id = 40\G
*************************** 1. row ***************************
         created_at: 2012-10-29 07:00:23
         updated_at: 2012-10-29 07:00:25
         deleted_at: NULL
            deleted: 0
                 id: 40
             ec2_id: NULL
            user_id: 397dd3be88b6492caa88521502b07617
         project_id: c6159a4f3dd34a2b83527499a40dbd2b
               host: store2.sigsit.org
               size: 20
  availability_zone: nova
        instance_id: NULL
         mountpoint: NULL
        attach_time: NULL
             status: available
      attach_status: detached
       scheduled_at: 2012-10-29 07:00:23
        launched_at: 2012-10-29 07:00:25
      terminated_at: NULL
       display_name: test
display_description: 
  provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
      provider_auth: NULL
        snapshot_id: NULL
     volume_type_id: NULL

select * from volume_metadata where volume_id = 40\G

select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
   deleted: 0
        id: 205
target_num: 5
      host: store2.sigsit.org
 volume_id: 40

select * from block_device_mapping where volume_id = 40\G

select * from sm_volume where id = 40\G
    

将该Volume挂载到一个实例后:

select * from volumes where id = 40\G
*************************** 1. row ***************************
         created_at: 2012-10-29 07:00:23
         updated_at: 2012-10-29 11:55:36
         deleted_at: NULL
            deleted: 0
                 id: 40
             ec2_id: NULL
            user_id: 397dd3be88b6492caa88521502b07617
         project_id: c6159a4f3dd34a2b83527499a40dbd2b
               host: store2.sigsit.org
               size: 20
  availability_zone: nova
        instance_id: 70
         mountpoint: /dev/vdc
        attach_time: NULL
             status: in-use
      attach_status: attached
       scheduled_at: 2012-10-29 07:00:23
        launched_at: 2012-10-29 07:00:25
      terminated_at: NULL
       display_name: test
display_description: 
  provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
      provider_auth: NULL
        snapshot_id: NULL
     volume_type_id: NULL

select * from volume_metadata where volume_id = 40\G

select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
   deleted: 0
        id: 205
target_num: 5
      host: store2.sigsit.org
 volume_id: 40

select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
           created_at: 2012-10-29 11:55:36
           updated_at: NULL
           deleted_at: NULL
              deleted: 0
                   id: 49
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}

select * from sm_volume where id = 40\G

select * from instances where id = 70\G
*************************** 1. row ***************************
              created_at: 2012-09-10 02:32:36
              updated_at: 2012-09-12 10:43:48
              deleted_at: NULL
                 deleted: 0
                      id: 70
             internal_id: NULL
                 user_id: 397dd3be88b6492caa88521502b07617
              project_id: c6159a4f3dd34a2b83527499a40dbd2b
               image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
               kernel_id: 
              ramdisk_id: 
             server_name: NULL
            launch_index: 0
                key_name: NULL
                key_data: NULL
             power_state: 1
                vm_state: active
               memory_mb: 1024
                   vcpus: 1
                hostname: jiangyong-win7
                    host: stack6.sigsit.org
               user_data: 
          reservation_id: r-h1yqckm4
            scheduled_at: 2012-09-10 02:32:37
             launched_at: 2012-09-10 02:32:48
           terminated_at: NULL
            display_name: jiangyong-win7
     display_description: jiangyong-win7
       availability_zone: NULL
                  locked: 0
                 os_type: NULL
             launched_on: stack6.sigsit.org
        instance_type_id: 19
                 vm_mode: NULL
                    uuid: 333f6afa-9009-40f7-a493-20b2382628b1
            architecture: NULL
        root_device_name: /dev/vda
            access_ip_v4: NULL
            access_ip_v6: NULL
            config_drive: 
              task_state: NULL
default_ephemeral_device: NULL
     default_swap_device: NULL
                progress: 0
        auto_disk_config: NULL
      shutdown_terminate: 1
       disable_terminate: 0
                 root_gb: 0
            ephemeral_gb: 0
               cell_name: NULL
    

将该Volume从实例卸载:

select * from volumes where id = 40\G
*************************** 1. row ***************************
         created_at: 2012-10-29 07:00:23
         updated_at: 2012-10-29 11:58:36
         deleted_at: NULL
            deleted: 0
                 id: 40
             ec2_id: NULL
            user_id: 397dd3be88b6492caa88521502b07617
         project_id: c6159a4f3dd34a2b83527499a40dbd2b
               host: store2.sigsit.org
               size: 20
  availability_zone: nova
        instance_id: NULL
         mountpoint: NULL
        attach_time: NULL
             status: available
      attach_status: detached
       scheduled_at: 2012-10-29 07:00:23
        launched_at: 2012-10-29 07:00:25
      terminated_at: NULL
       display_name: test
display_description: 
  provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
      provider_auth: NULL
        snapshot_id: NULL
     volume_type_id: NULL

select * from volume_metadata where volume_id = 40\G

select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
   deleted: 0
        id: 205
target_num: 5
      host: store2.sigsit.org
 volume_id: 40

select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
           created_at: 2012-10-29 11:55:36
           updated_at: NULL
           deleted_at: 2012-10-29 11:58:36
              deleted: 1
                   id: 49
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}

select * from sm_volume where id = 40\G

select * from instances where id = 70\G
*************************** 1. row ***************************
              created_at: 2012-09-10 02:32:36
              updated_at: 2012-09-12 10:43:48
              deleted_at: NULL
                 deleted: 0
                      id: 70
             internal_id: NULL
                 user_id: 397dd3be88b6492caa88521502b07617
              project_id: c6159a4f3dd34a2b83527499a40dbd2b
               image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
               kernel_id: 
              ramdisk_id: 
             server_name: NULL
            launch_index: 0
                key_name: NULL
                key_data: NULL
             power_state: 1
                vm_state: active
               memory_mb: 1024
                   vcpus: 1
                hostname: jiangyong-win7
                    host: stack6.sigsit.org
               user_data: 
          reservation_id: r-h1yqckm4
            scheduled_at: 2012-09-10 02:32:37
             launched_at: 2012-09-10 02:32:48
           terminated_at: NULL
            display_name: jiangyong-win7
     display_description: jiangyong-win7
       availability_zone: NULL
                  locked: 0
                 os_type: NULL
             launched_on: stack6.sigsit.org
        instance_type_id: 19
                 vm_mode: NULL
                    uuid: 333f6afa-9009-40f7-a493-20b2382628b1
            architecture: NULL
        root_device_name: /dev/vda
            access_ip_v4: NULL
            access_ip_v6: NULL
            config_drive: 
              task_state: NULL
default_ephemeral_device: NULL
     default_swap_device: NULL
                progress: 0
        auto_disk_config: NULL
      shutdown_terminate: 1
       disable_terminate: 0
                 root_gb: 0
            ephemeral_gb: 0
               cell_name: NULL
    

再次将该Volume挂载到该实例:

select * from volumes where id = 40\G
*************************** 1. row ***************************
         created_at: 2012-10-29 07:00:23
         updated_at: 2012-10-29 12:00:32
         deleted_at: NULL
            deleted: 0
                 id: 40
             ec2_id: NULL
            user_id: 397dd3be88b6492caa88521502b07617
         project_id: c6159a4f3dd34a2b83527499a40dbd2b
               host: store2.sigsit.org
               size: 20
  availability_zone: nova
        instance_id: 70
         mountpoint: /dev/vdc
        attach_time: NULL
             status: in-use
      attach_status: attached
       scheduled_at: 2012-10-29 07:00:23
        launched_at: 2012-10-29 07:00:25
      terminated_at: NULL
       display_name: test
display_description: 
  provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
      provider_auth: NULL
        snapshot_id: NULL
     volume_type_id: NULL

select * from volume_metadata where volume_id = 40\G

select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
   deleted: 0
        id: 205
target_num: 5
      host: store2.sigsit.org
 volume_id: 40

select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
           created_at: 2012-10-29 11:55:36
           updated_at: NULL
           deleted_at: 2012-10-29 11:58:36
              deleted: 1
                   id: 49
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
*************************** 2. row ***************************
           created_at: 2012-10-29 12:00:32
           updated_at: NULL
           deleted_at: NULL
              deleted: 0
                   id: 50
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}

select * from sm_volume where id = 40\G

select * from instances where id = 70\G
*************************** 1. row ***************************
              created_at: 2012-09-10 02:32:36
              updated_at: 2012-09-12 10:43:48
              deleted_at: NULL
                 deleted: 0
                      id: 70
             internal_id: NULL
                 user_id: 397dd3be88b6492caa88521502b07617
              project_id: c6159a4f3dd34a2b83527499a40dbd2b
               image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
               kernel_id: 
              ramdisk_id: 
             server_name: NULL
            launch_index: 0
                key_name: NULL
                key_data: NULL
             power_state: 1
                vm_state: active
               memory_mb: 1024
                   vcpus: 1
                hostname: jiangyong-win7
                    host: stack6.sigsit.org
               user_data: 
          reservation_id: r-h1yqckm4
            scheduled_at: 2012-09-10 02:32:37
             launched_at: 2012-09-10 02:32:48
           terminated_at: NULL
            display_name: jiangyong-win7
     display_description: jiangyong-win7
       availability_zone: NULL
                  locked: 0
                 os_type: NULL
             launched_on: stack6.sigsit.org
        instance_type_id: 19
                 vm_mode: NULL
                    uuid: 333f6afa-9009-40f7-a493-20b2382628b1
            architecture: NULL
        root_device_name: /dev/vda
            access_ip_v4: NULL
            access_ip_v6: NULL
            config_drive: 
              task_state: NULL
default_ephemeral_device: NULL
     default_swap_device: NULL
                progress: 0
        auto_disk_config: NULL
      shutdown_terminate: 1
       disable_terminate: 0
                 root_gb: 0
            ephemeral_gb: 0
               cell_name: NULL
    

再次卸载该Volume并删除:

select * from volumes where id = 40\G
*************************** 1. row ***************************
         created_at: 2012-10-29 07:00:23
         updated_at: 2012-10-29 13:41:56
         deleted_at: 2012-10-29 13:44:36
            deleted: 1
                 id: 40
             ec2_id: NULL
            user_id: 397dd3be88b6492caa88521502b07617
         project_id: c6159a4f3dd34a2b83527499a40dbd2b
               host: store2.sigsit.org
               size: 20
  availability_zone: nova
        instance_id: NULL
         mountpoint: NULL
        attach_time: NULL
             status: deleting
      attach_status: detached
       scheduled_at: 2012-10-29 07:00:23
        launched_at: 2012-10-29 07:00:25
      terminated_at: 2012-10-29 13:41:55
       display_name: test
display_description: 
  provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
      provider_auth: NULL
        snapshot_id: NULL
     volume_type_id: NULL

select * from volume_metadata where volume_id = 40\G

select * from iscsi_targets where volume_id = 40\G

select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
           created_at: 2012-10-29 11:55:36
           updated_at: NULL
           deleted_at: 2012-10-29 11:58:36
              deleted: 1
                   id: 49
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
*************************** 2. row ***************************
           created_at: 2012-10-29 12:00:32
           updated_at: NULL
           deleted_at: 2012-10-29 13:10:51
              deleted: 1
                   id: 50
          instance_id: 70
          device_name: /dev/vdc
delete_on_termination: 0
         virtual_name: NULL
          snapshot_id: NULL
            volume_id: 40
          volume_size: NULL
            no_device: NULL
      connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}

select * from sm_volume where id = 40\G
    

相关结论

创建Volume对数据库的修改
修改volumes表,添加一条Volume记录。
修改iscsi_targets表,寻找一个可用的target记录,将该记录的volume_id设置为新添volume的id。target记录有target_num和host信息,然后nova会用这两个值去相应的主机用这个target_num创建Volume。
挂载Volume对数据库的修改
修改volumes表,设置instance_id、mountpoint为实例id、设备名,修改Volume的状态status、attach_status为in-use、attached
修改block_device_mapping表,添加一条映射记录,包括包含实例和卷的信息,特别是卷的连接信息。
卸载Volume对数据库的修改
修改volumes表,设置instance_id、mountpoint为null,修改Volume的状态status、attach_status为available、detached
修改block_device_mapping表,修改相应的映射记录,设置deleted_at时间及deleted为1。
删除Volume对数据库的修改
修改volumes表,设置deleted_at时间及deleted为1。
修改iscsi_targets表,修改将volume_id对应的记录,设置volume_id为null。

在数据库中修改以上四个数值,Volume可以成功挂载、卸载到其它实例,原来的问题实例也可以正常删除了。当然具体情况还得具体分析,我已经查看过Volume的状态知道已经实质上被卸载了,没有客户端连接到该Volume上,所以只要在数据库中将该Volume恢复成未挂载状态即可。

因此,当Volume处于不一致的状态时,首先应当登录Volume所在的存储节点,通过tgtadm --lld iscsi --mode target --op show命令查看Volume的状态:Volume是不是还存在?有没有连接的客户端?然后修改相关的数据库状态值。

北方工业大学 | 云计算研究中心 | 姜永

你可能感兴趣的:(OpenStack从数据库恢复Volume状态)