QEMU KVM Libvirt手册(5) – snapshots

前面讲了QEMU的qcow2格式的internal snapshot和external snapshot,这都是虚拟机文件格式的功能。

这是文件级别的。

还可以是文件系统级别的,比如很多文件系统支持snapshot,如OCFS2

还可以是block级别的,比如LVM支持snapshot

我们这节来分析openstack中各种snapshot的实现。

在Openstack中,Instance的启动大概有两种,一种是从image启动,一种是从bootable volume启动

启动了的instance还可以attach一个volume。

从image启动并且attach一个volume的libvirt xml里面有

<disk type='file' device='disk'>
  <driver name='qemu' type='qcow2' cache='none'/>
  <source file='/var/lib/nova/instances/59ca11ea-0978-4f7d-8385-480649e63a1d/disk'/>
  <target dev='vda' bus='virtio'/>
  <alias name='virtio-disk0'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none'/>
  <source dev='/dev/disk/by-path/ip-16.158.166.197:3260-iscsi-iqn.2010-10.org.openstack:volume-f6ba87f7-d0b6-4fdb-ac82-346371e78c48-lun-1'/>
  <target dev='vdb' bus='virtio'/>
  <serial>f6ba87f7-d0b6-4fdb-ac82-346371e78c48</serial>
  <alias name='virtio-disk1'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>

从bootable volume启动的instance的libvirt xml里面有

<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none'/>
  <source dev='/dev/disk/by-path/ip-16.158.166.197:3260-iscsi-iqn.2010-10.org.openstack:volume-640a10f7-3965-4a47-9641-002a94526444-lun-1'/>
  <target dev='vda' bus='virtio'/>
  <serial>640a10f7-3965-4a47-9641-002a94526444</serial>
  <alias name='virtio-disk0'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>

snapshot可以分为下面的几种

  • 对Instance进行snapshot
    • 对image启动的instance进行snapshot
      • 将运行当中的instance的Ephemeral disk打成snapshot,然后上传到glance上去
      • 根据虚拟机的状态和libvirt的版本,又分为live snapshot和code snapshot
      • 对instance进行snapshot的时候,attached volume不会同时snapshot
    • 对bootable volume启动的instance进行snapshot
      • 对运行中的instance snapshot的时候,发现disk时volume,则调用cinder,对LVM进行snapshot
      • snapshot的metadata会出现在glance里面,但是并不上传到glance上去,而是在LVM里面
      • 这个snapshot也会出现在cinder的数据库里面
  • 对volume进行snapshot:
    • 最终调用cinder,对backend的LVM进行snapshot
    • 这个snapshot会出现在cinder的数据库里面

所以snapshot本质上就两种,对libvirt的disk进行snapshot和对LVM进行snapshot

要进行snapshot调用的命令为

nova --debug image-create d9793e05-111c-43bb-93ed-672e94ad096e myInstanceWithVolume-snapshot3

调用的rest为

curl -i 'http://16.158.166.197:8774/v2/c24c59846a7f44538d958e7548cc74a3/servers/d9793e05-111c-43bb-93ed-672e94ad096e/action' -X POST -H "X-Auth-Project-Id: openstack" -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: +Onwmvjs6T0CELsN48ON4PUNMUhF-" -d '{"createImage": {"name": "myInstanceWithVolume-snapshot3", "metadata": {}}}'

会调用/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py中的

@wsgi.response(202)
@wsgi.serializers(xml=FullServerTemplate)
@wsgi.deserializers(xml=ActionDeserializer)
@wsgi.action('createImage')
@common.check_snapshots_enabled
def _action_create_image(self, req, id, body):

在这个函数中

#如果是volume则调用volume snapshot
image = self.compute_api.snapshot_volume_backed(
                                                       context,
                                                       instance,
                                                       image_meta,
                                                       image_name,
                                                       extra_properties=props)
#如果是普通的instance则进行普通的snapshot
image = self.compute_api.snapshot(context,
                                                  instance,
                                                  image_name,
                                                  extra_properties=props)

我们先来分析volume snapshot

在/usr/lib/python2.7/dist-packages/nova/compute/api.py中有函数

@check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED])
def snapshot_volume_backed(self, context, instance, image_meta, name,
                           extra_properties=None):

它首先会

#调用volume的api来snapshot
snapshot = self.volume_api.create_snapshot_force(
                    context, volume['id'], name, volume['display_description'])

然后

#然后将这个snapshot的metadata加入glance
return self.image_service.create(context, image_meta, data='')

在/usr/lib/python2.7/dist-packages/cinder/volume/api.py中有函数

def create_snapshot_force(self, context,
                          volume, name,
                          description, metadata=None):
    return self._create_snapshot(context, volume, name, description,
                                 True, metadata)

在_create_snapshot函数中

#在cinder数据库里面创建
snapshot = self.db.snapshot_create(context, options)

#创建volume snapshot
self.volume_rpcapi.create_snapshot(context, volume, snapshot)

在/usr/lib/python2.7/dist-packages/cinder/volume/manager.py的中有函数

def create_snapshot(self, context, volume_id, snapshot_id):

#会调用driver创建snapshot

model_update = self.driver.create_snapshot(snapshot_ref)

volume_driver默认是cinder.volume.drivers.lvm.LVMISCSIDriver,它继承于LVMVolumeDriver

LVMVolumeDriver有函数

def create_snapshot(self, snapshot):
    """Creates a snapshot."""
    self.vg.create_lv_snapshot(self._escape_snapshot(snapshot['name']),
                               snapshot['volume_name'],
                               self.configuration.lvm_type)

这里的vg是

self.vg = lvm.LVM(self.configuration.volume_group,
                  root_helper,
                  lvm_type=self.configuration.lvm_type,
                  executor=self._execute)

在/usr/lib/python2.7/dist-packages/cinder/brick/local_dev/lvm.py中有函数

def create_lv_snapshot(self, name, source_lv_name, lv_type='default'):

它主要执行了下面的命令

cmd = ['lvcreate', '--name', name,
       '--snapshot', '%s/%s' % (self.vg_name, source_lv_name)]

lvcreate --size 100M --snapshot --name snap /dev/vg00/lvol1

我们再来分析instance snapshot

在/usr/lib/python2.7/dist-packages/nova/compute/api.py中有函数

@wrap_check_policy
@check_instance_cell
@check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED,
                                vm_states.PAUSED, vm_states.SUSPENDED])
def snapshot(self, context, instance, name, extra_properties=None):

#在glance中创建一个image的记录
        image_meta = self._create_image(context, instance, name,
                                        'snapshot',
                                        extra_properties=extra_properties)

#真正创建snapshot
        self.compute_rpcapi.snapshot_instance(context, instance,
                                              image_meta['id'])

在/usr/lib/python2.7/dist-packages/nova/compute/manager.py中有函数

@wrap_exception()
@reverts_task_state
@wrap_instance_fault
@delete_image_on_error
def snapshot_instance(self, context, image_id, instance):

它调用_snapshot_instance,并最终调用driver

self.driver.snapshot(context, instance, image_id,
                     update_task_state)

nova compute的driver是libvirt

在/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py中有函数

1461     def snapshot(self, context, instance, image_href, update_task_state):

是snapshot的核心函数

1)得到libvirt的domain

virt_dom = self._lookup_by_name(instance['name'])

(Pdb) p instance['name']
'instance-0000000d'
(Pdb) p virt_dom
<libvirt.virDomain object at 0x7f22a87227d0>

2)得到image service

(image_service, image_id) = glance.get_remote_image_service(context, instance['image_ref'])

(Pdb) p image_service
<nova.image.glance.GlanceImageService object at 0x7f22a8722bd0>
(Pdb) p image_id
u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'

3) 得到image的metadata

base = compute_utils.get_image_metadata(context, image_service, image_id, instance)

(Pdb) p base
{u'min_disk': 20, u'container_format': u'bare', u'min_ram': 0, u'disk_format': u'qcow2', 'properties': {u'instance_type_memory_mb': u'2048', u'instance_type_swap': u'0', u'instance_type_root_gb': u'20', u'instance_type_name': u'm1.small', u'instance_type_id': u'5', u'instance_type_ephemeral_gb': u'0', u'instance_type_rxtx_factor': u'1.0', u'network_allocated': u'True', u'instance_type_flavorid': u'2', u'instance_type_vcpus': u'1', u'base_image_ref': u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'}}

4)得到snapshot

snapshot = snapshot_image_service.show(context, snapshot_image_id)

(Pdb) p snapshot
{'status': u'queued', 'name': u'myinstancewithvolume-snapshot6', 'deleted': False, 'container_format': u'bare', 'created_at': datetime.datetime(2014, 7, 3, 20, 48, 46, tzinfo=<iso8601.iso8601.Utc object at 0x7f22a873d8d0>), 'disk_format': u'qcow2', 'updated_at': datetime.datetime(2014, 7, 3, 20, 48, 46, tzinfo=<iso8601.iso8601.Utc object at 0x7f22a873d8d0>), 'id': u'2b2da7ba-9cba-4cc9-87da-a47666e62f57', 'owner': u'c24c59846a7f44538d958e7548cc74a3', 'min_ram': 0, 'checksum': None, 'min_disk': 20, 'is_public': False, 'deleted_at': None, 'properties': {u'instance_uuid': u'd9793e05-111c-43bb-93ed-672e94ad096e', u'instance_type_memory_mb': u'2048', u'user_id': u'5df0e89888364c0b80d47a7a426a9a67', u'image_type': u'snapshot', u'instance_type_id': u'5', u'instance_type_name': u'm1.small', u'instance_type_ephemeral_gb': u'0', u'instance_type_rxtx_factor': u'1.0', u'instance_type_root_gb': u'20', u'network_allocated': u'True', u'instance_type_flavorid': u'2', u'instance_type_vcpus': u'1', u'instance_type_swap': u'0', u'base_image_ref': u'd96b0e41-8264-41de-8dbb-6b31ce9bfbfc'}, 'size': 0}

5)得到Ephemeral disk的路径和格式

disk_path = libvirt_utils.find_disk(virt_dom)
source_format = libvirt_utils.get_disk_type(disk_path)

(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p source_format
'qcow2'

6)创建snapshot的metadata

metadata = self._create_snapshot_metadata(base, instance, image_format, snapshot['name'])

(Pdb) p metadata
{'status': 'active', 'name': u'myinstancewithvolume-snapshot6', 'container_format': u'bare', 'disk_format': 'qcow2', 'is_public': False, 'properties': {'kernel_id': u'', 'image_location': 'snapshot', 'image_state': 'available', 'ramdisk_id': u'', 'owner_id': u'c24c59846a7f44538d958e7548cc74a3'}}

7) 为snapshot生成一个文件名

snapshot_name = uuid.uuid4().hex

(Pdb) p snapshot_name
'7d9d745f5bf5482eb43886f6e69ed6e5'

8)得到snapshot文件夹

snapshot_directory = CONF.libvirt.snapshots_directory

(Pdb) p snapshot_directory
'/var/lib/nova/instances/snapshots'

9)调用live snapshot

(Pdb) p virt_dom
<libvirt.virDomain object at 0x7f22a87227d0>
(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p out_path
'/var/lib/nova/instances/snapshots/tmps2A_Uy/7d9d745f5bf5482eb43886f6e69ed6e5'
(Pdb) p image_format
'qcow2'

self._live_snapshot(virt_dom, disk_path, out_path, image_format)

10) _live_snapshot: 终止或者cancel当前的block operation

domain.blockJobAbort(disk_path, 0)

这是类似执行命令

virsh blockjob <domain> <path> [--abort] [--async] [--pivot] [--info] [<bandwidth>]

调用的是libvirt的virDomainBlockJobAbort

http://libvirt.org/html/libvirt-libvirt.html#virDomainBlockJobAbort

根据当前的job的不同以及flag的不同,结果不一样

如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_PULL类型,cancel它需要很长时间,所以flag通常设为VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC,来异步cancel,等job真正被cancel了,会有一个event发出。

如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_COPY,则马上停止当前操作,disk回到原来的状态

如果job是VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT,也马上停止,保持原状态不变

如果job是上面两种状态,而且flag设为VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT,则abort会失败,被告知正在进行copy或者commit操作。

11)_live_snapshot: 得到原disk的base disk的位置,生成新disk的位置

src_disk_size = libvirt_utils.get_disk_size(disk_path)
src_back_path = libvirt_utils.get_disk_backing_file(disk_path, basename=False)

(Pdb) p src_disk_size
21474836480
(Pdb) p src_back_path
'/var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678'

disk_delta = out_path + '.delta'

(Pdb) p disk_delta
'/var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta'

12)_live_snapshot: 创建一个cow disk

libvirt_utils.create_cow_image(src_back_path, disk_delta, src_disk_size)

其实是执行下面的命令

qemu-img info /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678

qemu-img create -f qcow2 -o backing_file=/var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678,size=21474836480 /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta

13)_live_snapshot: 将原来的disk复制到新的disk

domain.blockRebase(disk_path, disk_delta, 0,
libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY |
libvirt.VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT |
libvirt.VIR_DOMAIN_BLOCK_REBASE_SHALLOW)

(Pdb) p disk_path
'/var/lib/nova/instances/d9793e05-111c-43bb-93ed-672e94ad096e/disk'
(Pdb) p disk_delta
'/var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta'

经过这一步,我们可以看到新的disk中有了内容

root:/var/lib/nova/instances# qemu-img info d9793e05-111c-43bb-93ed-672e94ad096e/disk
image: d9793e05-111c-43bb-93ed-672e94ad096e/disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 379M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678
Format specific information:
    compat: 1.1
    lazy refcounts: false

root:/var/lib/nova/instances# qemu-img info /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta
image: /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 18G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/ed39541b2c77cd7b069558570fa1dff4fda4f678
Format specific information:
    compat: 1.1
    lazy refcounts: false

这相当于执行了blockpull

例如

[root@moon ~]# virsh snapshot-list daisy --tree

snap1-daisy
  |
  +- snap2-daisy
      |
      +- snap3-daisy

snap3的base是snap2,snap2的base是snap1

我们想让snap2的base直接是snap1
[root@moon ~]# virsh blockpull --domain daisy  --path /export/vmimgs/snap3-daisy.qcow2 \
 --base /export/vmimgs/snap1-daisy.qcow2 --wait --verbose
Block Pull: [100 %]
Pull complete

做了这个后,发现snap3直接base是snap1了

[root@moon ~]# qemu-img info /export/vmimgs/snap3-daisy.qcow2 
image: /export/vmimgs/snap3-daisy.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 145M
cluster_size: 65536
backing file: /export/vmimgs/snap1-daisy.qcow2
[root@moon ~]# 

14)_live_snapshot: 将新的disk进行压缩

libvirt_utils.extract_snapshot(disk_delta, 'qcow2', out_path, image_format)

调用下面的命令

qemu-img convert -f qcow2 -O qcow2 /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030.delta /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030

root:/var/lib/nova/instances# qemu-img info /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030
image: /var/lib/nova/instances/snapshots/tmpzfjdJS/7f8d11be9ff647f6b7a0a643fad1f030
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 862M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false

15)将image上传到glance

在snapshot函数中

image_service.update(context, image_href, metadata, image_file)

你可能感兴趣的:(qemu)