[论文笔记] Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning (VEE, 2009)

Timespan: 1.28 – 1.29
Michael R. Hines and Kartik Gopalan. 2009. Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning. In Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments (VEE '09). ACM, New York, NY, USA, 51-60. (gs:94)


    作者Michael R. Hines写这篇论文的时候是S.U.N.Y. Binghamton大学的博士生,毕业后去了IBM Watson Research Center工作,研究兴趣为“creating and analyzing experimental, networked systems”。他参与了“IBM Cloud Rapid Experimentation and Analysis Tool”项目,这是个开源项目,主要作用:“a framework that automates IaaS cloud benchmarking through the running of controlled experiments”。目前支持的Iaas平台有:Amazon EC2、OpenStack、IBM SCP and IBM SCE等。

    Pre-copy是Xen中VM动态迁移的默认方法,本文提出了Post-copy方法,并与Pre-copy方法进行了各方面的比较(pages transferred, total migration time and network overhead)。
    Post-copy migration:“defers the transfer of a VM’s memeory contents until after its processor state has been sent to the target host.”

以下是论文摘录:
1. Post-copy可以确保memory page最多被传输一次;但是本文中的post-copy方法在desination node出现故障时不具有容错性。(S1)

2. 使用了adaptive pre-paging来降低“network page faults (major faults)” (S1)
* 关于minor/major faults, http://en.wikipedia.org/wiki/Page_fault

  • minor faults: If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory, then it is called a minor or soft page fault.
  • major faults: If the page is not loaded in memory at the time the fault is generated, then it is called a major or hard page fault.


3. 使用了dynamic self-ballooning (DSB) 来处理free pages以降低总迁移时间(S1)
本文中的DSB每隔5秒钟触发一次,“responds directly to OS memory allocation requests”。
* ballooning: allows a guest OS to reduce its memory footprint by releasing its free memory pages back to the hypervisor.
* memory footprint (内存占用):  the amount of main memory that a program uses or references while running.

4. (S2)介绍了相关工作,分为以下几个方向

  • Process Migration:post-copy技术在process migration中被大量研究
  • Pre-Paging:也称为“adaptive prefetching”或“adaptive remote paging”
  • Live VM Migration:可以分为hypervisor-based approaches、OS-level approaches、wide-area migration
  • Non-Live VM Migration:非动态迁移
  • DSB:ballooing技术被广泛用于VM memory resizing

作者提到,与本文最相近的是SnowFlock.

5. (S3)中介绍本文Post-Copy方法的设计,结合了4种技术:

  • demand-paging: 确保每个page最多只被传输一次
  • active push: 当major fault发生后,fault page附近的pages也会被传输(利用了spatial locality) (S3.2)
  • pre-paging: VM的page访问模式,预测major fault的发生,从而可以提早传输相关页面
  • DSB: 减少传输的free pages的数量 (S.3)

6. (S4)中则讨论了几个实现方面的问题:
(1) 如何“trap page faults at the target VM”:介绍了三种方法(shadow paging, page tracking, pseudo-paging),本文用到的是第三种,这种方式实现最快(S4.1)
(2) DSB的实现(S4.2)

实验的平台基于Xen 3.2.1和para-virtulaized Linux 2.6.18.8

7. (S6) 结论部分提到了要改进的地方

  1. investigate shadow paging based page fault detection
  2. handle destinationnode failure during post-copy migration
  3. implement a hybrid pre/post copy approach

你可能感兴趣的:(migration)