openstack中有关虚拟机cpu绑定总结

一.vcpu_pin_set:

vcpu_pin_set是最早加入到openstack中的一个设计vcpu绑定概念的一个功能。它主要是用来解决这个问题:

currently the instances can use all of the pcpu of compute node,the host may become slowly when vcpus of instances are busy, so we need to pin vcpus to the specific pcpus of host instead of all pcpus.the vcpu xml conf of libvirt will change to like this:
<vcpu cpuset="4-12,^8,15">1</vcpu>

也就是说主要是让openstack的虚拟机使用指定的几个物理cpu核,给物理机操作系统适当留下若干物理核,保证物理机的性能。

vcpu_pin_set用法如下,在nova的配置文件中加入一个配置项:

edit /etc/nova/nova.conf

[DEFAULT]
vcpu_pin_set = 4-12,^8,15     

#Presumably, this would ensure that all instances only run on CPUs 4,5,6,7,9,10,11,12,15

二.hw:cpu_policy

kilo版本的openstack新加入了一个cpu绑定功能:

The flavor extra specs will be enhanced to support twonew parameters

  • hw:cpu_policy=shared|dedicated
  • hw:cpu_threads_policy=avoid|separate|isolate|prefer

If the policy is set to ‘shared’ no change will be madecompared to the current default guest CPU placement policy. The guest vCPUswill be allowed to freely float across host pCPUs, albeit potentiallyconstrained by NUMA policy. If the policy is set to‘dedicated’ then the guest vCPUs will be strictly pinned to a set of hostpCPUs. In the absence of an explicit vCPU topology request, the virt driverstypically expose all vCPUs as sockets with 1 core and 1 thread. When strict CPU pinning is in effect the guest CPU topology will be setupto match the topology of the CPUs to which it is pinned. ie if a 2 vCPU guestis pinned to a single host core with 2 threads, then the guest will get atopology of 1 socket, 1 core, 2 threads.

The threads policy will control how the scheduler / virtdriver place guests wrt CPU threads. It will only apply if the sheduler policyis ‘dedicated’

·      avoid: the scheduler will not place the guest on a hostwhich has hyperthreads.

·      separate: if the host has threads, each vCPU will beplaced on a different core. ie no two vCPUs will be placed on thread siblings

·      isolate: if the host has threads, each vCPU will beplaced on a different coreand no vCPUs from other guests will be able to be placedon the same core. ie one thread sibling is guaranteed to always be unused,

·      prefer: if the host has threads, vCPU will be placed onthe same core, so they are thread siblings.

The image metadata properties will also allow specification of the threads policy

  • hw_cpu_threads_policy=avoid|separate|isolate|prefer

This will only be honoured if the flavor does not already have a threads policy set. This ensures the cloud administrator can have absolute control over threads policy if desired.

The scheduler will have to be enhanced so that it considers the usage of CPUs by existing guests.Use of a dedicated CPU policy will have to be accompanied by the setup of aggregates to split the hosts into two groups, one allowing overcommit of shared pCPUs and the other only allowing dedicated CPU guests.ie we do not want a situation with dedicated CPU and shared CPU guests on the same host. It is likely that the administrator will already need to setup host aggregates for the purpose of using huge pages for guest RAM. The same grouping will be usable forboth dedicated RAM (via huge pages) and dedicated CPUs (via pinning).

The compute host already has a notion of CPU sockets which are reserved for execution of base operating system services(vcpu_pin_set). This facility will be preserved unchanged. ie dedicated CPU guests will only be placed on CPUs which are not marked as reserved for the base OS.

Note that exact vCPU to pCPU pinning is not exposed to the user as this would require them to have direct knowledge of the host pCPU layout.Instead they request that the instance receive "dedicated" CPU resourcing and Nova handles allocation of pCPUs 

 

也就是说这个所谓的绑定,并不是让用户显式的将一个vcpu绑定到某一物理cpu上,openstack不会暴露给用户物理cpu的layout信息;它的使用只是由用户指定绑定选项dedicated,并制定绑定策略,由nova来通过一系列调度具体选择绑定某个vcpu到某一pcpu上。使用方法一般是建两个host-aggregate,一个叫cpu_pinning,一个叫normal,两个aggregate加入不同物理机,有绑定需求的虚机使用cpu_pinning这个aggregate中的物理机建虚机。不会将有绑定需求和没有绑定需求的cpu放在同一个物理机上

 

Example usage:

 

* Create a host aggregate and add set metadata on it toindicate it is to be used for pinning, 'pinned' is used

for the example but any key value can beused. The same key must used be used in later steps though:

    $ nova aggregate-create cpu_pinning

    $ novaaggregate-set-metadata cpu_pinning pinned=true

 

 Foraggregates/flavors that wont be dedicated set pinned=false:

$ nova aggregate-create normal

$ nova aggregate-set-metadata normal pinned=false

 

Before creating the new flavor forperformance intensive instances update all existing flavors so that their extraspecifications match them to the compute hosts in the normal aggregate.ieset all existing flavors to avoid this aggregate:

$ for FLAVOR in `nova flavor-list | cut -f 2 -d ' ' |grep -o [0-9]*`; \

     do nova flavor-key ${FLAVOR} set \

             "aggregate_instance_extra_specs:pinned"="false";\

     done

 

Createa new flavor for performance intensive instances. The differences in behaviourbetween the two will be the result of the metadata we add to the new flavorshortly.

$ nova flavor-create pinned.medium 6 2048 20 2

 

Set the hw:cpy_policy flavorextra specification to dedicated. This denotes that all instances createdusing this flavor willrequirededicated compute resources and be pinned accordingly.

$ nova flavor-key 6 set hw:cpu_policy=dedicated

 

Set the aggregate_instance_extra_specs:pinned flavorextra specification to true. This denotes that all instances created usingthis flavor will be sent to hosts in host aggregates with pinned=true intheir aggregate metadata:

$ nova flavor-key 6 setaggregate_instance_extra_specs:pinned=true

 

Finally, we must add some hosts to our performance hostaggregate. Hosts that are not intended to be targets for pinned instancesshould be added to the normal host aggregate(see nova host-list toget the host name(s)):

    $ nova aggregate-add-host cpu_pinning compute1.nova

    $ nova aggregate-add-host normal compute2.nova

Scheduler Configuration

On each node where the OpenStack ComputeScheduler (openstack-nova-scheduler) runs, edit /etc/nova/nova.conf. Add the AggregateInstanceExtraSpecFilter andNUMATopologyFilter values to the list of scheduler_default_filters.These filters are used to segregate the compute nodes that can be used for CPUpinning from those that can not and to apply NUMA aware scheduling rules whenlaunching instances:   scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,CoreFilter,NUMATopologyFilter,AggregateInstanceExtraSpecsFilter

 

Once the change has been applied, restartthe openstack-nova-scheduler service:

    # systemctl restart openstack-nova-scheduler

 

After the above - with a normal (non-adminuser) try to boot an instance with the newly created flavor:

    $ nova boot --image fedora --flavor 6 test_pinning

 

Confirm the instance has succesfully bootedand that it's vCPU's are pinned to _a single_ host CPU by observing the<cputune> element of the generated domain XML:

       # virsh list

     Id    Name                           State

   ----------------------------------------------------

     2    instance-00000001             running

   # virsh dumpxml instance-00000001

    ...

    <vcpu placement='static' cpuset='2-3'>2</vcpu>

      <cputune>

        <vcpupin vcpu='0' cpuset='2'/>

        <vcpupin vcpu='1' cpuset='3'/>

      </cputune>

The resultant output will be quite long,but there are some key elements related to NUMA layout and vCPU pinning tofocus on:

  • As you might expect the vCPU placement for the 2 vCPUs remains static though a cpuset range is no longer specified alongside it – instead the more specific placement definition defined later on are used:

<vcpuplacement='static'>2</vcpu>

  • The vcpupin, and emulatorpin elements have been added. These pin the virtual machine instance’s vCPU cores and the associated emulator threads respectively to physical host CPU cores. In the current implementation the emulator threads are pinned to the union of all physical CPU cores associated with the guest (physical CPU cores 2-3).

<cputune>

<vcpupinvcpu='0' cpuset='2'/>

<vcpupinvcpu='1' cpuset='3'/>

<emulatorpincpuset='2-3'/>

</cputune>


三.结合一、二

 这样可以做到既给物理机留核,又可以让虚机使用绑定的cpu。步骤就是把一和二结合起来。

四。。。

以上这些都无法实现:1.对于同一物理机上的虚拟机一个做CPU绑定、一个不做;2.显式地指定虚机上某个vcpu绑定到物理机某个pcpu。openstack的绑定是隐式的,不可以指定具体pcpu



参考:http://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/
   http://comments.gmane.org/gmane.comp.cloud.openstack.devel/43096
   http://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-cpu-pinning.html

你可能感兴趣的:(cpu,kvm,openstack)