nova-compute Periodic tasks 机制

        我们本文将讲述nova-compute组件的resource tracker和report state的periodic task机制的代码流程。

        在前面关于nova-scheduler组件的启动主要分析了关于RPC-server的创建流程,同样地,nova-compute组件的resource tracker和report state的periodic task也是在其服务启动时分别创建了两个协程来定时的向nova-conductor上报主机的资源信息和nova-compute的服务状态信息,然后由nova-conductor组件上报到DB,从而更新数据库信息。由于我们前面已经分析了nova-scheduler组件的启动流程,其重点在于start方法中的代码,所以我们也主要分析关于nova-compute组件的start方法的代码。

#/nova/cmd/compute.py
def block_db_access():
    class NoDB(object):
        def __getattr__(self, attr):
            return self

        def __call__(self, *args, **kwargs):
            stacktrace = "".join(traceback.format_stack())
            LOG = logging.getLogger('nova.compute')
            LOG.error(_LE('No db access allowed in nova-compute: %s'),
                      stacktrace)
            raise exception.DBNotAllowed('nova-compute')

    nova.db.api.IMPL = NoDB()


def main():
    config.parse_args(sys.argv)
    logging.setup(CONF, 'nova')
    utils.monkey_patch()
    objects.register_all()

    gmr.TextGuruMeditation.setup_autorun(version)

    if not CONF.conductor.use_local:
        block_db_access()
        objects_base.NovaObject.indirection_api = \
            conductor_rpcapi.ConductorAPI()

    server = service.Service.create(binary='nova-compute',
                                    topic=CONF.compute_topic,
                                    db_allowed=CONF.conductor.use_local)
    service.serve(server)
    service.wait()

        上面是nova-compute组件的服务的启动的入口,config.parse_args(sys.argv)会创建后面RPC-server所需要的transport。这里CONF.conductor.use_local的值为False,因为这里采用默认值(即nova-compute组件不能直接访问数据库,需要通过nova-conductor组件),所以会进入if语句,关于if语句中的代码,我们后面在分析,在这里的主要作用就是访问数据库(通过nova-conductor组件)。

        在执行service.wait()代码时,则进入我们所要分析的start方法(具体细节请查看OpenStack-RPC-server的构建)。

#/nova/service.py:Service
class Service(service.Service):
    """Service object for binaries running on hosts.

    A service takes a manager and enables rpc by listening to queues based
    on topic. It also periodically runs tasks on the manager and reports
    it state to the database services table.
    """

    def __init__(self, host, binary, topic, manager, report_interval=None,
                 periodic_enable=None, periodic_fuzzy_delay=None,
                 periodic_interval_max=None, db_allowed=True,
                 *args, **kwargs):
        super(Service, self).__init__()
        self.host = host
        self.binary = binary
        self.topic = topic
        self.manager_class_name = manager
        # NOTE(russellb) We want to make sure to create the servicegroup API
        # instance early, before creating other things such as the manager,
        # that will also create a servicegroup API instance.  Internally, the
        # servicegroup only allocates a single instance of the driver API and
        # we want to make sure that our value of db_allowed is there when it
        # gets created.  For that to happen, this has to be the first instance
        # of the servicegroup API.
        self.servicegroup_api = servicegroup.API(db_allowed=db_allowed)
        manager_class = importutils.import_class(self.manager_class_name)
        self.manager = manager_class(host=self.host, *args, **kwargs)
        self.rpcserver = None
        self.report_interval = report_interval
        self.periodic_enable = periodic_enable
        self.periodic_fuzzy_delay = periodic_fuzzy_delay
        self.periodic_interval_max = periodic_interval_max
        self.saved_args, self.saved_kwargs = args, kwargs
        self.backdoor_port = None
        self.conductor_api = conductor.API(use_local=db_allowed)
        self.conductor_api.wait_until_ready(context.get_admin_context())

    def start(self):
        verstr = version.version_string_with_package()
        LOG.info(_LI('Starting %(topic)s node (version %(version)s)'),
                  {'topic': self.topic, 'version': verstr})
        self.basic_config_check()
        self.manager.init_host()
        self.model_disconnected = False
        ctxt = context.get_admin_context()
        try:
            self.service_ref = (
                self.conductor_api.service_get_by_host_and_binary(
                    ctxt, self.host, self.binary))
            self.service_id = self.service_ref['id']
        except exception.NotFound:
            try:
                self.service_ref = self._create_service_ref(ctxt)
            except (exception.ServiceTopicExists,
                    exception.ServiceBinaryExists):
                # NOTE(danms): If we race to create a record with a sibling
                # worker, don't fail here.
                self.service_ref = (
                    self.conductor_api.service_get_by_host_and_binary(
                        ctxt, self.host, self.binary))

        self.manager.pre_start_hook()

        if self.backdoor_port is not None:
            self.manager.backdoor_port = self.backdoor_port

        LOG.debug("Creating RPC server for service %s", self.topic)

        target = messaging.Target(topic=self.topic, server=self.host)

        endpoints = [
            self.manager,
            baserpc.BaseRPCAPI(self.manager.service_name, self.backdoor_port)
        ]
        endpoints.extend(self.manager.additional_endpoints)

        serializer = objects_base.NovaObjectSerializer()

        self.rpcserver = rpc.get_server(target, endpoints, serializer)
        self.rpcserver.start()

        self.manager.post_start_hook()

        LOG.debug("Join ServiceGroup membership for this service %s",
                  self.topic)
        # Add service to the ServiceGroup membership group.
        self.servicegroup_api.join(self.host, self.topic, self)

        if self.periodic_enable:
            if self.periodic_fuzzy_delay:
                initial_delay = random.randint(0, self.periodic_fuzzy_delay)
            else:
                initial_delay = None

            self.tg.add_dynamic_timer(self.periodic_tasks,
                                     initial_delay=initial_delay,
                                     periodic_interval_max=
                                        self.periodic_interval_max)

        这里的servicegroup_api属性是用于report state的peridoic task,即上报nova-compute的服务状态给数据库,self.tg.add_dynamic_timer则用于resource tracker的periodic task,即上报底层的物理机的资源信息(如disk,cpu和memory等等)给数据库。

        1. self.servicegroup_api = servicegroup.API(db_allowed=db_allowed)

_driver_name_class_mapping = {
    'db': 'nova.servicegroup.drivers.db.DbDriver',
    'zk': 'nova.servicegroup.drivers.zk.ZooKeeperDriver',
    'mc': 'nova.servicegroup.drivers.mc.MemcachedDriver'
}
_default_driver = 'db'
servicegroup_driver_opt = cfg.StrOpt('servicegroup_driver',
                                     default=_default_driver,
                                     help='The driver for servicegroup '
                                          'service (valid options are: '
                                          'db, zk, mc)')

CONF = cfg.CONF
CONF.register_opt(servicegroup_driver_opt)

# NOTE(geekinutah): By default drivers wait 5 seconds before reporting
INITIAL_REPORTING_DELAY = 5

#/nova/servicegroup/api.py:API
class API(object):

    def __init__(self, *args, **kwargs):
        '''Create an instance of the servicegroup API.

        args and kwargs are passed down to the servicegroup driver when it gets
        created.
        '''
        # Make sure report interval is less than service down time
        report_interval = CONF.report_interval
        if CONF.service_down_time <= report_interval:
            new_service_down_time = int(report_interval * 2.5)
            LOG.warning(_LW("Report interval must be less than service down "
                            "time. Current config: <service_down_time: "
                            "%(service_down_time)s, report_interval: "
                            "%(report_interval)s>. Setting service_down_time "
                            "to: %(new_service_down_time)s"),
                        {'service_down_time': CONF.service_down_time,
                         'report_interval': report_interval,
                         'new_service_down_time': new_service_down_time})
            CONF.set_override('service_down_time', new_service_down_time)
        LOG.debug('ServiceGroup driver defined as an instance of %s',
                  str(CONF.servicegroup_driver))
        driver_name = CONF.servicegroup_driver
        try:
            driver_class = _driver_name_class_mapping[driver_name]
        except KeyError:
            raise TypeError(_("unknown ServiceGroup driver name: %s")
                            % driver_name)
        self._driver = importutils.import_object(driver_class,
                                                 *args, **kwargs)

        上述的API初始化函数一是检测report nova-compute服务的间隔时间是否满足小于nova-compute server down的时间。如果report的间隔时间大于server down的时间,则report则不能准确的上报nova-compute server down的状态。二是根据配置文件创建servicegroup driver对象。我们这里采用配置文件默认的方式即db driver。那么db driver的创建都做了什么事呢?

#/nova/servicegroup/drivers/db.py:DbDriver
class DbDriver(base.Driver):

    def __init__(self, *args, **kwargs):
        """Creates an instance of the DB-based servicegroup driver.

        Valid kwargs are:

        db_allowed - Boolean. False if direct db access is not allowed and
                     alternative data access (conductor) should be used
                     instead.
        """
        self.db_allowed = kwargs.get('db_allowed', True)
        self.conductor_api = conductor.API(use_local=self.db_allowed)
        self.service_down_time = CONF.service_down_time

        这里,kwargs字典中的db_allowed在nova-compute组件中的值为False,所以在self.db_allowed= False,然后根据该值创建conductor中的API对象,如下。

#/nova/conductor/__init__.py:API
def API(*args, **kwargs):
    use_local = kwargs.pop('use_local', False)
    if oslo_config.cfg.CONF.conductor.use_local or use_local:
        api = conductor_api.LocalAPI
    else:
        api = conductor_api.API
    return api(*args, **kwargs)

因为use_local = False,所以执行api = conductor_api.API,该类的作用是通过RPC方式去访问conductor的manager,所以nova-compute组件report state是通过RPC上报状态到nova-conductor组件的。

2. self.manager.pre_start_hook()

# nova/compute/manager.py:ComputeManager
    def pre_start_hook(self):
        """After the service is initialized, but before we fully bring
        the service up by listening on RPC queues, make sure to update
        our available resources (and indirectly our available nodes).
        """
        self.update_available_resource(nova.context.get_admin_context())

    @periodic_task.periodic_task
    def update_available_resource(self, context):
        """See driver.get_available_resource()

        Periodic process that keeps that the compute host's understanding of
        resource availability and usage in sync with the underlying hypervisor.

        :param context: security context
        """
        new_resource_tracker_dict = {}
        nodenames = set(self.driver.get_available_nodes())
        for nodename in nodenames:
            rt = self._get_resource_tracker(nodename)
            rt.update_available_resource(context)
            new_resource_tracker_dict[nodename] = rt

        # Delete orphan compute node not reported by driver but still in db
        compute_nodes_in_db = self._get_compute_nodes_in_db(context,
                                                            use_slave=True)

        for cn in compute_nodes_in_db:
            if cn.hypervisor_hostname not in nodenames:
                LOG.info(_LI("Deleting orphan compute node %s") % cn.id)
                cn.destroy()

        self._resource_tracker_dict = new_resource_tracker_dict

        pre_start_hook方法的作用是在创建nova-compute组件的RPC-server之前,首先更新主机资源信息到数据库。其中update_available_resource方法的装饰器@periodic_task.periodic_task就表示周期的执行该函数,其细节我们会在后面的代码流程中分析,在这里首先关注该函数如何上报主机资源信息到数据库中。

        nodenames是利用libvirt接口查询得到的主机名,然后创建ResourceTracker对象,如下。

# nova/compute/manager.py:ComputeManager
    def _get_resource_tracker(self, nodename):
        rt = self._resource_tracker_dict.get(nodename)
        if not rt:
            if not self.driver.node_is_available(nodename):
                raise exception.NovaException(
                        _("%s is not a valid node managed by this "
                          "compute host.") % nodename)

            rt = resource_tracker.ResourceTracker(self.host,
                                                  self.driver,
                                                  nodename)
            self._resource_tracker_dict[nodename] = rt
        return rt

        如下,首先判断self._resource_tracker_dict字典中是否有已经创建的与nodename对应的ResourceTracker对象,如果有,则直接使用,如果没有,则创建一个ResourceTracker对象,并将其加入到self._resource_tracker_dict字典中去。

        然后调用ResourceTracker对象的update_available_resource方法去更新数据库信息。如下。

# nova/compute/resource_tracker.py:ResourceTracker
    def update_available_resource(self, context):
        """Override in-memory calculations of compute node resource usage based
        on data audited from the hypervisor layer.

        Add in resource claims in progress to account for operations that have
        declared a need for resources, but not necessarily retrieved them from
        the hypervisor layer yet.
        """
        LOG.info(_LI("Auditing locally available compute resources for "
                     "node %(node)s"),
                 {'node': self.nodename})
        resources = self.driver.get_available_resource(self.nodename)

        if not resources:
            # The virt driver does not support this function
            LOG.info(_LI("Virt driver does not support "
                 "'get_available_resource'. Compute tracking is disabled."))
            self.compute_node = None
            return
        resources['host_ip'] = CONF.my_ip

        # We want the 'cpu_info' to be None from the POV of the
        # virt driver, but the DB requires it to be non-null so
        # just force it to empty string
        if ("cpu_info" not in resources or
            resources["cpu_info"] is None):
            resources["cpu_info"] = ''

        # TODO(berrange): remove this once all virt drivers are updated
        # to report topology
        if "numa_topology" not in resources:
            resources["numa_topology"] = None

        self._verify_resources(resources)

        self._report_hypervisor_resource_view(resources)

        self._update_available_resource(context, resources)

        resources =self.driver.get_available_resource(self.nodename)从hypervisor层查询(通过libvirt接口)主机的资源信息,包括cpu,memory和disk等信息,然后通过self._verify_resources(resources)验证查询的资源是否完整,如不完整,则直接raise,self._report_hypervisor_resource_view(resources)方法通过Log的debug级别打印获得的资源信息。下面我们重点分析self._update_available_resource(context,resources)方法。

# nova/compute/resource_tracker.py:ResourceTracker
    @utils.synchronized(COMPUTE_RESOURCE_SEMAPHORE)
    def _update_available_resource(self, context, resources):

        # initialise the compute node object, creating it
        # if it does not already exist.
        self._init_compute_node(context, resources)

        # if we could not init the compute node the tracker will be
        # disabled and we should quit now
        if self.disabled:
            return

        if 'pci_passthrough_devices' in resources:
            devs = []
            for dev in jsonutils.loads(resources.pop(
                'pci_passthrough_devices')):
                if dev['dev_type'] == 'type-PF':
                    continue

                if self.pci_filter.device_assignable(dev):
                    devs.append(dev)

            if not self.pci_tracker:
                n_id = self.compute_node['id'] if self.compute_node else None
                self.pci_tracker = pci_manager.PciDevTracker(context,
                                                             node_id=n_id)
            self.pci_tracker.set_hvdevs(devs)

        # Grab all instances assigned to this node:
        instances = objects.InstanceList.get_by_host_and_node(
            context, self.host, self.nodename,
            expected_attrs=['system_metadata',
                            'numa_topology'])

        # Now calculate usage based on instance utilization:
        self._update_usage_from_instances(context, resources, instances)

        # Grab all in-progress migrations:
        migrations = objects.MigrationList.get_in_progress_by_host_and_node(
                context, self.host, self.nodename)

        self._update_usage_from_migrations(context, resources, migrations)

        # Detect and account for orphaned instances that may exist on the
        # hypervisor, but are not in the DB:
        orphans = self._find_orphaned_instances()
        self._update_usage_from_orphans(context, resources, orphans)

        # NOTE(yjiang5): Because pci device tracker status is not cleared in
        # this periodic task, and also because the resource tracker is not
        # notified when instances are deleted, we need remove all usages
        # from deleted instances.
        if self.pci_tracker:
            self.pci_tracker.clean_usage(instances, migrations, orphans)
            resources['pci_device_pools'] = self.pci_tracker.stats
        else:
            resources['pci_device_pools'] = []

        self._report_final_resource_view(resources)

        metrics = self._get_host_metrics(context, self.nodename)
        resources['metrics'] = jsonutils.dumps(metrics)

        # TODO(sbauza): Juno compute nodes are missing the host field and
        # the Juno ResourceTracker does not set this field, even if
        # the ComputeNode object can show it.
        # Unfortunately, as we're not yet using ComputeNode.save(), we need
        # to add this field in the resources dict until the RT is using
        # the ComputeNode.save() method for populating the table.
        # tl;dr: To be removed once RT is using ComputeNode.save()
        resources['host'] = self.host

        self._update(context, resources)
        LOG.info(_LI('Compute_service record updated for %(host)s:%(node)s'),
                     {'host': self.host, 'node': self.nodename})

        首先,_update_available_resource方法得到一个COMPUTE_RESOURCE_SEMAPHORE lock,然后执行下面的操作,注意,在Juno版本中,这里与下层的RabbitMQ相结合分析时,有个严重的bug:当_update_available_resource方法得到lock,RabbitMQ突然中断(Kill RabbitMQ进程),导致message没有发送到nova-conductor组件上去,然后RabbitMQ恢复,而重新发送message到nova-conductor组件,但是nova-conductor慢于resource tracker thread连接上RabbitMQ服务,所以最终也导致message发送失败,但是这个resource tracker thread认为message发送成功,然后等待期望的与msg_id相匹配的reply,且设置timeout=60s(注意,这个timeout=60s表示在60s内未收到message则raise MessagingTimeout到外层从而释放lock,如果在60s内收到message,发现与msg_id不匹配,resource tracker thread则简单的把message存到一个queue中,重新设置timeout=60s继续去等待message,从而不释放lock),而此时controller node和compute node到RabbitMQ之间的连接都建立完成了,此时另一个线程report state thread每隔10s将向nova-conductor传输message,以至于resource tracker thread将一直会收到message,即不会超时导致dead loop。就在此时creation VM thread 在执行instance_claim方法时,也需要相同的lock,而resource tracker thread在RabbitMQ层dead loop了,并没有释放lock,进而导致creation VM thread block。不过这个bug已经在Kilo版本被修复了。下面我们继续分析代码。

        然后我们需要创建self.compute_node,如果该属性没有值,那么这个resource tracker将不可用。然后更新资源的关于pci的相关信息。紧接着查询instance的相关信息,从而进行每个instance进行workload的计算。(这里通过数据库全部的instance的信息和迁移的instance的信息,同时在hypervisor层查询orphaned instances,这种instance没存在数据库中,所以需要利用hypervisor层去查询),然后通过log的info级别去report最终的资源信息。最后执行self._update(context,resources)去更新到数据库。如何更新的呢?我们往下看。

# nova/compute/resource_tracker.py:ResourceTracker
    def _update(self, context, values):
        """Update partial stats locally and populate them to Scheduler."""
        self._write_ext_resources(values)
        # NOTE(pmurray): the stats field is stored as a json string. The
        # json conversion will be done automatically by the ComputeNode object
        # so this can be removed when using ComputeNode.
        values['stats'] = jsonutils.dumps(values['stats'])

        if not self._resource_change(values):
            return
        if "service" in self.compute_node:
            del self.compute_node['service']
        # NOTE(sbauza): Now the DB update is asynchronous, we need to locally
        #               update the values
        self.compute_node.update(values)
        # Persist the stats to the Scheduler
        self._update_resource_stats(context, values)
        if self.pci_tracker:
            self.pci_tracker.save(context)

    def _update_resource_stats(self, context, values):
        stats = values.copy()
        stats['id'] = self.compute_node['id']
        self.scheduler_client.update_resource_stats(
            context, (self.host, self.nodename), stats)

# nova/scheduler/client/__init__.py:SchedulerClient
    def update_resource_stats(self, context, name, stats):
        self.reportclient.update_resource_stats(context, name, stats)

# nova/scheduler/client/report.py:SchedulerReportClient
    def update_resource_stats(self, context, name, stats):
        """Creates or updates stats for the desired service.

        :param context: local context
        :param name: name of resource to update
        :type name: immutable (str or tuple)
        :param stats: updated stats to send to scheduler
        :type stats: dict
        """

        if 'id' in stats:
            compute_node_id = stats['id']
            updates = stats.copy()
            del updates['id']
        else:
            raise exception.ComputeHostNotCreated(name=str(name))

        if 'stats' in updates:
            # NOTE(danms): This is currently pre-serialized for us,
            # which we don't want if we're using the object. So,
            # fix it here, and follow up with removing this when the
            # RT is converted to proper objects.
            updates['stats'] = jsonutils.loads(updates['stats'])
        compute_node = objects.ComputeNode(context=context,
                                           id=compute_node_id)
        compute_node.obj_reset_changes()
        for k, v in updates.items():
            if k == 'pci_device_pools':
                # NOTE(danms): Since the updates are actually the result of
                # a obj_to_primitive() on some real objects, we need to convert
                # back to a real object (not from_dict() or _from_db_object(),
                # which expect a db-formatted object) but just an attr-based
                # reconstruction. When we start getting a ComputeNode from
                # scheduler this "bandage" can go away.
                if v:
                    devpools = [objects.PciDevicePool.from_dict(x) for x in v]
                else:
                    devpools = []
                compute_node.pci_device_pools = objects.PciDevicePoolList(
                    objects=devpools)
            else:
                setattr(compute_node, k, v)
        compute_node.save()

        LOG.info(_LI('Compute_service record updated for '
                 '%s') % str(name))

        最终创建一个ComputeNode对象,更新该对象的相关属性信息,然后将其信息传输到nova-conductor组件去,由nova-conductor组件上报到DB。注意这里执行的compute_node.save()方法去更新信息到DB的。那么如何更新的呢?我们下面简单的画个逻辑图进行说明。再结合代码讲解。


        Object Model由Redhat的Dan Smith提出,在Icehouse版本中开始添加,在Juno版本中基本实现了所有的功能,Object Model应该说是nova中数据库访问方式的分水岭,在此之前,对每一个表的操作都放在同一个文件里,比如flavor.py,使用时直接调用这个文件中的函数去修改数据库。而Object Model引入后,新建了Flavor对与flavor表相对应,将对flavor表的操作都封装在Flavor对象里,需要通过Flavor对象的函数进行数据库操作。

        Object Model引入之后nova-compute访问数据库的流程如上图所示,nova-compute与nova-conductor没有部署在同一个节点时,虚线为引入Object Model之前nova-compute访问数据库的流程,实线表示引入Object Model之后的流程。Nova-compute需要更新数据库时,将通过Object Model调用nova.conductor.rpcapi.ConductorAPI提供的RPC接口,nova-conductor接受到RPC请求之后,通过本地的Object Model完成数据库的更新。

        nova-compute与nova-conductor部署在同一个节点时,nova-compute将直接通过Object Model的封装操作数据库,并不通过nova-conductor。

        下面我们就compute_node.save()举例说明Object Model的调用流程,注意,我们的nova-compute和nova-conductor没有部署在同一个节点。这里的Object Model的代码在nova/objects目录下。

# compute node: nova/objects/compute_node.py:ComputeNode
    @base.remotable
    def save(self, prune_stats=False):
        # NOTE(belliott) ignore prune_stats param, no longer relevant

        updates = self.obj_get_changes()
        updates.pop('id', None)
        self._convert_stats_to_db_format(updates)
        self._convert_host_ip_to_db_format(updates)
        self._convert_supported_instances_to_db_format(updates)
        self._convert_pci_stats_to_db_format(updates)

        db_compute = db.compute_node_update(self._context, self.id, updates)
        self._from_db_object(self._context, self, db_compute)

        注意save方法有个@base.remotable装饰器,这跟Object Model有关。注意,我们这里为了区分执行代码是在compute node还是在controller node上进行,在代码的位置前面会加上compute node或controller node,在前面我们没加的前提下默认nova-compute组件的代码在compute node上执行的。那么我们往下看这个装饰器做了些什么呢?

# compute node: nova/objects/base.py
# See comment above for remotable_classmethod()
#
# Note that this will use either the provided context, or the one
# stashed in the object. If neither are present, the object is
# "orphaned" and remotable methods cannot be called.
def remotable(fn):
    """Decorator for remotable object methods."""
    @functools.wraps(fn)
    def wrapper(self, *args, **kwargs):
        if args and isinstance(args[0], context.RequestContext):
            raise exception.ObjectActionError(
                action=fn.__name__,
                reason='Calling remotables with context is deprecated')
        if self._context is None:
            raise exception.OrphanedObjectError(method=fn.__name__,
                                                objtype=self.obj_name())
        if NovaObject.indirection_api:
            updates, result = NovaObject.indirection_api.object_action(
                self._context, self, fn.__name__, args, kwargs)
            for key, value in updates.iteritems():
                if key in self.fields:
                    field = self.fields[key]
                    # NOTE(ndipanov): Since NovaObjectSerializer will have
                    # deserialized any object fields into objects already,
                    # we do not try to deserialize them again here.
                    if isinstance(value, NovaObject):
                        setattr(self, key, value)
                    else:
                        setattr(self, key,
                                field.from_primitive(self, key, value))
            self.obj_reset_changes()
            self._changed_fields = set(updates.get('obj_what_changed', []))
            return result
        else:
            return fn(self, *args, **kwargs)

    wrapper.remotable = True
    wrapper.original_fn = fn
    return wrapper

        从上面的代码可以看出,如果nova-compute启动时设置了NovaObject.indirection_api的值,那么就是在controller node去执行save方法。从nova/cmd/compute.py文件中的main方法可以看出,由于nova-compute和nova-conductor没有部署在同一个节点,所以CONF.conductor.use_local为False,所以将设置NovaObject.indirection_api(全局变量)的值为conductor_rpcapi.ConductorAPI(),然后执行该对象的object_action方法。如下

# compute node: nova/conductor/rpcapi.py:ControllerAPI
    def object_action(self, context, objinst, objmethod, args, kwargs):
        cctxt = self.client.prepare()
        return cctxt.call(context, 'object_action', objinst=objinst,
                          objmethod=objmethod, args=args, kwargs=kwargs)

        从上可以看出,该方法将通过RabbitMQ的RPC调用执行远端的ControllerManager的object_action方法。继续向下分析,具体的RabbitMQ层面的代码就不分析了,具体参看前面关于RPC-server的创建的文章。

# controller node: nova/conductor/manager.py:ControllerManager
    def object_action(self, context, objinst, objmethod, args, kwargs):
        """Perform an action on an object."""
        oldobj = objinst.obj_clone()
        result = self._object_dispatch(objinst, objmethod, args, kwargs)
        updates = dict()
        # NOTE(danms): Diff the object with the one passed to us and
        # generate a list of changes to forward back
        for name, field in objinst.fields.items():
            if not objinst.obj_attr_is_set(name):
                # Avoid demand-loading anything
                continue
            if (not oldobj.obj_attr_is_set(name) or
                    getattr(oldobj, name) != getattr(objinst, name)):
                updates[name] = field.to_primitive(objinst, name,
                                                   getattr(objinst, name))
        # This is safe since a field named this would conflict with the
        # method anyway
        updates['obj_what_changed'] = objinst.obj_what_changed()
        return updates, result

        从上可以看出,通过RabbitMQ层最终调用到controller node的ControllerManager对象中的object_action方法,那么save方法如何在远端进行执行的呢?就是通过object_action方法中的self._object_dispatch方法执行的。如下

# controller node: nova/conductor/manager.py:ControllerManager
    def _object_dispatch(self, target, method, args, kwargs):
        """Dispatch a call to an object method.

        This ensures that object methods get called and any exception
        that is raised gets wrapped in an ExpectedException for forwarding
        back to the caller (without spamming the conductor logs).
        """
        try:
            # NOTE(danms): Keep the getattr inside the try block since
            # a missing method is really a client problem
            return getattr(target, method)(*args, **kwargs)
        except Exception:
            raise messaging.ExpectedException()

        最终执行下面的代码。

# controller node: nova/objects/compute_node.py:ComputeNode
    @base.remotable
    def save(self, prune_stats=False):
        # NOTE(belliott) ignore prune_stats param, no longer relevant

        updates = self.obj_get_changes()
        updates.pop('id', None)
        self._convert_stats_to_db_format(updates)
        self._convert_host_ip_to_db_format(updates)
        self._convert_supported_instances_to_db_format(updates)
        self._convert_pci_stats_to_db_format(updates)

        db_compute = db.compute_node_update(self._context, self.id, updates)
        self._from_db_object(self._context, self, db_compute)

        呀,controller node与compute node不是公用一套代码吗?在controller node上的save岂不是又要执行远端的RPC调用?答案是No,因为controller node并没有nova-compute服务NovaObject.indirection_api的值没有被设置,所以直接执行本地调用,即执行controller node本地的save方法内代码。最终将nova-compute的资源信息更新到DB中去。

        下面将重点分析resource tracker和report state的periodic上报信息到DB的操作。

        3. self.servicegroup_api.join(self.host, self.topic, self)

        Section 1 中已经分析self.servicegroup_api为nova/servicegroup/api.py文件中的API对象。所以其join方法的代码如下:

_driver_name_class_mapping = {
    'db': 'nova.servicegroup.drivers.db.DbDriver',
    'zk': 'nova.servicegroup.drivers.zk.ZooKeeperDriver',
    'mc': 'nova.servicegroup.drivers.mc.MemcachedDriver'
}
_default_driver = 'db'
servicegroup_driver_opt = cfg.StrOpt('servicegroup_driver',
                                     default=_default_driver,
                                     help='The driver for servicegroup '
                                          'service (valid options are: '
                                          'db, zk, mc)')

CONF = cfg.CONF
CONF.register_opt(servicegroup_driver_opt)

# NOTE(geekinutah): By default drivers wait 5 seconds before reporting
INITIAL_REPORTING_DELAY = 5

# nova/servicegroup/api.py:API
class API(object):

    def __init__(self, *args, **kwargs):
        '''Create an instance of the servicegroup API.

        args and kwargs are passed down to the servicegroup driver when it gets
        created.
        '''
        # Make sure report interval is less than service down time
        report_interval = CONF.report_interval
        if CONF.service_down_time <= report_interval:
            new_service_down_time = int(report_interval * 2.5)
            LOG.warning(_LW("Report interval must be less than service down "
                            "time. Current config: <service_down_time: "
                            "%(service_down_time)s, report_interval: "
                            "%(report_interval)s>. Setting service_down_time "
                            "to: %(new_service_down_time)s"),
                        {'service_down_time': CONF.service_down_time,
                         'report_interval': report_interval,
                         'new_service_down_time': new_service_down_time})
            CONF.set_override('service_down_time', new_service_down_time)
        LOG.debug('ServiceGroup driver defined as an instance of %s',
                  str(CONF.servicegroup_driver))
        driver_name = CONF.servicegroup_driver
        try:
            driver_class = _driver_name_class_mapping[driver_name]
        except KeyError:
            raise TypeError(_("unknown ServiceGroup driver name: %s")
                            % driver_name)
        self._driver = importutils.import_object(driver_class,
                                                 *args, **kwargs)

    def join(self, member, group, service=None):
        """Add a new member to a service group.

        :param member: the joined member ID/name
        :param group: the group ID/name, of the joined member
        :param service: a `nova.service.Service` object
        """
        return self._driver.join(member, group, service)

        从上面可以看出,如果没有在配置文件中设置servicegroup_driver的值,系统默认值为’db’,所以self._driver为nova/servicegroup/drivers/db.py文件中的DbDriver对象。所以调用DbDriver对象的join方法。

# nova/servicegroup/drivers/db.py:DbDriver
    def join(self, member, group, service=None):
        """Add a new member to a service group.

        :param member: the joined member ID/name
        :param group: the group ID/name, of the joined member
        :param service: a `nova.service.Service` object
        """
        LOG.debug('DB_Driver: join new ServiceGroup member %(member)s to '
                  'the %(group)s group, service = %(service)s',
                  {'member': member, 'group': group,
                   'service': service})
        if service is None:
            raise RuntimeError(_('service is a mandatory argument for DB based'
                                 ' ServiceGroup driver'))
        report_interval = service.report_interval
        if report_interval:
            service.tg.add_timer(report_interval, self._report_state,
                                 api.INITIAL_REPORTING_DELAY, service)

        这个report_interval就是report state thread周期性上报nova-service status到数据库的间隔时间,该值默认值为10s。然后执行service.tg.add_timer方法去周期性上报信息。继续向下分析。

# nova/openstack/common/threadgroup.py:ThreadGroup
    def add_timer(self, interval, callback, initial_delay=None,
                  *args, **kwargs):
        pulse = loopingcall.FixedIntervalLoopingCall(callback, *args, **kwargs)
        pulse.start(interval=interval,
                    initial_delay=initial_delay)
        self.timers.append(pulse)

# nova/openstack/common/loopingcall.py:LoopingCallBase
class LoopingCallBase(object):
    def __init__(self, f=None, *args, **kw):
        self.args = args
        self.kw = kw
        self.f = f
        self._running = False
        self.done = None

    def stop(self):
        self._running = False

    def wait(self):
        return self.done.wait()

# nova/openstack/common/loopingcall.py:FixedIntervalLoopingCall
class FixedIntervalLoopingCall(LoopingCallBase):
    """A fixed interval looping call."""

    def start(self, interval, initial_delay=None):
        self._running = True
        done = event.Event()

        def _inner():
            if initial_delay:
                greenthread.sleep(initial_delay)

            try:
                while self._running:
                    start = _ts()
                    self.f(*self.args, **self.kw)
                    end = _ts()
                    if not self._running:
                        break
                    delay = end - start - interval
                    if delay > 0:
                        LOG.warn(_LW('task %(func_name)r run outlasted '
                                     'interval by %(delay).2f sec'),
                                 {'func_name': self.f, 'delay': delay})
                    greenthread.sleep(-delay if delay < 0 else 0)
            except LoopingCallDone as e:
                self.stop()
                done.send(e.retvalue)
            except Exception:
                LOG.exception(_LE('in fixed duration looping call'))
                done.send_exception(*sys.exc_info())
                return
            else:
                done.send(True)

        self.done = done

        greenthread.spawn_n(_inner)
        return self.done

        从上面可以看出,add_timer方法将调用FixedIntervalLoopingCall中的start方法创建一个线程(协程)去周期性执行nova/servicegroup/drivers/db.py:DbDriver对象中的_report_state方法,注意在周期性执行该方法之前,需要根据initial_delay的值(这里默认值为5s)作相应的初始化延时。还有就是如果上报数据库时,时间超过了10s,则将会打印warn日志。如果小于10s,则将sleep剩余的时间,直到间隔到达10s后,再次上报nova-service status到数据库。那么_report_state方法做了些什么操作呢?

# nova/servicegroup/drivers/db.py:DbDriver
    def _report_state(self, service):
        """Update the state of this service in the datastore."""
        ctxt = context.get_admin_context()
        state_catalog = {}
        try:
            report_count = service.service_ref['report_count'] + 1
            state_catalog['report_count'] = report_count

            service.service_ref = self.conductor_api.service_update(ctxt,
                    service.service_ref, state_catalog)

            # TODO(termie): make this pattern be more elegant.
            if getattr(service, 'model_disconnected', False):
                service.model_disconnected = False
                LOG.error(_LE('Recovered model server connection!'))

        # TODO(vish): this should probably only catch connection errors
        except Exception:
            if not getattr(service, 'model_disconnected', False):
                service.model_disconnected = True
                LOG.exception(_LE('model server went away'))

        从上可以看出,_report_state方法只是调用nova-conductor的rpcapi.py文件中的方法远程(通过RabbitMQ)执行nova-conductor的manager.py文件中的方法去更新数据库中的nova-service status。

        至此,report state thread的周期性上报nova-service status的代码流程分析完成,即每隔10s向nova-conductor上报nova-service status,然后nova-conductor将收到的status汇报给DB,DB更新nova-service的status。

        4. self.tg.add_dynamic_timer(self.periodic_tasks,

                                    initial_delay=initial_delay,

                                    periodic_interval_max=

                                    self.periodic_interval_max)

        下面我们分析resource tracker thread如何去更新信息到DB。

# nova/openstack/common/threadgroup.py:ThreadGroup
    def add_dynamic_timer(self, callback, initial_delay=None,
                          periodic_interval_max=None, *args, **kwargs):
        timer = loopingcall.DynamicLoopingCall(callback, *args, **kwargs)
        timer.start(initial_delay=initial_delay,
                    periodic_interval_max=periodic_interval_max)
        self.timers.append(timer)

# nova/openstack/common/loopingcall.py:LoopingCallBase
class LoopingCallBase(object):
    def __init__(self, f=None, *args, **kw):
        self.args = args
        self.kw = kw
        self.f = f
        self._running = False
        self.done = None

    def stop(self):
        self._running = False

    def wait(self):
        return self.done.wait()

# nova/openstack/common/loopingcall.py:DynamicLoopingCall
class DynamicLoopingCall(LoopingCallBase):
    """A looping call which sleeps until the next known event.

    The function called should return how long to sleep for before being
    called again.
    """

    def start(self, initial_delay=None, periodic_interval_max=None):
        self._running = True
        done = event.Event()

        def _inner():
            if initial_delay:
                greenthread.sleep(initial_delay)

            try:
                while self._running:
                    idle = self.f(*self.args, **self.kw)
                    if not self._running:
                        break

                    if periodic_interval_max is not None:
                        idle = min(idle, periodic_interval_max)
                    LOG.debug('Dynamic looping call %(func_name)r sleeping '
                              'for %(idle).02f seconds',
                              {'func_name': self.f, 'idle': idle})
                    greenthread.sleep(idle)
            except LoopingCallDone as e:
                self.stop()
                done.send(e.retvalue)
            except Exception:
                LOG.exception(_LE('in dynamic looping call'))
                done.send_exception(*sys.exc_info())
                return
            else:
                done.send(True)

        self.done = done

        greenthread.spawn(_inner)
        return self.done

        这里resource tracker首先构造一个DynamicLoopingCall对象,然后执行start方法开启一个thread,也就是我们所说的resource tracker thread,最后执行add_dynamic_timer方法传递进来的callback方法,该方法为nova/service.py文件中的periodic_tasks方法。如下

# nova/service.py:Service
    def periodic_tasks(self, raise_on_error=False):
        """Tasks to be run at a periodic interval."""
        ctxt = context.get_admin_context()
        return self.manager.periodic_tasks(ctxt, raise_on_error=raise_on_error)

# nova/manager.py:Manager
    def periodic_tasks(self, context, raise_on_error=False):
        """Tasks to be run at a periodic interval."""
        return self.run_periodic_tasks(context, raise_on_error=raise_on_error)

# nova/openstack/common/periodic_task.py:PeriodicTasks
    def run_periodic_tasks(self, context, raise_on_error=False):
        """Tasks to be run at a periodic interval."""
        idle_for = DEFAULT_INTERVAL
        for task_name, task in self._periodic_tasks:
            full_task_name = '.'.join([self.__class__.__name__, task_name])

            spacing = self._periodic_spacing[task_name]
            last_run = self._periodic_last_run[task_name]

            # Check if due, if not skip
            idle_for = min(idle_for, spacing)
            if last_run is not None:
                delta = last_run + spacing - time.time()
                if delta > 0:
                    idle_for = min(idle_for, delta)
                    continue

            LOG.debug("Running periodic task %(full_task_name)s",
                      {"full_task_name": full_task_name})
            self._periodic_last_run[task_name] = _nearest_boundary(
                last_run, spacing)

            try:
                task(self, context)
            except Exception as e:
                if raise_on_error:
                    raise
                LOG.exception(_LE("Error during %(full_task_name)s: %(e)s"),
                              {"full_task_name": full_task_name, "e": e})
            time.sleep(0)

        return idle_for

        注意,nova/service.py:Service中的periodic_tasks的self.manager是nova/compute/manager.py:ComputeManager对象,ComputeManager类继承nova/manager.py:Manager类,且在ComputeManager类没有实现periodic_tasks方法,所以nova/service.py:Service中的periodic_tasks执行的是nova/manager.py:Manager类中的periodic_tasks方法。最终执行run_periodic_tasks方法,那么self._periodic_tasks是什么呢?怎么来的呢?下面具体分析。

        首先,我们看看下列的继承关系。

# nova/compute/manager.py:ComputeManager
class ComputeManager(manager.Manager):

# nova/manager.py:Manager
class Manager(base.Base, periodic_task.PeriodicTasks):

# nova/openstack/common/periodic_task.py:PeriodicTasks
@six.add_metaclass(_PeriodicTasksMeta)
class PeriodicTasks(object):

# nova/openstack/common/periodic_task.py:_PeriodicTasksMeta
class _PeriodicTasksMeta(type):

        从上可以看出ComputeManager类最终继承PeriodicTasks类(顶层基类),而在PeriodicTasks类指定了一个元类(即metaclass),该元类的作用是将所有的子类(如这里的Manager和ComputeManager类)的函数中有_periodic_task属性的函数都进行周期性执行,即resource tracker thread的periodic性执行这些函数。那么_periodic_task的属性什么时候给附着到需要周期性执行的函数上面去的呢?下面用update_available_resource函数进行说明,该方法就是被周期性执行的。

# nova/compute/manager.py:ComputeManager
    @periodic_task.periodic_task
    def update_available_resource(self, context):
        """See driver.get_available_resource()

        Periodic process that keeps that the compute host's understanding of
        resource availability and usage in sync with the underlying hypervisor.

        :param context: security context
        """
        new_resource_tracker_dict = {}
        nodenames = set(self.driver.get_available_nodes())
        for nodename in nodenames:
            rt = self._get_resource_tracker(nodename)
            rt.update_available_resource(context)
            new_resource_tracker_dict[nodename] = rt

        # Delete orphan compute node not reported by driver but still in db
        compute_nodes_in_db = self._get_compute_nodes_in_db(context,
                                                            use_slave=True)

        for cn in compute_nodes_in_db:
            if cn.hypervisor_hostname not in nodenames:
                LOG.info(_LI("Deleting orphan compute node %s") % cn.id)
                cn.destroy()

        self._resource_tracker_dict = new_resource_tracker_dict

        从上面代码可以看到,该函数有个@periodic_task.periodic_task的装饰器,该装饰器就是为该函数附着_periodic_task属性的。我们看看该装饰器做了什么?

# nova/openstack/common/periodic_task.py
def periodic_task(*args, **kwargs):
    """Decorator to indicate that a method is a periodic task.

    This decorator can be used in two ways:

        1. Without arguments '@periodic_task', this will be run on the default
           interval of 60 seconds.

        2. With arguments:
           @periodic_task(spacing=N [, run_immediately=[True|False]]
           [, name=[None|"string"])
           this will be run on approximately every N seconds. If this number is
           negative the periodic task will be disabled. If the run_immediately
           argument is provided and has a value of 'True', the first run of the
           task will be shortly after task scheduler starts.  If
           run_immediately is omitted or set to 'False', the first time the
           task runs will be approximately N seconds after the task scheduler
           starts. If name is not provided, __name__ of function is used.
    """
    def decorator(f):
        # Test for old style invocation
        if 'ticks_between_runs' in kwargs:
            raise InvalidPeriodicTaskArg(arg='ticks_between_runs')

        # Control if run at all
        f._periodic_task = True
        f._periodic_external_ok = kwargs.pop('external_process_ok', False)
        if f._periodic_external_ok and not CONF.run_external_periodic_tasks:
            f._periodic_enabled = False
        else:
            f._periodic_enabled = kwargs.pop('enabled', True)
        f._periodic_name = kwargs.pop('name', f.__name__)

        # Control frequency
        f._periodic_spacing = kwargs.pop('spacing', 0)
        f._periodic_immediate = kwargs.pop('run_immediately', False)
        if f._periodic_immediate:
            f._periodic_last_run = None
        else:
            f._periodic_last_run = time.time()
        return f

    # NOTE(sirp): The `if` is necessary to allow the decorator to be used with
    # and without parenthesis.
    #
    # In the 'with-parenthesis' case (with kwargs present), this function needs
    # to return a decorator function since the interpreter will invoke it like:
    #
    #   periodic_task(*args, **kwargs)(f)
    #
    # In the 'without-parenthesis' case, the original function will be passed
    # in as the first argument, like:
    #
    #   periodic_task(f)
    if kwargs:
        return decorator
    else:
        return decorator(args[0])

        从上面的装饰器代码可以看出,该代码执行f._periodic_task = True,则为update_available_resource函数附着了_periodic_task属性,当然该装饰器也为函数附着了其他属性,如_periodic_spacing和_periodic_last_run,这些属性在周期性执行该函数有作用。那么这里只有为update_available_resource函数附着了_periodic_task属性,什么时候周期性执行该函数呢?其周期性执行这些函数就是在run_periodic_tasks方法中执行的。而在run_periodic_tasks函数中,取这些函数的属性都是类或者对象中的列表或字典取的,但是刚才我们并没有发现给类或者对象设置了这些列表或字典。这些列表和字典是在元类初始化设置的,也就生产ComputeManager对象时,就已经将这些周期性执行的函数的属性设置完成,并将它们保存在相对应的列表或字典中,如下。

# nova/openstack/common/periodic_task.py:_PeriodicTasksMeta
class _PeriodicTasksMeta(type):
    def _add_periodic_task(cls, task):
        """Add a periodic task to the list of periodic tasks.

        The task should already be decorated by @periodic_task.

        :return: whether task was actually enabled
        """
        name = task._periodic_name

        if task._periodic_spacing < 0:
            LOG.info(_LI('Skipping periodic task %(task)s because '
                         'its interval is negative'),
                     {'task': name})
            return False
        if not task._periodic_enabled:
            LOG.info(_LI('Skipping periodic task %(task)s because '
                         'it is disabled'),
                     {'task': name})
            return False

        # A periodic spacing of zero indicates that this task should
        # be run on the default interval to avoid running too
        # frequently.
        if task._periodic_spacing == 0:
            task._periodic_spacing = DEFAULT_INTERVAL

        cls._periodic_tasks.append((name, task))
        cls._periodic_spacing[name] = task._periodic_spacing
        return True

    def __init__(cls, names, bases, dict_):
        """Metaclass that allows us to collect decorated periodic tasks."""
        super(_PeriodicTasksMeta, cls).__init__(names, bases, dict_)

        # NOTE(sirp): if the attribute is not present then we must be the base
        # class, so, go ahead an initialize it. If the attribute is present,
        # then we're a subclass so make a copy of it so we don't step on our
        # parent's toes.
        try:
            cls._periodic_tasks = cls._periodic_tasks[:]
        except AttributeError:
            cls._periodic_tasks = []

        try:
            cls._periodic_spacing = cls._periodic_spacing.copy()
        except AttributeError:
            cls._periodic_spacing = {}

        for value in cls.__dict__.values():
            if getattr(value, '_periodic_task', False):
                cls._add_periodic_task(value)

        从上面的__init__的最后的for语句可看出,元类会为所有子类中有_periodic_task属性的函数调用_add_periodic_task方法,_add_periodic_task方法主要将这些周期性调用的函数保存在子类的_periodic_tasks列表中(cls._periodic_tasks.append((name,task))),且将周期性调用的函数的periodic_spacing值保存在_periodic_spacing字典中(cls._periodic_spacing[name]= task._periodic_spacing)。所以在run_periodic_tasks函数中就能取得子类所有的周期性函数,然后对所有的周期性函数进行周期性调用,系统默认的周期是60s。所以最终每隔60s,系统就会调用这些有_periodic_task属性的函数。

        下面我们总结一下本篇文章的内容:

        1. 在nova-compute服务启动时,会创建一个report state thread周期性(默认时间大约10s)上报nova-compute status到nova-conductor,然后由nova-conductor将status传递给DB更新nova-compute的status。

        2. 在nova-compute服务启动时,还会创建一个resource tracker thread周期性(默认时间大约60s)执行有_periodic_task属性的函数。

你可能感兴趣的:(openstack,resource,tracker,nova-compute,periodic)