首先我们看服务端启动过程,在此取compute节点为例:
在nova安装时,会调取pbr模块,根据setup.cfg中相关信息生成启动服务的console_scripts,详情可见pbr官方文档:https://docs.openstack.org/developer/pbr/
安装好nova包并运行启动脚本nova-compute后,会执行nova.cmd.compute中的main()函数:
def main():
config.parse_args(sys.argv)
logging.setup(CONF, 'nova')
priv_context.init(root_helper=shlex.split(utils.get_root_helper()))
utils.monkey_patch()
objects.register_all()
# Ensure os-vif objects are registered and plugins loaded
os_vif.initialize()
gmr.TextGuruMeditation.setup_autorun(version)
cmd_common.block_db_access('nova-compute')
objects_base.NovaObject.indirection_api = conductor_rpcapi.ConductorAPI()
server = service.Service.create(binary='nova-compute',
topic=CONF.compute_topic)
service.serve(server)
service.wait()
首先我们看到其中config.parse_args(sys.argv)
包含了一个rpc.init(CONF)
逻辑,指向了nova.rpc.init()方法。这个方法中定义了一个全局变量TRANSPORT,其值调取了另外两个方法:TRANSPORT = create_transport(get_transport_url())
。
代码逻辑清晰,不在此贴出,简单阐述逻辑:
get_transport_url()
返回了一个oslo_messaging.transport.TransportURL类的实例,其transport属性为字符串’rabbit’。create_transport()
将上述获得的实例作为接收的初始化参数,最终生成了一个oslo_messaging.transport.Transport类实例,其初始化参数driver=oslo_messaging._drivers_impl_rabbit.RabbitDriver
类实例,此实例的初始化参数:
conf=nova.rpc.CONF, url=之前获得的TransportURL实例, defalut_exchange='openstack', allowed_remote_exmods=[nova.exception.__name__,]
回到nova.cmd.compute.main,继续看
server=service.Service.create(binary='nova-compute',topic=CONF.compute_topic)
此方法作为nova.service.Service类实例的生成方法,对参数进行了初始化。下方service.serve(server)
方法,逻辑为通过新建线程装载上面获得的的server,调用其start方法。我们来看start方法(nova.service.start())中关于rpc的部分:
target = messaging.Target(topic=self.topic, server=self.host)
endpoints = [
self.manager,
baserpc.BaseRPCAPI(self.manager.service_name, self.backdoor_port)
]
endpoints.extend(self.manager.additional_endpoints)
serializer = objects_base.NovaObjectSerializer()
self.rpcserver = rpc.get_server(target, endpoints, serializer)
self.rpcserver.start()
分析逻辑前先明确几个变量的值:
self.topic = 'compute',
self.host = 配置文件中写明的物理机地址,
self.manager = nova.compute.manager.ComputManager 类
self.manager.service_name = 'compute',
self.additional_endpoints = []
target = messaging.Target(topic=self.topic, server=self.host)
返回了一个oslo_messaging.target.Target类的实例,具体作用在调用时分析。
endpoints = [self.manager, baserpc.BaseRPCAPI(self.manager.service_name, self.backdoor_port)]
中,我们只分析self.manager作为endpoint时的情况。
serializer = objects_base.NovaObjectSerializer()
返回了nova.objects.base.NovaObjectSerializer实例,具体作用在调用时分析。
下面我们看self.rpcserver = rpc.get_server(target, endpoints, serializer)
def get_server(target, endpoints, serializer=None):
assert TRANSPORT is not None
if profiler:
serializer = ProfilerRequestContextSerializer(serializer)
else:
serializer = RequestContextSerializer(serializer)
return messaging.get_rpc_server(TRANSPORT,
target,
endpoints,
executor='eventlet',
serializer=serializer)
因为在之前rpc.init()函数中已经定义了TRANSPORT全局变量的值,所以可以通过assert逻辑。
我们在此考虑不启用profiler的情况,则之前的serializer在此被RequestContextSerializer类包装了一层,具体逻辑在调用时进行分析。之后调用了oslo_messaging.get_rpc_server()
方法,并将之前定义的一些参数传入:
def get_rpc_server(transport, target, endpoints,
executor='blocking', serializer=None, access_policy=None):
"""Construct an RPC server.
:param transport: the messaging transport
:type transport: Transport
:param target: the exchange, topic and server to listen on
:type target: Target
:param endpoints: a list of endpoint objects
:type endpoints: list
:param executor: name of a message executor - for example
'eventlet', 'blocking'
:type executor: str
:param serializer: an optional entity serializer
:type serializer: Serializer
:param access_policy: an optional access policy.
Defaults to LegacyRPCAccessPolicy
:type access_policy: RPCAccessPolicyBase
"""
dispatcher = rpc_dispatcher.RPCDispatcher(endpoints, serializer,
access_policy)
return RPCServer(transport, target, dispatcher, executor)
dispatcher在后面解析,我们发现该方法最终返回了一个oslo_messaging.rpc.server.RPCServer实例。
再回到nova.service.Service.start()中,发现定义了self.rpcserver后直接调用了其start方法,即RPCServer实例的start方法。此方法在RPCServer的父类oslo_messaging.server.MessageHandlingServer中定义:
@ordered(reset_after='stop')
def start(self, override_pool_size=None):
"""Start handling incoming messages.
This method causes the server to begin polling the transport for
incoming messages and passing them to the dispatcher. Message
processing will continue until the stop() method is called.
The executor controls how the server integrates with the applications
I/O handling strategy - it may choose to poll for messages in a new
process, thread or co-operatively scheduled coroutine or simply by
registering a callback with an event loop. Similarly, the executor may
choose to dispatch messages in a new thread, coroutine or simply the
current thread.
"""
# Warn that restarting will be deprecated
if self._started:
LOG.warning(_LW('Restarting a MessageHandlingServer is inherently '
'racy. It is deprecated, and will become a noop '
'in a future release of oslo.messaging. If you '
'need to restart MessageHandlingServer you should '
'instantiate a new object.'))
self._started = True
executor_opts = {}
if self.executor_type == "threading":
executor_opts["max_workers"] = (
override_pool_size or self.conf.executor_thread_pool_size
)
elif self.executor_type == "eventlet":
eventletutils.warn_eventlet_not_patched(
expected_patched_modules=['thread'],
what="the 'oslo.messaging eventlet executor'")
executor_opts["max_workers"] = (
override_pool_size or self.conf.executor_thread_pool_size
)
self._work_executor = self._executor_cls(**executor_opts)
try:
self.listener = self._create_listener()
except driver_base.TransportDriverError as ex:
raise ServerListenError(self.target, ex)
self.listener.start(self._on_incoming)
还是先看几个实例属性或变量:
self.transport = transport
self._target = target
self.executor_type = executor = 'eventlet'
self._executor_cls = mgr.driver = stevedore.driver.DriverManager().driver = futurist.GreenThreadPoolExecutor类(参照之前entry_points插件加载方法)
executor_opts['max_workers'] = 64(默认值)
self._work_executor = self._executor_cls(**executor_opts) = futurist.GreenThreadPoolExecutor(**executor_opts)
下面看_create_listener()
在子类RPCServer中被覆写:
def _create_listener(self):
return self.transport._listen(self._target, 1, None)
不难理解_listen方法是定义在之前rpc.init()中全局变量TRANSPORT中,即oslo_messaging.transport.Transport实例的_listen():
def _listen(self, target, batch_size, batch_timeout):
if not (target.topic and target.server):
raise exceptions.InvalidTarget('A server\'s target must have '
'topic and server names specified',
target)
return self._driver.listen(target, batch_size,
batch_timeout)
而其又返回了self._driver.listen方法,也不难得知为oslo_messaging._drivers.amqpdriver类的listen方法:
def listen(self, target, batch_size, batch_timeout):
conn = self._get_connection(rpc_common.PURPOSE_LISTEN)
listener = AMQPListener(self, conn)
conn.declare_topic_consumer(exchange_name=self._get_exchange(target),
topic=target.topic,
callback=listener)
conn.declare_topic_consumer(exchange_name=self._get_exchange(target),
topic='%s.%s' % (target.topic,
target.server),
callback=listener)
conn.declare_fanout_consumer(target.topic, listener)
return base.PollStyleListenerAdapter(listener, batch_size,
batch_timeout)
至此,终于到了实际定义消息队列的逻辑。
首先,可以看到当RabbitDriver类实例初始化时:
connection_pool = pool.ConnectionPool(
conf, max_size, min_size, ttl,
url, Connection)
super(RabbitDriver, self).__init__(
conf, url,
connection_pool,
default_exchange,
allowed_remote_exmods
进入了父类的AMQPDriverBase的初始化,而_get_connection方法则在此层定义:
def __init__(self, conf, url, connection_pool,
default_exchange=None, allowed_remote_exmods=None):
super(AMQPDriverBase, self).__init__(conf, url, default_exchange,
allowed_remote_exmods)
self._default_exchange = default_exchange
self._connection_pool = connection_pool
self._reply_q_lock = threading.Lock()
self._reply_q = None
self._reply_q_conn = None
self._waiter = None
def _get_connection(self, purpose=rpc_common.PURPOSE_SEND):
return rpc_common.ConnectionContext(self._connection_pool,
purpose=purpose)
解析conn = self._get_connection(rpc_common.PURPOSE_LISTEN)
,其中purpose = rpc_common.PURPOSE_LISTEN = 'listen'
,返回了一个oslo_messaging._drivers.common.ConnectionContext类的实例,接收参数为connection_pool = self._connection_pool, purpose='listen'
。
再看ConnectionContext类的__init__()
:
def __init__(self, connection_pool, purpose):
"""Create a new connection, or get one from the pool."""
self.connection = None
self.connection_pool = connection_pool
pooled = purpose == PURPOSE_SEND
if pooled:
self.connection = connection_pool.get()
else:
# a non-pooled connection is requested, so create a new connection
self.connection = connection_pool.create(purpose)
self.pooled = pooled
self.connection.pooled = pooled
这里我们关注self.connection = connection_pool.create(purpose)
,可知调用了oslo_messaging._drivers.pool.ConnectionPool的create方法:
def create(self, purpose=common.PURPOSE_SEND):
LOG.debug('Pool creating new connection')
return self.connection_cls(self.conf, self.url, purpose)
逻辑可知此处返回的是oslo_messaging._drivers.impl_rabbit.Connection类的实例。
综上可知:
conn = oslo_messaging._drivers.common.ConnectionContext的实例,其属性
self.connection = oslo_messaging._drivers.impl_rabbit.Connection类的实例。
且因为ConnectionContext类覆写了获取属性的方法:
def __getattr__(self, key):
"""Proxy all other calls to the Connection instance."""
if self.connection:
return getattr(self.connection, key)
else:
raise InvalidRPCConnectionReuse()
直接获取Connection实例的属性。
我们看Connection类的初始化方法:
def __init__(self, conf, url, purpose):
# NOTE(viktors): Parse config options
driver_conf = conf.oslo_messaging_rabbit
self.max_retries = driver_conf.rabbit_max_retries
self.interval_start = driver_conf.rabbit_retry_interval
self.interval_stepping = driver_conf.rabbit_retry_backoff
self.interval_max = driver_conf.rabbit_interval_max
self.login_method = driver_conf.rabbit_login_method
self.fake_rabbit = driver_conf.fake_rabbit
self.virtual_host = driver_conf.rabbit_virtual_host
self.rabbit_hosts = driver_conf.rabbit_hosts
self.rabbit_port = driver_conf.rabbit_port
self.rabbit_userid = driver_conf.rabbit_userid
self.rabbit_password = driver_conf.rabbit_password
self.rabbit_ha_queues = driver_conf.rabbit_ha_queues
self.rabbit_transient_queues_ttl = \
driver_conf.rabbit_transient_queues_ttl
self.rabbit_qos_prefetch_count = driver_conf.rabbit_qos_prefetch_count
self.heartbeat_timeout_threshold = \
driver_conf.heartbeat_timeout_threshold
self.heartbeat_rate = driver_conf.heartbeat_rate
self.kombu_reconnect_delay = driver_conf.kombu_reconnect_delay
self.amqp_durable_queues = driver_conf.amqp_durable_queues
self.amqp_auto_delete = driver_conf.amqp_auto_delete
self.rabbit_use_ssl = driver_conf.rabbit_use_ssl
self.kombu_missing_consumer_retry_timeout = \
driver_conf.kombu_missing_consumer_retry_timeout
self.kombu_failover_strategy = driver_conf.kombu_failover_strategy
self.kombu_compression = driver_conf.kombu_compression
if self.rabbit_use_ssl:
self.kombu_ssl_version = driver_conf.kombu_ssl_version
self.kombu_ssl_keyfile = driver_conf.kombu_ssl_keyfile
self.kombu_ssl_certfile = driver_conf.kombu_ssl_certfile
self.kombu_ssl_ca_certs = driver_conf.kombu_ssl_ca_certs
# Try forever?
if self.max_retries <= 0:
self.max_retries = None
if url.virtual_host is not None:
virtual_host = url.virtual_host
else:
virtual_host = self.virtual_host
self._url = ''
if self.fake_rabbit:
LOG.warning(_LW("Deprecated: fake_rabbit option is deprecated, "
"set rpc_backend to kombu+memory or use the fake "
"driver instead."))
self._url = 'memory://%s/' % virtual_host
elif url.hosts:
if url.transport.startswith('kombu+'):
LOG.warning(_LW('Selecting the kombu transport through the '
'transport url (%s) is a experimental feature '
'and this is not yet supported.'),
url.transport)
if len(url.hosts) > 1:
random.shuffle(url.hosts)
for host in url.hosts:
transport = url.transport.replace('kombu+', '')
transport = transport.replace('rabbit', 'amqp')
self._url += '%s%s://%s:%s@%s:%s/%s' % (
";" if self._url else '',
transport,
parse.quote(host.username or ''),
parse.quote(host.password or ''),
self._parse_url_hostname(host.hostname) or '',
str(host.port or 5672),
virtual_host)
elif url.transport.startswith('kombu+'):
# NOTE(sileht): url have a + but no hosts
# (like kombu+memory:///), pass it to kombu as-is
transport = url.transport.replace('kombu+', '')
self._url = "%s://%s" % (transport, virtual_host)
else:
if len(self.rabbit_hosts) > 1:
random.shuffle(self.rabbit_hosts)
for adr in self.rabbit_hosts:
hostname, port = netutils.parse_host_port(
adr, default_port=self.rabbit_port)
self._url += '%samqp://%s:%s@%s:%s/%s' % (
";" if self._url else '',
parse.quote(self.rabbit_userid, ''),
parse.quote(self.rabbit_password, ''),
self._parse_url_hostname(hostname), port,
virtual_host)
self._initial_pid = os.getpid()
self._consumers = {}
self._producer = None
self._new_tags = set()
self._active_tags = {}
self._tags = itertools.count(1)
# Set of exchanges and queues declared on the channel to avoid
# unnecessary redeclaration. This set is resetted each time
# the connection is resetted in Connection._set_current_channel
self._declared_exchanges = set()
self._declared_queues = set()
self._consume_loop_stopped = False
self.channel = None
self.purpose = purpose
# NOTE(sileht): if purpose is PURPOSE_LISTEN
# we don't need the lock because we don't
# have a heartbeat thread
if purpose == rpc_common.PURPOSE_SEND:
self._connection_lock = ConnectionLock()
else:
self._connection_lock = DummyConnectionLock()
self.connection_id = str(uuid.uuid4())
self.name = "%s:%d:%s" % (os.path.basename(sys.argv[0]),
os.getpid(),
self.connection_id)
self.connection = kombu.connection.Connection(
self._url, ssl=self._fetch_ssl_params(),
login_method=self.login_method,
heartbeat=self.heartbeat_timeout_threshold,
failover_strategy=self.kombu_failover_strategy,
transport_options={
'confirm_publish': True,
'client_properties': {
'capabilities': {
'authentication_failure_close': True,
'connection.blocked': True,
'consumer_cancel_notify': True
},
'connection_name': self.name},
'on_blocked': self._on_connection_blocked,
'on_unblocked': self._on_connection_unblocked,
},
)
LOG.debug('[%(connection_id)s] Connecting to AMQP server on'
' %(hostname)s:%(port)s',
self._get_connection_info())
# NOTE(sileht): kombu recommend to run heartbeat_check every
# seconds, but we use a lock around the kombu connection
# so, to not lock to much this lock to most of the time do nothing
# expected waiting the events drain, we start heartbeat_check and
# retrieve the server heartbeat packet only two times more than
# the minimum required for the heartbeat works
# (heatbeat_timeout/heartbeat_rate/2.0, default kombu
# heartbeat_rate is 2)
self._heartbeat_wait_timeout = (
float(self.heartbeat_timeout_threshold) /
float(self.heartbeat_rate) / 2.0)
self._heartbeat_support_log_emitted = False
# NOTE(sileht): just ensure the connection is setuped at startup
with self._connection_lock:
self.ensure_connection()
# NOTE(sileht): if purpose is PURPOSE_LISTEN
# the consume code does the heartbeat stuff
# we don't need a thread
self._heartbeat_thread = None
if purpose == rpc_common.PURPOSE_SEND:
self._heartbeat_start()
LOG.debug('[%(connection_id)s] Connected to AMQP server on '
'%(hostname)s:%(port)s via [%(transport)s] client with'
' port %(client_port)s.',
self._get_connection_info())
# NOTE(sileht): value chosen according the best practice from kombu
# http://kombu.readthedocs.org/en/latest/reference/kombu.common.html#kombu.common.eventloop
# For heatbeat, we can set a bigger timeout, and check we receive the
# heartbeat packets regulary
if self._heartbeat_supported_and_enabled():
self._poll_timeout = self._heartbeat_wait_timeout
else:
self._poll_timeout = 1
if self._url.startswith('memory://'):
# Kludge to speed up tests.
self.connection.transport.polling_interval = 0.0
# Fixup logging
self.connection.hostname = "memory_driver"
self.connection.port = 1234
self._poll_timeout = 0.05
# FIXME(markmc): use oslo sslutils when it is available as a library
_SSL_PROTOCOLS = {
"tlsv1": ssl.PROTOCOL_TLSv1,
"sslv23": ssl.PROTOCOL_SSLv23
}
_OPTIONAL_PROTOCOLS = {
'sslv2': 'PROTOCOL_SSLv2',
'sslv3': 'PROTOCOL_SSLv3',
'tlsv1_1': 'PROTOCOL_TLSv1_1',
'tlsv1_2': 'PROTOCOL_TLSv1_2',
}
for protocol in _OPTIONAL_PROTOCOLS:
try:
_SSL_PROTOCOLS[protocol] = getattr(ssl,
_OPTIONAL_PROTOCOLS[protocol])
except AttributeError:
pass
分析逻辑,此处所有类变量都使用默认值:
'amqp://(rabbit_username):(rabbit_password)@(rabbit_hostname):(rabbit_port)/(virtual_host)'
。如果提供了多个rabbit_hosts,则各个url中间由分号隔开;self.connection
类属性,并赋值为kombu.connection.Connection
类的实例,且self.transport=kombu.transport.pyamqp:Transport
类self.ensure_connection
方法中,调用了self.ensure
方法,并将self.connection.connection
作为参数传入。因代码过长,逐步分解来看: method = self.connection.connection
↓ kombu.connection.Connection.connection()
@property
def connection(self):
"""The underlying connection object.
.. warning::
This instance is transport specific, so do not
depend on the interface of this object.
"""
if not self._closed:
if not self.connected:
self.declared_entities.clear()
self._default_channel = None
self._connection = self._establish_connection()
self._closed = False
return self._connection
↓ kombu.connection.Connection._extablish_connection()
def _establish_connection(self):
self._debug('establishing connection...')
conn = self.transport.establish_connection()
self._debug('connection established: %r', conn)
return conn
↓
def establish_connection(self):
"""Establish connection to the AMQP broker."""
conninfo = self.client
for name, default_value in items(self.default_connection_params):
if not getattr(conninfo, name, None):
setattr(conninfo, name, default_value)
if conninfo.ssl:
raise NotImplementedError(NO_SSL_ERROR)
opts = dict({
'host': conninfo.host,
'userid': conninfo.userid,
'password': conninfo.password,
'virtual_host': conninfo.virtual_host,
'login_method': conninfo.login_method,
'insist': conninfo.insist,
'ssl': conninfo.ssl,
'connect_timeout': conninfo.connect_timeout,
}, **conninfo.transport_options or {})
conn = self.Connection(**opts)
conn.client = self.client
self.client.drain_events = conn.drain_events
return conn
def _set_current_channel(self, new_channel):
"""Change the channel to use.
NOTE(sileht): Must be called within the connection lock
"""
if new_channel == self.channel:
return
if self.channel is not None:
self._declared_queues.clear()
self._declared_exchanges.clear()
self.connection.maybe_close_channel(self.channel)
self.channel = new_channel
if new_channel is not None:
if self.purpose == rpc_common.PURPOSE_LISTEN:
self._set_qos(new_channel)
self._producer = kombu.messaging.Producer(new_channel)
for consumer in self._consumers:
consumer.declare(self)
通过以上列出代码不难得知:该方法创建并返回了kombu层控制的compute节点与rabbit节点的连接,创建时使用了kombu.connection的autroretry方法生成进行包装,通过librabbitmq库建立channel。如果发生了connection或者channel相关的错误则会自动重建channel、self._producer并重试连接。
回到顶层,listener = AMQPListener(self, conn)
待使用时分解,继续看
conn.declare_topic_consumer(exchange_name=self._get_exchange(target),
topic=target.topic,
callback=listener)
↓
def declare_topic_consumer(self, exchange_name, topic, callback=None,
queue_name=None):
"""Create a 'topic' consumer."""
consumer = Consumer(exchange_name=exchange_name,
queue_name=queue_name or topic,
routing_key=topic,
type='topic',
durable=self.amqp_durable_queues,
exchange_auto_delete=self.amqp_auto_delete,
queue_auto_delete=self.amqp_auto_delete,
callback=callback,
rabbit_ha_queues=self.rabbit_ha_queues)
self.declare_consumer(consumer)
↓
def declare_consumer(self, consumer):
"""Create a Consumer using the class that was passed in and
add it to our list of consumers
"""
def _connect_error(exc):
log_info = {'topic': consumer.routing_key, 'err_str': exc}
LOG.error(_LE("Failed to declare consumer for topic '%(topic)s': "
"%(err_str)s"), log_info)
def _declare_consumer():
consumer.declare(self)
tag = self._active_tags.get(consumer.queue_name)
if tag is None:
tag = next(self._tags)
self._active_tags[consumer.queue_name] = tag
self._new_tags.add(tag)
self._consumers[consumer] = tag
return consumer
with self._connection_lock:
return self.ensure(_declare_consumer,
error_callback=_connect_error)
可知生成了Consumer类实例,并启用autoretry确保重试,调用了实例的declare方法,并且将新定义好的consumer一一对应新tag,将新tag放入self._new_tags
,对应关系放入self._consumer
中。下面我们看declare:
def declare(self, conn):
"""Re-declare the queue after a rabbit (re)connect."""
self.queue = kombu.entity.Queue(
name=self.queue_name,
channel=conn.channel,
exchange=self.exchange,
durable=self.durable,
auto_delete=self.queue_auto_delete,
routing_key=self.routing_key,
queue_arguments=self.queue_arguments)
try:
LOG.debug('[%s] Queue.declare: %s',
conn.connection_id, self.queue_name)
self.queue.declare()
except conn.connection.channel_errors as exc:
# NOTE(jrosenboom): This exception may be triggered by a race
# condition. Simply retrying will solve the error most of the time
# and should work well enough as a workaround until the race
# condition itself can be fixed.
# See https://bugs.launchpad.net/neutron/+bug/1318721 for details.
if exc.code == 404:
self.queue.declare()
else:
raise
self._declared_on = conn.channel
调用kombu.entity.Queue类生成了一个实例记录了即将定义的queue的name、channel、exchange、callback方法及其余参数信息。之后在绑定的channel上定义了exchange和queue,并进行binding。
回到顶层,我们可以看到定义了两个topic的consumer,分别监听的topic为'compute'和'compute.%s' % host
,也定义了一个fanout类型的consumer。
最后,listen方法的返回值为
return base.PollStyleListenerAdapter(listener, batch_size,
batch_timeout)
class PollStyleListenerAdapter(Listener):
"""A Listener that uses a PollStyleListener for message transfer. A
dedicated thread is created to do message polling.
"""
def __init__(self, poll_style_listener, batch_size, batch_timeout):
super(PollStyleListenerAdapter, self).__init__(
batch_size, batch_timeout, poll_style_listener.prefetch_size
)
self._poll_style_listener = poll_style_listener
self._listen_thread = threading.Thread(target=self._runner)
self._listen_thread.daemon = True
self._started = False
def start(self, on_incoming_callback):
super(PollStyleListenerAdapter, self).start(on_incoming_callback)
self._started = True
self._listen_thread.start()
@excutils.forever_retry_uncaught_exceptions
def _runner(self):
while self._started:
incoming = self._poll_style_listener.poll(
batch_size=self.batch_size, batch_timeout=self.batch_timeout)
if incoming:
self.on_incoming_callback(incoming)
# listener is stopped but we need to process all already consumed
# messages
while True:
incoming = self._poll_style_listener.poll(
batch_size=self.batch_size, batch_timeout=self.batch_timeout)
if not incoming:
return
self.on_incoming_callback(incoming)
在oslo_messaging.server.MessageHandlingServer.start方法中,最后一步self.listener.start(self._on_incoming)
即调用了上述start方法,而此方法又调用了之前listener = AMQPListener(self, conn)
的poll方法:
@base.batch_poll_helper
def poll(self, timeout=None):
while not self._stopped.is_set():
if self.incoming:
return self.incoming.pop(0)
try:
self.conn.consume(timeout=timeout)
except rpc_common.Timeout:
return None
def batch_poll_helper(func):
"""Decorator to poll messages in batch
This decorator is used to add message batching support to a
:py:meth:`PollStyleListener.poll` implementation that only polls for a
single message per call.
"""
def wrapper(in_self, timeout=None, batch_size=1, batch_timeout=None):
incomings = []
driver_prefetch = in_self.prefetch_size
if driver_prefetch > 0:
batch_size = min(batch_size, driver_prefetch)
with timeutils.StopWatch(timeout) as timeout_watch:
# poll first message
msg = func(in_self, timeout=timeout_watch.leftover(True))
if msg is not None:
incomings.append(msg)
if batch_size == 1 or msg is None:
return incomings
# update batch_timeout according to timeout for whole operation
timeout_left = timeout_watch.leftover(True)
if timeout_left is not None and (
batch_timeout is None or timeout_left < batch_timeout):
batch_timeout = timeout_left
with timeutils.StopWatch(batch_timeout) as batch_timeout_watch:
# poll remained batch messages
while len(incomings) < batch_size and msg is not None:
msg = func(in_self, timeout=batch_timeout_watch.leftover(True))
if msg is not None:
incomings.append(msg)
return incomings
return wrapper
在没有获得消息的时候,调用了self.conn.consume
,即oslo_messaging._drivers.impl_rabbit.Connection.consume:
def consume(self, timeout=None):
"""Consume from all queues/consumers."""
timer = rpc_common.DecayingTimer(duration=timeout)
timer.start()
def _raise_timeout(exc):
LOG.debug('Timed out waiting for RPC response: %s', exc)
raise rpc_common.Timeout()
def _recoverable_error_callback(exc):
if not isinstance(exc, rpc_common.Timeout):
self._new_tags = set(self._consumers.values())
timer.check_return(_raise_timeout, exc)
def _error_callback(exc):
_recoverable_error_callback(exc)
LOG.error(_LE('Failed to consume message from queue: %s'),
exc)
def _consume():
# NOTE(sileht): in case the acknowledgment or requeue of a
# message fail, the kombu transport can be disconnected
# In this case, we must redeclare our consumers, so raise
# a recoverable error to trigger the reconnection code.
if not self.connection.connected:
raise self.connection.recoverable_connection_errors[0]
while self._new_tags:
for consumer, tag in self._consumers.items():
if tag in self._new_tags:
consumer.consume(self, tag=tag)
self._new_tags.remove(tag)
poll_timeout = (self._poll_timeout if timeout is None
else min(timeout, self._poll_timeout))
while True:
if self._consume_loop_stopped:
return
if self._heartbeat_supported_and_enabled():
self._heartbeat_check()
try:
self.connection.drain_events(timeout=poll_timeout)
return
except socket.timeout as exc:
poll_timeout = timer.check_return(
_raise_timeout, exc, maximum=self._poll_timeout)
with self._connection_lock:
self.ensure(_consume,
recoverable_error_callback=_recoverable_error_callback,
error_callback=_error_callback)
简单分析逻辑:
拿到incoming后,会使用rpc.server.RPCServer._process_incoming对其进行解析:
def _process_incoming(self, incoming):
message = incoming[0]
try:
message.acknowledge()
except Exception:
LOG.exception(_LE("Can not acknowledge message. Skip processing"))
return
failure = None
try:
res = self.dispatcher.dispatch(message)
except rpc_dispatcher.ExpectedException as e:
failure = e.exc_info
LOG.debug(u'Expected exception during message handling (%s)', e)
except Exception:
# current sys.exc_info() content can be overridden
# by another exception raised by a log handler during
# LOG.exception(). So keep a copy and delete it later.
failure = sys.exc_info()
LOG.exception(_LE('Exception during message handling'))
try:
if failure is None:
message.reply(res)
else:
message.reply(failure=failure)
except Exception:
LOG.exception(_LE("Can not send reply for message"))
finally:
# NOTE(dhellmann): Remove circular object reference
# between the current stack frame and the traceback in
# exc_info.
del failure
首先,会对incoming进行acknowledge,之后进行res = self.dispatcher.dispatch(message)
:
def dispatch(self, incoming):
"""Dispatch an RPC message to the appropriate endpoint method.
:param incoming: incoming message
:type incoming: IncomingMessage
:raises: NoSuchMethod, UnsupportedVersion
"""
message = incoming.message
ctxt = incoming.ctxt
method = message.get('method')
args = message.get('args', {})
namespace = message.get('namespace')
version = message.get('version', '1.0')
found_compatible = False
for endpoint in self.endpoints:
target = getattr(endpoint, 'target', None)
if not target:
target = self._default_target
if not (self._is_namespace(target, namespace) and
self._is_compatible(target, version)):
continue
if hasattr(endpoint, method):
if self.access_policy.is_allowed(endpoint, method):
return self._do_dispatch(endpoint, method, ctxt, args)
found_compatible = True
if found_compatible:
raise NoSuchMethod(method)
else:
raise UnsupportedVersion(version, method=method)
def _do_dispatch(self, endpoint, method, ctxt, args):
ctxt = self.serializer.deserialize_context(ctxt)
new_args = dict()
for argname, arg in args.items():
new_args[argname] = self.serializer.deserialize_entity(ctxt, arg)
func = getattr(endpoint, method)
result = func(ctxt, **new_args)
return self.serializer.serialize_entity(ctxt, result)
逻辑很简单,取出相关参数,进行格式解析及转换,传入endpoint(self.manager)寻找对应方法,运行并获得结果。
运行完成后拿到结果,在_process_incoming
中通过message.reply向publisher返回结果。
至此,nova计算节点(server)端服务流程浅析完毕。