查看脚本位置
>ambari-agent
>Usage: /usr/sbin/ambari-agent {start|stop|restart|status|reset <server_hostname>}
/usr/sbin/ambari-agent
/usr/sbin/ambari-agent
调用/var/lib/ambari-agent/bin/ambari-agent
脚本
/var/lib/ambari-agent/bin/ambari-agent
如下
目录/usr/lib/ambari-agent/lib/ambari_agent/
如下,里面有AmbariAgent.py脚本
重点内容如下
这里的逻辑对照整体概念图中的这一部分
简单来说,ambari-agent启动后,调用AmbarAgent.py脚本,此脚本里有个status参数
当这个参数为77时,启动或者重启另外一个main.py脚本,否则就结束
当启动main.py脚本后会返回又一个status值,再次进行判断。
启动了main.py后,ps -ef | grep ambari-agent
会有这2个进程
if __name__ == "__main__":
is_logger_setup = False
try:
initializer_module = InitializerModule()
heartbeat_stop_callback = bind_signal_handlers(agentPid, initializer_module.stop_event)
main(initializer_module, heartbeat_stop_callback)
except SystemExit:
raise
except BaseException:
if is_logger_setup:
logger.exception("Exiting with exception:")
raise
返回的参数initializer_module、heartbeat_stop_callback放到main()中调用
在main()方法中,经过整体概念图中的如下流程后,调用了run_threads(initaializer_module)方法
这方法里启动了InitializerModule的各种reporter、handler、executor,最后启动了action_queue队列
def run_threads(initializer_module):
initializer_module.alert_scheduler_handler.start()
initializer_module.heartbeat_thread.start()
initializer_module.component_status_executor.start()
initializer_module.command_status_reporter.start()
initializer_module.host_status_reporter.start()
initializer_module.alert_status_reporter.start()
initializer_module.action_queue.start()
while not initializer_module.stop_event.is_set():
time.sleep(0.1)
initializer_module.action_queue.interrupt()
initializer_module.command_status_reporter.join()
initializer_module.component_status_executor.join()
initializer_module.host_status_reporter.join()
initializer_module.alert_status_reporter.join()
initializer_module.heartbeat_thread.join()
initializer_module.action_queue.join()
该方法完成了cache、executor、reportor的初始化,并由main.py中的run.threads方法完成各个执行线程的启动
重点类
其实在ambari2.5版本中不是用InitializerModule类来管理属性及各个线程的,使用Controller类来管理的
def __init__(self):
self.stop_event = threading.Event()
self.config = AmbariConfig.get_resolved_config()
self.is_registered = None
self.metadata_cache = None
self.topology_cache = None
self.host_level_params_cache = None
self.configurations_cache = None
self.alert_definitions_cache = None
self.configuration_builder = None
self.stale_alerts_monitor = None
self.server_responses_listener = None
self.file_cache = None
self.customServiceOrchestrator = None
self.hooks_orchestrator = None
self.recovery_manager = None
self.commandStatuses = None
self.action_queue = None
self.alert_scheduler_handler = None
self.init()
主要用来初始化属性
def init(self):
"""
Initialize properties
"""
self.is_registered = False
self.metadata_cache = ClusterMetadataCache(self.config.cluster_cache_dir)
self.topology_cache = ClusterTopologyCache(self.config.cluster_cache_dir, self.config)
self.host_level_params_cache = ClusterHostLevelParamsCache(self.config.cluster_cache_dir)
self.configurations_cache = ClusterConfigurationCache(self.config.cluster_cache_dir)
self.alert_definitions_cache = ClusterAlertDefinitionsCache(self.config.cluster_cache_dir)
self.configuration_builder = ConfigurationBuilder(self)
self.stale_alerts_monitor = StaleAlertsMonitor(self)
self.server_responses_listener = ServerResponsesListener(self)
self.file_cache = FileCache(self.config)
self.customServiceOrchestrator = CustomServiceOrchestrator(self)
self.hooks_orchestrator = HooksOrchestrator(self)
self.recovery_manager = RecoveryManager(self)
self.commandStatuses = CommandStatusDict(self)
self.init_threads()
这些都是整体概念图中的initialize properties
上面init方法调用了init_threads方法
主要用来初始化各个线程
def init_threads(self):
"""
Initialize thread objects
"""
self.component_status_executor = ComponentStatusExecutor(self)
self.action_queue = ActionQueue(self)
self.alert_scheduler_handler = AlertSchedulerHandler(self)
self.command_status_reporter = CommandStatusReporter(self)
self.host_status_reporter = HostStatusReporter(self)
self.alert_status_reporter = AlertStatusReporter(self)
self.heartbeat_thread = HeartbeatThread.HeartbeatThread(self)
对应整体概念图中的这一模块
init_threads()中最后调用了 HeartbeatThread.HeartbeatThread(self)
self.heartbeat_thread = HeartbeatThread.HeartbeatThread(self)
再来看看HeartbeatThread心跳线程
这些listener都是/usr/lib/ambari-agent/lib/ambari_stomp/listener.py的子类
拿上面的CommandsEventListener来举例
ambari_stomp.ConnectionListener如下
其中可以看到
This class should be used as a base class for objects registered
using Connection.set_listener().
再回到HeartbeatThread中,handle_heartbeat_reponse方法里判断心跳反应获得的response中serverId是否连续,如果不连续,记录Error日志并且重启agent,连续的话继续执行HeartbeatThread.py
至此心跳通信完成