参考文章 https://blog.csdn.net/u011812294/article/details/60878890
bash bin/flume-ng agent --conf-file conf/test-hbase.conf --name agent -Dflume.root.logger=INFO,console &
这个就是我们启动的命令
这里最后启动的是 org.apache.flume.node.Application
List components = Lists.newArrayList();
if (reload) { //一般是true
EventBus eventBus = new EventBus(agentName + "-event-bus");
PollingPropertiesFileConfigurationProvider configurationProvider =
new PollingPropertiesFileConfigurationProvider(
agentName, configurationFile, eventBus, 30);
components.add(configurationProvider);
application = new Application(components);
eventBus.register(application);
} else {
PropertiesFileConfigurationProvider configurationProvider =
new PropertiesFileConfigurationProvider(agentName, configurationFile);
application = new Application();
application.handleConfigurationEvent(configurationProvider.getConfiguration());
}
}
application.start();
备注
不加--no-reload-conf,flume会每隔30秒去重新加载Flume agent的配置文件
如果担心Flume自动去加载配置文件有时会出现问题,可以在启动Flume的时候通过加上--no-reload-conf配置来禁止Flume自动加载配置文件只针对apache有效。
参考 https://blog.csdn.net/liuxiao723846/article/details/64128382
挑出主要的代码说,在这之前还把--conf-file conf/test-hbase.conf --name agent -Dflume.root.logger=INFO,console 这些参数封装
PollingPropertiesFileConfigurationProvider这个类通过componets.add()放入到了componets里
然后又把commnents给了application,此时application里只有components,也就是PollingPropertiesFileConfigurationProvider
public class PollingPropertiesFileConfigurationProvider
extends PropertiesFileConfigurationProvider
implements LifecycleAware {
实现了LifecycleAware接口,注意
public void start() {
lifecycleLock.lock();
try {
for (LifecycleAware component : components) {
supervisor.supervise(component,
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
}
} finally {
lifecycleLock.unlock();
}
}
MonitorRunnable monitorRunnable = new MonitorRunnable();
monitorRunnable.lifecycleAware = lifecycleAware; //注意这个
monitorRunnable.supervisoree = process;
monitorRunnable.monitorService = monitorService;
supervisedProcesses.put(lifecycleAware, process);
ScheduledFuture future = monitorService.scheduleWithFixedDelay(
monitorRunnable, 0, 3, TimeUnit.SECONDS);
monitorFutures.put(lifecycleAware, future);
new 了一个monitorRunnable,并把start()中的LifecycleAware component的component传进去
此时这个lifecycleAware就是PollingPropertiesFileConfigurationProvider
public static class MonitorRunnable implements Runnable {
public void run() {
switch (supervisoree.status.desiredState) {
case START:
try {
lifecycleAware.start();
}
}
lifecycleAware的实现类就刚好有我上面写的PollingPropertiesFileConfigurationProvider
public void start() {
LOGGER.info("Configuration provider starting");
Preconditions.checkState(file != null,
"The parameter file must not be null");
executorService = Executors.newSingleThreadScheduledExecutor(
new ThreadFactoryBuilder().setNameFormat("conf-file-poller-%d")
.build());
FileWatcherRunnable fileWatcherRunnable =
new FileWatcherRunnable(file, counterGroup);
executorService.scheduleWithFixedDelay(fileWatcherRunnable, 0, interval,
TimeUnit.SECONDS);
lifecycleState = LifecycleState.START;
LOGGER.debug("Configuration provider started");
}
开启一个延迟30s的调度线程,执行fileWatcherRunnable
public void run() {
LOGGER.debug("Checking file:{} for changes", file);
counterGroup.incrementAndGet("file.checks");
long lastModified = file.lastModified();
if (lastModified > lastChange) {
LOGGER.info("Reloading configuration file:{}", file);
counterGroup.incrementAndGet("file.loads");
lastChange = lastModified;
try {
eventBus.post(getConfiguration());
} catch (Exception e) {
LOGGER.error("Failed to load configuration data. Exception follows.",
e);
} catch (NoClassDefFoundError e) {
LOGGER.error("Failed to start agent because dependencies were not " +
"found in classpath. Error follows.", e);
} catch (Throwable t) {
// caught because the caller does not handle or log Throwables
LOGGER.error("Unhandled error", t);
}
}
}
只看try里的代码eventBus.post(getConfiguration());
出现了eventbus,这里解释下为啥要去这个名,包括spark里也有bus这个名称,说的就是flume想汽车一样,一批一批的去运载event运到目的地然后放下,再去运下一批
这个eventbus还在哪里出现了? application的main方法里 eventBus.register(application);
这个方法是AbstractConfigurationProvider的,这个是PollingPropertiesFileConfigurationProvider 的父类
public MaterializedConfiguration getConfiguration() {
conf.addChannel(channelName, channelComponent.channel);
for (Map.Entry entry : sourceRunnerMap.entrySet()) {
conf.addSourceRunner(entry.getKey(), entry.getValue());
}
for (Map.Entry entry : sinkRunnerMap.entrySet()) {
conf.addSinkRunner(entry.getKey(), entry.getValue());
}
return conf;
}
我把主要代码贴出来,就是对conf=>MaterializedConfiguration这个类的属性 channel sourceRunner sinkrunner 进行封装了,
封装的内容就是我们linux上的配好的哪些source sink channel 对象
看到这里好像还是没有看到channel sink source 启动啊 怎么回事呢?
注意application main 方法里有
eventBus.register(application);
我们跟踪到最后的结束方法
eventBus.post(getConfiguration());
建议看下这边文章讲解eventBus https://www.jianshu.com/p/348ff06f42f6
@Subscribe
public void handleConfigurationEvent(MaterializedConfiguration conf) {
try {
lifecycleLock.lockInterruptibly();
stopAllComponents();
startAllComponents(conf); //正式启动
}
}
简单的来说类似消息订阅模式
eventbus 订阅了application这个类
eventbus post方法的的属性是MaterializedConfiguration
@Subscribe 代表这个方法接收eventbus传递的 并且是post(conf)的
例如eventbus.post("string")就只对应 @Subscribe 下 方法入参为 (String a)的因为eventbus里有个typehandler
所以在eventbus.post后 handleConfigurationEvent这个方法被调用了。
private void startAllComponents(MaterializedConfiguration materializedConfiguration) {
logger.info("Starting new configuration:{}", materializedConfiguration);
this.materializedConfiguration = materializedConfiguration;
for (Entry entry :
materializedConfiguration.getChannels().entrySet()) {
try {
logger.info("Starting Channel " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e) {
logger.error("Error while starting {}", entry.getValue(), e);
}
}
/*
* Wait for all channels to start.
*/
for (Channel ch : materializedConfiguration.getChannels().values()) {
while (ch.getLifecycleState() != LifecycleState.START
&& !supervisor.isComponentInErrorState(ch)) {
try {
logger.info("Waiting for channel: " + ch.getName() +
" to start. Sleeping for 500 ms");
Thread.sleep(500);
} catch (InterruptedException e) {
logger.error("Interrupted while waiting for channel to start.", e);
Throwables.propagate(e);
}
}
}
for (Entry entry : materializedConfiguration.getSinkRunners().entrySet()) {
try {
logger.info("Starting Sink " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e) {
logger.error("Error while starting {}", entry.getValue(), e);
}
}
for (Entry entry :
materializedConfiguration.getSourceRunners().entrySet()) {
try {
logger.info("Starting Source " + entry.getKey());
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
} catch (Exception e) {
logger.error("Error while starting {}", entry.getValue(), e);
}
}
this.loadMonitoring();
}
这里是我们从我们绕了一圈的最后封装好了的conf里获取channel sink source 的对象,然后启动这些对象的start方法
supervisor.supervise(entry.getValue(),
new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);