flume源码分析

flume是一个高可靠性的分布式的大文件收集系统。它提供了transaction来保证数据不会丢失。

flume官网:http://flume.apache.org/

Flume文档:http://flume.apache.org/FlumeUserGuide.htmlhttp://flume.apache.org/FlumeDeveloperGuide.html

 

安装:从官网下载flume,然后解压

启动:nohup bin/flume-ng agent --conf <conf_file_path> --conf-file <conf_file> --name <agent_name> -Dflume.root.logger=DEBUG,console &

 

Flume主要包含三部分:source,channel,sink. source 用于接收数据,channel是一个缓冲通道,sink发送数据到目的端。source可以配置多个channel。channel可以通过channelSelect来选择发往那个channel。可以配置往每个channel发送,也可以配置一个参数,当满足特定值时,发往某个channel。每个channel可以配置多个sink。通过sinkprocess来做load balance,或者failover。

 

flume-ng命令会调用Application的main函数,如果需要reload configure 文件,则注册application到eventBus中,当文件变更时,调用application的handleConfigurationEvent方法

 public static void main(String[] args) {
      Application application;
      if(reload) {
        EventBus eventBus = new EventBus(agentName + "-event-bus");
        PollingPropertiesFileConfigurationProvider configurationProvider =
            new PollingPropertiesFileConfigurationProvider(agentName,
                configurationFile, eventBus, 30);
        components.add(configurationProvider);
        application = new Application(components);
        eventBus.register(application);
      } else {
        PropertiesFileConfigurationProvider configurationProvider =
            new PropertiesFileConfigurationProvider(agentName,
                configurationFile);
        application = new Application();
        application.handleConfigurationEvent(configurationProvider.getConfiguration());
      }
      application.start();
  }

    application中的start方法会调用supervisor.supervise(),这个方法会尝试调用component的start方法,component列表中包含了PollingPropertiesFileConfigurationProvider对象,这个对象的start方法启动了一个线程来监控文件的变更,初始状态文件是变更的,接着就会调用application的handleConfigurationEvent方法

  public synchronized void start() {
    for(LifecycleAware component : components) {
      supervisor.supervise(component,
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
    }
  }

  在 handleConfigurationEvent中先调用 PropertiesFileConfigurationProvider的getConfiguration方法,这个方法通过配置文件创建了source,sink,channel,并调用了各个组件的configure方法,然后  调用了startAllComponents方法,启动了channel,source,sink,并且加载了monitor,用于监控flume的metrics

 private void startAllComponents(MaterializedConfiguration materializedConfiguration) {
    logger.info("Starting new configuration:{}", materializedConfiguration);

    this.materializedConfiguration = materializedConfiguration;

    for (Entry<String, Channel> entry :
      materializedConfiguration.getChannels().entrySet()) {
      try{
        logger.info("Starting Channel " + entry.getKey());
        supervisor.supervise(entry.getValue(),
            new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e){
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    /*
     * Wait for all channels to start.
     */
    for(Channel ch: materializedConfiguration.getChannels().values()){
      while(ch.getLifecycleState() != LifecycleState.START
          && !supervisor.isComponentInErrorState(ch)){
        try {
          logger.info("Waiting for channel: " + ch.getName() +
              " to start. Sleeping for 500 ms");
          Thread.sleep(500);
        } catch (InterruptedException e) {
          logger.error("Interrupted while waiting for channel to start.", e);
          Throwables.propagate(e);
        }
      }
    }

    for (Entry<String, SinkRunner> entry : materializedConfiguration.getSinkRunners()
        .entrySet()) {
      try{
        logger.info("Starting Sink " + entry.getKey());
        supervisor.supervise(entry.getValue(),
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e) {
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    for (Entry<String, SourceRunner> entry : materializedConfiguration
        .getSourceRunners().entrySet()) {
      try{
        logger.info("Starting Source " + entry.getKey());
        supervisor.supervise(entry.getValue(),
          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);
      } catch (Exception e) {
        logger.error("Error while starting {}", entry.getValue(), e);
      }
    }

    this.loadMonitoring();
  }

 

你可能感兴趣的:(Flume)