sqoop 抽取源码流程分析(一) 主流程分析以及各种插件

1. 工具类的继承关系

目前sqoop 提供的各种工具,比如 ImportTool, ExportTool, CodeGenTool 等工具类都是集成于 BaseSqoopTool 这个基础类
下面看看这个BaseSqoopTool 基础类的继承关系


public abstract class BaseSqoopTool 
   extends com.cloudera.sqoop.tool.SqoopTool {
......
public abstract class SqoopTool

    extends org.apache.sqoop.tool.SqoopTool {

......

public abstract class SqoopTool {

2. sqoop 之 main 函数

2.1 sqoop 继承关系

public class Sqoop extends Configured implements Tool {
......

其中的Tool 和 Configured 是hadoop 中的基础类,毕竟sqoop 之用hadoop 来做的具体实现。

2.2 主函数

 public static void main(String [] args) {

    if (args.length == 0) {

      System.err.println("Try 'sqoop help' for usage.");

      System.exit(1);

    }

    int ret = runTool(args);

    System.exit(ret);

  }

2.3 runTool 的实现

runTool 更具传入的sqoop 参数,选择需要使用的工具类名称来构建相应的工具类

......
   SqoopTool tool = SqoopTool.getTool(toolName);
......
   Sqoop sqoop = new Sqoop(tool, pluginConf);

    return runSqoop(sqoop, Arrays.copyOfRange(expandedArgs, 1, expandedArgs.length));

那么具体的运行就是在各个工具类的具体实现里面了。
具体的实现可以参考对importTool的具体分析了

2.4 隐藏的关系


 public static int runSqoop(Sqoop sqoop, String [] args) {

    try {

      String [] toolArgs = sqoop.stashChildPrgmArgs(args);

      return ToolRunner.run(sqoop.getConf(), sqoop, toolArgs);

......

虽然是在hadoop 里面实现的这个run 方法,不过可以通过下面的具体实现来看出,其实调用的还是sqoop的run方法


 public static int run(Configuration conf, Tool tool, String[] args) 

    throws Exception{

    if(conf == null) {

      conf = new Configuration();

    }

    GenericOptionsParser parser = new GenericOptionsParser(conf, args);

    //set the configuration back, so that Tool can configure itself

    tool.setConf(conf);



    //get the args w/o generic hadoop args

    String[] toolArgs = parser.getRemainingArgs();

    return tool.run(toolArgs);

2.5 真正的入口点


 public int run(String [] args) {

    if (options.getConf() == null) {

      // Configuration wasn't initialized until after the ToolRunner

      // got us to this point. ToolRunner gave Sqoop itself a Conf

      // though.

      options.setConf(getConf());

    }



    try {

      options = tool.parseArguments(args, null, options, false);

      tool.appendArgs(this.childPrgmArgs);

      tool.validateOptions(options);

    } catch (Exception e) {

      // Couldn't parse arguments.

      // Log the stack trace for this exception

      LOG.debug(e.getMessage(), e);

      // Print exception message.

      System.err.println(e.getMessage());

      return 1; // Exit on exception here.

    }



    return tool.run(options);

  }

2.5.1 参数解析


public SqoopOptions parseArguments(String [] args,

      Configuration conf, SqoopOptions in, boolean useGenericOptions)

      throws ParseException, SqoopOptions.InvalidOptionsException {

......

 // Parse tool-specific arguments.

    ToolOptions toolOptions = new ToolOptions();

    configureOptions(toolOptions);

    CommandLineParser parser = new SqoopParser();

    CommandLine cmdLine = parser.parse(toolOptions.merge(), toolArgs, true);

    applyOptions(cmdLine, out);

    this.extraArguments = cmdLine.getArgs();

可以看出真正的参数解析是在SqoopParser这个类中

那么继续看SqoopParser 的parse方法


 public void processArgs(Option opt, ListIterator iter)

      throws ParseException {

    // Loop until an option is found.

    while (iter.hasNext()) {

      String str = (String) iter.next();

      if (getOptions().hasOption(str) && str.startsWith("-")) {

        // found an Option, not an argument.

        iter.previous();

        break;

      }



      // Otherwise, this is a value.

      try {

        // Note that we only strip matched quotes here.

        addValForProcessing.invoke(opt, stripMatchedQuotes(str));

      } catch (IllegalAccessException iae) {

        throw new RuntimeException(iae);

      } catch (java.lang.reflect.InvocationTargetException ite) {

        // Any runtime exception thrown within addValForProcessing()

        // will be wrapped in an InvocationTargetException.

        iter.previous();

        break;

      } catch (RuntimeException re) {

        iter.previous();

        break;

      }

    }



    if (opt.getValues() == null && !opt.hasOptionalArg()) {

      throw new MissingArgumentException(opt);

    }

  }

你可能感兴趣的:(hadoop)