上次我们阅读了main(),run(),和CLiDriver的构造函数;接上,我们该阅读executeDriver函数了。在进入executeDriver之前,我们可以认为Hive处理的是用户进入Hive程序的指令,到此用户已经进入了Hive,Cli的Driver 将不断读取用户的HiveQL语句并解析,提交给Driver。executeDriver 函数内部核心的代码是通过while循环不断按行读取用户的输入,然后调用ProcessLine拼接一条命令cmd,传递给processCmd处理用户输入。executeDriver 和ProcessLine这两个函数有空再细说,下面就来看看processCmd函数。
CliSessionState ss = (CliSessionState) SessionState.get();
ss.setLastCommand(cmd);
// Flush the print stream, so it doesn't include output from the last command
ss.err.flush();
String cmd_trimmed = cmd.trim();
String[] tokens = tokenizeCmd(cmd_trimmed);
// tokenizeCmd
private String[] tokenizeCmd(String cmd) {
return cmd.split("\\s+");
}
首先是设置当前clisession的指令,将用户输入的指令从空格,制表符等出断开(tokenizeCmd函数),得到token数组。
其中这个tokenizeCmdhan()函数使用了split(“\s+”),这个使用了java的正则表达式,’\s’ 的意思是匹配任何空白字符,包括空格、制表符、换页符等。与 [ \f\n\r\t\v] 等效。为什么使用’\’呢,因为’\’是转移字符 的前缀,想要表示’\’本身,就得使用’\’。
if (cmd_trimmed.toLowerCase().equals("quit") || cmd_trimmed.toLowerCase().equals("exit")) {
// if we have come this far - either the previous commands
// are all successful or this is command line. in either case
// this counts as a successful run
ss.close();
System.exit(0);
} else if (tokens[0].equalsIgnoreCase("source")) {
String cmd_1 = getFirstCmd(cmd_trimmed, tokens[0].length());
cmd_1 = new VariableSubstitution().substitute(ss.getConf(), cmd_1);
File sourceFile = new File(cmd_1);
if (! sourceFile.isFile()){
console.printError("File: "+ cmd_1 + " is not a file.");
ret = 1;
} else {
try {
ret = processFile(cmd_1);
} catch (IOException e) {
console.printError("Failed processing file "+ cmd_1 +" "+ e.getLocalizedMessage(),
stringifyException(e));
ret = 1;
}
}
} else if (cmd_trimmed.startsWith("!")) {
String shell_cmd = cmd_trimmed.substring(1);
shell_cmd = new VariableSubstitution().substitute(ss.getConf(), shell_cmd);
// shell_cmd = "/bin/bash -c \'" + shell_cmd + "\'";
try {
ShellCmdExecutor executor = new ShellCmdExecutor(shell_cmd, ss.out, ss.err);
ret = executor.execute();
if (ret != 0) {
console.printError("Command failed with exit code = " + ret);
}
} catch (Exception e) {
console.printError("Exception raised from Shell command " + e.getLocalizedMessage(),
stringifyException(e));
ret = 1;
}
}
这段程序中分了三种情况:
quit或exit: 关闭回话,退出hive
source: 调用processFile读取文件
! 开头: 调用Linux系统的shell执行指令
else { // local mode
try {
CommandProcessor proc = CommandProcessorFactory.get(tokens, (HiveConf) conf);
ret = processLocalCmd(cmd, proc, ss);
} catch (SQLException e) {
console.printError("Failed processing command " + tokens[0] + " " + e.getLocalizedMessage(),
org.apache.hadoop.util.StringUtils.stringifyException(e));
ret = 1;
}
}
这段程序和上面的同属于一个if-else语句中,从这里可以进入本地模式,也就是我们常常使用的Hsql语句。
public interface CommandProcessor {
void init();
CommandProcessorResponse run(String command) throws CommandNeedRetryException;
}
这是一个接口类,CommandProcessorFactory 根据用户指令生成的tokens和配置文件,返回CommandProcessor的一个具体实现。
public static CommandProcessor get(String[] cmd, HiveConf conf)
throws SQLException {
CommandProcessor result = getForHiveCommand(cmd, conf);
if (result != null) {
return result;
}
if (isBlank(cmd[0])) {
return null;
} else {
if (conf == null) {
return new Driver();
}
Driver drv = mapDrivers.get(conf);
if (drv == null) {
drv = new Driver();
mapDrivers.put(conf, drv);
}
drv.init();
return drv;
}
}
其中getForHiveCommand函数首先根据tokens的第一个字串,也就是用户输入指令的第一个单词,在HiveCommand这个enum中定义的一些非SQL查询操作集合中进行匹配,确定相应的HiveCommand类型。在依据HiveCommand选择合适的CommandProcessor实现方式,比如dfs命令对应的DFSProcessor,set命令对应的SetProcessor等,如果用户输入的是诸如select之类的SQL查询,getForHive Command 返回 null,直接在get函数中根据配置文件conf选择或者生成一个Driver类实例,并作为CommandProcessor返回。
在 CommandProcessorFactory 中可以看到对不同指令的操作,代码如下:
switch (hiveCommand) {
case SET:
return new SetProcessor();
case RESET:
return new ResetProcessor();
case DFS:
SessionState ss = SessionState.get();
return new DfsProcessor(ss.getConf());
case ADD:
return new AddResourceProcessor();
case LIST:
return new ListResourceProcessor();
case DELETE:
return new DeleteResourceProcessor();
case COMPILE:
return new CompileProcessor();
case RELOAD:
return new ReloadProcessor();
case CRYPTO:
try {
return new CryptoProcessor(SessionState.get().getHdfsEncryptionShim(), conf);
} catch (HiveException e) {
throw new SQLException("Fail to start the command processor due to the exception: ", e);
}
default:
throw new AssertionError("Unknown HiveCommand " + hiveCommand);
}
do {
try {
needRetry = false;
if (proc != null) {
//如果CommandProcessor是Driver实例
if (proc instanceof Driver) {
Driver qp = (Driver) proc;
//获取标准输出流,打印结果信息
PrintStream out = ss.out;
long start = System.currentTimeMillis();
if (ss.getIsVerbose()) {
out.println(cmd);
}
qp.setTryCount(tryCount);
//driver实例运行用户指令,获取运行结果响应码
ret = qp.run(cmd).getResponseCode();
if (ret != 0) {
qp.close();
return ret;
}
// 统计指令的运行时间
long end = System.currentTimeMillis();
double timeTaken = (end - start) / 1000.0;
ArrayList res = new ArrayList();
//打印查询结果的列名称
printHeader(qp, out);
// 打印查询结果
int counter = 0;
try {
if (out instanceof FetchConverter) {
((FetchConverter)out).fetchStarted();
}
while (qp.getResults(res)) {
for (String r : res) {
out.println(r);
}
counter += res.size();
res.clear();
if (out.checkError()) {
break;
}
}
} catch (IOException e) {
console.printError("Failed with exception " + e.getClass().getName() + ":"
+ e.getMessage(), "\n"
+ org.apache.hadoop.util.StringUtils.stringifyException(e));
ret = 1;
}
//关闭结果
int cret = qp.close();
if (ret == 0) {
ret = cret;
}
if (out instanceof FetchConverter) {
((FetchConverter)out).fetchFinished();
}
console.printInfo("Time taken: " + timeTaken + " seconds" +
(counter == 0 ? "" : ", Fetched: " + counter + " row(s)"));
} else {
//如果proc不是Driver,也就是用户执行的是非SQL查询操作,直接执行语句,不执行FetchResult的操作
String firstToken = tokenizeCmd(cmd.trim())[0];
String cmd_1 = getFirstCmd(cmd.trim(), firstToken.length());
if (ss.getIsVerbose()) {
ss.out.println(firstToken + " " + cmd_1);
}
CommandProcessorResponse res = proc.run(cmd_1);
if (res.getResponseCode() != 0) {
ss.out.println("Query returned non-zero code: " + res.getResponseCode() +
", cause: " + res.getErrorMessage());
}
ret = res.getResponseCode();
}
}
} catch (CommandNeedRetryException e) {
//如果执行过程中出现异常,修改needRetry标志,下次循环是retry。
console.printInfo("Retry query with a different approach...");
tryCount++;
needRetry = true;
}
} while (needRetry);