Apache Drill源码分析和编译

此前了解过Apache Drill的设计原理,借鉴与Google的Dremel的一个开源实现;而cloudera貌似在此基础上构造了impala.

最近从apache官网上看到Drill源码可以下载了,分析一下。

http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-1.0.0-m1-incubating/

代码结构:

包括:src,sqlparser,exec,distribution,contrib,common,sample-data.

查看INSTALL.cd要求如下:

Java 7+

protoc 2.5.x compiler 【google的Protobuffer compiler 2.5.x】

Maven 3.0+

源码可以直接下载;然后通过mvn clean install来进行安装。


将drill-1.0.0xxx.tar.gz解压缩到/drill/目录下

直接运行./sqlline,开始下载所需的包

最后开始报错

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
cat: .classpath: 没有那个文件或目录
Exception in thread "main" java.lang.NoClassDefFoundError: sqlline/SqlLine
Caused by: java.lang.ClassNotFoundException: sqlline.SqlLine

 看来还算需要根据规定一步步来处理。

运行命令

mvn install [开始下载依赖的版本]

检测Java版本不行,升级或者安装Java 7

[root@pg2 jdk1.7]# pwd
/drill/jdk1.7
修改/etc/profile

[root@pg2 jdk1.7]# vi /etc/profile
[root@pg2 jdk1.7]# source /etc/profile
[root@pg2 jdk1.7]# 

然后运行mvn install

发现错误缺少profobuf的complier

根据Drill的Install.md中依赖内容为protoc 2.5.x complier


接下来准备:

安装protobuf,下载http://code.google.com/p/protobuf/

http://code.google.com/p/protobuf/downloads/detail?name=protobuf-2.5.0.tar.gz

对应的protobuf-2.5.0.tar.gz

tar zxvf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0
./configure
make
make check
make install
安装结束。
验证:
查看是否安装成功:protoc --version

[root@pg2 protobuf-2.5.0]# protoc --version
libprotoc 2.5.0
[root@pg2 protobuf-2.5.0]# 

接下来继续:

mvn install

最后存在错误如下:


[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test (default-test) on project common: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test failed: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ?
[ERROR] Command was/bin/sh -c cd /drill/drill/common && /drill/jdk1.7/jre/bin/java -XX:MaxDirectMemorySize=4096M org.apache.maven.surefire.booter.ForkedBooter /drill/drill/common/target/surefire/surefire5854881226327277986tmp /drill/drill/common/target/surefire/surefire_02131510559628541214tmp
[ERROR] -> [Help 1]

测试通过

 git clone https://github.com/apache/incubator-drill.git
下载源码

[root@pg2 incubator-drill]# pwd
/drill/incubator-drill
[root@pg2 incubator-drill]# ls
common   distribution  header      KEYS     NOTICE   protocol   sample-data  sqlline    src     tools
contrib  exec          INSTALL.md  LICENSE  pom.xml  README.md  sandbox      sqlparser  target
[root@pg2 incubator-drill]# 
运行测试:

    cd incubator-drill
    mvn clean install

编译失败存在错误:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Invalid maximum direct memory size: -XX:MaxDirectMemorySize=4096M
The specified size exceeds the maximum representable size.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

分析pom.xml文件中


          maven-surefire-plugin
          2.15
         
            -XX:MaxDirectMemorySize=4096M
         

       

分配给CentOS6.3的内存只有1128M

尝试修改-XX:MaxDirectMemorySize=896M

重新进行mvn install,原来的错误已经不存从

执行到后面错误包括:

[INFO] exec/Java Execution Engine ........................ FAILURE [10:23.114s]
[INFO] SQL Parser ........................................ SKIPPED
[INFO] contrib/sqlline ................................... SKIPPED
[INFO] Packaging and Distribution Assembly ............... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 13:31.608s
[INFO] Finished at: Sat Nov 30 23:58:45 CST 2013
[INFO] Final Memory: 41M/146M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test (default-test) on project java-exec: There are test failures.
[ERROR] 
[ERROR] Please refer to /drill/drill/exec/java-exec/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn -rf :java-exec
[root@pg2 drill]# 

存在test failures

Tests in error: 
  TestSimpleFunctions.testSubstring:122 » OutOfMemory Direct buffer memory
  TestSimpleFunctions.testIsNull:68 » OutOfMemory Direct buffer memory
  TestComparisonFunctions.testIntNullable:126->runTest:66 » OutOfMemory Direct b...
  TestComparisonFunctions.testInt:82->runTest:66 » OutOfMemory Direct buffer mem...
  TestComparisonFunctions.testBigIntNullable:136->runTest:66 » OutOfMemory Direc...
  TestComparisonFunctions.testBigInt:93->runTest:66 » OutOfMemory Direct buffer ...
  TestComparisonFunctions.testFloat4:104->runTest:66 » OutOfMemory Direct buffer...
  TestComparisonFunctions.testFloat8:115->runTest:66 » OutOfMemory Direct buffer...
  TestAgg.twoKeyAgg:92->doTest:63 » OutOfMemory Direct buffer memory
  TestAgg.oneKeyAgg:69->doTest:63 » OutOfMemory Direct buffer memory
  TestDistributedFragmentRun.oneBitOneExchangeOneEntryRun:50 »  test timed out a...
  TestDistributedFragmentRun.twoBitOneExchangeTwoEntryRun:106 »  test timed out ...
  TestDistributedFragmentRun.oneBitOneExchangeTwoEntryRun:69 »  test timed out a...
  TestDistributedFragmentRun.oneBitOneExchangeTwoEntryRunLogical:87 »  test time...
  TestMergeJoin.orderedEqualityLeftJoin:124 » NullPointer
  TestMergeJoin.orderedEqualityInnerJoin:180 » OutOfMemory Direct buffer memory
  TestMergeJoin.orderedEqualityMultiBatchJoin:232 » OutOfMemory Direct buffer me...
  TestMergeJoin.simpleEqualityJoin:72 » OutOfMemory Direct buffer memory
  TestEndianess.testLittleEndian:33 » OutOfMemory Direct buffer memory

相关实例:

testLittleEndian(org.apache.drill.exec.memory.TestEndianess)  Time elapsed: 0.735 sec  <<< ERROR!
java.lang.OutOfMemoryError: Direct buffer memory
    at java.nio.Bits.reserveMemory(Bits.java:658)
    at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
    at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
    at io.netty.buffer.PoolArenaL$DirectArena.newChunk(PoolArenaL.java:381)
    at io.netty.buffer.PoolArenaL.allocateNormal(PoolArenaL.java:144)
    at io.netty.buffer.PoolArenaL.allocate(PoolArenaL.java:133)
    at io.netty.buffer.PoolArenaL.allocate(PoolArenaL.java:95)
    at io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:236)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:132)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:123)
    at org.apache.drill.exec.memory.DirectBufferAllocator.buffer(DirectBufferAllocator.java:35)
    at org.apache.drill.exec.memory.TestEndianess.testLittleEndian(TestEndianess.java:33)

-------------------

在Cloudera的CDH4.3的虚拟机下,安装jdk1.7.0_45和protoc-2.5.0

然后尝试多次

mvn e install最终完成安装编译

Apache Drill源码分析和编译_第1张图片

测试运行;

./sqlline -u jdbc:drill:schema=parquet-local -n admin -p admin

首先扫描是否有更新,如果没有更新,启动命令行

Apache Drill源码分析和编译_第2张图片

测试命令

SELECT 
      _MAP['R_REGIONKEY'] AS region_key, 
      _MAP['R_NAME'] AS name, _MAP['R_COMMENT'] AS comment
    FROM "sample-data/region.parquet";

Apache Drill源码分析和编译_第3张图片


#DRILL

你可能感兴趣的:(hadoop)