此前了解过Apache Drill的设计原理,借鉴与Google的Dremel的一个开源实现;而cloudera貌似在此基础上构造了impala.
最近从apache官网上看到Drill源码可以下载了,分析一下。
http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-1.0.0-m1-incubating/
代码结构:
包括:src,sqlparser,exec,distribution,contrib,common,sample-data.
查看INSTALL.cd要求如下:
Java 7+
protoc 2.5.x compiler 【google的Protobuffer compiler 2.5.x】
Maven 3.0+
源码可以直接下载;然后通过mvn clean install来进行安装。
将drill-1.0.0xxx.tar.gz解压缩到/drill/目录下
直接运行./sqlline,开始下载所需的包
最后开始报错
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
cat: .classpath: 没有那个文件或目录
Exception in thread "main" java.lang.NoClassDefFoundError: sqlline/SqlLine
Caused by: java.lang.ClassNotFoundException: sqlline.SqlLine
看来还算需要根据规定一步步来处理。
运行命令
mvn install [开始下载依赖的版本]
检测Java版本不行,升级或者安装Java 7
[root@pg2 jdk1.7]# pwd
/drill/jdk1.7
修改/etc/profile
[root@pg2 jdk1.7]# vi /etc/profile
[root@pg2 jdk1.7]# source /etc/profile
[root@pg2 jdk1.7]#
然后运行mvn install
发现错误缺少profobuf的complier
根据Drill的Install.md中依赖内容为protoc 2.5.x complier
接下来准备:
安装protobuf,下载http://code.google.com/p/protobuf/
http://code.google.com/p/protobuf/downloads/detail?name=protobuf-2.5.0.tar.gz
对应的protobuf-2.5.0.tar.gz
tar zxvf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0
./configure
make
make check
make install
安装结束。
验证:
查看是否安装成功:protoc --version
[root@pg2 protobuf-2.5.0]# protoc --version
libprotoc 2.5.0
[root@pg2 protobuf-2.5.0]#
接下来继续:
mvn install
最后存在错误如下:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test (default-test) on project common: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test failed: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ?
[ERROR] Command was/bin/sh -c cd /drill/drill/common && /drill/jdk1.7/jre/bin/java -XX:MaxDirectMemorySize=4096M org.apache.maven.surefire.booter.ForkedBooter /drill/drill/common/target/surefire/surefire5854881226327277986tmp /drill/drill/common/target/surefire/surefire_02131510559628541214tmp
[ERROR] -> [Help 1]
测试通过
git clone https://github.com/apache/incubator-drill.git
下载源码
[root@pg2 incubator-drill]# pwd
/drill/incubator-drill
[root@pg2 incubator-drill]# ls
common distribution header KEYS NOTICE protocol sample-data sqlline src tools
contrib exec INSTALL.md LICENSE pom.xml README.md sandbox sqlparser target
[root@pg2 incubator-drill]#
运行测试:
cd incubator-drill
mvn clean install
编译失败存在错误:
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Invalid maximum direct memory size: -XX:MaxDirectMemorySize=4096M
The specified size exceeds the maximum representable size.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
分析pom.xml文件中
分配给CentOS6.3的内存只有1128M
尝试修改-XX:MaxDirectMemorySize=896M
重新进行mvn install,原来的错误已经不存从
执行到后面错误包括:
[INFO] exec/Java Execution Engine ........................ FAILURE [10:23.114s]
[INFO] SQL Parser ........................................ SKIPPED
[INFO] contrib/sqlline ................................... SKIPPED
[INFO] Packaging and Distribution Assembly ............... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 13:31.608s
[INFO] Finished at: Sat Nov 30 23:58:45 CST 2013
[INFO] Final Memory: 41M/146M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.15:test (default-test) on project java-exec: There are test failures.
[ERROR]
[ERROR] Please refer to /drill/drill/exec/java-exec/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn
[root@pg2 drill]#
存在test failures
Tests in error:
TestSimpleFunctions.testSubstring:122 » OutOfMemory Direct buffer memory
TestSimpleFunctions.testIsNull:68 » OutOfMemory Direct buffer memory
TestComparisonFunctions.testIntNullable:126->runTest:66 » OutOfMemory Direct b...
TestComparisonFunctions.testInt:82->runTest:66 » OutOfMemory Direct buffer mem...
TestComparisonFunctions.testBigIntNullable:136->runTest:66 » OutOfMemory Direc...
TestComparisonFunctions.testBigInt:93->runTest:66 » OutOfMemory Direct buffer ...
TestComparisonFunctions.testFloat4:104->runTest:66 » OutOfMemory Direct buffer...
TestComparisonFunctions.testFloat8:115->runTest:66 » OutOfMemory Direct buffer...
TestAgg.twoKeyAgg:92->doTest:63 » OutOfMemory Direct buffer memory
TestAgg.oneKeyAgg:69->doTest:63 » OutOfMemory Direct buffer memory
TestDistributedFragmentRun.oneBitOneExchangeOneEntryRun:50 » test timed out a...
TestDistributedFragmentRun.twoBitOneExchangeTwoEntryRun:106 » test timed out ...
TestDistributedFragmentRun.oneBitOneExchangeTwoEntryRun:69 » test timed out a...
TestDistributedFragmentRun.oneBitOneExchangeTwoEntryRunLogical:87 » test time...
TestMergeJoin.orderedEqualityLeftJoin:124 » NullPointer
TestMergeJoin.orderedEqualityInnerJoin:180 » OutOfMemory Direct buffer memory
TestMergeJoin.orderedEqualityMultiBatchJoin:232 » OutOfMemory Direct buffer me...
TestMergeJoin.simpleEqualityJoin:72 » OutOfMemory Direct buffer memory
TestEndianess.testLittleEndian:33 » OutOfMemory Direct buffer memory
相关实例:
testLittleEndian(org.apache.drill.exec.memory.TestEndianess) Time elapsed: 0.735 sec <<< ERROR!
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at io.netty.buffer.PoolArenaL$DirectArena.newChunk(PoolArenaL.java:381)
at io.netty.buffer.PoolArenaL.allocateNormal(PoolArenaL.java:144)
at io.netty.buffer.PoolArenaL.allocate(PoolArenaL.java:133)
at io.netty.buffer.PoolArenaL.allocate(PoolArenaL.java:95)
at io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:236)
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:132)
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:123)
at org.apache.drill.exec.memory.DirectBufferAllocator.buffer(DirectBufferAllocator.java:35)
at org.apache.drill.exec.memory.TestEndianess.testLittleEndian(TestEndianess.java:33)
-------------------
在Cloudera的CDH4.3的虚拟机下,安装jdk1.7.0_45和protoc-2.5.0
然后尝试多次
mvn e install最终完成安装编译
测试运行;
./sqlline -u jdbc:drill:schema=parquet-local -n admin -p admin
首先扫描是否有更新,如果没有更新,启动命令行
测试命令
SELECT
_MAP['R_REGIONKEY'] AS region_key,
_MAP['R_NAME'] AS name, _MAP['R_COMMENT'] AS comment
FROM "sample-data/region.parquet";