直接从 Cloudera官网下载的hadoop-2.6.0-cdh5.15.1是不支持压缩的,而生产上往往需要这方面的支持,所以需要自行下载源码并编译,这里记录一下自己编译的过程。
首先我们使用命令hadoop checknative
查看官方下载的hadoop-2.6.0-cdh5.15.1对压缩支持的情况
[root@suddev bin]# hadoop checknative
19/08/03 02:34:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop: false
zlib: false
snappy: false
lz4: false
bzip2: false
openssl: false
19/08/03 02:34:24 INFO util.ExitUtil: Exiting with status 1
可以看出官方版本是默认不支持的,接下来准备编译并开启压缩支持。
1.Centos 7
1.hadoop-2.6.0-cdh5.15.1源码 官网下载链接
2.Maven 官网下载地址
3.protobuf 百度网盘下载链接 提取码: bwp1 GitHub下载页面
4.jdk-7u80-linux-x64.tar.gz 百度网盘下载链接 提取码: v1xs
注意:
1、根据前辈的帖子,编译的JDK版本必须是1.7,1.8的JDK会导致编译失败
2、当我们下载万hadoop-2.6.0-cdh5.15.1源码后,我们可以在源码根目录下的BUILDING.txt
文件中查看源码包的编译要求,使用命令cat BUILDING.txt
[root@suddev hadoop-2.6.0-cdh5.15.1]# cat BUILDING.txt
Build instructions for Hadoop
----------------------------------------------------------------------------------
Requirements:
* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel ( if compiling native hadoop-pipes )
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
我们看到除了刚才准备的软件外我们还需要其他软件包,这里只需通过Yum命令来安装就行了
yum install -y svn ncurses-devel
yum install -y gcc gcc-c++ make cmake
yum install -y openssl openssl-devel svn ncurses-devel zlib-devel libtool
yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop autoconf automake cmake
tar -zxvf jdk-7u80-linux-x64.tar.gz
tar -zxvf apache-maven-3.6.1-bin.tar.gz
vim apache-maven-3.6.1/conf/settings.xml
# 在 标签中添加阿里云中央仓库地址
aliyunmaven
*,!cloudera
阿里云公共仓库
https://maven.aliyun.com/repository/public
先解压
tar -zxvf protobuf-2.5.0.tar.gz
编译并安装
./configure
make
make install
路径一定记着修改成自己对应的
vim ~/.bashrc
#在文件末尾添加以下环境变量
export JAVA_HOME=/bd/app/jdk1.7.0_80
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
export MAVEN_HOME=/bd/app/apache-maven-3.6.1
export PROTOBUF_HOME=/bd/software/protobuf-2.5.0
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PROTOBUF_HOME/bin:$PATH
# :wq退出Vim并设置环境变量生效
source ~/.bashrc
jdk测试
[suddev@suddev ~]$ java -version
# 打印以下字样说明安装成功
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
Maven测试
[suddev@suddev ~]$ mvn -v
# 打印以下字样说明安装成功
Apache Maven 3.6.1 (d66c9c0b3152b2e69ee9bac180bb8fcc8e6af555; 2019-04-05T03:00:29+08:00)
Maven home: /bd/app/apache-maven-3.6.1
Java version: 1.7.0_80, vendor: Oracle Corporation, runtime: /bd/app/jdk1.7.0_80/jre
Default locale: zh_CN, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-957.el7.x86_64", arch: "amd64", family: "unix"
protobuf测试
[suddev@suddev ~]$ protoc --version
# 打印以下字样说明安装成功
libprotoc 2.5.0
tar -zxvf hadoop-2.6.0-cdh5.15.1-src.tar.gz
vim hadoop-2.6.0-cdh5.15.1-src/pom.xml
将
替换成以下内容(建议复制原文件备份)
<repositories>
<repository>
<id>aliyunmavenid>
<url>http://maven.aliyun.com/nexus/content/groups/public//url>
<releases>
<enabled>trueenabled>
releases>
<snapshots>
<enabled>trueenabled>
<checksumPolicy>failchecksumPolicy>
snapshots>
repository>
<repository>
<id>clouderaid>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/url>
repository>
repositories>
使用Maven编译hadoop并使其支持压缩:mvn clean package -Pdist,native -DskipTests -Dtar
cd hadoop-2.6.0-cdh5.15.1-src
mvn clean package -Pdist,native -DskipTests -Dtar
根据机器性能和网络情况,这个过程耗时会很长,如果网络不行可以尝试购买阿里云香港主机进行编译,效果可能会好一些。
编译成功后在hadoop-2.6.0-cdh5.15.1/hadoop-dist/target/下会有hadoop-2.6.0-cdh5.15.1.tar.gz
包,即为编译后结果
tar -zxvf hadoop-2.6.0-cdh5.15.1.tar.gz
cd hadoop-2.6.0-cdh5.15.1
./bin/hadoop checknative
# 打印以下结果说明成功!
19/08/05 19:27:00 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
19/08/05 19:27:00 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /bd/app/hadoop-2.6.0-cdh5.15.1/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /lib64/libsnappy.so.1
lz4: true revision:10301
bzip2: true /lib64/libbz2.so.1
openssl: true /lib64/libcrypto.so
这个错网上说是JDK1.7的bug,弄了半天没解决,后来想着既然是SSL出错必然是https的原因,于是把pom.xml中的Cloudera Repository从https换成http即
http://repository.cloudera.com/artifactory/cloudera-repos/
就解决了
这个应该就是网络原因吧,我遇到了很多次,非常头疼,不过解决办法很简单:前往Maven本地repo仓库到目标jar文件目录,然后通过wget 命令,从repository.cloudera.com来获取该文件,重新执行编译命令
wget https://repository.cloudera.com/artifactory/cloudera-repos/com/cloudera/cdh/cdh-root/5.15.1/cdh-root-5.15.1.pom
hadoop-2.6.0-cdh5.7.0源码编译支持压缩
https://blog.csdn.net/liweihope/article/details/89605340#hadoop_174
hadoop之hadoop-2.6.0-cdh5.7.0源码编译支持压缩以及伪分布式部署
https://blog.csdn.net/qq_32641659/article/details/89074365#commentBox
maven仓库不支持cdh解决方案
https://blog.csdn.net/eieiei438/article/details/81742833