linux编译Hadoop步骤

版本:

系统:CentOS 6.5
Hadoop:2.8.1
JDK:1.8

1.hadoop源代码下载

下载方式可以使用wget命令加上官方给的地址来下载,例如:

wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.8.3/hadoop-2.8.3-src.tar.gz

https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/ 里面目前已经没有2.8.1了。
也可以在浏览器上面下载,然后放到sourcecode文件中。如果使用的是虚拟机,也可以在Windows上下载好,然后上传到linux上。

创建文件夹

建议在/opt/下面创建两个文件夹software(编译用到的软件放在这里)和sourcecode(Hadoop放在这里)。

[root@hadoop001 ~]# mkdir -p /opt/sourcecode /opt/software
[root@hadoop001 sourcecode]# pwd
/opt/sourcecode
[root@hadoop001 sourcecode]# ll
-rw-r--r--.  1 root root 34523353 Mar  6 21:51 hadoop-2.8.1-src.tar.gz
[root@hadoop001 software]# pwd
/opt/software
[root@hadoop001 software]# ll
-rw-r--r--.  1 root root 8617253 Mar  6 22:08 apache-maven-3.3.9-bin.zip
-rw-r--r--.  1 root root 7546219 Mar  6 22:08 findbugs-1.3.9.zip
-rw-r--r--.  1 root root 2401901 Mar  6 22:08 protobuf-2.5.0.tar.gz

解压:

[root@hadoop001 sourcecode]# tar -xzvf hadoop-2.8.1-src.tar.gz
[root@hadoop001 sourcecode]# ll
total 33760
drwxr-xr-x. 17 root root     4096 Jun  2 14:13 hadoop-2.8.1-src
-rw-r--r--.  1 root root 34523353 Aug 20 12:14 hadoop-2.8.1-src.tar.gz
[root@hadoop001 sourcecode]# cd hadoop-2.8.1-src
[root@hadoop001 hadoop-2.8.1-src]# ll
total 224
-rw-rw-r--.  1 root root 15623 May 24 07:14 BUILDING.txt
drwxr-xr-x.  4 root root  4096 Aug 20 12:44 dev-support
drwxr-xr-x.  3 root root  4096 Aug 20 12:44 hadoop-assemblies
drwxr-xr-x.  3 root root  4096 Aug 20 12:44 hadoop-build-tools
drwxrwxr-x.  2 root root  4096 Aug 20 12:44 hadoop-client
drwxr-xr-x. 10 root root  4096 Aug 20 12:44 hadoop-common-project
drwxr-xr-x.  2 root root  4096 Aug 20 12:44 hadoop-dist
drwxr-xr-x.  8 root root  4096 Aug 20 12:44 hadoop-hdfs-project
drwxr-xr-x.  9 root root  4096 Aug 20 12:44 hadoop-mapreduce-project
drwxr-xr-x.  3 root root  4096 Aug 20 12:44 hadoop-maven-plugins
drwxr-xr-x.  2 root root  4096 Aug 20 12:44 hadoop-minicluster
drwxr-xr-x.  3 root root  4096 Aug 20 12:44 hadoop-project
drwxr-xr-x.  2 root root  4096 Aug 20 12:44 hadoop-project-dist
drwxr-xr-x. 18 root root  4096 Aug 20 12:44 hadoop-tools
drwxr-xr-x.  3 root root  4096 Aug 20 12:44 hadoop-yarn-project
-rw-rw-r--.  1 root root 99253 May 24 07:14 LICENSE.txt
-rw-rw-r--.  1 root root 15915 May 24 07:14 NOTICE.txt
drwxrwxr-x.  2 root root  4096 Jun  2 14:24 patchprocess
-rw-rw-r--.  1 root root 20477 May 29 06:36 pom.xml
-rw-r--r--.  1 root root  1366 May 20 13:30 README.txt
-rwxrwxr-x.  1 root root  1841 May 24 07:14 start-build-env.sh

查看官方安装需求(部分):

[root@hadoop001 hadoop-2.8.1-src]# cat BUILDING.txt
Build instructions for Hadoop

----------------------------------------------------------------------------------
Requirements:

* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel (if compiling native hadoop-pipes and to get the best HDFS encryption performance)
* Linux FUSE (Filesystem in Userspace) version 2.6 or above (if compiling fuse_dfs)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

大致的需求是:Unix(内核)系统、jdk1.7或以上、Maven3.0或更高版本、Findbugs 1.3.9(如果用的到的话)、ProtocolBuffer 2.5.0、CMake 2.6或更新版本(如果编译本地代码)。第一次构建需要联网(下载Maven和Hadoop依赖用)。

其实可以先尝试安装,等报错了再解决,解决错误也是一种能力。

2.JAVA安装

建议把jdk放在/usr/java文件夹内,建议版本1.8,因为大数据的部分组件可能需要。

[root@hadoop001 ~]# mkdir -p /usr/java
[root@hadoop001 ~]# mv jdk-8u45-linux-x64.gz /usr/java
[root@hadoop001 ~]# cd /usr/java
[root@hadoop001 java]# tar -xzvf jdk-8u45-linux-x64.gz

修改用户和用户组

解压后的用户和用户组是压缩时的用户和用户组,为了避免以后出问题,建议修改用户和用户组。

[root@hadoop001 java]# ll
total 169388
drwxr-xr-x. 8 uucp  143      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Mar 16 15:25 jdk-8u45-linux-x64.gz
[root@hadoop001 java]# chown -R root:root jdk1.8.0_45
[root@hadoop001 java]# ll
total 169388
drwxr-xr-x. 8 root root      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Mar 16 15:25 jdk-8u45-linux-x64.gz

配置环境变量

[root@hadoop001 java]# vi /etc/profile

在末尾启用编辑模式,加入两行代码:

export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH

然后使用source让变量生效。

[root@hadoop001 java]# source /etc/profile
[root@hadoop001 java]# which java
/usr/java/jdk1.8.0_45/bin/java
[root@hadoop001 java]# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
[root@hadoop001 java]# 

3.Maven安装

可以安装# yum -y install lrzsz来实现:
rz:从本地上传到linux
sz:从linux下载到本地


[root@hadoop001 ~]# cd /opt/software/
[root@hadoop001 software]# rz
rz waiting to receive.
Starting zmodem transfer.  Press Ctrl+C to cancel.
Transferring apache-maven-3.3.9-bin.zip...
  100%    8415 KB    8415 KB/sec    00:00:01       0 Errors  

[root@hadoop001 software]# ll
total 8432
-rw-r--r--. 1 root root 8617253 Aug 20 12:35 apache-maven-3.3.9-bin.zip
[root@hadoop001 software]# unzip apache-maven-3.3.9-bin.zip

配置变量:

[root@hadoop001 java]# vi /etc/profile
export MAVEN_HOME=/opt/software/apache-maven-3.3.9
export MAVEN_OPTS="-Xms256m -Xmx512m"
//在原有的基础上添加$MAVEN_HOME/bin:
export PATH=$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH

验证是否安装成功:

[root@hadoop001 ~]# mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /opt/software/apache-maven-3.3.9
Java version: 1.8.0_45, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_45/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-431.el6.x86_64", arch: "amd64", family: "unix"

4.protobuf安装

[root@hadoop001 software]# tar -xzvf protobuf-2.5.0.tar.gz
[root@hadoop001 software]# ll
total 10792
drwxr-xr-x.  6 root   root    4096 Nov 10  2015 apache-maven-3.3.9
-rw-r--r--.  1 root   root 8617253 Aug 20 12:35 apache-maven-3.3.9-bin.zip
drwxr-xr-x. 10 109965 5000    4096 Feb 27  2013 protobuf-2.5.0
-rw-r--r--.  1 root   root 2401901 Aug 20 13:03 protobuf-2.5.0.tar.gz
[root@hadoop001 software]# cd protobuf-2.5.0
[root@hadoop001 protobuf-2.5.0]# yum install -y gcc gcc-c++ make cmake
//-y是都选yes的意思,安装四个(gcc、gcc-c++、make、cmake)

[root@hadoop001 protobuf-2.5.0]# ./configure --prefix=/usr/local/protobuf
//prefix是指定软件安装目录,上面的protobuf-2.5.0的用户和用户组可以不修改,因为用不到,但是建议看见就改掉,这样少了许多麻烦。

[root@hadoop001 protobuf-2.5.0]# make && make install
//&&是让前后两个命令依次执行的意思

[root@hadoop001 java]# vi /etc/profile
export PROTOC_HOME=/usr/local/protobuf
export PATH=$PROTOC_HOME/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
[root@hadoop001 protobuf-2.5.0]# source /etc/profile

[root@hadoop001 protobuf-2.5.0]# protoc --version
libprotoc 2.5.0

5.Findbugs安装

[root@hadoop001 software]# unzip findbugs-1.3.9.zip

[root@hadoop001 software]# vi /etc/profile
export FINDBUGS_HOME=/opt/software/findbugs-1.3.9
export PATH=$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH

[root@hadoop001 software]# source /etc/profile
[root@hadoop001 software]# findbugs -version
1.3.9

6.其他依赖

安装其他官网给出的所需软件

yum install -y openssl openssl-devel svn ncurses-devel zlib-devel libtool
yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop autoconf automake

7.编译

BUILDING.txt里面详细的介绍了整个过程,下面就是最后一步,文档里面关于编译有:

Building distributions:

Create binary distribution without native code and without documentation:

  $ mvn package -Pdist -DskipTests -Dtar

Create binary distribution with native code and with documentation:

  $ mvn package -Pdist,native,docs -DskipTests -Dtar

Create source distribution:

  $ mvn package -Psrc -DskipTests

Create source and binary distributions with native code and documentation:

  $ mvn package -Pdist,native,docs,src -DskipTests -Dtar

Create a local staging version of the website (in /tmp/hadoop-site)

  $ mvn clean site -Preleasedocs; mvn site:stage -DstagingDirectory=/tmp/hadoop-site

可以根据情况自己选择。
第一次编译:mvn package -Pdist,native -DskipTests -Dtar
非第一次(不管是不是第一次都可以使用这个):mvn clean package -Pdist,native -DskipTests -Dtar

[root@hadoop001 sourcecode]# cd hadoop-2.8.1-src
[root@hadoop001 hadoop-2.8.1-src]# mvn clean package -Pdist,native -DskipTests -Dtar

编译成功会显示:

[INFO] --- maven-site-plugin:3.5:attach-descriptor (attach-descriptor) @ hadoop-dist ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  7.770 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  4.990 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  4.920 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 12.936 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.809 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  5.013 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 17.617 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 23.551 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 26.352 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 10.305 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [05:58 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 19.105 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 27.479 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.174 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:32 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [06:42 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 20.370 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 40.720 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [ 21.298 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 11.818 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.148 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.192 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 41.517 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [01:19 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.192 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 19.421 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 42.398 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  8.925 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 16.120 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 57.415 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  3.869 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 14.325 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 11.814 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 10.027 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.276 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  8.333 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  5.473 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.160 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 13.204 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  8.106 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.514 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [01:09 min]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 40.479 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 10.304 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 27.335 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 19.910 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 16.657 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  4.591 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 12.346 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  5.966 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [  7.940 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 15.245 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  5.380 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  5.812 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 11.785 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  9.890 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  5.784 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  3.254 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  5.495 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 10.630 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 11.234 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 14.060 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 10.535 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 13.519 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  2.164 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 10.405 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 11.514 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [  9.201 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.129 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [01:07 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 31:41 min
[INFO] Finished at: 2017-12-10T11:55:28+08:00
[INFO] Final Memory: 166M/494M
[INFO] ------------------------------------------------------------------------

需要下载的东西太多,光下载就需要很久。
整个完成大概需要2-3个小时,如果中间出现问题,如:
1、有时候编译过程中会出现下载某个包的时间太久,这是由于连接网站的过程中会出现假死,
此时按ctrl+c,重新运行编译命令。
2、如果出现缺少了某个文件的情况,则要先清理maven(使用命令 mvn clean) 再重新编译(或者使用带clean的那个)。
有一个快速的方法,就是如果有一个已经编译好的机器,那么可以把编译好的机器的/root/.m2这个文件复制到另一台上面的/root/下面(实际上就是maven库)

[root@hadoop001 ~]# cd .m2
[root@hadoop001 .m2]# pwd
/root/.m2

8.编译好的tar包

编译好后会出现一个hadoop-2.8.1.tar.gz这个tar包。
路径:/opt/sourcecode/hadoop-2.8.1-src/hadoop-dist/target/hadoop-2.8.1.tar.gz

你可能感兴趣的:(Hadoop)