系统:CentOS 6.5
Hadoop:2.8.1
JDK:1.8
下载方式可以使用wget命令加上官方给的地址来下载,例如:
wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.8.3/hadoop-2.8.3-src.tar.gz
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/ 里面目前已经没有2.8.1了。
也可以在浏览器上面下载,然后放到sourcecode文件中。如果使用的是虚拟机,也可以在Windows上下载好,然后上传到linux上。
建议在/opt/下面创建两个文件夹software(编译用到的软件放在这里)和sourcecode(Hadoop放在这里)。
[root@hadoop001 ~]# mkdir -p /opt/sourcecode /opt/software
[root@hadoop001 sourcecode]# pwd
/opt/sourcecode
[root@hadoop001 sourcecode]# ll
-rw-r--r--. 1 root root 34523353 Mar 6 21:51 hadoop-2.8.1-src.tar.gz
[root@hadoop001 software]# pwd
/opt/software
[root@hadoop001 software]# ll
-rw-r--r--. 1 root root 8617253 Mar 6 22:08 apache-maven-3.3.9-bin.zip
-rw-r--r--. 1 root root 7546219 Mar 6 22:08 findbugs-1.3.9.zip
-rw-r--r--. 1 root root 2401901 Mar 6 22:08 protobuf-2.5.0.tar.gz
[root@hadoop001 sourcecode]# tar -xzvf hadoop-2.8.1-src.tar.gz
[root@hadoop001 sourcecode]# ll
total 33760
drwxr-xr-x. 17 root root 4096 Jun 2 14:13 hadoop-2.8.1-src
-rw-r--r--. 1 root root 34523353 Aug 20 12:14 hadoop-2.8.1-src.tar.gz
[root@hadoop001 sourcecode]# cd hadoop-2.8.1-src
[root@hadoop001 hadoop-2.8.1-src]# ll
total 224
-rw-rw-r--. 1 root root 15623 May 24 07:14 BUILDING.txt
drwxr-xr-x. 4 root root 4096 Aug 20 12:44 dev-support
drwxr-xr-x. 3 root root 4096 Aug 20 12:44 hadoop-assemblies
drwxr-xr-x. 3 root root 4096 Aug 20 12:44 hadoop-build-tools
drwxrwxr-x. 2 root root 4096 Aug 20 12:44 hadoop-client
drwxr-xr-x. 10 root root 4096 Aug 20 12:44 hadoop-common-project
drwxr-xr-x. 2 root root 4096 Aug 20 12:44 hadoop-dist
drwxr-xr-x. 8 root root 4096 Aug 20 12:44 hadoop-hdfs-project
drwxr-xr-x. 9 root root 4096 Aug 20 12:44 hadoop-mapreduce-project
drwxr-xr-x. 3 root root 4096 Aug 20 12:44 hadoop-maven-plugins
drwxr-xr-x. 2 root root 4096 Aug 20 12:44 hadoop-minicluster
drwxr-xr-x. 3 root root 4096 Aug 20 12:44 hadoop-project
drwxr-xr-x. 2 root root 4096 Aug 20 12:44 hadoop-project-dist
drwxr-xr-x. 18 root root 4096 Aug 20 12:44 hadoop-tools
drwxr-xr-x. 3 root root 4096 Aug 20 12:44 hadoop-yarn-project
-rw-rw-r--. 1 root root 99253 May 24 07:14 LICENSE.txt
-rw-rw-r--. 1 root root 15915 May 24 07:14 NOTICE.txt
drwxrwxr-x. 2 root root 4096 Jun 2 14:24 patchprocess
-rw-rw-r--. 1 root root 20477 May 29 06:36 pom.xml
-rw-r--r--. 1 root root 1366 May 20 13:30 README.txt
-rwxrwxr-x. 1 root root 1841 May 24 07:14 start-build-env.sh
[root@hadoop001 hadoop-2.8.1-src]# cat BUILDING.txt
Build instructions for Hadoop
----------------------------------------------------------------------------------
Requirements:
* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel (if compiling native hadoop-pipes and to get the best HDFS encryption performance)
* Linux FUSE (Filesystem in Userspace) version 2.6 or above (if compiling fuse_dfs)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
大致的需求是:Unix(内核)系统、jdk1.7或以上、Maven3.0或更高版本、Findbugs 1.3.9(如果用的到的话)、ProtocolBuffer 2.5.0、CMake 2.6或更新版本(如果编译本地代码)。第一次构建需要联网(下载Maven和Hadoop依赖用)。
其实可以先尝试安装,等报错了再解决,解决错误也是一种能力。
建议把jdk放在/usr/java文件夹内,建议版本1.8,因为大数据的部分组件可能需要。
[root@hadoop001 ~]# mkdir -p /usr/java
[root@hadoop001 ~]# mv jdk-8u45-linux-x64.gz /usr/java
[root@hadoop001 ~]# cd /usr/java
[root@hadoop001 java]# tar -xzvf jdk-8u45-linux-x64.gz
解压后的用户和用户组是压缩时的用户和用户组,为了避免以后出问题,建议修改用户和用户组。
[root@hadoop001 java]# ll
total 169388
drwxr-xr-x. 8 uucp 143 4096 Apr 11 2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Mar 16 15:25 jdk-8u45-linux-x64.gz
[root@hadoop001 java]# chown -R root:root jdk1.8.0_45
[root@hadoop001 java]# ll
total 169388
drwxr-xr-x. 8 root root 4096 Apr 11 2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Mar 16 15:25 jdk-8u45-linux-x64.gz
[root@hadoop001 java]# vi /etc/profile
在末尾启用编辑模式,加入两行代码:
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH
然后使用source让变量生效。
[root@hadoop001 java]# source /etc/profile
[root@hadoop001 java]# which java
/usr/java/jdk1.8.0_45/bin/java
[root@hadoop001 java]# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
[root@hadoop001 java]#
可以安装# yum -y install lrzsz来实现:
rz:从本地上传到linux
sz:从linux下载到本地
[root@hadoop001 ~]# cd /opt/software/
[root@hadoop001 software]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring apache-maven-3.3.9-bin.zip...
100% 8415 KB 8415 KB/sec 00:00:01 0 Errors
[root@hadoop001 software]# ll
total 8432
-rw-r--r--. 1 root root 8617253 Aug 20 12:35 apache-maven-3.3.9-bin.zip
[root@hadoop001 software]# unzip apache-maven-3.3.9-bin.zip
配置变量:
[root@hadoop001 java]# vi /etc/profile
export MAVEN_HOME=/opt/software/apache-maven-3.3.9
export MAVEN_OPTS="-Xms256m -Xmx512m"
//在原有的基础上添加$MAVEN_HOME/bin:
export PATH=$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
验证是否安装成功:
[root@hadoop001 ~]# mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /opt/software/apache-maven-3.3.9
Java version: 1.8.0_45, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_45/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-431.el6.x86_64", arch: "amd64", family: "unix"
[root@hadoop001 software]# tar -xzvf protobuf-2.5.0.tar.gz
[root@hadoop001 software]# ll
total 10792
drwxr-xr-x. 6 root root 4096 Nov 10 2015 apache-maven-3.3.9
-rw-r--r--. 1 root root 8617253 Aug 20 12:35 apache-maven-3.3.9-bin.zip
drwxr-xr-x. 10 109965 5000 4096 Feb 27 2013 protobuf-2.5.0
-rw-r--r--. 1 root root 2401901 Aug 20 13:03 protobuf-2.5.0.tar.gz
[root@hadoop001 software]# cd protobuf-2.5.0
[root@hadoop001 protobuf-2.5.0]# yum install -y gcc gcc-c++ make cmake
//-y是都选yes的意思,安装四个(gcc、gcc-c++、make、cmake)
[root@hadoop001 protobuf-2.5.0]# ./configure --prefix=/usr/local/protobuf
//prefix是指定软件安装目录,上面的protobuf-2.5.0的用户和用户组可以不修改,因为用不到,但是建议看见就改掉,这样少了许多麻烦。
[root@hadoop001 protobuf-2.5.0]# make && make install
//&&是让前后两个命令依次执行的意思
[root@hadoop001 java]# vi /etc/profile
export PROTOC_HOME=/usr/local/protobuf
export PATH=$PROTOC_HOME/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
[root@hadoop001 protobuf-2.5.0]# source /etc/profile
[root@hadoop001 protobuf-2.5.0]# protoc --version
libprotoc 2.5.0
[root@hadoop001 software]# unzip findbugs-1.3.9.zip
[root@hadoop001 software]# vi /etc/profile
export FINDBUGS_HOME=/opt/software/findbugs-1.3.9
export PATH=$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
[root@hadoop001 software]# source /etc/profile
[root@hadoop001 software]# findbugs -version
1.3.9
安装其他官网给出的所需软件
yum install -y openssl openssl-devel svn ncurses-devel zlib-devel libtool
yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop autoconf automake
BUILDING.txt里面详细的介绍了整个过程,下面就是最后一步,文档里面关于编译有:
Building distributions:
Create binary distribution without native code and without documentation:
$ mvn package -Pdist -DskipTests -Dtar
Create binary distribution with native code and with documentation:
$ mvn package -Pdist,native,docs -DskipTests -Dtar
Create source distribution:
$ mvn package -Psrc -DskipTests
Create source and binary distributions with native code and documentation:
$ mvn package -Pdist,native,docs,src -DskipTests -Dtar
Create a local staging version of the website (in /tmp/hadoop-site)
$ mvn clean site -Preleasedocs; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
可以根据情况自己选择。
第一次编译:mvn package -Pdist,native -DskipTests -Dtar
非第一次(不管是不是第一次都可以使用这个):mvn clean package -Pdist,native -DskipTests -Dtar
[root@hadoop001 sourcecode]# cd hadoop-2.8.1-src
[root@hadoop001 hadoop-2.8.1-src]# mvn clean package -Pdist,native -DskipTests -Dtar
编译成功会显示:
[INFO] --- maven-site-plugin:3.5:attach-descriptor (attach-descriptor) @ hadoop-dist ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [ 7.770 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 4.990 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 4.920 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 12.936 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.809 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 5.013 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 17.617 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 23.551 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 26.352 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 10.305 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [05:58 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 19.105 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 27.479 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.174 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:32 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [06:42 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 20.370 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 40.720 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [ 21.298 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 11.818 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.148 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [ 0.192 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 41.517 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [01:19 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [ 0.192 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 19.421 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 42.398 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 8.925 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 16.120 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 57.415 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 3.869 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 14.325 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 11.814 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 10.027 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [ 0.276 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 8.333 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 5.473 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [ 0.160 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 13.204 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 8.106 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 0.514 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [01:09 min]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 40.479 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 10.304 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 27.335 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 19.910 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 16.657 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 4.591 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 12.346 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 5.966 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 7.940 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 15.245 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 5.380 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 5.812 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 11.785 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 9.890 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 5.784 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 3.254 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 5.495 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 10.630 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 11.234 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 14.060 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 10.535 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 13.519 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 2.164 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 10.405 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 11.514 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 9.201 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.129 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [01:07 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 31:41 min
[INFO] Finished at: 2017-12-10T11:55:28+08:00
[INFO] Final Memory: 166M/494M
[INFO] ------------------------------------------------------------------------
需要下载的东西太多,光下载就需要很久。
整个完成大概需要2-3个小时,如果中间出现问题,如:
1、有时候编译过程中会出现下载某个包的时间太久,这是由于连接网站的过程中会出现假死,
此时按ctrl+c,重新运行编译命令。
2、如果出现缺少了某个文件的情况,则要先清理maven(使用命令 mvn clean) 再重新编译(或者使用带clean的那个)。
有一个快速的方法,就是如果有一个已经编译好的机器,那么可以把编译好的机器的/root/.m2这个文件复制到另一台上面的/root/下面(实际上就是maven库)
[root@hadoop001 ~]# cd .m2
[root@hadoop001 .m2]# pwd
/root/.m2
编译好后会出现一个hadoop-2.8.1.tar.gz这个tar包。
路径:/opt/sourcecode/hadoop-2.8.1-src/hadoop-dist/target/hadoop-2.8.1.tar.gz