CC00064.hadoop——|Hadoop&MapReduce.V35|——|Hadoop.v35||Hadoop二次开发环境|搭建示例|

一、Hadoop二次开发环境搭建
### --- 系统环境

~~~     系统:linux122: CentOS-7_x86_64
protobuf: protoc-2.5.0
maven: maven-3.6.0
hadoop: hadoop-2.9.2
java: jdk1.8.0_231
cmake: cmake-2.8.12.2
OpenSSL: OpenSSL 1.0.2k-fips
findbugs: findbugs-1.3.9
### --- 准备工作

~~~     # 安装编译需要的依赖库
~~~     # linux122主机执行
[root@linux122 ~]# yum install -y lzo-devel zlib-devel autoconf automake libtool cmake openssldevel cmake gcc gcc-c++
二、安装Maven
### --- 上传maven二进制包并解压安装

~~~     # 上传maven安装包、解压缩
[root@linux122 software]# tar -zxvf apache-maven-3.6.3-bin.tar.gz -C /usr/local/
### --- 配置系统环境变量

~~~     # 配置到系统环境变量
[root@linux122 software]# vim /etc/profile
# MAVEN_HOME
export MAVEN_HOME=/usr/local/apache-maven-3.6.3
export PATH=$PATH:$MAVEN_HOME/bin
~~~     # 刷新配置文件
[root@linux122 software]# source /etc/profile
### --- 验证maven是否安装成功

[root@linux122 software]# mvn -version
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/apache-maven-3.6.3
Java version: 1.8.0_231, vendor: Oracle Corporation, runtime: /opt/yanqi/servers/jdk1.8.0_231/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-957.el7.x86_64", arch: "amd64", family: "unix"
三、安装protobuf:序列化的框架机制
### --- 安装依赖环境

[root@linux122 ~]# yum groupinstall Development tools -y
### --- 下载安装包

~~~     # 下载
[root@linux122 software]# wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz
~~~     #上传protobuf安装包
~~~     # 解压缩
[root@linux122 software]# tar -zxvf protobuf-2.5.0.tar.gz
### --- 编译安装

~~~     # 进入解压目录 配置安装路径(--prefix=/usr/local/protobuf-2.5.0)
[root@linux122 software]# cd protobuf-2.5.0
[root@linux122 protobuf-2.5.0]# ./configure --prefix=/usr/local/protobuf-2.5.0
~~~     # 编译
[root@linux122 protobuf-2.5.0]# make
~~~     # 验证编译文件
[root@linux122 protobuf-2.5.0]# make check
~~~     # 安装
[root@linux122 protobuf-2.5.0]# make install
### --- 配置环境变量

~~~     # 配置protobuf环境变量
[root@linux122 ~]# vim /etc/profile
##PROTOBUF_HOME
export PROTOCBUF_HOME=/usr/local/protobuf-2.5.0
export PATH=$PATH:$PROTOCBUF_HOME/bin
~~~     # 刷新配置文件
[root@linux122 ~]# source /etc/profile
~~~     # 验证是否安装成功
[root@linux122 ~]# protoc --version
libprotoc 2.5.0
四、安装Findbugs
### --- 下载软件包

~~~     # 下载
~~~     # 上传安装包
$ https://jaist.dl.sourceforge.net/project/findbugs/findbugs/1.3.9/findbugs-1.3.9.tar.gz
### --- 安装软件包

~~~     # 解压缩
[root@linux122 software]# tar -zxvf findbugs-1.3.9.tar.gz -C /usr/local/
### --- 配置环境变量

~~~     # 配置系统环境变量
[root@linux122 ~]# vim /etc/profile
## FINDBUGS_HOME
export FINDBUGS_HOME=/usr/local/findbugs-1.3.9
export PATH=$PATH:$FINDBUGS_HOME/bin
~~~     # 刷新配置文件
[root@linux122 ~]# source /etc/profile
### --- 验证是否部署成功

[root@linux122 ~]# findbugs -version
1.3.9
五、添加aliyun镜像
### --- 找到maven环境下的settings.xml文件,添加镜像代理

[root@linux122 ~]# vim /usr/local/apache-maven-3.6.3/conf/settings.xml

      nexus
      *
      http://maven.aliyun.com/nexus/content/groups/public/
    
    
      nexus-public-snapshots
      public-snapshots
      http://maven.aliyun.com/nexus/content/repositories/snapshots/
六、上传源码文件
### --- 将自己编写的配置文件导入到源码包中;进行封装

~~~     # wordcount项目中
//  MergeInputFormat
//  MergeRecordReader

~~~     # 源码包位置
//  hadoop-2.9.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoopmapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input

~~~     # 将自己写的代码放到源码包中进行封装
CC00064.hadoop——|Hadoop&MapReduce.V35|——|Hadoop.v35||Hadoop二次开发环境|搭建示例|_第1张图片
### --- 进入代码文件目标路径

~~~     # 进入源码编译目录下
[root@linux122 ~]# cd /root/hadoop-2.9.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/
~~~     # 将MergeInputFormat.java和MergeRecordReader.java上传到该目录下
[root@linux122 input]# pwd
/root/hadoop-2.9.2-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input
[root@linux122 input]# ll
-rw-r--r-- 1 root root     1867 Aug 19 23:01 MergeInputFormat.java
-rw-r--r-- 1 root root     2843 Aug 19 23:01 MergeRecordReader.java
### --- 编译

~~~     # 进入Hadoop源码目录
[root@linux122 input]# cd /root/hadoop-2.9.2-src

~~~     # 执行编译命令
[root@linux122 hadoop-2.9.2-src]# mvn package -Pdist,native -DskipTests -Dtar

~~~     # 编译后生成的jar包位置及jar包
[root@linux122 hadoop-2.9.2-src]# ls hadoop-dist/
pom.xml
 
~~~     # 编译生成的jar包
[root@linux122 hadoop-dist]# pwd
/root/hadoop-2.9.2-src/hadoop-dist
[root@linux122 hadoop-dist]# pwd
/root/hadoop-2.9.2-src/hadoop-dist
[root@linux122 hadoop-dist]# ls target/
antrun                    dist-tar-stitching.sh  hadoop-dist-2.9.2.jar          hadoop-dist-2.9.2-test-sources.jar  maven-shared-archive-resources
classes                   hadoop-2.9.2           hadoop-dist-2.9.2-javadoc.jar  javadoc-bundle-options              test-classes
dist-layout-stitching.sh  hadoop-2.9.2.tar.gz    hadoop-dist-2.9.2-sources.jar  maven-archiver                      test-dir

~~~     # 自己编写的打成jar包位置
~~~     里面含有MergeInputFormat.java和MergeRecordReader.java文件

[root@linux122 ~]# cd hadoop-2.9.2-src/hadoop-dist/target/hadoop-2.9.2/share/hadoop/mapreduce/
hadoop-mapreduce-client-core-2.9.2.jar
### --- 编译成功

[INFO] Reactor Summary for Apache Hadoop Main 2.9.2:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  5.214 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  4.830 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.969 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.714 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.466 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  3.761 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  8.122 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  7.711 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 10.418 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  7.660 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [01:54 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 12.256 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 15.964 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.130 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 29.469 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [01:21 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [  5.105 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 25.855 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [  9.607 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [  6.868 s]
[INFO] Apache Hadoop HDFS-RBF ............................. SUCCESS [ 38.402 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.069 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.071 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 19.898 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [ 48.027 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [  8.516 s]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.087 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 19.856 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 21.764 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  4.517 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 10.905 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [  7.621 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 34.623 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  1.906 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 15.152 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [  5.897 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [  4.661 s]
[INFO] Apache Hadoop YARN Router .......................... SUCCESS [  9.551 s]
[INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [ 12.263 s]
[INFO] Apache Hadoop YARN Timeline Service HBase tests .... SUCCESS [  3.132 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.090 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  4.193 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  3.905 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.103 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [  0.061 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  7.952 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.384 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 38.992 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 27.521 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [  6.161 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 21.886 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 11.528 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 15.226 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  2.744 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 10.097 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  3.668 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 11.195 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 10.439 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  3.504 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  3.719 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [  8.608 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  7.772 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  4.552 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  3.369 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  5.278 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  9.118 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  7.151 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 22.322 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 12.390 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [  8.251 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [  7.108 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  1.612 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  8.219 s]
[INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [  7.365 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [  7.077 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 19.244 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  1.384 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [01:33 min]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [  5.750 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [  0.080 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  17:32 min
[INFO] Finished at: 2021-08-19T23:55:42+08:00
[INFO] ------------------------------------------------------------------------
一、自己编译的jar包如何调用
### --- 自己编译的jar包如何调用

~~~     通过反编译软件查看手动编写的MergeInputFormat.java和MergeRecordReader.java
~~~     是否打包在jar中
CC00064.hadoop——|Hadoop&MapReduce.V35|——|Hadoop.v35||Hadoop二次开发环境|搭建示例|_第2张图片
二、调用jar包
### --- 调用jar包

~~~     通过workcount编写的程序运行
### --- 创建step4工程:拷贝原有step2工程中的文件到该目录下
~~~     在:com.yanqi.mr.comment.step2.MergeInputFormat和com.yanqi.mr.comment.step2.MergeRecordReader

### --- 重命名为:
~~~     MyMergeRecordReader
~~~     MyMergeInputFormat
### --- MergeDriver报错
### --- 导入重新打包后的jar包:
~~~     包含这两个文件MergeInputFormat.java和MergeRecordReader.java

### --- 运行程序后可以正常执行
~~~     # 备份原有的hadoop-mapreduce-client-core-2.9.2.jar包为.bak
~~~     C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.9.2
~~~     hadoop-mapreduce-client-core-2.9.2.jar.bak
### --- 替换新的jar包

~~~     hadoop-mapreduce-client-core-2.9.2.jar
~~~     该jar包含有:MergeInputFormat.java和MergeRecordReader.java这两个文件
CC00064.hadoop——|Hadoop&MapReduce.V35|——|Hadoop.v35||Hadoop二次开发环境|搭建示例|_第3张图片
附录一:报错处理
### --- 报错现象:

[INFO] Apache Hadoop Amazon Web Services support .......... FAILED [ 7.011 s]
### --- 报错分析:

~~~     缺少依赖包:下载失败导致的DynamoDBLocal:jar
### --- 解决方案:
~~~     hadoop-aws:jar时缺少依赖包DynamoDBLocal:jar
~~~     选择手动下载该Jar包,上传到本地maven仓库

[root@linux122 ~]# ll  /root/.m2/repository/com/amazonaws/DynamoDBLocal/1.11.86
total 3628
-rw-r--r-- 1 root root 3713946 Nov 27  2020 DynamoDBLocal-1.11.86.jar

你可能感兴趣的:(反编译,java,maven,linux,大数据)