Linux下Hadoop环境配置及测试用例

配置Hadoop环境

下载安装

下载oracle jdk:jdk下载
得到:jdk-14.0.2_linux-x64_bin.tar.gz

下载Hadoop:Hadoop下载
得到:hadoop-3.3.0.tar.gz

将二者移动到/usr/local文件夹,分别解压:

$ sudo tar xzf hadoop-3.3.0.tar.gz
$ sudo mv hadoop-3.3.0 hadoop
$ sudo tar xzf jdk-14.0.2_linux-x64_bin.tar.gz

配置环境变量

$ sudo vim ~/.bashrc

在文件末尾添加:

#set oracle jdk && hadoop environment 
export JAVA_HOME=/usr/local/jdk-14.0.2
export HADOOP_HOME=/usr/local/hadoop
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH

执行:source ~/.bashrc使配置立即生效。

检查配置是否正确:

$ java -version
java version "14.0.2" 2020-07-14
Java(TM) SE Runtime Environment (build 14.0.2+12-46)
Java HotSpot(TM) 64-Bit Server VM (build 14.0.2+12-46, mixed mode, sharing)
$ hadoop version
Hadoop 3.3.0

Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r aa96f1871bfd858f9bac59cf2a81ec470da649af
Compiled by brahma on 2020-07-06T18:44Z
Compiled with protoc 3.7.1
From source with checksum 5dc29b802d6ccd77b262ef9d04d19c4
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.3.0.jar

简单测试

利用Hadoop安装提供的示例 MapReduce jar 文件,计算文件的单词总数。

$ mkdir input 
$ cp $HADOOP_HOME/*.txt input 
$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0.jar wordcount input ouput

output文件夹下出现两个文件:

part-r-00000  _SUCCESS

统计结果存储在 part-r-00000:

"AS     3
"Contribution"  1
"Contributor"   1
"Derivative     1
"Legal  1
"License"       1
"License");     1
"Licensor"      1
"NOTICE"        1
"Not    1
"Object"        1
"Software"),    1
"Source"        1
"Work"  1
"You"   1
"Your") 1
…

_SUCCESS为空文件,应该是用来指示执行成功的。

你可能感兴趣的:(大数据)