hadoop的本地模式安装和简单示例

一:配置Java环境

这里就不详细介绍了,网上有大把的教程

 

二:安装hadoop

hadoop下载地址:https://hadoop.apache.org/releases.html

 

解压hadoop

# tar -zxvf hadoop-2.9.2.tar.gz

 

配置hadoop的环境变量:

# vi /etc/profile

在文件最后添加如下两行:

export HADOOP_HOME=/root/develop/hadoop-2.9.2

export PATH=$PATH:$HADOOP_HOME/bin:/$HADOOP_HOME/sbin

使配置生效

# source /etc/profile

执行hadoop version查询是否配置成功:

# hadoop version
Hadoop 2.9.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 826afbeae31ca687bc2f8471dc841b66ed2c6704
Compiled by ajisaka on 2018-11-13T12:42Z
Compiled with protoc 2.5.0
From source with checksum 3a9939967262218aa556c684d107985
This command was run using /root/develop/hadoop-2.9.2/share/hadoop/common/hadoop-common-2.9.2.jar

下面以一个简单的例子进行演示

https://download.csdn.net/download/vincent_yuan89/10883834

备注:来自《Hadoop权威指南.大数据的存储与分析.第4版》随书源码

 

把源码工程打包成jar包

如hadoop-examples.jar

设置HADOOP_CLASSPATH环境变量

# export HADOOP_CLASSPATH=hadoop-examples.jar

执行hadoop命令

# hadoop MaxTemperature input/sample.txt output

后面跟的参数分别为需要执行的应用类名,输入的源数据,输出的目录

执行的结果如图

 File System Counters
                FILE: Number of bytes read=25034
                FILE: Number of bytes written=947710
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=5
                Map output records=5
                Map output bytes=45
                Map output materialized bytes=61
                Input split bytes=94
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=61
                Reduce input records=5
                Reduce output records=2
                Spilled Records=10
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=39
                Total committed heap usage (bytes)=243048448
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=529
        File Output Format Counters 
                Bytes Written=29

在output目录下有相关的结果记录

# cat output/part-r-00000

1949 111

1950 22

 

你可能感兴趣的:(java,大数据)