下载的版本和你linux安装时的版本一样即可,我的是hadoop-2.7.2
下载地址:http://archive.apache.org/dist/hadoop/common/hadoop-2.7.2/
解压文件:
下载地址:http://download.csdn.net/detail/tondayong1981/9432425
把hadoop-eclipse-plugin-2.7.2.jar(具体版本视你的hadoop版本而定)放到eclipse安装目录的plugins文件夹中,如果重新打开eclipse后看到有如下视图,则说明你的hadoop插件已经安装成功了:
然后配置eclipse的hadoop路径,就是指向你解压后的hadoop文件,如图所示
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0modelVersion>
<groupId>axc-hadoopgroupId>
<artifactId>axcartifactId>
<packaging>warpackaging>
<version>0.0.1-SNAPSHOTversion>
<name>axc Maven Webappname>
<url>http://maven.apache.orgurl>
<dependencies>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-commonartifactId>
<version>2.7.2version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-hdfsartifactId>
<version>2.7.2version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-clientartifactId>
<version>2.7.2version>
dependency>
<dependency>
<groupId>org.apache.logging.log4jgroupId>
<artifactId>log4j-1.2-apiartifactId>
<version>2.7version>
dependency>
<dependency>
<groupId>commons-logginggroupId>
<artifactId>commons-loggingartifactId>
<version>1.2version>
dependency>
<dependency>
<groupId>junitgroupId>
<artifactId>junitartifactId>
<version>3.8.1version>
<scope>testscope>
dependency>
<dependency>
<groupId>jdk.toolsgroupId>
<artifactId>jdk.toolsartifactId>
<version>1.7version>
<scope>systemscope>
<systemPath>${JAVA_HOME}/lib/tools.jarsystemPath>
dependency>
dependencies>
<build>
<finalName>axcfinalName>
build>
project>
将hadoop安装在linux上的etc/hadoop下的三个核心配置文件放入到工程中,分别是:core-site.xml、hdfs-site.xml、mapred-site.xml
我本地的配置如下:
(1)core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dirname>
<value>file:/home/liaohui/tools/hadoop/hadoop-2.7.2/tmpvalue>
<description>Abase for other temporary directories.description>
property>
<property>
<name>fs.defaultFSname>
<value>hdfs://liaomaster:9000value>
property>
configuration>
(2)hdfs-site.xml
<configuration>
<property>
<name>dfs.replicationname>
<value>1value>
property>
<property>
<name>dfs.namenode.name.dirname>
<value>file:/home/liaohui/tools/hadoop/hadoop-2.7.2/tmp/dfs/namevalue>
property>
<property>
<name>dfs.datanode.data.dirname>
<value>file:/home/liaohui/tools/hadoop/hadoop-2.7.2/tmp/dfs/datavalue>
property>
<property>
<name>dfs.permissionsname>
<value>falsevalue>
property>
configuration>
3.mapred-site.xml
<configuration>
<property>
<name>mapred.job.trackername>
<value>liaomaster:9001value>
property>
configuration>
package com.liaohui;
import java.io.IOException;
import java.util.Iterator;
import java.util.StringTokenizer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class WordCount {
public static class WordCountMapper extends MapReduceBase implements Mapper
(1)配置两个本地电脑系统环境变量
%HADOOP_HOME%\bin
E:\java\openSource\hadoop\hadoop-2.7.7
(2)问题Exception in thread “main”java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
分析:C:\Windows\System32下缺少hadoop.dll,把这个文件拷贝到C:\Windows\System32下面即可。
解决:hadoop-common-2.2.0-bin-master下的bin的hadoop.dll放到C:\Windows\System32下,然后重启电脑,也许还没那么简单,还是出现这样的问题。
我们在继续分析:我们在出现错误的的atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)
Windows的唯一方法用于检查当前进程的请求,在给定的路径的访问权限,所以我们先给以能进行访问,我们自己先修改源代码,return true 时允许访问。我们下载对应hadoop源代码,hadoop-2.6.0-src.tar.gz解压,hadoop-2.6.0-src\hadoop- common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio 下NativeIO.java 复制到对应的Eclipse的project,然后修改557行为return true如图所示:
(3)问题四:org.apache.hadoop.security.AccessControlException: Permissiondenied: user=zhengcy, access=WRITE,inode=”/user/root/output”:root:supergroup:drwxr-xr-x
分析:我们没权限访问output目录。
解决:我们 在设置hdfs配置的目录是在hdfs-site.xml配置hdfs文件存放的地方,我们在这个etc/hadoop下的hdfs-site.xml添加
<property>
<name>dfs.permissionsname>
<value>falsevalue>
property>
(1)运行程序前eclipse配置hadoop环境
先启动Linux下Hadoop程序,如图所示
配置eclipse仓库路径,鼠标右击新建一个Hadoop路径
(2)成功输出统计结果