第一个hadoop程序(hadoop2.4.0集群+Eclipse环境)

一、Eclipse hadoop环境配置 

1. 在我的电脑右键->属性->高级系统设置->环境变量,配置环境变量:

        JAVA_HOME=D:\ProgramFiles\Java\jdk1.7.0_67

      HADOOP_HOME=D:\TEDP_Software\hadoop-2.4.0,

      PATH=.;%JAVA_HOME%\bin;%HADOOP_HOME%\bin;

2. 在Eclipse中安装好hadoop-eclipse-kepler-plugin-2.2.0.jar插件,并配置好Hadoop Server

二、WordCount程序

1.准备测试文件
[hadoop@master hadoop]# mkdir file 

[hadoop@master hadoop]# cd file

[hadoop@master file]# ls
[hadoop@master file]# echo "Hello world">file1.txt
[hadoop@master file]# echo"Hello hadoop">file2.txt

2. 输入文件夹
创建Hadoop文件夹: hadoop fs -mkdir /user
权限设置:hadoop fs -chmod -R 777 /user
创建输入文件夹: hadoop fs -mkdir /user/input
查看文件夹: hadoop fs -ls /
上传文件到Hadoop: hadoop fs -put ~/file/file*.txt /user/input
报错1:
java.net.NoRouteToHostException: No route to host
(或在hive中:could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.)
防火墙没关闭导致的:各主机切换到root, 执行 service iptables stop 
 
3. 新建MR工程,将附件中WordCount.java拷贝进去
WordCount 类上右键 ->Run as->Run Configurations, 输入如下参数信息:
hdfs://192.168.1.200:9000/user/input hdfs://192.168.1.200:9000/user/output
 
4.Run on hadoop
(1)异常信息1: Exception in thread "main" java.lang.NullPointerException
解决办法:  百度上说,这是 Hadoop windows 上的一个 BUG ,在 linux 上没有问题
下载 hadoop-common-2.2.0-bin-master.ziphadoop-common-2.2.0-bin-master.zip解压后将

bin中的文件替换到.\hadoop-2.4.0\bin

并将bin中的hadoop.dll拷贝到C:\Windows\System32中,重启电脑。

(2)异常信息2:14/12/02 21:01:01 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

解决办法:   配置本地环境变量: HADOOP_HOME =D:\Soft\Linux\hadoop-2.4.0需重启,

不想重启的话在代码中加:  System.setProperty("hadoop.home.dir", "D:\\Soft\\Linux\\hadoop-2.4.0"); 
(3)异常信息3: Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://192.168.1.200:9000/user/output already exists

解决办法:  output文件夹已存在,修改一下输出文件夹或间output删掉

(4)异常信息4: 然后没反应了(这是后来新建第二个hadoop程序时发生的错误)

解决办法:到Run Configurations->main中发现mainclass为jline.ANSIBuffer, 改成WordCount,让后点击“Run”即可

注意:如果用”Run As“ ->“Run On Hadoop”菜单执行,在弹出页面选择Select Type的时候要输入或选择WordCount;

5.OK 运行结果:

Hello 2

hadoop 1

world 1

6. 附件: WordCount .java文件
 
import java.io.IOException;
import java.util.*;
 
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
 
public class WordCount {
 
 public static class Map extends MapReduceBase implements
   Mapper<LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();
 
  public void map(LongWritable key, Text value,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
   String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);
   while (tokenizer.hasMoreTokens()) {
    word.set(tokenizer.nextToken());
    output.collect(word, one);
   }
  }
 }
 
 public static class Reduce extends MapReduceBase implements
   Reducer<Text, IntWritable, Text, IntWritable> {
  public void reduce(Text key, Iterator<IntWritable> values,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
   int sum = 0;
   while (values.hasNext()) {
    sum += values.next().get();
   }
   output.collect(key, new IntWritable(sum));
  }
 }
 
 public static void main(String[] args) throws Exception {
 
 // System.setProperty("hadoop.home.dir", "D:\\Soft\\Linux\\hadoop-2.4.0");
 
  JobConf conf = new JobConf(WordCount.class);
  conf.setJobName("wordcount");
 
  conf.setOutputKeyClass(Text.class);
  conf.setOutputValueClass(IntWritable.class);
 
  conf.setMapperClass(Map.class);
  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);
 
  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);
 
  FileInputFormat.setInputPaths(conf, new Path(args[0]));
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));
 
  JobClient.runJob(conf);
 }
}

本文参考:http://www.cnblogs.com/xia520pi/archive/2012/05/16/2504205.html 

《完》
 

你可能感兴趣的:(eclipse)