首先需要编译eclipse插件。
下载源代码:
git clone https://github.com/winghc/hadoop2x-eclipse-plugin.git注意编译之前,需要安装ant。未安装时候,到apache官网下载安装包
http://ant.apache.org/bindownload.cgi下载完成hadoop-eclipse插件后,在当前目录下会有一个文件夹hadoop2x-eclipse-plugin,进入目录hadoop2x-eclipse-plugin/src/contrib/eclipse-plugin目录。执行下面的命令:
ant jar -Dversion=2.6.0 -Declipse.home=/usr/local/eclipse -Dhadoop.home=/usr/local/hadoop经过一段时间后,看到下面信息,编译完成。
[jar] Building jar: /home/hadoop/software/hadoop2x-eclipse-plugin/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.6.0.jar然后拷贝插件hadoop-eclipse-plugin-2.6.0.jar到eclipse插件目录/usr/local/eclipse/plugins。
然后启动eclipse。
配置 hadoop 安装目录
window ->preference -> hadoop Map/Reduce -> Hadoop installation directory
配置Map/Reduce 视图
window ->Open Perspective -> other->Map/Reduce -> 点击“OK” windows → show view → other->Map/Reduce Locations-> 点击“OK”
空白的地方右键,选择“New Hadoop location…”,弹出对话框“New hadoop location…”,配置如下内容:
这里需要注意的是,这里的mapreduce与dfs选项的参数与mapre-site.xml与core-site.xml文件内容保持一致。
<name>mapred.job.tracker</name> <value>http://master.hadoop:9001</value>
<name>fs.defaultFS</name> <value>hdfs://master.hadoop:9000</value>然后编写一个简单的程序测试hadoop集群。
File->New->project->Map/Reduce Project->Next添加单词统计程序:
package wordcount; import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; public class WordCount { public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } }然后选择工程,右键run as->run configurations:
main class中选择,写好的类。再点击(x)=Arguements,填入下面的参数:
然后点击Run -> Run as -> Run on Hadoop,选择wordcount-wordcount,点击OK,即可完成程序的运行。
点击菜单栏左侧的DFS Locations可以查看hadoop集群里面的文件,直接查找我们要查看的文件可以看到输出结果。
至此,hadoop-eclipse开发环境的搭建完成~~~