Windows通过IDEA开发虚拟机中Hadoop

搭建Hadoop环境,让其能够在Windows中进行开发
步骤1 关闭防火墙
先关闭防火墙,这样可以让比如Hadoop的50070端口供给外界访问
centOS 6.5关闭防火墙步骤
关闭命令: service iptables stop
永久关闭防火墙:chkconfig iptables off
两个命令同时运行,运行完成后查看防火墙关闭状态
service iptables status
步骤2 搭建伪分布式环境
具体搭建环境请参见Hadoop官网

注意 为了能够让其在Windows中能够通过IDEA访问虚拟机中的Hadoop,那么就需要在core-site.xml等配置文件中使用ip地址,而不是hostname,不然windows端会报Connection Error

执行bin/hadoop namenode -format
执行sbin/start-dfs.sh启动hdfs
执行sbin/start-yarn.sh启动yarn
步骤3 Windows端配置
1, windows端配置Hadoop 环境变量,

Windows通过IDEA开发虚拟机中Hadoop_第1张图片
Paste_Image.png

2, Windows为了能够访问Hadoop,需要加入几个包放置到hadoop目录的bin文件夹中

Windows通过IDEA开发虚拟机中Hadoop_第2张图片
Paste_Image.png

3, windows 在etc host文件配置能够访问虚拟机hadoop机器的hostname

Paste_Image.png

4, 打开IDEA开发项目,然后将配置文件放到resources文件中

Windows通过IDEA开发虚拟机中Hadoop_第3张图片
Paste_Image.png

步骤4 IDEA开发Hadoop Yarn
这里以WordCount例子为例
package ComponentApp;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
/**

  • Created by IBM on 2017/7/16.
    */
    public class WordCount2 implements Tool {
    public void setConf(Configuration configuration) {

    }

    public Configuration getConf() {
    return new JobConf(WordCount2.class);
    }

    public int run(String[] strings) throws Exception {
    try {
    Configuration conf = getConf();
    conf.set("mapreduce.job.jar", "D:\java\idea\ComponentApp\out\artifacts\ComponentApp_jar\ComponentApp.jar");
    conf.set("mapreduce.framework.name", "yarn");
    conf.set("yarn.resourcemanager.hostname", "192.168.137.131");
    conf.set("mapreduce.app-submission.cross-platform", "true");

         Job job = Job.getInstance(conf);
         job.setJarByClass(WordCount2.class);
    
         job.setOutputKeyClass(Text.class);
         job.setOutputValueClass(LongWritable.class);
    
         job.setMapperClass(WcMapper.class);
         job.setReducerClass(WcReducer.class);
    
         job.setInputFormatClass(TextInputFormat.class);
         job.setOutputFormatClass(TextOutputFormat.class);
    
         FileInputFormat.setInputPaths(job, "hdfs://192.168.137.131:9000/kason/myid");
         FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.137.131:9000/kason/out4"));
    
         job.waitForCompletion(true);
     } catch (Exception e) {
         e.printStackTrace();
     }
     return 0;
    

    }

    public static class WcMapper extends Mapper{
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String mVal = value.toString();
    String[] strs = mVal.split(" ");
    for(String s : strs) {
    System.out.println("data:" + s);
    context.write(new Text(s), new LongWritable(1));
    }
    }
    }
    public static class WcReducer extends Reducer{
    @Override
    protected void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException {
    long sum = 0;
    for(LongWritable lVal : values){
    sum += lVal.get();
    }
    context.write(key, new LongWritable(sum));
    }
    }
    public static void main(String[] args) throws Exception {
    ToolRunner.run(new WordCount2(),args);
    }
    }
    IDEA运行结果

Windows通过IDEA开发虚拟机中Hadoop_第4张图片
Paste_Image.png

YARN 页面

Windows通过IDEA开发虚拟机中Hadoop_第5张图片
Paste_Image.png

HDFS页面

Windows通过IDEA开发虚拟机中Hadoop_第6张图片
Paste_Image.png

你可能感兴趣的:(Windows通过IDEA开发虚拟机中Hadoop)