idea编写mapreduce程序打包放到服务器运行过程记录

1.新建项目,然后创建目录,编写简单worldcount的demo:

package com.hadoop.mapreduce.wordcount;

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MyWordCount {

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration(true);
        Job job = Job.getInstance(conf);
        job.setJarByClass(MyWordCount.class);
        // Create a new Job
//        Job job = Job.getInstance();
//        job.setJarByClass(MyJob.class);
        job.setJobName("wordcount-01");

        // Specify various job-specific parameters
        job.setJobName("myjob");

        //写死的方式
//        job.setInputPath(new Path("in"));
//        job.setOutputPath(new Path("out"));

        Path inputPath = new Path("/user/root/test.txt");
        FileInputFormat.addInputPath(job,inputPath);

        Path outputPath = new Path("/result/output");
        if (outputPath.getFileSystem(conf).exists(outputPath)){
            outputPath.getFileSystem(conf).delete(outputPath,true);
        }

        FileOutputFormat.setOutputPath(job , outputPath);

        job.setMapperClass(MyMapper.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setReducerClass(MyReducer.class);



        // Submit the job, then poll for progress until the job is complete
        job.waitForCompletion(true);

    }

    public static class MyMapper extends Mapper {

        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(Object key, Text value,  Mapper.Context context) throws IOException, InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }

    }

    public static class MyReducer extends Reducer {

        private IntWritable result = new IntWritable();

        public void reduce(Text key, Iterable values, Reducer.Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key, result);
        }

    }

}

2.idea打jar包:

步骤一:新增empty的artifact

idea编写mapreduce程序打包放到服务器运行过程记录_第1张图片

步骤二:由于我的项目里有别的代码,因此这里指定class打包,按照目录创建出文件夹

idea编写mapreduce程序打包放到服务器运行过程记录_第2张图片

步骤三:将编译后的class文件添加到目录中

idea编写mapreduce程序打包放到服务器运行过程记录_第3张图片

步骤四:添加META-INF,这里可以创建新的也可以使用之前创建好的

idea编写mapreduce程序打包放到服务器运行过程记录_第4张图片

步骤四:打jar包

idea编写mapreduce程序打包放到服务器运行过程记录_第5张图片

完成以上步骤之后jar包就在默认的目录中了,这里的输出路径可以根据自己的喜好选择。

3.jar上传服务器后运行:

hadoop jar unnamed.jar com.hadoop.mapreduce.wordcount.MyWordCount

 

你可能感兴趣的:(hadoop)