mapreduce案例实战-排序和二次排序案例

目录

  • 1.数据GroupingComparator.txt:
  • 2.需求
  • 3.代码
    • 1)序列化的bean对象,用于保存展示数据
    • 2)mapper
    • 3)reduce
    • 4)测试类

1.数据GroupingComparator.txt:

0000001 Pdt_01 222.8
0000002 Pdt_06 722.4
0000001 Pdt_05 25.8
0000003 Pdt_01 222.8
0000003 Pdt_01 33.8
0000002 Pdt_03 522.8
0000002 Pdt_04 122.4

2.需求

根据订单号id排序, 相同的订单号,价格倒序

3.代码

1)序列化的bean对象,用于保存展示数据

@Setter
@Getter
public class FlowBean implements WritableComparable<FlowBean> {
    private long id;
    private double price;

    //无参数的构造方法,在反序列化的时候调用
    public FlowBean() {
    }

    @Override
    public int compareTo(FlowBean o) {
        if (this.id > o.id) {
            return 1;
        }else if(this.id < o.id){
            return -1;
        }else{
            return this.price > o.price ? -1 : 1;
        }
    }

    @Override
    public void write(DataOutput output) throws IOException {
        output.writeLong(this.id);
        output.writeDouble(this.price);
    }

    @Override
    public void readFields(DataInput input) throws IOException {
        this.id = input.readLong();
        this.price = input.readDouble();
    }

    @Override
    public String toString() {
        return this.id + "\t" + this.price;
    }

}

2)mapper

//0000001	Pdt_01	222.8
//mapreduce根据key进行排序,所以k2,v2的类型应该为 FlowBean, NullWritable
public class SortMap extends Mapper<LongWritable, Text, FlowBean, NullWritable> {
    FlowBean bean = new FlowBean();

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String s = value.toString();
        String[] split = s.split("\t");
        bean.setId(Long.parseLong(split[0]));
        bean.setPrice(Double.parseDouble(split[2]));
        context.write(bean, NullWritable.get());
    }
}

3)reduce

public class SortReduce extends Reducer<FlowBean, NullWritable, FlowBean, NullWritable> {

    @Override
    protected void reduce(FlowBean bean, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException {
        context.write(bean, NullWritable.get());
    }
}

4)测试类

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

public class MainWritable {
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        args = new String[]{"F:\\input\\GroupingComparator.txt", "F:\\output\\GroupingComparator"};
        //获取配置文件
        Configuration conf = new Configuration();
        //创建job任务
        Job job = Job.getInstance(conf);
        job.setJarByClass(MainWritable.class);


        //指定Map类和map的输出类型 Text, NullWritable
        job.setMapperClass(SortMap.class);
        job.setMapOutputKeyClass(FlowBean.class);
        job.setMapOutputValueClass(NullWritable.class);

        job.setReducerClass(SortReduce.class);
        job.setOutputKeyClass(FlowBean.class);
        job.setOutputValueClass(NullWritable.class);

        //指定数据输入的路径和输出的路径
        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.waitForCompletion(true);

    }
}

运行结果:
mapreduce案例实战-排序和二次排序案例_第1张图片

mapreduce案例实战-排序和二次排序案例_第2张图片


ok, 大功告成(深爱这句话!)

你可能感兴趣的:(大数据)