Hadoop MR程序学习笔记

1 在客户端程序必须要显示设置

        myJob.setMapOutputKeyClass(LongWritable.class);
        myJob.setMapOutputValueClass(Text.class);

不然map任务会没结果输出

2 hdfs访问文件

        String url = HDFS_URL +PATH+ "/part-r-00000";
        FileSystem fs = FileSystem.get(URI.create(url), configuration);
        InputStream in = null;
        try {
            in = fs.open(new Path(url));
            IOUtils.copyBytes(in, System.out, 4096, false);
        } finally {
            IOUtils.closeStream(in);
        }

3 中文乱码,输入的非UTF8的文件,在Map中进行转码

    protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        String line = new String(value.getBytes(), 0, value.getLength(), "GBK");
        ......
}



你可能感兴趣的:(Hadoop)