mapreduce的缓存(addCacheFile)使用

1.在main()方法中添加缓存路径

job.addCacheFile(new URI(args[2]));
job.addCacheFile(new URI(args[3]));

2.在map或者reduce的setup方法中处理缓存文件

FileReader in = null;
BufferedReader reader = null;
HashMap<String, String> n_map = null;
Path[] cacheFiles = context.getLocalCacheFiles();
Path cacheFile = cacheFiles[0];
Path cacheFile2 = cacheFiles[1];

in = new FileReader(cacheFile.toUri().getPath());
reader = new BufferedReader(in);
n_map = new HashMap<String, String>();
String line = null;
while (null != (line = reader.readLine())) {
    String[] fields = line.split("\001");
    if (fields.length > 4) {
        String f1 = fields[0];
        String f2 = fields[4];
        n_map.put(f1, f2);
    }
}
IOUtils.closeStream(reader);
IOUtils.closeStream(in);

你可能感兴趣的:(hadoop,mapreduce)