map任务的输出由ReduceTask类的方法long copyOutput(MapOutputLocation loc)实现,包括以下几个步骤:
1.检查是否已经拷贝,如果已经拷贝,则返回-2表示要拷贝的数据已经过期
// check if we still need to copy the output from this location
if (copiedMapOutputs.contains(loc.getTaskId()) ||
obsoleteMapIds.contains(loc.getTaskAttemptId())) {
return CopyResult.OBSOLETE;
}
2. 构造map输出的路径及文件名和本地用于存储远程数据的临时文件路径
//map输出文件名output/map_任务Id.out
Path filename =
new Path(String.format(
MapOutputFile.REDUCE_INPUT_FILE_FORMAT_STRING,
TaskTracker.OUTPUT, loc.getTaskId().getId()));
// Copy the map output to a temp file whose name is unique to this attempt
//拷贝到本地的临时文件名
Path tmpMapOutput = new Path(filename+"-"+id);
3. 执行数据的拷贝
这步主要由函数getMapOutput()实现,在下面会详细描述这个个过程
// Copy the map output
MapOutput mapOutput = getMapOutput(loc, tmpMapOutput,
reduceId.getTaskID().getId());
4.以同步并发的机制实现以下功能
synchronized (ReduceTask.this) {}
1)再次检查当前拷贝的数据是否已经拷贝过,如果拷贝过,则丢弃
if (copiedMapOutputs.contains(loc.getTaskId())) {
mapOutput.discard();
return CopyResult.OBSOLETE;
}
2)检查原始map输出数据大小是否为0,如果为0,则把拷贝生成的文件删除
// Special case: discard empty map-outputs
if (bytes == 0) {
try {
mapOutput.discard();
} catch (IOException ioe) {
LOG.info("Couldn't discard output of " + loc.getTaskId());
}
// Note that we successfully copied the map-output
noteCopiedMapOutput(loc.getTaskId());
return bytes;
}
3)分别处理拷贝完成的数据,分为内存和本地文件两种
a.数据被拷贝到内存中,则把拷贝的内存数据句柄加入集合中
// Process map-output
if (mapOutput.inMemory) {
// Save it in the synchronized list of map-outputs
mapOutputsFilesInMemory.add(mapOutput);
}
b.数据存储在本地文件,则把临时文件重命名为最终文件
// Rename the temporary file to the final file;
// ensure it is on the same partition
//把拷贝生成的临时文件重命名为最后
tmpMapOutput = mapOutput.file;
//把output/output/map_任务Id.out-0这样的临时文件重命名为
//output/output/map_任务Id.out这样的文件
filename = new Path(tmpMapOutput.getParent(), filename.getName());
if (!localFileSys.rename(tmpMapOutput, filename)) {
localFileSys.delete(tmpMapOutput, true);
bytes = -1;
throw new IOException("Failed to rename map output " +
tmpMapOutput + " to " + filename);
}
4)把本次拷贝的任务加入已经拷贝任务的集合中,并修改可拷贝的任务数
// Note that we successfully copied the map-output
//把此任务id加入进copiedMapOutputs
//并把还需要拷贝的map任务数置为(总数-已经拷贝的数量)
noteCopiedMapOutput(loc.getTaskId());
此方法内部代码为:
/**
* Save the map taskid whose output we just copied.
* This function assumes that it has been synchronized on ReduceTask.this.
*
* @param taskId map taskid
*/
private void noteCopiedMapOutput(TaskID taskId) {
copiedMapOutputs.add(taskId);
ramManager.setNumCopiedMapOutputs(numMaps - copiedMapOutputs.size());
}
getMapOutput是数据拷贝的主实现方法,以下是这个方法的源码解析,方法签名为
private MapOutput getMapOutput(MapOutputLocation mapOutputLoc,
Path filename, int reduce)
throws IOException, InterruptedException
内部实现步骤:
1.获取map任务输出地址的连接和输入流
// Connect
URL url = mapOutputLoc.getOutputLocation();
URLConnection connection = url.openConnection();
InputStream input = setupSecureConnection(mapOutputLoc, connection);
2.检查当前地址的map输出是否是想要获取的map输出
// Validate header from map output
TaskAttemptID mapId = null;
try {
mapId =
TaskAttemptID.forName(connection.getHeaderField(FROM_MAP_TASK));
} catch (IllegalArgumentException ia) {
LOG.warn("Invalid map id ", ia);
return null;
}
TaskAttemptID expectedMapId = mapOutputLoc.getTaskAttemptId();
if (!mapId.equals(expectedMapId)) {
LOG.warn("data from wrong map:" + mapId +
" arrived to reduce task " + reduce +
", where as expected map output should be from " + expectedMapId);
return null;
}
如果是,则往下继续执行,如果不是,则说明取数据的地址出现问题,则返回
3.检查map输出的数据大小是否大于零,包括压缩和未压缩的情况
//未压缩的数据
long decompressedLength =
Long.parseLong(connection.getHeaderField(RAW_MAP_OUTPUT_LENGTH));
//压缩的数据长度
long compressedLength =
Long.parseLong(connection.getHeaderField(MAP_OUTPUT_LENGTH));
if (compressedLength < 0 || decompressedLength < 0) {
LOG.warn(getName() + " invalid lengths in map output header: id: " +
mapId + " compressed len: " + compressedLength +
", decompressed len: " + decompressedLength);
return null;
}
4.检查map输出的分区是否属于此reduce任务
//检查是否属于此reduce任务的输出,我的理解是,map端的分区输出记录有reduce的 //任务id,需要查看map端输出
//猜测?job在初始化任务的时候,已经创建了所有的map任务ID以及reduce任务ID
int forReduce =
(int)Integer.parseInt(connection.getHeaderField(FOR_REDUCE_TASK));
//reduce的值为当前reduce任务id
if (forReduce != reduce) {
LOG.warn("data for the wrong reduce: " + forReduce +
" with compressed len: " + compressedLength +
", decompressed len: " + decompressedLength +
" arrived to reduce task " + reduce);
return null;
}
5.执行数据的拷贝
此步,又可以分为以下几个详细的步骤:
1)检查剩下的内存是否足够存储拷贝的数据
//We will put a file in memory if it meets certain criteria:
//1. The size of the (decompressed) file should be less than 25% of
// the total inmem fs
//2. There is space available in the inmem fs
// Check if this map-output can be saved in-memory
//通过检查输出数据没有压缩的大小与内存能放的最大值比较,如果小于,则可以放,如 //果大于,则不可以放内存
//最大值是mapred.job.reduce.total.mem.bytes配置的0.25倍
boolean shuffleInMemory = ramManager.canFitInMemory(decompressedLength);