Hadoop MapReduce之ReduceTask任务执行(三)

在reduce端的文件拷贝阶段,会将数据放入内存或直接放入磁盘中,如果文件全部拷贝完再进行合并那样必然降低作业效率,所以在拷贝进行到一定阶段,数据的合并就开始了,负责该工作的有两个线程:InMemFSMergeThread和LocalFSMerger,分别针对内存和磁盘Segment的合并。
首先看内存合并线程InMemFSMergeThread的run函数
  public void run() {
    LOG.info(reduceTask.getTaskID() + " Thread started: " + getName());
    try {
      boolean exit = false;
      do {
        exit = ramManager.waitForDataToMerge(); //检测是否需要合并
        if (!exit) {
          doInMemMerge();//执行合并
        }
      } while (!exit);
    } catch (Exception e) {
      LOG.warn(reduceTask.getTaskID() +
               " Merge of the inmemory files threw an exception: "
               + StringUtils.stringifyException(e));
      ReduceCopier.this.mergeThrowable = e;
    } catch (Throwable t) {
      String msg = getTaskID() + " : Failed to merge in memory" 
                   + StringUtils.stringifyException(t);
      reportFatalError(getTaskID(), t, msg);
    }
  }
  下面是内存合并的条件,注释写的已经很清楚了,这里需要注意的是内存的使用量、拷贝完毕的文件数、挂起线程数,线程挂起的判断条件是用于保留map端数据的内存超过阈值,可参考ShuffleRamManager.reserve()函数
  public boolean waitForDataToMerge() throws InterruptedException {
    boolean done = false;
    synchronized (dataAvailable) {
             // Start in-memory merge if manager has been closed or...
      while (!closed
             &&
             // In-memory threshold exceeded and at least two segments
             // have been fetched
             (getPercentUsed() < maxInMemCopyPer || numClosed < 2)
             &&
             // More than "mapred.inmem.merge.threshold" map outputs
             // have been fetched into memory
             (maxInMemOutputs <= 0 || numClosed < maxInMemOutputs)
             && 
             // More than MAX... threads are blocked on the RamManager
             // or the blocked threads are the last map outputs to be
             // fetched. If numRequiredMapOutputs is zero, either
             // setNumCopiedMapOutputs has not been called (no map ouputs
             // have been fetched, so there is nothing to merge) or the
             // last map outputs being transferred without
             // contention, so a merge would be premature.
             (numPendingRequests < 
                  numCopiers*MAX_STALLED_SHUFFLE_THREADS_FRACTION && 
              (0 == numRequiredMapOutputs ||
               numPendingRequests < numRequiredMapOutputs))) {
        dataAvailable.wait();
      }
      done = closed;
    }
    return done;
  }  
  这里的合并可以和map端的合并对比来看,逻辑大同小异,确定文件名、构建写入器,将segment放入合并队列中,如果有本地合并函数则先合并否则直接写入文件。
  private void doInMemMerge() throws IOException{
    if (mapOutputsFilesInMemory.size() == 0) {
      return;
    }
    
    //name this output file same as the name of the first file that is 
    //there in the current list of inmem files (this is guaranteed to
    //be absent on the disk currently. So we don't overwrite a prev. 
    //created spill). Also we need to create the output file now since
    //it is not guaranteed that this file will be present after merge
    //is called (we delete empty files as soon as we see them
    //in the merge method)


    //figure out the mapId 
    TaskID mapId = mapOutputsFilesInMemory.get(0).mapId;


    List<Segment<K, V>> inMemorySegments = new ArrayList<Segment<K,V>>();
    long mergeOutputSize = createInMemorySegments(inMemorySegments, 0);
    int noInMemorySegments = inMemorySegments.size();


    Path outputPath =
        mapOutputFile.getInputFileForWrite(mapId, mergeOutputSize);


    Writer writer = 
      new Writer(conf, rfs, outputPath,
                 conf.getMapOutputKeyClass(),
                 conf.getMapOutputValueClass(),
                 codec, null);


    RawKeyValueIterator rIter = null;
    try {
      LOG.info("Initiating in-memory merge with " + noInMemorySegments + 
               " segments...");
      
      rIter = Merger.merge(conf, rfs,
                           (Class<K>)conf.getMapOutputKeyClass(),
                           (Class<V>)conf.getMapOutputValueClass(),
                           inMemorySegments, inMemorySegments.size(),
                           new Path(reduceTask.getTaskID().toString()),
                           conf.getOutputKeyComparator(), reporter,
                           spilledRecordsCounter, null);
      
      if (combinerRunner == null) {
        Merger.writeFile(rIter, writer, reporter, conf);
      } else {
        combineCollector.setWriter(writer);
        combinerRunner.combine(rIter, combineCollector);
      }
      writer.close();


      LOG.info(reduceTask.getTaskID() + 
          " Merge of the " + noInMemorySegments +
          " files in-memory complete." +
          " Local file is " + outputPath + " of size " + 
          localFileSys.getFileStatus(outputPath).getLen());
    } catch (Exception e) { 
      //make sure that we delete the ondisk file that we created 
      //earlier when we invoked cloneFileAttributes
      localFileSys.delete(outputPath, true);
      throw (IOException)new IOException
              ("Intermediate merge failed").initCause(e);
    }


    // Note the output of the merge
    FileStatus status = localFileSys.getFileStatus(outputPath);
    synchronized (mapOutputFilesOnDisk) {
      addToMapOutputFilesOnDisk(status);
    }
  }
}
磁盘文件的合并与此大致相同,可以具体细节可以查看org.apache.hadoop.mapred.ReduceTask.ReduceCopier.LocalFSMerger

你可能感兴趣的:(Hadoop MapReduce之ReduceTask任务执行(三))