1. 客户端
1)Job.java
用户编辑好的MapReduce程序会通过Job.waitForCompletion(true)提交任务。
public boolean waitForCompletion(boolean verbose ) throws IOException, InterruptedException, ClassNotFoundException { // 通过submit()方法提交Job if (state == JobState.DEFINE) { submit(); } if (verbose) { //在循环中不断得到此任务的状态,并打印到客户端的Console中 monitorAndPrintJob(); } else { ... } return isSuccessful(); } public void submit() throws IOException, InterruptedException, ClassNotFoundException { // 确认Job状态 ensureState(JobState.DEFINE); // 使用YARN的API setUseNewAPI(); // 初始化用于获取MapReduce程序状态的Cluster,该类会去配置文件中加载是将job提交到Yarn还是JobTracker中 connect(); // 构建用于提交Job的JobSubmitter final JobSubmitter submitter = getJobSubmitter(cluster.getFileSystem(), cluster.getClient()); status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() { public JobStatus run() throws IOException, InterruptedException, ClassNotFoundException { return submitter.submitJobInternal(Job.this, cluster); } }); // 更新Job状态 state = JobState.RUNNING; LOG.info("The url to track the job: " + getTrackingURL()); }
2) JobSubmitter.java
由Job提交的任务将由JobSubmitter接管,在提交前进行一些检查和准备工作
JobStatus submitJobInternal(Job job, Cluster cluster) throws ClassNotFoundException, InterruptedException, IOException { // 检查Job的output格式是否符合要求 checkSpecs(job); Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, job.getConfiguration()); // 确认提交Job时使用的参数是否正确,比如运行该Job的host等 Configuration conf = job.getConfiguration(); InetAddress ip = InetAddress.getLocalHost(); if (ip != null) { submitHostAddress = ip.getHostAddress(); submitHostName = ip.getHostName(); conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName); conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress); } // 从ClientPotocol的实现类获得当前job的id JobID jobId = submitClient.getNewJobID(); job.setJobID(jobId); Path submitJobDir = new Path(jobStagingArea, jobId.toString()); JobStatus status = null; // 将该job任务运行所在的程序的jar包、所要处理的input split信息以及配置项写入HDFS try { conf.set("hadoop.http.filter.initializers", "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer"); conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString()); LOG.debug("Configuring job " + jobId + " with " + submitJobDir + " as the submit dir"); TokenCache.obtainTokensForNamenodes(job.getCredentials(), new Path[] { submitJobDir }, conf); populateTokenCache(conf, job.getCredentials()); copyAndConfigureFiles(job, submitJobDir); Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir); // 写入Split信息 LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir)); int maps = writeSplits(job, submitJobDir); conf.setInt(MRJobConfig.NUM_MAPS, maps); LOG.info("number of splits:" + maps); // 写入Job提交的queue String queue = conf.get(MRJobConfig.QUEUE_NAME, JobConf.DEFAULT_QUEUE_NAME); AccessControlList acl = submitClient.getQueueAdmins(queue); conf.set(toFullPropertyName(queue, QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString()); // 删除用于提供用户访问的Token TokenCache.cleanUpTokenReferral(conf); // 写入配置信息 writeConf(conf, submitJobFile); printTokens(jobId, job.getCredentials()); // 调用ClientProtocol的submitJob方法提交Job status = submitClient.submitJob( jobId, submitJobDir.toString(), job.getCredentials()); ... }
3) YARNRunner.java
在Yarn框架中,将由实现了ClientPotocol的YARNRunner类接管submitJob工作,检查运行条件并初始化运行MRAppMaster的相关信息,向ResourceManager申请Container运行MRAppMaster
public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts) throws IOException, InterruptedException { // 查看是否开启HistoryServer,如果开启则设置相关信息 MRClientProtocol hsProxy = clientCache.getInitializedHSProxy(); if (hsProxy != null) { if (conf.getBoolean(JobClient.HS_DELEGATION_TOKEN_REQUIRED, DEFAULT_HS_DELEGATION_TOKEN_REQUIRED)) { Token hsDT = getDelegationTokenFromHS(hsProxy, new Text( conf.get(JobClient.HS_DELEGATION_TOKEN_RENEWER))); ts.addToken(hsDT.getService(), hsDT); } } // 写入Job相关文件的相关信息:文件大小等 Path applicationTokensFile = new Path(jobSubmitDir, MRJobConfig.APPLICATION_TOKENS_FILE); try { ts.writeTokenStorageFile(applicationTokensFile, conf); } catch (IOException e) { throw new YarnException(e); } /* 初始化Application上下文 ApplicationId ApplicationName Queue:Application将被提交到的队列 Priority:Application的优先级 User:运行MRAppMaster的用户 AMContainerSpec:运行ApplicationMaster的Container的信息 ContainerId User:运行MRAppMaster的用户 Resource:ResourceManager分配给该MRAppMaster的资源 ContainerToken:Security模式下的SecurityTokens LocalResources:MRAppMaster所在的jar包、Job的配置文件、Job程序所在的jar包、每个Split的相关信息等 ServiceData: Environment:运行MRAppMaster的ClassPath以及其他的环境便令 Commands:运行MRAppMaster的Command,如:$JAVA_HOME/bin/java MRAppMaster.class.getName() ... ApplicationACLs:MRAppMaster的访问控制列表 */ ApplicationSubmissionContext appContext = createApplicationSubmissionContext(conf, jobSubmitDir, ts); // 由ResourceMgrDelegate提交给ResourceManager ApplicationId applicationId = resMgrDelegate.submitApplication(appContext); // 由ResourceMgrDelegate获取MRAppMaster的运行信息 ApplicationReport appMaster = resMgrDelegate .getApplicationReport(applicationId); ... return clientCache.getClient(jobId).getJobStatus(jobId); }
4) ResourceMgrDelegate.java
ResourceMgrDelegate负责和ResourceManager的通信,并向ResourceManager提交启动ApplicationMaster(MRAppMaster)
// 连接ResourceManager public ResourceMgrDelegate(YarnConfiguration conf) { this.conf = conf; YarnRPC rpc = YarnRPC.create(this.conf); InetSocketAddress rmAddress = NetUtils.createSocketAddr(this.conf.get( YarnConfiguration.RM_ADDRESS, YarnConfiguration.DEFAULT_RM_ADDRESS), YarnConfiguration.DEFAULT_RM_PORT, YarnConfiguration.RM_ADDRESS); this.rmAddress = rmAddress.toString(); LOG.debug("Connecting to ResourceManager at " + rmAddress); applicationsManager = (ClientRMProtocol) rpc.getProxy(ClientRMProtocol.class, rmAddress, this.conf); LOG.debug("Connected to ResourceManager at " + rmAddress); } // 获取一个Application public JobID getNewJobID() throws IOException, InterruptedException { GetNewApplicationRequest request = recordFactory.newRecordInstance(GetNewApplicationRequest.class); applicationId = applicationsManager.getNewApplication(request).getApplicationId(); // TypeConverter类用于对Yarn格式的Job信息和旧版本Hadoop的Job信息进行转换,比如JobId、ApplicationId、TaskId等 return TypeConverter.fromYarn(applicationId); } // 将Application提交到ResourceManager,ResourceManager将分配Container运行MRAppMaster public ApplicationId submitApplication( ApplicationSubmissionContext appContext) throws IOException { appContext.setApplicationId(applicationId); SubmitApplicationRequest request = recordFactory.newRecordInstance(SubmitApplicationRequest.class); request.setApplicationSubmissionContext(appContext); applicationsManager.submitApplication(request); LOG.info("Submitted application " + applicationId + " to ResourceManager" + " at " + rmAddress); return applicationId; } // 向ResourceManager询问Application运行的信息 public ApplicationReport getApplicationReport(ApplicationId appId) throws YarnRemoteException { GetApplicationReportRequest request = recordFactory .newRecordInstance(GetApplicationReportRequest.class); request.setApplicationId(appId); GetApplicationReportResponse response = applicationsManager .getApplicationReport(request); ApplicationReport applicationReport = response.getApplicationReport(); return applicationReport; }
至此Client端(即提交Job的机器)的工作结束,接下来将由ResourceManager接管,分配Container运行MRAppMaster
2. Server端(ResourceManager)
To be continued...