先看书,看视频,写demo,总结已有经验和学习;再进一步阅读源码,实践,加深对这一知识的认识;一个月 // TODO
核心思想:The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons.
hadoop 1.0,图书《Hadoop技术内幕:深入理解MapReduce架构设计与实现原理》
计算向数据移动
可以支持了JobTracker & TaskTracker(对一个任务的运行)
多个任务发到JobTracker ?(单点)
缺点:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
https://www.edureka.co/blog/hadoop-yarn-tutorial/
YARN enabled the users to perform operations as per requirement by using a variety of tools like Spark for real-time processing, Hive for SQL, HBase for NoSQL and others.
(yarn 能够让 spark, hive, hbase等数据处理任务调度运行起来,不单单只是MapReduce任务)
Resource Manager: Runs on a master daemon and manages the resource allocation in the cluster.
(处理客户端请求,负责集群资源的管理和调度)
Node Manager: They run on the slave daemons and are responsible for the execution of a task on every single Data Node.
(负责自己所在节点的应用资源的使用情况,并向RM汇报。接收并处理来自RM,AM的各种命令)
Application Master: Manages the user job lifecycle and resource needs of individual applications. It works along with the Node Manager and monitors the execution of tasks.
(每个application都对应一个AM. 主要负责应用程序的管理,向RM申请资源,并分配给task。AM与NM通信,来启动或停止task)
Container: Package of resources including RAM, CPU, Network, HDD etc on a single node.
(封装了CPU,内存等资源的一个容器,相当于一个任务环境的抽象)
客户端变化不大, 做的事情相同;不过原来客户端联系JobTracker, 现在联系Resource Manager (独立的资源管理)
RM 知晓所有 NodeManager 的资源使用情况, 要执行计算的时候,RM会选择一个NodeManager来启动一个Application Master (即为客户端启动一个计算程序的主节点)
AM(类似原来的TaskTracker,但没有资源管理)来启动任务:仍然是根据客户端上传到HFDF的split清单,配置等; 且必须向 RM 申请资源,去哪些节点跑任务
AM 要处理任务,由RM来决策;RM 通信 NM 来计算资源,分配Container(计算资源的抽象); Container反向注册给AM,AM就知道有多少个container归自己使用
接着AM来分配MapTask, ReduceTask 到 container,即一个应用任务跑起来了
MapReduce 在 yarn 中运行 ?
反射
一个MRAppMaster客户端请求看成一个外界请求,比如需要造一个购物APP微信小程序
这些需求由Boss(RM角色)来接,决定怎么做,且Boss是拥有各种资源的
Boss 发现 某几个部门配合就能完成,找到 部门负责人; 部门负责人(NM角色)要负责自己部门的资源使用情况,同时要汇报Boss
部门负责人下面有项目经理(AM角色),来负责各种任务;为了完成任务,需要的各种资源,向RM申请就好了,当然也要通知NM
资源(可以任务是各种人员,设备之类的)即完成某个任务必备的(Container角色)
当使用yarn的时候,我们在mapred-site.xml
中配置了如下的实践版本:hadoop 2.7.2,当然还有别的配置)
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
整个就是沿着job.waitForCompletion(true);
这么简单的一句走下去
job.waitForCompletion(true)
-> (Job)this.submit();
-> submitter.submitJobInternal(Job.this, Job.this.cluster)
public void submit() throws IOException, InterruptedException, ClassNotFoundException {
this.ensureState(Job.JobState.DEFINE);
this.setUseNewAPI();
this.connect();
final JobSubmitter submitter = this.getJobSubmitter(this.cluster.getFileSystem(), this.cluster.getClient());
this.status = (JobStatus)this.ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
public JobStatus run() throws IOException, InterruptedException, ClassNotFoundException {
return submitter.submitJobInternal(Job.this, Job.this.cluster);
}
});
this.state = Job.JobState.RUNNING;
LOG.info("The url to track the job: " + this.getTrackingURL());
}
cluster new出来的时候执行了this.initialize(jobTrackAddr, conf);
,其中ClientProtocol
使用的YarnClientProtocolProvider
的create方法
@Override
public ClientProtocol create(Configuration conf) throws IOException {
if (MRConfig.YARN_FRAMEWORK_NAME.equals(conf.get(MRConfig.FRAMEWORK_NAME))) {
return new YARNRunner(conf);
}
return null;
}
/**
* Internal method for submitting jobs to the system.
*
* The job submission process involves:
*
* -
* Checking the input and output specifications of the job. (检查任务的输入输出)
*
* -
* Computing the {@link InputSplit}s for the job.(计算任务的splits)
*
* -
* Setup the requisite accounting information for the
* {@link DistributedCache} of the job, if necessary.
*
* -
* Copying the job's jar and configuration to the map-reduce system (把任务的jar,配置等拷贝到hdfs中)
* directory on the distributed file-system.
*
* -
* Submitting the job to the
JobTracker
and optionally (任务提交到JobTracker,并监控)
* monitoring it's status.
*
*
* @param job the configuration to submit
* @param cluster the handle to the Cluster
* @throws ClassNotFoundException
* @throws InterruptedException
* @throws IOException
*/
JobStatus submitJobInternal(Job job, Cluster cluster)
throws ClassNotFoundException, InterruptedException, IOException {
JobStatus submitJobInternal(Job job, Cluster cluster)
throws ClassNotFoundException, InterruptedException, IOException {
//validate the jobs output specs
checkSpecs(job);
Configuration conf = job.getConfiguration();
addMRFrameworkToDistributedCache(conf);
Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
//configure the command line options correctly on the submitting dfs
InetAddress ip = InetAddress.getLocalHost();
if (ip != null) {
submitHostAddress = ip.getHostAddress();
submitHostName = ip.getHostName();
conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
}
// 生成JobId, 类似: `application_1517538889175_2550` 这种
JobID jobId = submitClient.getNewJobID();
job.setJobID(jobId);
Path submitJobDir = new Path(jobStagingArea, jobId.toString());
JobStatus status = null;
try {
conf.set(MRJobConfig.USER_NAME,
UserGroupInformation.getCurrentUser().getShortUserName());
conf.set("hadoop.http.filter.initializers",
"org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
LOG.debug("Configuring job " + jobId + " with " + submitJobDir
+ " as the submit dir");
// get delegation token for the dir
TokenCache.obtainTokensForNamenodes(job.getCredentials(),
new Path[] { submitJobDir }, conf);
populateTokenCache(conf, job.getCredentials());
// generate a secret to authenticate shuffle transfers
if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
KeyGenerator keyGen;
try {
keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
keyGen.init(SHUFFLE_KEY_LENGTH);
} catch (NoSuchAlgorithmException e) {
throw new IOException("Error generating shuffle secret key", e);
}
SecretKey shuffleKey = keyGen.generateKey();
TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
job.getCredentials());
}
if (CryptoUtils.isEncryptedSpillEnabled(conf)) {
conf.setInt(MRJobConfig.MR_AM_MAX_ATTEMPTS, 1);
LOG.warn("Max job attempts set to 1 since encrypted intermediate" +
"data spill is enabled");
}
// 把任务相关的文件,配置,jars上传
copyAndConfigureFiles(job, submitJobDir);
// 获取配置文件job.xml的路径
Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
// 输入文件的splits,配置信息写入job信息中, map数量也即splits数量
// Create the splits for the job
LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
int maps = writeSplits(job, submitJobDir);
conf.setInt(MRJobConfig.NUM_MAPS, maps);
LOG.info("number of splits:" + maps);
int maxMaps = conf.getInt(MRJobConfig.JOB_MAX_MAP,
MRJobConfig.DEFAULT_JOB_MAX_MAP);
if (maxMaps >= 0 && maxMaps < maps) {
throw new IllegalArgumentException("The number of map tasks " + maps +
" exceeded limit " + maxMaps);
}
// 设置job使用的资源队列
// write "queue admins of the queue to which job is being submitted"
// to job file.
String queue = conf.get(MRJobConfig.QUEUE_NAME,
JobConf.DEFAULT_QUEUE_NAME);
AccessControlList acl = submitClient.getQueueAdmins(queue);
conf.set(toFullPropertyName(queue,
QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());
// removing jobtoken referrals before copying the jobconf to HDFS
// as the tasks don't need this setting, actually they may break
// because of it if present as the referral will point to a
// different job.
TokenCache.cleanUpTokenReferral(conf);
if (conf.getBoolean(
MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
// Add HDFS tracking ids
ArrayList<String> trackingIds = new ArrayList<String>();
for (Token<? extends TokenIdentifier> t :
job.getCredentials().getAllTokens()) {
trackingIds.add(t.decodeIdentifier().getTrackingId());
}
conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
trackingIds.toArray(new String[trackingIds.size()]));
}
// Set reservation info if it exists
ReservationId reservationId = job.getReservationId();
if (reservationId != null) {
conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
}
// Write job file to submit dir
writeConf(conf, submitJobFile);
// submitClient.submitJob 真正提交作业
//
// Now, actually submit the job (using the submit name)
//
printTokens(jobId, job.getCredentials());
status = submitClient.submitJob(
jobId, submitJobDir.toString(), job.getCredentials());
if (status != null) {
return status;
} else {
throw new IOException("Could not launch job");
}
} finally {
if (status == null) {
LOG.info("Cleaning up the staging area " + submitJobDir);
if (jtFs != null && submitJobDir != null)
jtFs.delete(submitJobDir, true);
}
}
}
@Override
public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
throws IOException, InterruptedException {
addHistoryToken(ts);
ApplicationSubmissionContext appContext =
createApplicationSubmissionContext(conf, jobSubmitDir, ts);
// Submit to ResourceManager
try {
ApplicationId applicationId =
resMgrDelegate.submitApplication(appContext);
ApplicationReport appMaster = resMgrDelegate
.getApplicationReport(applicationId);
String diagnostics =
(appMaster == null ?
"application report is null" : appMaster.getDiagnostics());
if (appMaster == null
|| appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
|| appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
throw new IOException("Failed to run job : " +
diagnostics);
}
return clientCache.getClient(jobId).getJobStatus(jobId);
} catch (YarnException e) {
throw new IOException(e);
}
}
/**
* {@code ApplicationSubmissionContext} represents all of the
* information needed by the {@code ResourceManager} to launch
* the {@code ApplicationMaster} for an application.
*
* It includes details such as:
*
* - {@link ApplicationId} of the application.
* - Application user.
* - Application name.
* - {@link Priority} of the application.
* -
* {@link ContainerLaunchContext} of the container in which the
*
ApplicationMaster
is executed.
*
* -
* maxAppAttempts. The maximum number of application attempts.
* It should be no larger than the global number of max attempts in the
* YARN configuration.
*
* -
* attemptFailuresValidityInterval. The default value is -1.
* when attemptFailuresValidityInterval in milliseconds is set to
* {@literal >} 0, the failure number will no take failures which happen
* out of the validityInterval into failure count. If failure count
* reaches to maxAppAttempts, the application will be failed.
*
* - Optional, application-specific {@link LogAggregationContext}
*
*
* @see ContainerLaunchContext
* @see ApplicationClientProtocol#submitApplication(org.apache.hadoop.yarn.api.protocolrecords.SubmitApplicationRequest)
*/
@Public
@Stable
public abstract class ApplicationSubmissionContext {
主备模式:一个主RM,另外的备用RM(平时主RM干事情,备用RM基本废物;如果有问题,备用RM会切换成主RM, 依靠zk心跳和选举)【学过druid.io,与druid.io的coordinator 和 overlord类似 】
调度这个事情,太多地方有了, 还是得按照实际业务需求自定义一些:多业务线,多策略。
eg: 进程调度算法,有类似如下
eg: druid.io middle 分配策略