linux-c0001:ResourceManager
linux-c0002:NameNodelinux-c0003:DataNode/NodeManagerlinux-c0004:DataNode/NodeManager
fs.defaultFS
hdfs://linux-c0002:9090
io.file.buffer.size
131072
dfs.namenode.http-address
linux-c0002:50071
dfs.namenode.backup.address
linux-c0002:50101
dfs.namenode.backup.http-address
linux-c0002:50106
dfs.namenode.name.dir
/home/hadoop/YarnRun/name1,/home/hadoop/YarnRun/name2
dfs.blocksize
268435456
dfs.namenode.handler.count
100
dfs.namenode.http-address
linux-c0002:50071
dfs.datanode.address
linux-c0003:50011
dfs.datanode.http.address
linux-c0003:50076
dfs.datanode.ipc.address
linux-c0003:50021
dfs.blocksize
268435456
dfs.datanode.data.dir
/home/hadoop/YarnRun/data1
linux-c0003
4:Namenode的格式化linux-c0004
$HADOOP_PREFIX/bin/hdfs namenode -format5:Namenode启动
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode6:检查NameNode是否启动
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
yarn.resourcemanager.address
linux-c0001:8032
the host is the hostname of the ResourceManager and the port is the port on
which the clients can talk to the Resource Manager.
yarn.resourcemanager.resource-tracker.address
linux-c0001:8031
host is the hostname of the resource manager and
port is the port on which the NodeManagers contact the Resource Manager.
yarn.resourcemanager.scheduler.address
linux-c0001:8030
host is the hostname of the resourcemanager and port is the port
on which the Applications in the cluster talk to the Resource Manager.
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
In case you do not want to use the default scheduler
yarn.nodemanager.local-dirs
the local directories used by the nodemanager
yarn.nodemanager.resource.memory-mb
10240
the amount of memory on the NodeManager in GB
yarn.nodemanager.remote-app-log-dir
/app-logs
directory on hdfs where the application logs are moved to
yarn.nodemanager.log-dirs
the directories used by Nodemanagers as log directories
yarn.nodemanager.aux-services
mapreduce_shuffle
shuffle service that needs to be set for Map Reduce to run
mapreduce.framework.name
yarn
mapreduce.cluster.temp.dir
mapreduce.cluster.local.dir
4:ResourceManager启动
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
6:JobHistory的启动
JobHistory服务主要是负责记录集群中曾经跑过的任务,对完成的任务查看任务运行期间的详细信息。一般JobHistory都是启动在运行任务的节点上,即NodeManager节点上。如果不对JobHistory的配置进行修改,那么直接可以在NodeManager所在节点上运行启动命令即可,具体启动命令如下:
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
启动了之后就可以在集群运行任务页面查看具体的job history,通过点击每个任务条目的左后history链接就可以查看具体的任务细节。具体的截图如下: