ip | hostname | install software | process |
---|---|---|---|
10.62.84.37 | master | hadoop,zookeeper | namenode,ResouceManager,ZKFC |
10.62.84.38 | master2 | hadoop,zookeeper,mysql,hive,spark ,hue | namenode,ResouceManager,mrHistoryserver,ZKFC,mysql,metastore,hiveserver2,Master,sparkHistoryserver,hue |
10.62.84.39 | worker1 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode ,Worker |
10.62.84.40 | worker2 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode,Worker |
10.62.84.41 | worker3 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode ,Worker |
10.62.84.42 | worker4 | hadoop,spark | datanode,nodeManager,Worker |
下载地址:https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/stable/
data
|__ zookeeper
|__ data
| |__ myid
|__ logs
tar zxvf apache-zookeeper-3.5.5.tar.gz
mv apache-zookeeper-3.5.5 zookeeper
cd zookeeper/conf
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
打开文件后,修改配置如下
dataDir=/data/zookeeper/data
# zookeeper的事务日志通过zoo.cfg文件中的dataLogDir配置项配置
dataLogDir=/data/zookeeper/logs
server.1=worker1:2888:3888
server.2=worker2:2888:3888
server.3=worker3:2888:3888
server后面的.数字(不能重复)是当前server节点在该zk集群中的唯一标识
=后面则是对当前server的说明,用":"分隔开,
第一段是当前server所在机器的主机名
第二段和第三段以及2181端口
2181—>zookeeper服务器开放给client连接的端口
2888—>zookeeper服务器之间进行通信的端口
3888—>zookeeper和外部进程进行通信的端口
在dataDir=/data/hadoop/zookeeper/data目录下创建空文件
touch /data/zookeeper/data/myid
echo 3 > /data/zookeeper/data/myid
日志配置
修改zkEnv.sh
if [ "x${ZOO_LOG_DIR}" = "x" ]
then
# 服务运行日志输出路径
ZOO_LOG_DIR="/data/zookeeper/logs"
fi
if [ "x${ZOO_LOG4J_PROP}" = "x" ]
then
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
fi
修改log4j.properties
zookeeper.root.logger=INFO,ROLLINGFILE
# 按照日期每天输出logs
log4j.appender.ROLLINGFILE=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.ROLLINGFILE.MaxFileSize=${zookeeper.log.maxfilesize}
#log4j.appender.ROLLINGFILE.MaxBackupIndex=${zookeeper.log.maxbackupindex}
vi /etc/profile
打开后,在文档最下方添加如下配置:
export ZOOKEEPER_HOME=/usr/local/hadoop/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH
修改完后,保存退出,执行如下命令,使更改生效
source /etc/profile
然后将配置好的zookeeper拷贝到其他节点(worker1、worker2)
注意:修改worker1、worker2对应/data/hadoop/zookeeper/data/myid的内容
worker1:
echo 1 > /data/zookeeper/data/myid
worker2:
echo 2 > /data/zookeeper/data/myid
在worker1、worker2和worker3上分别执行
zkServer.sh --config /usr/local/hadoop/zookeeper/conf start
我们先在master服务器上解压、配置Hadoop,然后再分发到其他其它服务器上的方式来安装集群。
新建hadoop用户以及用户组,并赋予sudo免密码权限
新建hadoop用户
在root权限下首先新建用户,建议用adduser命令
adduser hadoop
passwd hadoop
输入密码(hadoop)后一直按回车即可,最后输入y确定。
将hadoop用户加入到hadoop用户组
创建hadoop用户的同时也创建了hadoop用户组,下面我们把hadoop用户加入到hadoop用户组
usermod -a -G hadoop hadoop
把hadoop用户赋予root权限,让他可以使用sudo命令
chmod u+w /etc/sudoers # 修改sudoers文件的权限
vi /etc/sudoers
在 root ALL=(ALL) ALL 下面添加:
hadoop ALL=(root) NOPASSWD:ALL
chmod u-w /etc/sudoers # 修改sudoers文件的权限
data
|__ hadoop
|__ hdfs
| |__ name
| |__ data
|__ tmp
|__ pids
|__ logs
将下载的安装包上传到/usr/local/hadoop目录,解压并重命名
cd /usr/local/hadoop
tar zxvf hadoop-3.1.0.tar.gz
mv hadoop-3.1.0 hadoop
[hadoop@master hadoop]$ vi ~/.bashrc
打开后,在文档最下方添加如下配置:
# hadoop
export HADOOP_HOME=/usr/local/hadoop/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
修改完后,保存退出,执行如下命令,使更改生效
[hadoop@master hadoop]$ source ~/.bashrc
配置hadoop运行环境
配置Hadoop JDK路径,定义集群操作用户,在hadoop-env.sh文件中添加如下内容
export JAVA_HOME=/usr/java/jdk1.8.0_111
export HDFS_NAMENODE_USER=hadoop
export HDFS_DATANODE_USER=hadoop
export HDFS_JOURNALNODE_USER=hadoop
export HDFS_ZKFC_USER=hadoop
export YARN_RESOURCEMANAGER_USER=hadoop
export YARN_NODEMANAGER_USER=hadoop
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_LOG_DIR=/data/hadoop/logs
配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/tmp</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>hadoop</value>
<description>The user name to filter as, on static web filters while rendering content. </description>
</property>
<property>
<name>hadoop.zk.address</name>
<value>worker1:2181,worker2:2181,worker3:2181</value>
<description>Host:Port of the ZooKeeper server to be used.</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>worker1:2181,worker2:2181,worker3:2181</value>
<description>A list of ZooKeeper server addresses, separated by commas, that are to be used by the ZKFailoverController in automatic failover.</description>
</property>
</configuration>
配置hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
<description>Comma-separated list of nameservices</description>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
<description>a comma-separated list of namenodes for nameservice ns1</description>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>master:8020</value>
<description>The RPC address for namenode nn1</description>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>master:9870</value>
<description>The address and the base port where the dfs namenode nn1 web ui will listen on</description>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>master2:8020</value>
<description>The RPC address for namenode nn2</description>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>master2:9870</value>
<description>The address and the base port where the dfs namenode nn2 web ui will listen on</description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://worker1:8485;worker2:8485;worker3:8485/ns1</value>
<description>A directory on shared storage between the multiple namenodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized</description>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/hadoop/journal</value>
<description>The directory where the journal edit files are stored</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>Whether automatic failover is enabled</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>The prefix (plus a required nameservice ID) for the class name of the configured Failover proxy provider for the host</description>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
<description>A list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master2:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master2:19888</value>
</property>
</configuration>
配置yarn-site.xml
<configuration>
<!-- Configuring the External Shuffle Service -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
<description>Enable RM high-availability</description>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rmcluster</value>
<description>Name of the cluster</description>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
<description>The list of RM nodes in the cluster when HA is enabled</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value>
<description>The hostname of the rm1</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master2</value>
<description>The hostname of the rm2</description>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
<description>Enable RM to recover state after starting</description>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
<description>The class to use as the persistent store</description>
</property>
<!-- YARN-Fair Scheduler. Start -->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/usr/local/hadoop/hadoop/etc/hadoop/fair-scheduler.xml</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<property>
<name>yarn.scheduler.fair.user-as-default-queue</name>
<value>false</value>
<description>default is True</description>
</property>
<property>
<name>yarn.scheduler.fair.allow-undeclared-pools</name>
<value>false</value>
<description>default is True</description>
</property>
<!-- YARN-Fair Scheduler. End -->
<!-- YARN nodemanager resource config. Start -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>36864</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>24</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>3</value>
</property>
<!-- YARN nodemanager resource config. End -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://master2:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>/usr/local/hadoop/hadoop/etc/hadoop:/usr/local/hadoop/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/hadoop/share/hadoop/common/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs:/usr/local/hadoop/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/hadoop/share/hadoop/yarn:/usr/local/hadoop/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/hadoop/share/hadoop/yarn/*</value>
</property>
<!-- 客户端通过该地址向RM提交对应用程序操作 -->
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>master2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>master2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>master2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>master2:8033</value>
</property>
</configuration>
配置workers
[hadoop@master hadoop]$ vi workers
worker1
worker2
worker3
worker4
配置fair-scheduler
在/usr/local/hadoop/hadoop/etc/hadoop目录下新建fair-scheduler.xml文件,新增如下内容
<?xml version="1.0"?>
<allocations>
<defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
<queue name="prod">
<weight>40</weight>
</queue>
<queue name="dev">
<weight>60</weight>
</queue>
<queuePlacementPolicy>
<rule name="specified" create="false" />
<rule name="primaryGroup" create="false" />
<rule name="default" queue="dev" />
</queuePlacementPolicy>
</allocations>
scp -r /usr/local/hadoop/hadoop hadoop@master2:/usr/local/hadoop/
scp -r /usr/local/hadoop/hadoop hadoop@worker1:/usr/local/hadoop/
scp -r /usr/local/hadoop/hadoop hadoop@worker2:/usr/local/hadoop/
scp -r /usr/local/hadoop/hadoop hadoop@worker3:/usr/local/hadoop/
scp -r /usr/local/hadoop/hadoop hadoop@worker4:/usr/local/hadoop/
分别在worker1、worker2、worker3上启动zk
zkServer.sh --config /usr/local/hadoop/zookeeper/conf start
查看状态:一个leader,两个follower
zkServer.sh --config /usr/local/hadoop/zookeeper/conf status
分别在worker1、worker2、worker3上启动journalnode
注:journalnode为qjournal分布式应用,用来管理edit.log文件,依赖于zk管理,所以将三个node节点放到zk上启动。
hdfs --daemon start journalnode
运行jps命令检验,worker1、worker2、worker3上多了JournalNode进程
在master上执行命令
[hadoop@master data]$ hdfs namenode -format
在master上启动namenode
hdfs --daemon start namenode
在master2上同步master namenode元数据
[hadoop@master2 data]$ hdfs namenode -bootstrapStandby
在master上关闭namenode
[hadoop@master hadoop]$ hdfs --daemon stop namenode
在master上执行即可
注:zkfc是用来管理两台namenode切换状态的进程。同样是依赖zk实现。当active namenode状态不正常了,该namenode上的zkfc会将这个状态发动到 zk上,standby namenode上的zkfc会查看到该不正常状态,并向active namenode通过ssh发送一条指令,kill -9 进程号,杀死该进程,并将自己重置成active,防止active假死发生脑裂事件,万一ssh发送失败,也可以启动自定义的.sh脚本文件,强制杀死active namenode进程。
在hadoop3.x中将这样的一对namenode管理关系叫做 federation(联邦)。
并且支持多个federation, 比如配置文件中起名为ns1, 则该ns1中包括 (active namenode)nn1, (standby namenode)nn2 。
[hadoop@master hadoop]$ hdfs zkfc -formatZK
启动HDFS
先关闭journalnode(分别在worker1、worker2、worker3上执行hdfs --daemon stop journalnode),然后在master上执行
[hadoop@master hadoop]$ /usr/local/hadoop/hadoop/sbin/start-dfs.sh
Starting namenodes on [master master2]
Starting datanodes
Starting journal nodes [worker1 worker2 worker3]
Starting ZK Failover Controllers on NN hosts [master master2]
我们可以看到DFSZKFailoverController分别在master、master2上启动起来了。
启动YARN
在master上执行
[hadoop@master hadoop]$ /usr/local/hadoop/hadoop/sbin/start-yarn.sh
Starting resourcemanagers on [ master master2]
Starting nodemanagers
启动MR的historyserver
在master2上启动MR的historyserver
[hadoop@master2 logs]$ mapred --daemon start historyserver
HDFS
http://master:9870
http://master2:9870
其中一个是active,一个是standby
YARN
http://master:8088
http://master2:8088
在浏览的时候standby会重定向跳转到active对应的页面
namenode HA
访问
http://master:9870
http://master2:9870
其中一个是active,一个是standby
主备切换验证
在master上kill -9 namenode的进程,这时候
YARN HA
主备切换验证
在master上kill -9 resourcemanager的进程
这时可以访问http://master2:8088
然后在master上重新启动resourcemanager(yarn --daemon start resourcemanager),再访问http://master:8088时就会自动跳转到http://master2:8088
问题表述
namenode HA测试过程中发现nomenode无法进行主备切换
解决方案
在master2上查看zkfc的log日志,发现提示
bash: fuser: command not found
未找到fuster程序,导致无法进行fence,可以通过如下命令来安装
yum install psmisc
注:psmisc软件包含fuster、killall、pstree三个程序,出现上述问题是由于我们在安装centos7的时候选择了最小化安装,默认是不安装psmisc
问题描述
2019-05-29 11:09:37,100 INFO mapreduce.Job: Job job_1559046243109_0004 failed with state FAILED due to: Application application_1559046243109_0004 failed 2 times due to AM Container for appattempt_1559046243109_0004_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2019-05-29 11:08:53.805]Exception from container-launch.
Container id: container_e02_1559046243109_0004_02_000001
Exit code: 1
[2019-05-29 11:08:53.851]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
[2019-05-29 11:08:53.852]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
解决方案
yarn执行MapReduce任务时,找不到主类导致的,在master上运行
[hadoop@master hadoop]$ hadoop classpath
/usr/local/hadoop/hadoop/etc/hadoop:/usr/local/hadoop/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/hadoop/share/hadoop/common/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs:/usr/local/hadoop/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/hadoop/share/hadoop/yarn:/usr/local/hadoop/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/hadoop/share/hadoop/yarn/*
将上述输出的值添加到$HADOOP_HOME/etc/hadoop/yarn-site.xml文件对应的属性 yarn.application.classpath下面
<property>
<name>yarn.application.classpath</name>
<value>/usr/local/hadoop/hadoop/etc/hadoop:/usr/local/hadoop/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/hadoop/share/hadoop/common/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs:/usr/local/hadoop/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/hadoop/share/hadoop/yarn:/usr/local/hadoop/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/hadoop/share/hadoop/yarn/*</value>
</property>
然后重启yarn,重新跑MapReduce任务
问题描述
18/08/16 17:02:54 INFO mapreduce.Job: Job job_1534406793739_0005 failed with state FAILED due to: Application application_1534406793739_0005 failed 2 times due to AM Container for appattempt_1534406793739_0005_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2018-08-16 17:02:48.561]Exception from container-launch.
Container id: container_e27_1534406793739_0005_02_000001
Exit code: 1
[2018-08-16 17:02:48.562]
[2018-08-16 17:02:48.574]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
解决方案
从日志可以看出,发现是由于跑AM的container退出了,并没有为任务去RM获取资源,怀疑是AM和RM通信有问题;一台是备RM,一台活动的RM,在yarn内部,当MR去活动的RM为任务获取资源的时候当然没问题,但是去备RM获取时就会出现这个问题了。
在$HADOOP_HOME/etc/hadoop/yarn-site.xml文件中新增如下配置
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>master2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>master2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>master2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>master2:8033</value>
</property>
然后重启yarn,重新跑MapReduce任务
问题描述
[hadoop@master sbin]$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [master master2]
Starting datanodes
worker6: WARNING: Your password has expired.
worker6: Password change required but no TTY available.
worker8: WARNING: Your password has expired.
worker5: WARNING: Your password has expired.
worker8: Password change required but no TTY available.
worker5: Password change required but no TTY available.
worker9: WARNING: Your password has expired.
worker9: Password change required but no TTY available.
worker10: WARNING: Your password has expired.
worker10: Password change required but no TTY available.
worker7: WARNING: Your password has expired.
worker7: Password change required but no TTY available.
worker11: WARNING: Your password has expired.
worker14: WARNING: Your password has expired.
worker11: Password change required but no TTY available.
worker14: Password change required but no TTY available.
worker12: WARNING: Your password has expired.
worker12: Password change required but no TTY available.
worker13: WARNING: Your password has expired.
worker13: Password change required but no TTY available.
Starting journal nodes [worker1 worker2 worker3]
Starting ZK Failover Controllers on NN hosts [master master2]
Starting resourcemanagers on [ master master2]
Starting nodemanagers
worker5: WARNING: Your password has expired.
worker5: Password change required but no TTY available.
worker9: WARNING: Your password has expired.
worker6: WARNING: Your password has expired.
worker9: Password change required but no TTY available.
worker6: Password change required but no TTY available.
worker10: WARNING: Your password has expired.
worker10: Password change required but no TTY available.
worker8: WARNING: Your password has expired.
worker8: Password change required but no TTY available.
worker7: WARNING: Your password has expired.
worker7: Password change required but no TTY available.
worker11: WARNING: Your password has expired.
worker11: Password change required but no TTY available.
worker14: WARNING: Your password has expired.
worker14: Password change required but no TTY available.
worker12: WARNING: Your password has expired.
worker12: Password change required but no TTY available.
worker13: WARNING: Your password has expired.
worker13: Password change required but no TTY available.
解决方案
从错误提示信息来看,是linux用户密码过期,解决办法如下
1、查看用户密码是否已过期
[root@worker6 ~]# chage -l hadoop
最近一次密码修改时间 :9月 03, 2019
密码过期时间 :12月 02, 2019
密码失效时间 :从不
帐户过期时间 :从不
两次改变密码之间相距的最小天数 :2
两次改变密码之间相距的最大天数 :90
在密码过期之前警告的天数 :14
2、修改用户的过期时间,修改过后就可以了,无需其他的操作
[root@worker6 ~]# chage -M 3600 hadoop
[root@worker6 ~]# chage -l hadoop
最近一次密码修改时间 :9月 03, 2019
密码过期时间 :7月 12, 2029
密码失效时间 :从不
帐户过期时间 :从不
两次改变密码之间相距的最小天数 :2
两次改变密码之间相距的最大天数 :3600
在密码过期之前警告的天数 :14
解决方案
在主节点服务器中重启NameNode、DFSZKFailoverController、ResourceManager进程
hdfs --daemon start namenode
hdfs --daemon start zkfc
yarn --daemon start resourcemanager