3. 安装Hadoop
3.1. 解压程序
※ 3台服务器分别执行
tar -xf ~/install/hadoop-2.7.3.tar.gz -C/opt/cloud/packages ln -s /opt/cloud/packages/hadoop-2.7.3 /opt/cloud/bin/hadoop ln -s /opt/cloud/packages/hadoop-2.7.3/etc/hadoop /opt/cloud/etc/hadoop mkdir -p /opt/cloud/hdfs/name mkdir -p /opt/cloud/hdfs/data mkdir -p /opt/cloud/hdfs/journal mkdir -p /opt/cloud/hdfs/tmp/java mkdir -p /opt/cloud/logs/hadoop/yarn
3.2. 设置环境变量
设置JAVA环境变量和Hadoop环境变量
vi ~/.bashrc
增加
export HADOOP_HOME=/opt/cloud/bin/hadoop export HADOOP_CONF_DIR=/opt/cloud/etc/hadoop export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp export YARN_PID_DIR=/opt/cloud/hdfs/tmp export HADOOP_OPTS="-Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java" export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
即刻生效
source ~/.bashrc
复制到另外两台服务器
scp ~/.bashrc hadoop2:/home/hadoop scp ~/.bashrc hadoop3:/home/hadoop
3.3. 修改Hadoop参数
cd ${HADOOP_HOME}/etc/hadoop
修改log4j.properties 、hadoop-env.sh、yarn-env.sh、slaves、core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml,分发到hadoop2和hadoop2相同的目录下
3.3.1. 修改log配置文件log4j.properties
hadoop.root.logger =INFO,DRFA
hadoop.log.dir=/opt/cloud/logs/hadoop
3.3.2. 修改hadoop-env.sh
hadoop-env.sh设置了Hadoop的一些环境变量,但是直到2.7.3都有bug,不能从系统的环境变量中提取正确的值,需要手工修改,在文件头部
export JAVA_HOME=${JAVA_HOME}
将其注释,手工修改为
export JAVA_HOME="/usr/lib/jvm/java"
在文件中查找#export HADOOP_LOG_DIR,在其下增加
export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
在文件中查找export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/
设置java的临时目录,查找
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true "
修改为
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java"
3.3.3. 修改yarn-env.sh
查找default log directory,在其后增加一行
export YARN_LOG_DIR=/opt/cloud/logs/hadoop/yarn
3.3.4. 修改slaves
# vi slaves
配置内容:
删除:localhost
添加:
hadoop2
hadoop3
3.3.5. 修改core-site.xml
# vi core-site.xml
配置内容:
<configuration>
<property> <name>fs.defaultFSname> <value>hdfs://myclustervalue> property> <property> <name>ha.zookeeper.quorumname> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181value> property> <property> <name>hadoop.tmp.dirname> <value>/opt/cloud/hdfs/tmpvalue> property> <property> <name>io.file.buffer.sizename> <value>131072value> property> <property> <name>hadoop.proxyuser.hadoop.groupsname> <value>hadoopvalue> property> <property> <name>hadoop.proxyuser.hadoop.hostsname> <value>hadoop1, hadoop2, hadoop3,127.0.0.1,localhostvalue> property> <property> <name>ipc.client.rpc-timeout.msname>[1] <value>4000value> property> <property> <name>ipc.client.connect.timeoutname> <value>4000value> property> <property> <name>ipc.client.connect.max.retriesname> <value>100value> property> <property> <name>ipc.client.connect.retry.intervalname> <value>10000value> property> configuration>
3.3.6. 修改hdfs-site.xml
# vi hdfs-site.xml
配置内容:
<configuration> <property> <name>dfs.nameservicesname> <value>myclustervalue> property> <property> <name>dfs.ha.namenodes.myclustername> <value>nn1,nn2value> property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1name> <value>hadoop1:9000value> property> <property> <name>dfs.namenode.http-address.mycluster.nn1name> <value>hadoop1:50070value> property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2name> <value>hadoop2:9000value> property> <property> <name>dfs.namenode.http-address.mycluster.nn2name> <value>hadoop2:50070value> property> <property> <name>dfs.namenode.shared.edits.dirname> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/myclustervalue> property> <property> <name>dfs.journalnode.edits.dirname> <value>/opt/cloud/hdfs/journalvalue> property> <property> <name>dfs.ha.automatic-failover.enabledname> <value>truevalue> property> <property> <name>dfs.client.failover.proxy.provider.myclustername> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue> property> <property> <name>dfs.ha.fencing.methodsname> <value> sshfence shell(/bin/true) value> property> <property> <name>dfs.ha.fencing.ssh.private-key-filesname> <value>/home/hadoop/.ssh/id_rsavalue> property> <property> <name>dfs.ha.fencing.ssh.connect-timeoutname> <value>30000value> property> <property> <name>dfs.replicationname> <value>2value> property> <property> <name>dfs.name.dirname> <value>/opt/cloud/hdfs/namevalue> property> <property> <name>dfs.data.dirname> <value>/opt/cloud/hdfs/datavalue> property> <property> <name>dfs.permissionsname> <value>falsevalue> property> <property> <name>dfs.support.appendname> <value>truevalue> property> <property> <name>dfs.webhdfs.enabledname> <value>truevalue> property> <property> <name>dfs.client.block.write.replace-datanode-on-failure.enablename> <value>truevalue> property> <property> <name>dfs.client.block.write.replace-datanode-on-failure.policyname> <value>NEVERvalue> property> <property> <name>dfs.datanode.max.xcieversname> <value>8192value> property> configuration>
3.3.7. 修改mapred-site.xml
mv mapred-site.xml.template mapred-site.xml vi mapred-site.xml
配置内容:
<configuration> <property> <name>mapreduce.framework.namename> <value>yarnvalue> property> <property> <name>mapreduce.jobhistory.addressname> <value>0.0.0.0:10020value> property> <property> <name>mapreduce.jobhistory.webapp.addressname> <value>0.0.0.0:19888value> property> <property> <name>yarn.app.mapreduce.am.resource.mbname> <value>1024value> property> <property> <name>yarn.app.mapreduce.am.command-optsname> <value>-Xmx800mvalue> property> <property> <name>mapreduce.map.memory.mbname> <value>512value> property> <property> <name>mapreduce.map.java.optsname> <value>-Xmx400mvalue> property> <property> <name>mapreduce.reduce.memory.mbname> <value>1024value> property> <property> <name>mapreduce.reduce.java.optsname> <value>-Xmx800mvalue> property> configuration>
3.3.8. 修改yarn-site.xml(非HA版)
vi yarn-site.xml
配置内容:
<configuration> <property> <name>yarn.resourcemanager.hostnamename> <value>hadoop1value> property> <property> <name>yarn.nodemanager.aux-servicesname> <value>mapreduce_shufflevalue> property> <property> <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class name> <value>org.apache.hadoop.mapred.ShuffleHandlervalue> property> configuration>
3.3.9. 修改yarn-site.xml(HA版)
vi yarn-site.xml
配置内容:
<configuration> <property> <name>yarn.resourcemanager.ha.enabledname> <value>truevalue> property> <property> <name>yarn.resourcemanager.cluster-idname> <value>clusteryarnvalue> property> <property> <name>yarn.resourcemanager.ha.rm-idsname> <value>rm1,rm2value> property> <property> <name>yarn.resourcemanager.hostname.rm1name> <value>hadoop1value> property> <property> <name>yarn.resourcemanager.hostname.rm2name> <value>hadoop2value> property> <property> <name>yarn.log-aggregation-enablename> <value>truevalue> property> <property> <name>yarn.resourcemanager.zk-addressname> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181value> property> <property> <name>yarn.nodemanager.aux-servicesname> <value>mapreduce_shufflevalue> property> <property> <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class name> <value>org.apache.hadoop.mapred.ShuffleHandlervalue> property> <property> <name>yarn.resourcemanager.connect.retry-interval.msname> <value>5000value> property> <property> <name>yarn.nodemanager.resource.memory-mbname> <value>3072value> property> <property> <name>yarn.nodemanager.vmem-pmem-rationame> <value>4value> property> <property> <name>yarn.nodemanager.resource.cpu-vcoresname> <value>2value> property> <property> <name>yarn.scheduler.minimum-allocation-mbname> <value>512value> property> <property> <name>yarn.scheduler.maximum-allocation-mbname> <value>2048value> property> <property> <name>yarn.scheduler.minimum-allocation-vcoresname> <value>1value> property> <property> <name>yarn.scheduler.maximum-allocation-vcoresname> <value>2value> property> configuration>
3.3.10. 复制到另外2台服务器
配置文件打包为
scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop2:/opt/cloud/bin/hadoop/etc/hadoop/ scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop3:/opt/cloud/bin/hadoop/etc/hadoop/
3.4. 首次启动HDFS
- 启动JournalNode集群:
cexec 'hadoop-daemon.sh start journalnode'
注意只有第一次需要这么启动,之后启动hdfs会包含journalnode。
- 格式化第1个NameNode:
ssh hadoop1 'hdfs namenode -format -clusterId mycluster'
输出信息的最后部分出现下面两行表示格式化成功
INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.
...
INFO util.ExitUtil: Exiting with status 0
- 启动第1个NameNode:
ssh hadoop1 'hadoop-daemon.sh start namenode'
- 格式化第2个NameNode:
ssh hadoop2 'hdfs namenode -bootstrapStandby'
输出信息的最后部分出现下面两行表示格式化成功
INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.
...
INFO util.ExitUtil: Exiting with status 0
- 启动第2个NameNode:
ssh hadoop2 'hadoop-daemon.sh start namenode'
- 格式化Zk
ssh hadoop1 'hdfs zkfc -formatZK'
信息
INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.
即为格式化成功
- 启动2个Zkfc
ssh hadoop1 'hadoop-daemon.sh start zkfc' ssh hadoop2 'hadoop-daemon.sh start zkfc'
- 启动所有的DataNodes:
ssh hadoop1 'hadoop-daemons.sh start datanode'
用浏览器访问http://hadoop1:50070和http://hadoop2:50070 查看状态
namenode一个是active一个是standby,其中active的网页中QJM三台服务器的Written txid相同。
3.5. 正式启动hdfs和Yarn
在hadoop1上执行
start-dfs.sh start-yarn.sh
在hadoop2上执行
ssh hadoop2 'yarn-daemon.sh start resourcemanager'
通过jps查看进程
[hadoop@hadoop1 ~]$ cexec jps ************************* cloud ************************* --------- hadoop1--------- 1223 QuorumPeerMain 3757 DFSZKFailoverController 4787 Jps 3872 ResourceManager 3365 NameNode 3578 JournalNode --------- hadoop2--------- 1220 QuorumPeerMain 24240 NodeManager 24545 Jps 24022 JournalNode 24139 DFSZKFailoverController 23847 NameNode 23923 DataNode 24419 ResourceManager --------- hadoop3--------- 23764 Jps 23578 NodeManager 23471 JournalNode 23372 DataNode 1224 QuorumPeerMain
在浏览器中下列网址,会看到图形界面的监控程序
http://hadoop1:50070/ dfs的图形界面的监控程序
http://hadoop2:50070/ dfs的图形界面的监控程序,hadoop1和hadoop2其中一个是active,另外一个是standby
http://hadoop1:8088
http://hadoop2:8088 自动跳转到http://hadoop1:8088
3.6. 开机自动运行hdfs
Centos7 采用Systemd作为自启动管理器,有方便设置依赖关系等多个优点,不过,每个服务的环境变量都是初始化的,即“systemd不继承任何上下文环境”,所以服务脚本需要设置必要的所有环境变量,每个变量需要用Environment = name = value的方式设置,好消息Environment可以多行,坏消息是Environment中不支持已经使用已经声明的变量,就是说value中不能有$name,${name}。
3.6.1. journalnode service
vi hadoop-journalnode.service
[Unit] Description=hadoop journalnode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start journalnode' ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop journalnode' [Install] WantedBy=multi-user.target
3.6.2. namenode service
vi hadoop-namenode.service
[Unit] Description=hadoop namenode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start namenode' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop namenode' [Install] WantedBy=multi-user.target
3.6.3. datanode service
vi hadoop-datanode.service
[Unit] Description=hadoop datanode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start datanode' ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop datanode' [Install] WantedBy=multi-user.target
3.6.4. zkfc service
vi hadoop-zkfc.service
[Unit] Description=hadoop zkfc service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start zkfc' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop zkfc' [Install] WantedBy=multi-user.target
3.6.5. yarn resource manager service
vi yarn-rm.service
[Unit] Description=yarn resource manager service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start resourcemanager' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop resourcemanager' [Install] WantedBy=multi-user.target
3.6.6. yarn nodemanager service
vi yarn-nm.service
[Unit] Description=yarn node manager service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start nodemanager' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop nodemanager' [Install] WantedBy=multi-user.target
3.6.7. 测试和设置为自动启动服务
编写6种服务的启动脚本,分别复制到对应服务的/etc/systemd/system目录
hadoop2 (6种服务)
systemctl start hadoop-journalnode systemctl start hadoop-namenode systemctl start hadoop-datanode systemctl start hadoop-zkfc systemctl start yarn-rm systemctl start yarn-nm
测试通过后
systemctl enable hadoop-journalnode systemctl enable hadoop-namenode systemctl enable hadoop-datanode systemctl enable hadoop-zkfc systemctl enable yarn-rm systemctl enable yarn-nm
hadoop1 (4种服务)
systemctl enable hadoop-journalnode systemctl enable hadoop-namenode systemctl enable hadoop-zkfc systemctl enable yarn-rm
hadoop3 (3种服务)
systemctl enable hadoop-journalnode systemctl enable hadoop-datanode systemctl enable yarn-nm
重新启动3台服务器,运行 cexec jps 查看系统状态
3.7. 卸载
- 停止yarn,停止DFS:
ssh hadoop1 'stop-yarn.sh' ssh hadoop2 'yarn-daemon.sh stop resourcemanager' ssh hadoop1 'stop-dfs.sh'
cexec jps 不再看到hdfs和yarn的进程
- 停止并删除系统服务
hadoop2 (6种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-namenode systemctl disable hadoop-datanode systemctl disable hadoop-zkfc systemctl disable yarn-rm systemctl disable yarn-nm
hadoop1 (4种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-namenode systemctl disable hadoop-zkfc systemctl disable yarn-rm
hadoop3 (3种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-datanode systemctl disable yarn-nm
- 删除数据目录
rm /opt/cloud/hdfs -rf rm /opt/cloud/logs/hadoop -rf
- 删除程序目录
rm /opt/cloud/bin/hadoop -rf rm /opt/cloud/etc/hadoop -rf rm /opt/cloud/packages/hadoop-2.7.3 -rf
- 复原环境变量
vi ~/.bashrc
删除hadoop相关行
[1] 重要的参数,设置hadoop服务之间通讯超时,尤其是nodemanager和resoucemanager之间的ha机制
[2] 适应虚拟机4G内存,各项值都较低
[3] 与yarn的高可用有关,属于nodemanager连接失败后的策略参数
[4] 虚拟机仅4G内存2个核,这些资源参数也偏小
[5] 内存不足,虚拟内存比由2.1改为4