1、安装前准备
①、集群规划:
主机名称 | 用户 | 主机IP | 安装软件 | 运行进程 | |
centos71 | hzq | 192.168.1.201 | jdk、hadoop | NameNode、DFSZKFailoverController(zkfc) | |
centos72 | hzq | 192.168.1.202 | jdk、hadoop | NameNode、DFSZKFailoverController(zkfc) | |
centos73 | hzq | 192.168.1.203 | jdk、hadoop | ResourceManager | |
centos74 | hzq | 192.168.1.204 | jdk、hadoop | ResourceManager | |
centos75 | hzq | 192.168.1.205 | jdk、hadoop | DataNode、NodeManager、JournalNode | |
centos76 | hzq | 192.168.1.206 | jdk、hadoop | DataNode、NodeManager、JournalNode | |
centos77 | hzq | 192.168.1.207 | jdk、hadoop | DataNode、NodeManager、JournalNode | |
centos78 | hzq | 192.168.1.205 | jdk、zookeeper | QuorumPeerMain | |
centos79 | hzq | 192.168.1.206 | jdk、zookeeper | QuorumPeerMain | |
centos710 | hzq | 192.168.1.207 | jdk、zookeeper | QuorumPeerMain |
②、每台主机之间设置免密登陆,参考《ssh免密登陆》
③、每台安装jdk1.8.0_131,安装及配置见《Linux安装JDK步骤》
④、Zookeeper集群搭建,搭建步骤参考《zookeeper-3.4.10安装教程---分布式配置》
⑤、修改“etc/hosts"文件如下:
192.168.31.128 centos71
192.168.31.129 centos72
192.168.31.130 centos73
192.168.31.131 centos74
192.168.31.132 centos76
192.168.31.133 centos75
192.168.31.137 centos77
192.168.31.134 centos78
192.168.31.135 centos79
192.168.31.136 centos710
⑥、准备Hadoop安装包:hadoop-2.8.0.tar.gz
⑦、关闭防火墙
2、Hadoop安装:
①、在"/home/hzq/software/"下创建"hadoop"文件夹
②、在"hadoop"目录下创建"data"文件夹,用于存放hadoop运行时文件
③、将"hadoop-2.8.0.tar.gz"解压到hadoop目录下
tar -zxvf ../package/hadoop-2.8.0.tar.gz -C /home/hzq/software/hadoop/
④、删除"hadoop-2.8.0"下"share"中的doc文件,为了提高scp拷贝时速度
rm -rf hadoop-2.8.0/share/doc
3、Hadoop配置:
①、修改 hadoop-env.sh 配置文件,修改JAVA_HOME
export JAVA_HOME=/home/hzq/software/jdk1.8.0_131
②、修改core-site.xml
fs.defaultFS
hdfs://hzqnns/
hadoop.tmp.dir
/home/hzq/software/hadoop/data
ha.zookeeper.quorum
centos78:2181,centos79:2181,centos710:2181
③、修改hdfs-site.xml
dfs.replication
2
dfs.block.size
64M
dfs.nameservices
hzqnns
dfs.ha.namenodes.hzqnns
nn1,nn2
dfs.namenode.rpc-address.hzqnns.nn1
centos71:9000
dfs.namenode.http-address.hzqnns.nn1
centos71:50070
dfs.namenode.rpc-address.hzqnns.nn2
centos72:9000
dfs.namenode.http-address.hzqnns.nn2
centos72:50070
dfs.namenode.shared.edits.dir
qjournal://centos75:8485;centos76:8485;centos77:8485/hzqnns
dfs.journalnode.edits.dir
/home/hzq/software/hadoop/data/journaldata
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.hzqnns
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
shell(/bin/true)
dfs.ha.fencing.ssh.private-key-files
/home/hzq/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000
④、mapred-site.xml
将“mapred-site.xml.template”进行重命名。
mv mapred-site.xml.template mapred-site.xml
修改mapred-site.xml
mapreduce.framework.name
yarn
⑤、yarn-site.xml
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
yrc
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
centos73
yarn.resourcemanager.hostname.rm2
centos74
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.zk-address
centos78:2181,centos79:2181,centos710:2181
⑥、配置DataNode主机,修改slaves
centos75
centos76
centos77
⑦、将配置好的Hadoop发送到其他六台主机上
scp -r hadoop/ centos72:/home/hzq/software/
scp -r hadoop/ centos73:/home/hzq/software/
scp -r hadoop/ centos74:/home/hzq/software/
scp -r hadoop/ centos75:/home/hzq/software/
scp -r hadoop/ centos76:/home/hzq/software/
scp -r hadoop/ centos77:/home/hzq/software/
4、启动Hadoop(首次启动必须按照顺序来执行)
zkServer.sh start
zkServer.sh status
②、启动journalnode(分别在centos75、centos76、centos77上执行)
hadoop-daemon.sh start journalnode
注:运行jps命令检验是否启动成功,如成功,分别在
centos75、centos76、centos77多一个JournalNode进程
③、在centos71上格式化HDFS
hdfs namenode -format
④、使两个NameNode数据保持一直,将centos71主机上,data中的数据复制到centos72主机data中。
scp -r data/ centos72:/home/hzq/software/hadoop/data
⑤、在centos71上格式化ZKFC
hdfs zkfc -formatZK
⑥、在centos71上启动HDFS
start-dfs.sh
⑦、在centos73上启动Resourcemanager及NodeManager
start-yarn.sh
⑧、在centos74上启动Resourcemanager
yarn-daemon.sh start resourcemanager
5、验证是否启动成功:
①、在每台主机上分别使用jps验证。
②、HDFS管理界面 http://centos71:50070 或者 http://centos72:50070
③、MR管理界面 http://centos73:8088 或者 http://centos74:8088
6、常用命令:
hdfs dfsadmin -report
hdfs haadmin -getServiceState nn1
hadoop-daemon.sh start namenode
hadoop-daemon.sh start zkfc
yarn-daemon.sh start resourcemanager
7、总结
1、搭建纯属于学习使用,没有做优化等等。
2、望路过大神多多指点指点。