系统环境:centos 7.2
Hadoop:3.2.1
集群ip地址分别为:
192.168.1.10 hadoop
192.168.1.11 slave1
192.168.1.12 slave2
192.168.1.13 slave3
vi /etc/hostname
127.0.0.1 hadoop
重启网卡
systemctl restart network
ping测试
ping hadoop
# 查看防火墙状态
systemctl status firewalld
systemctl is-active firewalld
# 启动防火墙
systemctl start firewalld
# 停止防火墙
systemctl stop firewalld
vim /etc/hosts
# 根据自己实际情况填写集群地址
192.168.1.10 hadoop
192.168.1.11 slave1
192.168.1.12 slave2
192.168.1.13 slave3
ssh-keygen -t rsa
# 然后连续点击三次回车即可
# 复制公共密钥
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
ssh localhost
# 实际操作中需要将密钥复制到集群所有及其中
# linux下直接下载
wget https://www-us.apache.org/dist/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
# 浏览器下载手动上传
https://www-us.apache.org/dist/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
安装说明文档
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
mkdir -p /opt/hadoop-3.2.1/tmp /opt/hadoop-3.2.1/hdfs /opt/hadoop-3.2.1/hdfs/data /opt/hadoop-3.2.1/hdfs/name
/opt/hadoop-3.2.1/etc/hadoop/hadoop-env.sh
/opt/hadoop-3.2.1/etc/hadoop/core-site.xml
/opt/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
/opt/hadoop-3.2.1/etc/hadoop/mapred-site.xml
/opt/hadoop-3.2.1/etc/hadoop/yarn-site.xm
/opt/hadoop-3.2.1/etc/hadoop/workers
vim /opt/hadoop-3.2.1/etc/hadoop/hadoop-env.sh
# 注意:自己安装的java目录
export JAVA_HOME=/usr/local/jdk/
# 文件最后追加
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
vim /opt/hadoop-3.2.1/etc/hadoop/core-site.xml
# 按照以下内容修改
fs.defaultFS</name>
hdfs://hadoop:9000</value>
HDFS的URI,文件系统://namenode标识:端口号</description>
</property>
hadoop.tmp.dir</name>
/opt/hadoop-3.2.1/tmp</value>
namenode上本地的hadoop临时文件夹</description>
</property>
</configuration>
备注:
name 节点用 fs.defaultFS,不建议使用 fs.default.name;
hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果 hdfs-site.xml 中不配置 namenode 和 datanode 的存放位置,默认就放在如下路径中。
NameNode
dfs.name.dir
预设值: h a d o o p . t m p . d i r / d f s / n a m e D a t a N o d e d f s . d a t a . d i r 预 设 值 : {hadoop.tmp.dir}/dfs/name DataNode dfs.data.dir 预设值: hadoop.tmp.dir/dfs/nameDataNodedfs.data.dir预设值:{hadoop.tmp.dir}/dfs/data
vim /opt/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
# 修改如下配置
dfs.replication</name>
3</value>
</description>
</property>
</configuration>
vim /opt/hadoop-3.2.1/etc/hadoop/mapred-site.xml
mapreduce.framework.name</name>
yarn</value>
</property>
</configuration>
vim /opt/hadoop-3.2.1/etc/hadoop/yarn-site.xml
yarn.nodemanager.aux-services</name>
mapreduce_shuffle</value>
</property>
</configuration>
vim /opt/hadoop-3.2.1/etc/hadoop/workers
slave1
slave2
slave3
将本机的不hosts和profile配置文件copy到slave1、slave2、slave3机器上
scp /etc/hosts /etc/profile root@slave1:/etc
scp /etc/hosts /etc/profile root@slave2:/etc
scp /etc/hosts /etc/profile root@slave3:/etc
将配置好的hadoop进行压缩
tar cf hadoop-3.2.1.tar hadoop-3.2.1
scp hadoop-3.2.1.tar root@slave1:/opt
scp hadoop-3.2.1.tar root@slave2:/opt
scp hadoop-3.2.1.tar root@slave3:/opt
ssh root@slave1 "tar -xf /opt/hadoop-3.2.1.tar -C /opt/"
ssh root@slave2 "tar -xf /opt/hadoop-3.2.1.tar -C /opt/"
ssh root@slave3 "tar -xf /opt/hadoop-3.2.1.tar -C /opt/"
/opt/hadoop-3.2.1/bin/hdfs namenode -format cluster_name
如果出现 has been successfully formatted. 则表示格式化成功
/opt/program-files/hadoop/hadoop-3.2.1/sbin/start-all.sh
使用jps命令查询,出现如下结果:
[root]# jps
NodeManager (11|12|13机器上,也就是slave1,slave2,slave3节点上)
DataNode (11|12|13机器上,也就是slave1,slave2,slave3节点上)
NameNode (10机器上,也就是hadoop主节点上)
SecondaryNameNode (10机器上,也就是hadoop主节点上)
ResourceManager (10机器上,也就是hadoop主节点上)
http://192.168.1.10:9870 (nameNode information 存在于主节点上 )
http://192.168.1.10:8088/cluster (All Applications 集群运行状态,以及任务执行服务)
http://192.168.1.(11|12|13):8042/node (NodeManager 数据节点管理服务,存在于DataNode数据节点上)
http://192.168.1.(11|12|13):9864/datanode.html (DataNode,数据存储, 存在于DataNode数据节点上)