1、准备环境,centos7虚拟机三台,192.168.2.150,151、152
2、创建hadoop用户
useradd -d /home/hadoop -m hadoop
3、修改hadoop密码
passwd hadoop
4、修改主机名(分别修改三台机器的主机名为master、slave1、slave2)
hostnamectl set-hostname master
5、配置hosts文件,在每个节点/etc/hosts文件中添加一下内容
192.168.2.150 master
192.168.2.151 slave1
192.168.2.152 slave2
5、配置ssh免密登录,是节点之间两两互通
ssh-keygen -t rsa
ssh-copy-id uname@hostname
6、安装jdk并且配置环境变量
7、下载hadoop安装文件,解压在工作目录中
cd /home/hadoop/work
tar -zxvf hadoop-2.8.3.tar.gz
mv hadoop-2.8.3 hadoop
8、在工作目录下创建hdfs目录
cd /home/hadoop/work
mkdir hdfs
cd hdfs
mkdir data name tmp
9、添加hadoop环境变量到系统
export HADOOP_HOME=/home/hadoop/work/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
10、添加JAVA_HOME到hadoop
在/home/hadoop/work/hadoop/etc/hadoop/hadoop-env.sh最后添加一下内容
export JAVA_HOME=/usr/local/jdk
涉及core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、slaves五个配置文件,对应各个组件的配置。 位于 /home/hadoop/work/hadoop/etc/hadoop/ 目录下,文件说明如下:
文件 | 说明 |
---|---|
core-site.xml | Common组件 |
hdfs-site.xml | HDFS组件 |
mapred-site.xml | MapReduce组件 |
yarn-site.xml | YARN组件 |
slaves | slave节点信息 |
1、core-site.xml
hadoop.tmp.dir
file:/home/hadoop/work/hdfs/tmp
A base for other temporary directories.
io.file.buffer.size
131072
fs.default.name
hdfs://master:9000
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
2、hdfs-site.xml
dfs.replication
2
dfs.namenode.name.dir
file:/home/hadoop/work/hdfs/name
true
dfs.datanode.data.dir
file:/home/hadoop/work/hdfs/data
true
dfs.namenode.secondary.http-address
master:9001
dfs.webhdfs.enabled
true
dfs.permissions
false
3、mapred-site.xml
MapReduce组件配置文件默认为、mapred-site.xml.template ,复制问津
cp mapred-site.xml.template mapred-site.xml
mapreduce.framework.name
yarn
4、yarn-site.xml
yarn.resourcemanager.address
master:18040
yarn.resourcemanager.scheduler.address
master:18030
yarn.resourcemanager.webapp.address
master:18088
yarn.resourcemanager.resource-tracker.address
master:18025
yarn.resourcemanager.admin.address
master:18141
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
5、编辑slaves文件,添加从节点信息
去掉原本的localhost,换成以下内容。配置slaves的目录,是把所有节点连在一起,构成一个相连的集群,启动时,整个集群一起启动。
slave1
slave2
6、将配置好的hadoop目录拷贝到其他各节点上
cd /home/hadoop/work
scp -r hadoop slave1:/home/hadoop/work/
scp -r hadoop slave2:/home/hadoop/work/
scp -r hdfs slave1:/home/hadoop/work/
scp -r hdfs slave2:/home/hadoop/work/
7、格式化namenode
hadoop namenode -format
8、启动hadoop集群
start-all.sh
日志输出如下:
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
18/09/09 12:26:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: starting namenode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-datanode-slave1.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
18/09/09 12:27:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-resourcemanager-master.out
slave2: starting nodemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
slave1: starting nodemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
9、各节点进程情况
master节点:
1664 NameNode
4657 Jps
2021 ResourceManager
1865 SecondaryNameNode
1515 QuorumPeerMain
slave节点:
2145 Jps
1156 DataNode
1100 QuorumPeerMain
1244 NodeManager
10、访问web页面
访问master WEB UI界面,可以看另外2个节点都正常运行。
http://master:50070/
查看客户端节点: http://master:18088/cluster/nodes