我的Hadoop版本是hadoop-2.6.0、JDK1.7。Linux虚拟机工具:VMware、SecureCRT
我们想要的效果是这样的:
Master:
[root@CentOS hadoop-2.6.0]# jps
2150 Jps
1837 NodeManager
1747 ResourceManager
1474 DataNode
1587 SecondaryNameNode
slave1:
[root@slave1 ~]# jps
1282 DataNode
1348 NodeManager
1483 Jps
slave2:
1477 Jps
1342 NodeManager
1276 DataNode
注明:master既是主机也是从机
下面我们先搭建主机环境:
主机的环境搭建和我的上上一篇文章的Hdfs单机版基本一样。这里再详细说一遍搭建步骤:
第一,先配置主机
1.配置主机名:
[root@CentOS ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=CentOS
2.安装JDK,配置环境变量,环境变量配置可以在三个地方配置,我实在根目录下的 .bashrc这个隐藏文件里配置的(ls -a命令就能看见隐藏文件),
也可以在 /etc 下的profile文件里配置,这里就不多说了下面有配置。
3.安装hadoop
我的hadoop的版本是hadoop-.2.6.0.tar.gz,也把它解压到/usr/local/hadoop/
hadoop也要设置环境变量,使用vi .bashrc或者vi /etc/profile命令编辑添加如下内容(包括JDK和Hadoop):
export CLASSPATH=.
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export PATH=$HADOOP_HOME/bin:$PATH
export JAVA_HOME=/usr/local/jdk/jdk1.7.0_65
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib
配置完成后source /etc/profile或者source .bashrc使之生效,或者重新启动也行。
接下来验证JDK和Hadoop是否配置成功:
[root@CentOS ~]# java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) Client VM (build 24.65-b04, mixed mode)
[root@CentOS ~]# hadoop version
Hadoop 2.6.0
以上表示安装成功。
4.配置主机名丛机和IP的映射关系
[root@CentOS ~]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.19.128 CentOS //这个就是以后我默认的master,后面再hadoop里会有配置
192.168.19.130 slave1 //以后复制完从机ip不一样可以改
192.168.19.131 slave2
说明:slave1和slave2目前还没有,先这样配着,等主机都配置好了,再复制出2个从机,这样方便不用单独在配置从机了
5.修改hadoop的配置文件,进入到hadoop的安装目录
[root@CentOS ~]# cd /usr/local/hadoop/hadoop-2.6.0/
修改配置文件core-site.xml如下:
[root@CentOS hadoop-2.6.0]# vi etc/hadoop/core-site.xml
把hdfs-site.xml文件修改成如下:
[root@CentOS hadoop-2.6.0]# vi etc/hadoop/hdfs-site.xml
修改mapred-site.xml如下:
[root@CentOS hadoop-2.6.0]# vi etc/hadoop/mapred-site.xml
修改masters文件,明确主机是谁如下:
[root@CentOS hadoop-2.6.0]# vi etc/hadoop/masters
CentOS
注意:这里把CentOS设置成master了
修改slaves文件,明确从机是谁如下:
[root@CentOS hadoop-2.6.0]# vi etc/hadoop/slaves
CentOS
slave1
slave2
注意:这里算是3台从机,CentOS也算自已的一台从机
6.取消防火墙
[root@CentOS ~]# service iptables stop
iptables: Setting chains to policy ACCEPT:filter [ OK ]
iptables: Flushingfirewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@CentOS ~]# chkconfig --del iptables -- 关闭防火墙的开启自启动7.复制虚拟机
我使用VMware的克隆功能,将主机CentOS完全克隆两份:slave1和slave2,并修改相应的主机名和IP地址,这样就可以简单地保持hadoop环境基本配置相同。
注意:克隆完一定把新机器的主机名改了,参考步骤1。同时检查步骤4,把从机的ip改正确,保证三台机器一致。
8.SSH设置免密码登录:
(1)在主机上生成公私钥对
[root@CentOS ~]# ssh-keygen -t dsa -P '' -f~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in/root/.ssh/id_dsa.
Your public key has been saved in/root/.ssh/id_dsa.pub.
The key fingerprint is:
06:76:81:51:1f:94:7c:02:6b:49:c5:e8:cc:80:df:8broot@CentOS
The key's randomart image is:
+--[ DSA 1024]----+
| ..+=B+. |
| . o..==.. |
| .o*= .o |
| ..+= |
| .S. |
| E.. |
| |
| |
| |
+-----------------+
(2)上传给需要登陆的目标机器slave1和slave2
[root@CentOS ~]# scp -r ~/.ssh/id_dsa.pub 192.168.19.130:/root/
[root@CentOS ~]# scp -r ~/.ssh/id_dsa.pub 192.168.19.131:/root/
注意:一般会提示输入目标机器的登录密码,正常输入就好。
(3)目标机器将上传的公钥添加到自己的信任列表,分别在slave1和slave2上操作
[root@slave1 ~]# cat ~/id_dsa.pub >>~/.ssh/authorized_keys
[root@slave2 ~]# cat ~/id_dsa.pub >>~/.ssh/authorized_keys
(4)验证一下能不能ssh免密登录slave1
[root@CentOS hadoop-2.6.0]# ssh slave1
Last login: Sun Aug 13 01:51:37 2017 from slave1
上面是登陆成功,下面exit是退出登录
[root@slave1 ~]# exit
logout
Connection to slave1 closed.
注意:我们只需要配置从master向slaves发起SSH连接不需要密码就可以了,但这样只能在master(即在主机CentOS)启动或关闭hadoop服务。
9.格式化namenode (创建fsimage文件,只有第一次启动时格式化)
[root@CentOS hadoop-2.6.0]#./bin/hadoop namenode -format
16/07/2723:25:12 INFO namenode.NameNode: STARTUP_MSG:
17/08/13 02:02:56 INFO namenode.FSImage: Allocated new BlockPoolId: BP-2020023440-192.168.19.128-1502560975898上面是执行此命令后的后几行代码,看见上面那个successfully了么,这个就代表格式化成功了。
10.现在开始启动hadoop
[root@CentOS hadoop-2.6.0]# ./sbin/start-all.sh
检查是否启动成功
master
[root@CentOS hadoop-2.6.0]# jps
2150 Jps
1837 NodeManager
1747 ResourceManager
1474 DataNode
1587 SecondaryNameNode
slave1:
[root@slave1 ~]# jps
1282 DataNode
1348 NodeManager
1483 Jps
slave2:
1477 Jps
1342 NodeManager
1276 DataNode
11.关闭hadoop
[root@CentOS hadoop-2.6.0]# ./sbin/stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [CentOS]
CentOS: stopping namenode
slave2: stopping datanode
CentOS: stopping datanode
slave1: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
slave2: stopping nodemanager
slave1: stopping nodemanager
CentOS: stopping nodemanager
slave2: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
slave1: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop