一、环境配置


由于集群至少需要三台服务器,我就拿上次做的MongoDB Master, Slave, Arbiter环境来做Hadoop集群。服务器还是ibmcloud 免费提供的。其中Arbiter在这里做的也是slave的角色。

Hostname IP  Server Type
Master 192.168.0.28 Centos6.2
Slave 192.168.0.29 Ubuntu14.04
Arbiter 192.168.0.30 Ubuntu14.04

配置三台机器的Master hosts文件如下:

$ cat /etc/hosts
127.0.0.1   localhost Database-Master  localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.28	Database-Master master
192.168.0.29	Database-Slave slave
192.168.0.30	Database-Arbiter arbiter

Master机器有安装ansible,其他所需要的软件包地址:

http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.8/zookeeper-3.4.8.tar.gz

http://apache.opencas.org/hbase/1.2.0/hbase-1.2.0-bin.tar.gz

http://download.oracle.com/otn-pub/java/jdk/8u73-b02/jdk-8u73-linux-x64.tar.gz


java我解压缩到/usr/java/目录下,然后编辑环境变量.zshrc

export JAVA_HOME=/usr/java/jdk1.8.0_73
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tool.jar

然后重新加载,使变量生效, source .zshrc.

然后需要集群见无密码登录,此前做MongoDB实验的时候已经设置过,再次不在赘述。


二、Hadoop的安装和配置


1.    首先将刚才下载的hadoop-2.6.4.tar.gz文件解压到/home/ibmcloud/hadoop,然后编辑etc/hadoop/core-site.xml


	
		fs.default.name
		hdfs://master:9000
	

2.    添加JAVA_HOME变量到hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_73

3.    hdfs-site.xml


	
		dfs.name.dir
		/home/ibmcloud/hadoop/name
	
	
		dfs.data.dir
		/home/ibmcloud/hadoop/data
	
	
		dfs.replication
		3
	

4.    将mapred-site.xml.template 改名mapred-site.xml


	
		mapred.job.tracker
		master:9001
	

5.    add master and slave

echo "master" >~/hadoop/etc/hadoop/master
echo -e "slave\narbiter" >~/hadoop/etc/hadoop/slaves

6.    copy hadoop folder to slave and arbiter

ansible all -m copy -a "src=hadoop dest=~ '

7.    启动hadoop集群

第一次执行,需要格式化namenode,以后启动不需要执行此步骤。

hadoop/bin/hadoop namenode -format

然后启动hadoop

hadoop/sbin/start-all.sh

启动完成后,如果没有什么错误,执行jps查询一下当前进程,NameNode是Hadoop Master进程,SecondaryNameNode,ResourceManager是Hadoop进程。

$ jps                      
23076 NameNode
20788 ResourceManager
23302 SecondaryNameNode
27559 Jps

三、ZooKeeper集群安装


1.    解压缩zookeeper-3.4.8.tar.gz并重命名zookeeper, 进入zookeeper/conf目录,cp zoo_sample.cfg zoo.cfg 并编辑

$ egrep -v '^$|^#' zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/ibmcloud/zookeeper/data
clientPort=2181
server.1=192.168.0.28:2888:3888
server.2=192.168.0.29:2888:3888
server.3=192.168.0.30:2888:3888

2.    新建并编辑myid文件

mkdir /home/zookeeper/data
echo "1" > /home/zookeeper/data/myid

3.    然后同步zookeeper到其他两个节点,然后在其他节点需要修改myid为相应的数字。

ansible all -m copy -a "src=zookeeper dest=~ '

4.    启动zookeeper,查看启动信息

2016-03-15 06:43:00,421 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a0005 with negotiated timeout 40000 for client /192.168.0.28:57372
2016-03-15 06:43:01,755 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:57379
2016-03-15 06:43:01,757 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:57379
2016-03-15 06:43:01,760 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a0006 with negotiated timeout 40000 for client /192.168.0.28:57379
2016-03-15 06:43:02,211 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:57383
2016-03-15 06:43:02,215 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:57383
2016-03-15 06:43:02,217 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a0007 with negotiated timeout 40000 for client /192.168.0.28:57383
2016-03-15 06:46:57,531 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:57379 which had sessionid 0x15378e1050a0006
2016-03-15 06:46:57,544 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:57383 which had sessionid 0x15378e1050a0007
2016-03-15 06:46:57,555 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:57372 which had sessionid 0x15378e1050a0005
2016-03-15 06:47:10,171 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.30:60866
2016-03-15 06:47:10,184 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.30:60866
2016-03-15 06:47:10,186 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a0008 with negotiated timeout 40000 for client /192.168.0.30:60866
2016-03-15 06:47:10,625 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:58169
2016-03-15 06:47:10,626 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:58169
2016-03-15 06:47:10,629 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a0009 with negotiated timeout 40000 for client /192.168.0.28:58169
2016-03-15 06:47:11,199 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.30:60867
2016-03-15 06:47:11,200 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.30:60867
2016-03-15 06:47:11,204 [myid:1] - INFO  [CommitProcessor:1:ZooKeeperServer@645] - Established session 0x15378e1050a000a with negotiated timeout 40000 for client /192.168.0.30:60867

    来自Slave的信息:

2016-03-15 06:43:02,667 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:58604
2016-03-15 06:43:02,667 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:58604
2016-03-15 06:43:02,670 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@645] - Established session 0x25378e0edf00006 with negotiated timeout 40000 for client /192.168.0.28:58604
2016-03-15 06:46:55,407 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:59328
2016-03-15 06:46:55,410 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:59328
2016-03-15 06:46:55,415 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@645] - Established session 0x25378e0edf00007 with negotiated timeout 40000 for client /192.168.0.28:59328
2016-03-15 06:46:57,242 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:59328 which had sessionid 0x25378e0edf00007
2016-03-15 06:46:57,928 [myid:2] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x25378e0edf00006, likely client has closed socket
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:230)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
	at java.lang.Thread.run(Thread.java:745)
2016-03-15 06:46:57,929 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:58604 which had sessionid 0x25378e0edf00006
2016-03-15 06:47:08,780 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:59377
2016-03-15 06:47:08,786 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:59377
2016-03-15 06:47:08,789 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@645] - Established session 0x25378e0edf00008 with negotiated timeout 40000 for client /192.168.0.28:59377
2016-03-15 06:49:57,202 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.0.28:59911
2016-03-15 06:49:57,212 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@900] - Client attempting to establish new session at /192.168.0.28:59911
2016-03-15 06:49:57,215 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@645] - Established session 0x25378e0edf00009 with negotiated timeout 40000 for client /192.168.0.28:59911
2016-03-15 06:52:15,489 [myid:2] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x25378e0edf00009, likely client has closed socket
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:230)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
	at java.lang.Thread.run(Thread.java:745)
2016-03-15 06:52:15,490 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.0.28:59911 which had sessionid 0x25378e0edf00009

5.    再次查看jps, 此时会看到zookeeper进程QuorumPeerMain

$ jps
23076 NameNode
20788 ResourceManager
30821 Jps
23302 SecondaryNameNode
30538 QuorumPeerMain

四、HBase集群的安装和配置


1.    解压缩hbase-1.2.0-bin.tar.gz并重命名为hbase, 编辑/hbase/conf/hbase-env.sh

$ egrep -v '^$|^#' hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_73
export HBASE_CLASSPATH=/home/ibmcloud/hadoop/etc/hadoop
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
export HBASE_MANAGES_ZK=false

2.    编辑hbase-site.xml


	
		hbase.rootdir
		hdfs://master:9000/hbase
	
	
		hbase.master
		master
	
	
		hbase.cluster.distributed
		true
	
	
		hbase.zookeeper.property.clientPort
		2181
	
	
		hbase.zookeeper.quorum
		master,slave,arbiter
	
	
		zookeeper.session.timeout
		60000000
	
	
		dfs.support.append
		true
	

3.    添加Slave, Arbiter 到regionservers

4.    分发hbase到其他两个节点

 ansible all -m copy -a "src=hbase dest=~"


五、启动集群


1.     启动zookeeper

zookeeper/bin/zkServer.sh start

2.    启动Hadoop

$ hadoop/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
16/03/15 07:33:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: namenode running as process 23076. Stop it first.
arbiter: datanode running as process 2111. Stop it first.
slave: datanode running as process 1×××. Stop it first.
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 23302. Stop it first.
16/03/15 07:33:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
resourcemanager running as process 20788. Stop it first.
arbiter: starting nodemanager, logging to /home/ibmcloud/hadoop/logs/yarn-ibmcloud-nodemanager-Database-Arbiter.out
slave: starting nodemanager, logging to /home/ibmcloud/hadoop/logs/yarn-ibmcloud-nodemanager-Database-Slave.out

3.    启动hbase

$ hbase/bin/start-hbase.sh
master running as process 10144. Stop it first.
arbiter: regionserver running as process 3515. Stop it first.
slave: starting regionserver, logging to /home/ibmcloud/hbase/bin/../logs/hbase-ibmcloud-regionserver-Database-Slave.out
slave: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
slave: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

查询各个节点的集群进程情况

Master:

# ibmcloud at Database-Master in ~ [7:33:44]
$ jps
10144 HMaster
23076 NameNode
20788 ResourceManager
20773 Jps
23302 SecondaryNameNode
30538 QuorumPeerMain

Slave:

# ibmcloud at Database-Slave in ~/hbase/bin [6:47:55]
$ jps
1××× DataNode
26794 Jps
16397 QuorumPeerMain
26526 HRegionServer

Arbiter:

# ibmcloud at Database-Arbiter in ~/hbase/bin [6:46:34]
$ jps
2016 QuorumPeerMain
3515 HRegionServer
3628 Jps
2111 DataNode

进程都已经开启,进入habse shell环境,

# ibmcloud at Database-Master in ~ [7:34:03]
$ hbase/bin/hbase shell                

2016-03-15 07:35:04,687 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 1.2.0, r25b281972df2f5b15c426c8963cbf77dd853a5ad, Thu Feb 18 23:01:49 CST 2016

hbase(main):001:0> 
hbase(main):002:0* status

ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

提示master还是initialzing, 我的虚拟机1.5G内存,单核,10G硬盘,跑着MongoDB, PHP, Nginx, 加上Hadoop集群,肯定消化不良了。如图:

学习搭建Hadoop+HBase+ZooKeeper分布式集群环境_第1张图片


查看链接发现Hadoop监听网卡是内网的,加上端口转发,打开公网地址查看一下Hadoop运行状态

sudo iptables -t nat -I PREROUTING -d 129.41.153.232 -p tcp --dport  50070 -j DNAT --to 192.168.0.28:50070

然后打开浏览器输入http://129.41.153.232:50070/dfshealth.html#tab-overview,如图,hadoop状态


YARN状态:


Hbase状态:



其中遇到的问题又hbase启动不起来,一直报permission denied,后来发现Slave, Arbiter bin目录下的脚本没有给执行权限,然后logs下日志文件的权限不对。


参考文章:

http://songlee24.github.io/2015/07/20/hadoop-hbase-zookeeper-distributed-mode/

http://stackoverflow.com/questions/21166542/hbase-does-not-run-after-start-hbase-sh-permission-denied