鲁春利的工作笔记,谁说程序员不能有文艺范?
测试环境
hadoop-2.6.0
hbase-0.98.1-hadoop2
zookeeper-3.4.6
1、Hadoop集群配置过程略;
2、Zookeeper集群配置过程略;
3、HBase集群配置过程略;
4、/etc/hosts
192.168.128.114 nnode 192.168.128.118 dnode1 192.168.128.119 dnode2
5、regionservers
192.168.128.114 192.168.128.118 192.168.128.119
6、hbase-site.xml
单台master的配置:
hbase.master
master:60000
这是我们通常配置的,这样就定义了HBase master的主机和端口。
当需要配置多台master时,只需要提供端口,选择真正的master的事情会由zookeeper去处理。
多台master的配置
hbase.master.port
60000
假设现在架构
A:master、zookeeper、HRegionServer
B:zookeeper、HRegionServer
C:zookeeper、HRegionServer
在A上直接启动start-hbase.sh
在B上启动hbase-daemon.sh start master
这样我们在A和B上都启动了master,不用担心同时启动了2个,因为只有在A的master宕掉后,zookeeper才会切换B的master为主。
[hadoop@nnode conf]$ cat hbase-site.xml# 说明:在主机dnode1和dnode2上需要调整bindAddress的值为本机主机名。 # 通过http://nnode:60010来访问HMaster的web,通过端口60030来访问HRegionServer的web [hadoop@nnode conf]$ hbase.rootdir hdfs://nnode:9000/hbase hbase.cluster.distributed true hbase.tmp.dir /data/hbase/tmp hbase.zookeeper.quorum 192.168.128.114,192.168.128.115,192.168.128.116 hbase.master.port 60000 hbase.master.info.bindAddress nnode hbase.master.info.port 60010 hbase.regionserver.port 60020 hbase.regionserver.info.bindAddress nnode hbase.regionserver.info.port 60030
7、检查zookeeper的状态
[hadoop@nnode ~]$ zkServer.sh status JMX enabled by default Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg CLASSPATH=略 Mode: follower [hadoop@nnode ~]$ # 状态处理leader的zookeeper节点 [hadoop@dnode4 ~]$ zkServer.sh status JMX enabled by default Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg CLASSPATH=略 Mode: leader [hadoop@dnode4 ~]$ [hadoop@dnode5 ~]$ zkServer.sh status JMX enabled by default Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg CLASSPATH=略 Mode: follower [hadoop@dnode5 ~]$
8、启动测试
首先通过start-hbase.sh启动hbase集群。
[hadoop@nnode ~]$ jps 22821 QuorumPeerMain 15110 SecondaryNameNode 15290 ResourceManager 14867 NameNode 13069 HMaster 13239 HRegionServer 13346 Jps [hadoop@nnode ~]$ jps | grep H 13069 HMaster 13239 HRegionServer [hadoop@nnode ~]$ ssh dnode1 Last login: Wed Sep 2 12:16:04 2015 from nnode [hadoop@dnode1 ~]$ jps | grep H 10378 HRegionServer [hadoop@dnode1 ~]$ ssh dnode2 Last login: Wed Sep 2 11:53:03 2015 from dnode1 [hadoop@dnode2 ~]$ jps | grep H 24500 HRegionServer
然后检查集群的状态
[hadoop@dnode2 ~]$ hbase hbck 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:host.name=dnode2 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:java.version=1.7.0_79 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.7.0_79/jre 2015-09-02 13:38:36,083 INFO [main] zookeeper.ZooKeeper: Client environment:java.class.path=jar文件路径 2015-09-02 13:38:36,085 INFO [main] zookeeper.ZooKeeper: Client environment:java.library.path=/usr/sca_app/hadoop/lib/native 2015-09-02 13:38:36,085 INFO [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 2015-09-02 13:38:36,085 INFO [main] zookeeper.ZooKeeper: Client environment:java.compiler=2015-09-02 13:38:36,085 INFO [main] zookeeper.ZooKeeper: Client environment:os.name=Linux 2015-09-02 13:38:36,085 INFO [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64 2015-09-02 13:38:36,086 INFO [main] zookeeper.ZooKeeper: Client environment:os.version=2.6.32-431.el6.x86_64 2015-09-02 13:38:36,086 INFO [main] zookeeper.ZooKeeper: Client environment:user.name=hadoop 2015-09-02 13:38:36,086 INFO [main] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2015-09-02 13:38:36,086 INFO [main] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop 2015-09-02 13:38:36,087 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=略, baseZNode=/hbase // ...... Version: 0.98.1-hadoop2 Number of live region servers: 3 Number of dead region servers: 0 Master: nnode,60000,1441171436398 Number of backup masters: 1 Average load: 1.0 Number of requests: 54 Number of regions: 3 Number of regions in transition: 0 // ...... 2015-09-02 13:38:37,773 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/default/m_domain 2015-09-02 13:38:37,775 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/hbase/meta 2015-09-02 13:38:37,776 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/hbase/namespace 2015-09-02 13:38:37,785 DEBUG [hbasefsck-pool1-t3] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/hbase/meta/1588230740 2015-09-02 13:38:37,786 DEBUG [hbasefsck-pool1-t2] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/hbase/namespace/e1517e2d740ba108fe6ff5622a4124cf 2015-09-02 13:38:37,786 DEBUG [hbasefsck-pool1-t1] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/default/m_domain/fb8a9bbcb35df01a69320f68e76b0a8c 2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t4] util.HBaseFsck: HRegionInfo read: {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''} 2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t5] util.HBaseFsck: HRegionInfo read: {ENCODED => e1517e2d740ba108fe6ff5622a4124cf, NAME => 'hbase:namespace,,1440742486827.e1517e2d740ba108fe6ff5622a4124cf.', STARTKEY => '', ENDKEY => ''} 2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t6] util.HBaseFsck: HRegionInfo read: {ENCODED => fb8a9bbcb35df01a69320f68e76b0a8c, NAME => 'm_domain,,1440742617693.fb8a9bbcb35df01a69320f68e76b0a8c.', STARTKEY => '', ENDKEY => ''} // ...... Summary: m_domain is okay. Number of regions: 1 Deployed on: nnode,60020,1441171437119 hbase:meta is okay. Number of regions: 1 Deployed on: dnode2,60020,1441171437508 hbase:namespace is okay. Number of regions: 1 Deployed on: dnode1,60020,1441171437609 0 inconsistencies detected. Status: OK 2015-09-02 13:38:38,629 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 2015-09-02 13:38:38,630 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x14f6f7fe01c002d 2015-09-02 13:38:38,634 INFO [main] zookeeper.ZooKeeper: Session: 0x14f6f7fe01c002d closed 2015-09-02 13:38:38,634 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
在dnode2主机上再启一个HMaster
[hadoop@dnode2 ~]$ hbase-daemon.sh start master starting master, logging to /usr/sca_app/hbase/logs/hbase-hadoop-master-dnode2.out [hadoop@dnode2 ~]$ jps 29921 Jps 24500 HRegionServer 25754 NodeManager 29767 HMaster 25465 DataNode [hadoop@dnode2 ~]$ jps | grep H 24500 HRegionServer 29767 HMaster [hadoop@dnode2 ~]$
检查matser是哪个节点
[hadoop@dnode2 ~]$ hbase zkcli // ...... WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 0] ls / [hbase, zookeeper, hadoop-ha1, ] [zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 1] get /hbase/master ?master:60000??7f?0gPBUF nnode?????? cZxid = 0x60000024b ctime = Wed Sep 02 13:28:42 CST 2015 mZxid = 0x60000024b mtime = Wed Sep 02 13:28:42 CST 2015 pZxid = 0x60000024b cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x34f6f7fe9de0028 dataLength = 53 numChildren = 0 [zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 2] quit [hadoop@nnode ~]$ jps 22821 QuorumPeerMain 15110 SecondaryNameNode 15290 ResourceManager 14867 NameNode 13069 HMaster 13239 HRegionServer 14252 Jps # 将主机nnode上的HMaster进程杀死 [hadoop@nnode ~]$ kill 13069 [hadoop@nnode ~]$ jps 22821 QuorumPeerMain 15110 SecondaryNameNode 14402 Jps 15290 ResourceManager 14867 NameNode 13239 HRegionServer # 再次查看master在哪个节点 [hadoop@nnode ~]$ hbase zkcli WATCHER:: WatchedEvent state:SyncConnected type:None path:null JLine support is enabled [zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 1] get /hbase/master ?master:60000*vz?V?PBUF dnode2??????? cZxid = 0x6000002bd ctime = Wed Sep 02 13:53:02 CST 2015 mZxid = 0x6000002bd mtime = Wed Sep 02 13:53:02 CST 2015 pZxid = 0x6000002bd cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x34f6f7fe9de0030 dataLength = 53 numChildren = 0 [zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 2] get /hbase/meta-region-server ?regionserver:60020??*P??PBUF dnode2?????? cZxid = 0x600000266 ctime = Wed Sep 02 13:28:51 CST 2015 mZxid = 0x600000266 mtime = Wed Sep 02 13:28:51 CST 2015 pZxid = 0x600000266 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 59 numChildren = 0
9、查看leader状态的zookeeper节点的日志
# 从zookeeper的日志中看到zookeeper认为从114的client连接已经断开了,并尝试建立119的连接 2015-09-02 23:02:38,225 [myid:2] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x24f8d2924cf0008, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2015-09-02 23:02:38,227 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /192.168.128.114:38986 which had sessionid 0x24f8d2924cf0008 2015-09-02 23:02:42,086 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b000c 2015-09-02 23:02:43,403 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.128.119:43855 2015-09-02 23:02:43,404 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /192.168.128.119:43855 2015-09-02 23:02:43,408 [myid:2] - INFO [CommitProcessor:2:ZooKeeperServer@617] - Established session 0x24f8d2924cf000c with negotiated timeout 40000 for client /192.168.128.119:43855 2015-09-02 23:02:43,544 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x35 zxid:0xa000000ce txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2015-09-02 23:02:43,586 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000d type:create cxid:0x1 zxid:0xa000000d0 txntype:-1 reqpath:n/a Error Path:/hbase/replication/rs Error:KeeperErrorCode = NodeExists for /hbase/replication/rs 2015-09-02 23:02:49,501 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x6a zxid:0xa000000d1 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/default Error:KeeperErrorCode = NodeExists for /hbase/namespace/default 2015-09-02 23:02:49,517 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x6c zxid:0xa000000d3 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/hbase Error:KeeperErrorCode = NodeExists for /hbase/namespace/hbase 2015-09-02 23:03:16,000 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14f8d24f53b0006, timeout of 40000ms exceeded 2015-09-02 23:03:16,001 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x24f8d2924cf0008, timeout of 40000ms exceeded 2015-09-02 23:03:16,002 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14f8d24f53b0008, timeout of 40000ms exceeded 2015-09-02 23:03:16,002 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b0006 2015-09-02 23:03:16,003 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x24f8d2924cf0008 2015-09-02 23:03:16,004 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b0008 2015-09-02 23:03:32,163 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.128.114:39048 2015-09-02 23:03:32,176 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /192.168.128.114:39048 2015-09-02 23:03:32,180 [myid:2] - INFO [CommitProcessor:2:ZooKeeperServer@617] - Established session 0x24f8d2924cf000d with negotiated timeout 30000 for client /192.168.128.114:39048 2015-09-02 23:04:17,219 [myid:2] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x24f8d2924cf000d, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2015-09-02 23:04:17,222 [myid:2] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /192.168.128.114:39048 which had sessionid 0x24f8d2924cf000d 2015-09-02 23:04:44,000 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x24f8d2924cf000d, timeout of 30000ms exceeded 2015-09-02 23:04:44,001 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x24f8d2924cf000d
总结: