HBase-1.0.1学习笔记(五)How HBase Master selection with Zookeeper

鲁春利的工作笔记,谁说程序员不能有文艺范?


测试环境

    hadoop-2.6.0

    hbase-0.98.1-hadoop2

    zookeeper-3.4.6


1、Hadoop集群配置过程略;

2、Zookeeper集群配置过程略;

3、HBase集群配置过程略;

4、/etc/hosts

192.168.128.114 nnode
192.168.128.118 dnode1
192.168.128.119 dnode2

5、regionservers

192.168.128.114
192.168.128.118
192.168.128.119

6、hbase-site.xml

    单台master的配置:

    hbase.master

    master:60000

    这是我们通常配置的,这样就定义了HBase master的主机和端口。

    

    当需要配置多台master时,只需要提供端口,选择真正的master的事情会由zookeeper去处理。

    多台master的配置
    hbase.master.port
    60000

    

    假设现在架构

    A:master、zookeeper、HRegionServer

    B:zookeeper、HRegionServer

    C:zookeeper、HRegionServer

    在A上直接启动start-hbase.sh

    在B上启动hbase-daemon.sh start master

    这样我们在A和B上都启动了master,不用担心同时启动了2个,因为只有在A的master宕掉后,zookeeper才会切换B的master为主。


[hadoop@nnode conf]$ cat hbase-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!- hadoop集群 -->
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://nnode:9000/hbase</value> 
    </property>
    <!-- 是否分布式 -->
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value> 
    </property>
    <property>
        <name>hbase.tmp.dir</name>
        <value>/data/hbase/tmp</value>
    </property>
    <!-- zookeeper -->
    <property>
        <name>hbase.zookeeper.quorum</name> 
        <value>192.168.128.114,192.168.128.115,192.168.128.116</value>
    </property>
    
    <!-- 配置WEB UI -->
    <property>
        <name>hbase.master.port</name>
        <value>60000</value>
    </property>
    <property>
        <name>hbase.master.info.bindAddress</name>
        <value>nnode</value>
    </property>
    <property>
        <name>hbase.master.info.port</name>
        <value>60010</value>
    </property>

    <property>
        <name>hbase.regionserver.port</name>
        <value>60020</value>
    </property>
    <property>
        <name>hbase.regionserver.info.bindAddress</name>
        <value>nnode</value>
    </property>
    <property>
        <name>hbase.regionserver.info.port</name>
        <value>60030</value>
    </property>
</configuration>
# 说明:在主机dnode1和dnode2上需要调整bindAddress的值为本机主机名。
# 通过http://nnode:60010来访问HMaster的web,通过端口60030来访问HRegionServer的web
[hadoop@nnode conf]$

7、检查zookeeper的状态

[hadoop@nnode ~]$ zkServer.sh status
JMX enabled by default
Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg
CLASSPATH=略
Mode: follower
[hadoop@nnode ~]$ 

# 状态处理leader的zookeeper节点
[hadoop@dnode4 ~]$ zkServer.sh status
JMX enabled by default
Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg
CLASSPATH=略
Mode: leader
[hadoop@dnode4 ~]$ 

[hadoop@dnode5 ~]$ zkServer.sh status
JMX enabled by default
Using config: /usr/sca_app/zookeeper/bin/../conf/zoo.cfg
CLASSPATH=略
Mode: follower
[hadoop@dnode5 ~]$

8、启动测试
    首先通过start-hbase.sh启动hbase集群。

[hadoop@nnode ~]$ jps
22821 QuorumPeerMain
15110 SecondaryNameNode
15290 ResourceManager
14867 NameNode
13069 HMaster
13239 HRegionServer
13346 Jps
[hadoop@nnode ~]$ jps | grep H
13069 HMaster
13239 HRegionServer
[hadoop@nnode ~]$ ssh dnode1
Last login: Wed Sep  2 12:16:04 2015 from nnode
[hadoop@dnode1 ~]$ jps | grep H
10378 HRegionServer
[hadoop@dnode1 ~]$ ssh dnode2
Last login: Wed Sep  2 11:53:03 2015 from dnode1
[hadoop@dnode2 ~]$ jps | grep H
24500 HRegionServer

    然后检查集群的状态

[hadoop@dnode2 ~]$ hbase hbck
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:host.name=dnode2
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.7.0_79
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.7.0_79/jre
2015-09-02 13:38:36,083 INFO  [main] zookeeper.ZooKeeper: Client environment:java.class.path=jar文件路径
2015-09-02 13:38:36,085 INFO  [main] zookeeper.ZooKeeper: Client environment:java.library.path=/usr/sca_app/hadoop/lib/native
2015-09-02 13:38:36,085 INFO  [main] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2015-09-02 13:38:36,085 INFO  [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2015-09-02 13:38:36,085 INFO  [main] zookeeper.ZooKeeper: Client environment:os.name=Linux
2015-09-02 13:38:36,085 INFO  [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2015-09-02 13:38:36,086 INFO  [main] zookeeper.ZooKeeper: Client environment:os.version=2.6.32-431.el6.x86_64
2015-09-02 13:38:36,086 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=hadoop
2015-09-02 13:38:36,086 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop
2015-09-02 13:38:36,086 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop
2015-09-02 13:38:36,087 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=略, baseZNode=/hbase
// ......
Version: 0.98.1-hadoop2
Number of live region servers: 3
Number of dead region servers: 0
Master: nnode,60000,1441171436398
Number of backup masters: 1
Average load: 1.0
Number of requests: 54
Number of regions: 3
Number of regions in transition: 0
// ......
2015-09-02 13:38:37,773 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/default/m_domain
2015-09-02 13:38:37,775 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/hbase/meta
2015-09-02 13:38:37,776 DEBUG [main] util.HBaseFsck: Loading region dirs from hdfs://nnode:9000/hbase/data/hbase/namespace
2015-09-02 13:38:37,785 DEBUG [hbasefsck-pool1-t3] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/hbase/meta/1588230740
2015-09-02 13:38:37,786 DEBUG [hbasefsck-pool1-t2] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/hbase/namespace/e1517e2d740ba108fe6ff5622a4124cf
2015-09-02 13:38:37,786 DEBUG [hbasefsck-pool1-t1] util.HBaseFsck: Loading region info from hdfs:hdfs://nnode:9000/hbase/data/default/m_domain/fb8a9bbcb35df01a69320f68e76b0a8c
2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t4] util.HBaseFsck: HRegionInfo read: {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t5] util.HBaseFsck: HRegionInfo read: {ENCODED => e1517e2d740ba108fe6ff5622a4124cf, NAME => 'hbase:namespace,,1440742486827.e1517e2d740ba108fe6ff5622a4124cf.', STARTKEY => '', ENDKEY => ''}
2015-09-02 13:38:38,157 DEBUG [hbasefsck-pool1-t6] util.HBaseFsck: HRegionInfo read: {ENCODED => fb8a9bbcb35df01a69320f68e76b0a8c, NAME => 'm_domain,,1440742617693.fb8a9bbcb35df01a69320f68e76b0a8c.', STARTKEY => '', ENDKEY => ''}
// ......
Summary:
  m_domain is okay.
    Number of regions: 1
    Deployed on:  nnode,60020,1441171437119
  hbase:meta is okay.
    Number of regions: 1
    Deployed on:  dnode2,60020,1441171437508
  hbase:namespace is okay.
    Number of regions: 1
    Deployed on:  dnode1,60020,1441171437609
0 inconsistencies detected.
Status: OK
2015-09-02 13:38:38,629 INFO  [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
2015-09-02 13:38:38,630 INFO  [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x14f6f7fe01c002d
2015-09-02 13:38:38,634 INFO  [main] zookeeper.ZooKeeper: Session: 0x14f6f7fe01c002d closed
2015-09-02 13:38:38,634 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down

    在dnode2主机上再启一个HMaster

[hadoop@dnode2 ~]$ hbase-daemon.sh start master
starting master, logging to /usr/sca_app/hbase/logs/hbase-hadoop-master-dnode2.out
[hadoop@dnode2 ~]$ jps
29921 Jps
24500 HRegionServer
25754 NodeManager
29767 HMaster
25465 DataNode
[hadoop@dnode2 ~]$ jps | grep H
24500 HRegionServer
29767 HMaster
[hadoop@dnode2 ~]$

    检查matser是哪个节点

[hadoop@dnode2 ~]$ hbase zkcli
// ......
WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 0] ls /
[hbase, zookeeper, hadoop-ha1, ]
[zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 1] get /hbase/master
?master:60000??7f?0gPBUF

nnode??????
cZxid = 0x60000024b
ctime = Wed Sep 02 13:28:42 CST 2015
mZxid = 0x60000024b
mtime = Wed Sep 02 13:28:42 CST 2015
pZxid = 0x60000024b
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x34f6f7fe9de0028
dataLength = 53
numChildren = 0
[zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 2] quit

[hadoop@nnode ~]$ jps
22821 QuorumPeerMain
15110 SecondaryNameNode
15290 ResourceManager
14867 NameNode
13069 HMaster
13239 HRegionServer
14252 Jps
# 将主机nnode上的HMaster进程杀死
[hadoop@nnode ~]$ kill 13069
[hadoop@nnode ~]$ jps
22821 QuorumPeerMain
15110 SecondaryNameNode
14402 Jps
15290 ResourceManager
14867 NameNode
13239 HRegionServer

# 再次查看master在哪个节点
[hadoop@nnode ~]$ hbase zkcli
WATCHER::

WatchedEvent state:SyncConnected type:None path:null
JLine support is enabled
[zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 1] get /hbase/master
?master:60000*vz?V?PBUF

dnode2???????
cZxid = 0x6000002bd
ctime = Wed Sep 02 13:53:02 CST 2015
mZxid = 0x6000002bd
mtime = Wed Sep 02 13:53:02 CST 2015
pZxid = 0x6000002bd
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x34f6f7fe9de0030
dataLength = 53
numChildren = 0
[zk: 192.168.128.115:2181,192.168.128.114:2181,192.168.128.116:2181(CONNECTED) 2] get /hbase/meta-region-server
?regionserver:60020??*P??PBUF

dnode2??????
cZxid = 0x600000266
ctime = Wed Sep 02 13:28:51 CST 2015
mZxid = 0x600000266
mtime = Wed Sep 02 13:28:51 CST 2015
pZxid = 0x600000266
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 59
numChildren = 0

9、查看leader状态的zookeeper节点的日志

# 从zookeeper的日志中看到zookeeper认为从114的client连接已经断开了,并尝试建立119的连接
2015-09-02 23:02:38,225 [myid:2] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x24f8d2924cf0008, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-09-02 23:02:38,227 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /192.168.128.114:38986 which had sessionid 0x24f8d2924cf0008
2015-09-02 23:02:42,086 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b000c
2015-09-02 23:02:43,403 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.128.119:43855
2015-09-02 23:02:43,404 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /192.168.128.119:43855
2015-09-02 23:02:43,408 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@617] - Established session 0x24f8d2924cf000c with negotiated timeout 40000 for client /192.168.128.119:43855
2015-09-02 23:02:43,544 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x35 zxid:0xa000000ce txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2015-09-02 23:02:43,586 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000d type:create cxid:0x1 zxid:0xa000000d0 txntype:-1 reqpath:n/a Error Path:/hbase/replication/rs Error:KeeperErrorCode = NodeExists for /hbase/replication/rs
2015-09-02 23:02:49,501 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x6a zxid:0xa000000d1 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/default Error:KeeperErrorCode = NodeExists for /hbase/namespace/default
2015-09-02 23:02:49,517 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14f8d24f53b000b type:create cxid:0x6c zxid:0xa000000d3 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/hbase Error:KeeperErrorCode = NodeExists for /hbase/namespace/hbase
2015-09-02 23:03:16,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14f8d24f53b0006, timeout of 40000ms exceeded
2015-09-02 23:03:16,001 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347] - Expiring session 0x24f8d2924cf0008, timeout of 40000ms exceeded
2015-09-02 23:03:16,002 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14f8d24f53b0008, timeout of 40000ms exceeded
2015-09-02 23:03:16,002 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b0006
2015-09-02 23:03:16,003 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x24f8d2924cf0008
2015-09-02 23:03:16,004 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14f8d24f53b0008
2015-09-02 23:03:32,163 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.128.114:39048
2015-09-02 23:03:32,176 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /192.168.128.114:39048
2015-09-02 23:03:32,180 [myid:2] - INFO  [CommitProcessor:2:ZooKeeperServer@617] - Established session 0x24f8d2924cf000d with negotiated timeout 30000 for client /192.168.128.114:39048
2015-09-02 23:04:17,219 [myid:2] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x24f8d2924cf000d, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-09-02 23:04:17,222 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /192.168.128.114:39048 which had sessionid 0x24f8d2924cf000d
2015-09-02 23:04:44,000 [myid:2] - INFO  [SessionTracker:ZooKeeperServer@347] - Expiring session 0x24f8d2924cf000d, timeout of 30000ms exceeded
2015-09-02 23:04:44,001 [myid:2] - INFO  [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x24f8d2924cf000d


总结:

    

    

你可能感兴趣的:(hbase,HA)