HBase replication 建立

HBase replication建立

官方文档   http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements

  http://hbase.apache.org/replication.html

 

参考:http://www.infoq.com/cn/articles/lp-hbase-data-disaster-recovery

Zookeeper should be handled by yourself, not by HBase, and should always be available during the deployment.

1.zookeeper不用hbase管理 修改hbase-env.sh

export HBASE_MANAGES_ZK=false

2.安装zookeeper 下载zookeeper 解压zookeeper

创建文件/hadoopDATA/zookeeper(根据配置文件dataDir)

每台zookeeper机器

建立/hadoopDATA/zookeeper/myid文件并根据

server.1=192.168.40.240:2888:3888
server.2=192.168.40.246:2888:3888
server.3=192.168.40.247:2888:3888
server.4=
192.168.40.248:2888:3888

写入1,2,3,4等

192.168.40.240机器的myid=1

192.168.40.246机器的myid=2

编辑zoo.cfg


dataDir=/hadoopDATA/zookeeper

server.1=192.168.40.240:2888:3888
server.2=192.168.40.246:2888:3888
server.3=192.168.40.247:2888:3888
server.4=192.168.40.248:2888:3888

3.copy zoo.cfg 至/hbase/conf文件夹下

copy zookeeper至各服务器,并创建文件/hadoopDATA/zookeeper/myid,根扰配置文件分别写入各个值

scp -r zookeeper-3.4.3/ hadoop@slave1:/home/hadoop/

scp -r zookeeper-3.4.3/ hadoop@slave2:/home/hadoop/

scp -r zookeeper-3.4.3/ hadoop@slave3:/home/hadoop/

4.启动zookeeper验证是否正常

zookeeper目录下各服务器分别启动zookeeper

./bin/zkServer.sh

All machines from both clusters should be able to reach every other machine since replication goes from any region server to any other one on the slave cluster. That also includes the Zookeeper clusters.

两个集群间机器互联没问题(ssh master,ssh slave1,ssh salve2,ssh slave3)

Both clusters should have the same HBase and Hadoop major revision. For example, having 0.90.1 on the master and 0.90.0 on the slave is correct but not 0.90.1 and 0.89.20100725.

版本保持一致

Every table that contains families that are scoped for replication should exist on every cluster with the exact same name, same for those replicated families.

两个cluster间的表结构需要保持一致

For multiple slaves, Master/Master, or cyclic replication version 0.92 or greater is needed.

测试环境

两个集群zookeeper handled not by hbase

编辑hbase-env.conf

export HBASE_MANAGES_ZK=false

两集群的zookeeper.znode.parent要不一致,hbase默认会创建/hbase/,cluster 2在hbase-site.xml 指定zookeeper.znode.parent为/hbase-2

1.cluster 1分布式Hbase环境

hadoop namenode(master),datanode(master,slave1,slave2)

HBase

HMaster(master),regionserver(master,slave1,slave2)

2.cluster2 Pseudo-distributed Hbase

hadoop配置文件

hdfs-site.xml

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/hadoopDATA/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoopDATA/data</value>
</property>
<property>
<name>fs.replication</name>
<value>1</value>
</property>
</configuration>

core-site.xml

<configuration>
<!-- global properties -->
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoopDATA/tmp</value>
</property>
<!-- file system properties -->
<property>
<name>fs.default.name</name>
<value>hdfs://slave3:9000</value>
</property>
</configuration>

masters

slave3

slaves

slave3

Hbase 配置文件

hbase-site.xml

<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://slave3:9000/hbase3</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>slave3:60000</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>slave3</value>
</property>
<property>
<name>hbase.replication</name>
<value>true</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase-2</value>

</property>

regionservers

localhost

建立replication

1.Edit ${HBASE_HOME}/conf/hbase-site.xml on both cluster to add the following configurations:

<property>
  <name>hbase.replication</name>
  <value>true</value>
</property>

deploy the files, and then restart HBase if it was running.

2.add_peer '7', "slave3:2181:/hbase-2"

报错,实际使用中没有影响

hbase(main):004:0> add_peer '7' ,"slave3:2181:/hbase-2"
12/03/02 15:04:43 ERROR zookeeper.RecoverableZooKeeper: Node /hbase/replication/peers already exists and this is not a retry
0 row(s) in 0.0650 seconds

3.Once you have a peer, you need to enable replication on your column families. One way to do it is to alter the table and to set the scope like this:

      disable 'your_table'
      alter 'your_table', {NAME => 'family_name', REPLICATION_SCOPE => '7
'}
      enable 'your_table'
  两个cluster的表存在且结构相同
 

4.开启停止复制

start_replication
stop_replication

你可能感兴趣的:(HBase replication 建立)