hbase集群安装--jared
部署笔记是在2014年年初记录的,现在放在51cto上。
***环境介绍
1台master,3台node。
master
node1
node2
node3
hadoop集群基础已经部署完毕,详细请参考如下链接:
http://ganlanqing.blog.51cto.com/6967482/1387210
1. 软件介绍
JDK版本:jdk-7u51-linux-x64.rpm
hadoop版本:hadoop-0.20.2.tar.gz
hbase版本:hbase-0.90.5.tar.gz
zookeeper版本:本集群采用hbase自带的zookeeper
在原来搭建的hadoop集群基础之上搭建hbase
2.环境变量
export JAVA_HOME=/usr/java/jdk1.7.0_51
export HADOOP_INSTALL=/home/jared/hadoop
export HBASE_INSTALL=/home/jared/hbase
export PATH=$PATH:$HADOOP_INSTALL/bin:$HBASE_INSTALL/bin
3.安装配置hbase
下载hbase版本以及解压过程略
hbase根目录/home/jared/hbase
[jared@master ]$ pwd
/home/jared/hbase
在master节点上修改以下的配置文件hbase-env.sh:加入jave的环境变量
[jared@master conf]$ cat hbase-env.sh
………………
# Set environment variables here.
# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/java/jdk1.7.0_51/
………………
# Extra Java CLASSPATH elements. Optional.
export HBASE_CLASSPATH=/home/jared/hadoop/conf
………………
export HBASE_MANAGES_ZK=true #使用hbase自带的zookeeper
配置 hbase-site.xml
[jared@master conf]$ cat hbase-site.xml
说明:
第一个配置是配置HRegionServer的数据库存储目录
第二个配置是配置hbase为完全分布式
第三个配置是配置hmaster的地址
第四个是配置ZooKeeper集群服务器的位置 这个注意必须是奇数个;
最后一个是配置zookeeper的属性数据存储目录,如果你不想重启电脑就被清空的话就要配置这个 因为默认是/tmp
添加regionservers节点,配置一下HRegionServer (前提是做好集群中的解析)
[jared@master conf]$ cat regionservers
node1
node2
node3
4.替换Jar包
[jared@master lib]$ pwd
/home/jared/hbase/lib
[jared@master lib]$ mv hadoop-core-0.20-append-r1056497.jar hadoop-core-0.20-append-r1056497.jar.sav
[jared@master lib]$ cp ../../hadoop/hadoop-0.20.2-core.jar .
[jared@master lib]$ ls
activation-1.1.jar commons-net-1.4.1.jar jasper-compiler-5.5.23.jar jetty-util-6.1.26.jar slf4j-api-1.5.8.jar
asm-3.1.jar core-3.1.1.jar jasper-runtime-5.5.23.jar jruby-complete-1.6.0.jar slf4j-log4j12-1.5.8.jar
avro-1.3.3.jar guava-r06.jar jaxb-api-2.1.jar jsp-2.1-6.1.14.jar stax-api-1.0.1.jar
commons-cli-1.2.jar hadoop-0.20.2-core.jar jaxb-impl-2.1.12.jar jsp-api-2.1-6.1.14.jar thrift-0.2.0.jar
commons-codec-1.4.jar hadoop-core-0.20-append-r1056497.jar.sav jersey-core-1.4.jar jsr311-api-1.1.1.jar xmlenc-0.52.jar
commons-el-1.0.jar jackson-core-asl-1.5.5.jar jersey-json-1.4.jar log4j-1.2.16.jar zookeeper-3.3.2.jar
commons-httpclient-3.1.jar jackson-jaxrs-1.5.5.jar jersey-server-1.4.jar protobuf-java-2.3.0.jar
commons-lang-2.5.jar jackson-mapper-asl-1.4.2.jar jettison-1.1.jar ruby
commons-logging-1.1.1.jar jackson-xc-1.5.5.jar
修改权限,使其具有执行权限
[jared@master lib]$ chmod 755 hadoop-0.20.2-core.jar
5.向其它3个结点复制Hbase相关配置,使四个节点的hbase文件夹保持一致。
如果没有事先启动hadoop集群 请先启动hadoop集群这里主要是保证文件系统启动了,可以不用start-all 只启动文件系统也ok。
[jared@master ~]$ scp -r hbase node1:/home/jared
[jared@master ~]$ scp -r hbase node2:/home/jared
[jared@master ~]$ scp -r hbase node3:/home/jared
[jared@master ~]$
添加HBase相关环境变量 (所有结点)
[root@master ~]# vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_51
export HADOOP_INSTALL=/home/jared/hadoop
export HBASE_INSTALL=/home/jared/hbase
export PATH=$PATH:$HADOOP_INSTALL/bin:$HBASE_INSTALL/bin
[root@master ~]# source /etc/profile
[jared@master ~]$ source /etc/profile
6.创建HBase主目录
[jared@master ~]$ hadoop fs -mkdir hbase
7.启动/关闭Hbase数据库集群
做完以上的配置我们就可以启动Hbase集群了,在启动之前我们要检查一下Hadoop集群是否已经启动,必须先启动Hadoop在启动Hbase,我想道理大家都应该明白吧!Hadoop是Hbase的宿主。
[jared@master ~]$ start-hbase.sh
node1: starting zookeeper, logging to /home/jared/hbase/bin/../logs/hbase-jared-zookeeper-node1.out # 说明node1:HQuorumPeer进程已经启动
node2: starting zookeeper, logging to /home/jared/hbase/bin/../logs/hbase-jared-zookeeper-node2.out
master: starting zookeeper, logging to /home/jared/hbase/bin/../logs/hbase-jared-zookeeper-master.out
starting master, logging to /home/jared/hbase/bin/../logs/hbase-jared-master-master.out # 说明HMaster进程已经启动
node2: regionserver running as process 12153. Stop it first. # Region服务器
node1: regionserver running as process 41086. Stop it first.
node3: regionserver running as process 7268. Stop it first.
master的进程状态
[jared@master ~]$ /usr/java/jdk1.7.0_51/bin/jps
5196 SecondaryNameNode
6858 HMaster # Hbase集群的主控进程
5046 NameNode
6989 Jps
6817 HQuorumPeer # zookeeper集群进程
5263 JobTracker
node1的进程状态
[jared@node1 ~]$ /usr/java/jdk1.7.0_51/bin/jps
43728 DataNode
43818 TaskTracker
43901 HQuorumPeer # zookeeper集群进程
45225 HRegionServer # Hbase集群的Region服务器
45318 Jps
[jared@node1 ~]$
node2的进程状态
[jared@node2 ~]$ /usr/java/jdk1.7.0_51/bin/jps
23991 HRegionServer # Hbase集群的Region服务器
14792 DataNode
14883 TaskTracker
14966 HQuorumPeer # zookeeper集群进程
24119 Jps
[jared@node2 ~]$
node3的进程状态
[jared@node3 ~]$ /usr/java/jdk1.7.0_51/bin/jps
9603 TaskTracker
9513 DataNode
11245 HRegionServer # Hbase集群的Region服务器
11332 Jps
[jared@node3 ~]$
8.进入hbase shell测试
[jared@master ntp]$ hbase shell
HBase Shell; enter 'help
Type "exit
Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011
hbase(main):001:0> list
TABLE
0 row(s) in 0.7430 seconds
查询服务器状态
hbase(main):002:0> status
3 servers, 0 dead, 0.6667 average load
查询hbase版本
hbase(main):003:0> version
0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011
hbase(main):004:0>
[jared@master ~]$ stop-hbase.sh
stopping hbase........
node2: stopping zookeeper.
node1: stopping zookeeper.
master: stopping zookeeper.
[jared@master ~]$
9. 遇到到问题:
一、在停止hbase的时候,有可能会遇到停止很慢甚至一直处于stopping hbase..........................的状态;
二、集群在部署hbase的过程中,出现两个节点HRegionServer不稳定
问题一的解决办法:
将hbase相应的进程kill -9掉;如下操作:
[jared@master ~]$ netstat -lntp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN -
tcp 0 0 :::55533 :::* LISTEN 5046/java
tcp 0 0 :::50030 :::* LISTEN 5263/java
tcp 0 0 :::3888 :::* LISTEN 6817/java
tcp 0 0 :::45781 :::* LISTEN 5196/java
tcp 0 0 :::50070 :::* LISTEN 5046/java
tcp 0 0 :::22 :::* LISTEN -
tcp 0 0 ::1:25 :::* LISTEN -
tcp 0 0 ::ffff:192.168.255.25:60000 :::* LISTEN 6858/java
tcp 0 0 :::2181 :::* LISTEN 6817/java
tcp 0 0 :::39750 :::* LISTEN 5263/java
tcp 0 0 ::ffff:192.168.255.25:9000 :::* LISTEN 5046/java
tcp 0 0 ::ffff:192.168.255.25:9001 :::* LISTEN 5263/java
tcp 0 0 :::60010 :::* LISTEN 6858/java
tcp 0 0 :::50090 :::* LISTEN 5196/java
[jared@master ~]$ kill -9 6858
查看
[jared@master ~]$ netstat -lntp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN -
tcp 0 0 :::55533 :::* LISTEN 5046/java
tcp 0 0 :::50030 :::* LISTEN 5263/java
tcp 0 0 :::45781 :::* LISTEN 5196/java
tcp 0 0 :::50070 :::* LISTEN 5046/java
tcp 0 0 :::22 :::* LISTEN -
tcp 0 0 ::1:25 :::* LISTEN -
tcp 0 0 :::39750 :::* LISTEN 5263/java
tcp 0 0 ::ffff:192.168.255.25:9000 :::* LISTEN 5046/java
tcp 0 0 ::ffff:192.168.255.25:9001 :::* LISTEN 5263/java
tcp 0 0 :::50090 :::* LISTEN 5196/java
[jared@master ~]$ /usr/java/jdk1.7.0_51/bin/jps
5196 SecondaryNameNode
15134 Jps
5046 NameNode
5263 JobTracker
重新启动start-hbase.sh;hbase shell;list,随后一切正常!
问题二的解决方法:
出现该问题的原因:hbase在hadoop完全分布式基础之上的部署,会涉及的时钟同步的问题,如果节点的时间不一致,可能会导致HRegionServer服务不稳定的现象!
解决方法:在master上安装ntp服务,node上手动执行同步,保证各个node的时间与master的时间一致!
操作细节:NTP服务器一般都是系统自带的,所以不需要安装。 # yum install ntp
NTP服务端配置
[root@master ~]# vim /etc/ntp.conf
#加上
restrict 220.130.158.71
restrict 59.124.196.83
restrict 59.124.196.84
……
restrict 192.168.255.0 mask 255.255.255.0 nomodify <--放行局域网来源
……
server 220.130.158.71 prefer <--以这台主机为最优先
server 59.124.196.83
server 59.124.196.84
**新插入变更记录:时钟同步
以下的定义是让NTP Server和其自身保持同步,如果在ntp.conf中定义的server都不可用时,将使用local时间作为ntp服务提供给ntp客户端。
server 127.127.1.0 fudge
127.127.1.0 stratum 8
[root@master ~]# /etc/init.d/ntpd restart
[root@master ~]# chkconfig ntpd on
[root@master ~]# netstat -lntpu|grep ntp #(查看启动端口:123)
udp 0 0 192.168.11.162:123 0.0.0.0:* 21203/ntpd
udp 0 0 192.168.255.25:123 0.0.0.0:* 21203/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 21203/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 21203/ntpd
udp 0 0 fe80::20c:29ff:fe33:edb0:123 :::* 21203/ntpd
udp 0 0 fe80::20c:29ff:fe33:edba:123 :::* 21203/ntpd
udp 0 0 ::1:123 :::* 21203/ntpd
udp 0 0 :::123 :::* 21203/ntpd
#启动后不是马上就可以看得效果,需要过几分钟。
[root@master ~]# ntpstat
synchronised to NTP server (220.130.158.71) at stratum 3
time correct to within 153 ms
polling server every 128 s #查看到这些才正确。
注意:启动NTP服务器的过程是比较慢,所以要ntpstat完成就OK。
NTP客户端配置
#vim /etc/ntp.conf
restrict 192.168.255.25
server 192.168.255.25
#/etc/init.d/ntpd stop
如果需要校正时间在客户端输入:
#ntpdate 192.168.255.25
用date可以查看修改后的时间。
#/etc/init.d/ntpd start
#chkconfig ntpd on
小结:到此我们安装Hbase数据库已经完美完成,在操作的步骤中注意不同版本覆盖的文件不同,还要注意版本的配达要求,如果你使用的是VM虚拟机来安装的话,当你重启机器的时候可能会遇到节点HMaster、HQuorumPeer、HRegionServer进程不同程度的无法启动现象,先使用/bin/stop-hbase.sh停掉所有集群进程,在使用/bin/start-hbase.sh启动集群即可,必须所有进程全部正常启动后才能操作数据库否则会报错禁止操作。
10.接下来就是对hbase测试
下面我们看看HBase Shell的一些基本操作命令,我列出了几个常用的HBase Shell命令,如下
名称命令表达式
创建表create '表名称', '列名称1','列名称2','列名称N'
添加记录 put '表名称', '行名称', '列名称:', '值'
查看记录get '表名称', '行名称'
查看表中的记录总数count '表名称'
删除记录delete '表名' ,'行名称' , '列名称'
删除一张表先要屏蔽该表,才能对该表进行删除,第一步 disable '表名称' 第二步 drop '表名称'
查看所有记录scan "表名称"
查看某个表某个列中所有数据scan "表名称" , ['列名称:']
更新记录 就是重写一遍进行覆盖
一、一般操作
[root@master ~]# hbase shell
HBase Shell; enter 'help
Type "exit
Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011
列出所有的表
hbase(main):001:0> list
TABLE
0 row(s) in 0.9160 seconds
二、DDL操作
创建表
hbase(main):002:0> create 'member','member_id','address','info'
0 row(s) in 1.4540 seconds
hbase(main):003:0> list
TABLE
member
1 row(s) in 0.0130 seconds
获得表的描述
hbase(main):004:0> describe 'member'
DESCRIPTION ENABLED
{NAME => 'member', FAMILIES => [{NAME => 'address', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', true
COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'fa
lse', BLOCKCACHE => 'true'}, {NAME => 'info', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPR
ESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'member_id', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRE
SSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', B
LOCKCACHE => 'true'}]}
1 row(s) in 0.0360 seconds
hbase(main):005:0>
三、DML操作
插入数据
put'member','lijie','info:age','25'
put'member','lijie','info:birthday','1989-05-04'
put'member','lijie','info:favorite','music'
put'member','lijie','info:company','chinacache'
put'member','lijie','address:contry','china'
put'member','lijie','address:province','beijing'
put'member','lijie','address:city','beijing'
put'member','jared','info:age','24'
put'member','jared','info:birthday','1990-06-04'
put'member','jared','info:favorite','movie'
put'member','jared','info:company','chinacache'
put'member','jared','address:contry','china'
put'member','jared','address:province','henan'
put'member','jared','address:city','shangqiu'
put'member','jared','address:town','huabao'
查询数据
hbase(main):006:0> get 'member','lijie'
COLUMN CELL
address:city timestamp=1394094913527, value=beijing
address:contry timestamp=1394096815605, value=china
address:province timestamp=1394094903544, value=beijing
info:age timestamp=1394097288602, value=25
info:birthday timestamp=1394097444869, value=1989-05-04
info:company timestamp=1394097096504, value=chinacache
info:favorite timestamp=1394097238819, value=music
7 row(s) in 0.0280 seconds
获取一个id,一个列族的所有数据
hbase(main):007:0> get 'member','lijie','info'
COLUMN CELL
info:age timestamp=1394097288602, value=25
info:birthday timestamp=1394097444869, value=1989-05-04
info:company timestamp=1394097096504, value=chinacache
info:favorite timestamp=1394097238819, value=music
4 row(s) in 0.0260 seconds
获取一个id,一个列族中一个列的所有数据
hbase(main):008:0> get 'member','lijie','info:age'
COLUMN CELL
info:age timestamp=1394097288602, value=25
1 row(s) in 0.0120 seconds
hbase(main):009:0>
更新数据
将lijie的年龄改成99
hbase(main):009:0> put 'member','lijie','info:age','99'
0 row(s) in 0.0090 seconds
hbase(main):010:0> get 'member','lijie','info:age'
COLUMN CELL
info:age timestamp=1394098805418, value=99
1 row(s) in 0.0120 seconds
通过timestamp来获取两个版本的数据
hbase(main):011:0> get 'member','lijie',{COLUMN=>'info:age',TIMESTAMP=>1394098805418}
COLUMN CELL
info:age timestamp=1394098805418, value=99
1 row(s) in 0.0110 seconds
全表扫描:
hbase(main):075:0> scan'member'
ROW COLUMN+CELL
jared column=address:city, timestamp=1394096657293, value=shangqiu
jared column=address:contry, timestamp=1394096576288, value=china
jared column=address:province, timestamp=1394096631520, value=henan
jared column=address:town, timestamp=1394096718126, value=huabao
jared column=info:age, timestamp=1394097298985, value=24
jared column=info:birthday, timestamp=1394096371515, value=1990-06-04
jared column=info:company, timestamp=1394096543377, value=chinacache
jared column=info:favorite, timestamp=1394096451321, value=movie
lijie column=address:city, timestamp=1394094913527, value=beijing
lijie column=address:contry, timestamp=1394096815605, value=china
lijie column=address:province, timestamp=1394094903544, value=beijing
lijie column=info:age, timestamp=1394098805418, value=99
lijie column=info:birthday, timestamp=1394097444869, value=1989-05-04
lijie column=info:company, timestamp=1394097096504, value=chinacache
lijie column=info:favorite, timestamp=1394097238819, value=music
2 row(s) in 0.0410 seconds
删除id为lijie的值的‘info:age’字段
hbase(main):058:0> delete'member','lijie','info:age'
0 row(s) in 0.0080 seconds
hbase(main):059:0> get'member','lijie','info:age'
COLUMN CELL
0 row(s) in 0.0110 seconds
查询表中有多少行
hbase(main):065:0> count'member'
2 row(s) in 0.0100 seconds
hbase(main):066:0>
删除整行
hbase(main):066:0> deleteall'member','lijie'
0 row(s) in 0.0060 seconds
hbase(main):067:0> count'member'
1 row(s) in 0.0120 seconds
删除操作
属于DDL操作
删除一个列族,alter,disable,enable
我们之前建了3个列族,但是发现member_id这个列族是多余的,因为他就是主键,所以我们要将其删除。
(1)删除列族的时候必须先将表给disable掉
hbase(main):012:0> disable 'member'
0 row(s) in 2.1060 seconds
hbase(main):013:0> alter 'member',{NAME=>'member_id',METHOD=>'delete'}
0 row(s) in 0.0850 seconds
hbase(main):014:0> describe 'member'
DESCRIPTION ENABLED
{NAME => 'member', FAMILIES => [{NAME => 'address', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', true
VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'fa
lse', BLOCKCACHE => 'true'}, {NAME => 'info', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSI
ONS => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}]}
1 row(s) in 0.0170 seconds
该列族已经删除,我们继续将表enable
hbase(main):015:0> enable 'member'
0 row(s) in 2.0540 seconds
(2)drop一个表
hbase(main):026:0>disable 'temp_table'
0 row(s) in 2.0590seconds
hbase(main):027:0>drop 'temp_table'
0 row(s) in 1.1070seconds
(3)查询表是否存在
hbase(main):029:0> exists'member'
Table member does exist
0 row(s) in 0.0200 seconds
(4)判断表是否enable
hbase(main):030:0> is_enabled'member'
true
0 row(s) in 0.0110 seconds
(5)判断表是否disable
hbase(main):031:0> is_disabled'member'
false
0 row(s) in 0.0120 seconds
hbase(main):032:0>
给‘lijie’这个id增加'info:age'字段,并使用counter实现递增
hbase(main):070:0> incr'member','lijie','info:age'
COUNTER VALUE = 4123326933835972609
hbase(main):071:0>get'member','lijie','info:age'
COLUMN CELL
info:age timestamp=1394106638150, value=99\x00\x00\x00\x18\x00\x01
1 row(s) in 0.0100 seconds
hbase(main):076:0> incr'member','lijie','info:age'
COUNTER VALUE = 4123326933835972610
hbase(main):077:0> get'member','lijie','info:age'
COLUMN CELL
info:age timestamp=1394106638150, value=99\x00\x00\x00\x18\x00\x02
1 row(s) in 0.0100 seconds
获取当前count的值
hbase(main):078:0> get_counter'member','lijie','info:age'
COUNTER VALUE = 4123326933835972610
hbase(main):079:0>
将整张表清空 DML操作
hbase(main):080:0>truncate 'member'
Truncating 'member'table (it may take a while):
- Disabling table...
- Dropping table...
- Creating table...
0 row(s) in 4.3430seconds
可以看出,hbase是先将掉disable掉,然后drop掉后重建表来实现truncate的功能的。