1)下载hbase
解压到每台服务器的/data/soft
解压
root@master:/data/soft# tar zxvf hbase-0.90.0.tar.gz
建立软连
root@master:/data/soft# ln -s hbase-0.90.0 hbase
2)配置hbase
前提是安装完成hadoop,默认在namenode上进行
1.修改conf/hbase-env.sh,添加jdk支持
export JAVA_HOME=/usr/local/jdk
export HBASE_MANAGES_ZK=true
export HBASE_LOG_DIR=/data/logs/hbase
2. 修改conf/hbase-site.xml,
hbase.rootdir
hdfs://master:9000/hbase
hbase.cluster.distributed
true
hbase.master
hdfs://master:60000
hbase.zookeeper.quorum
slave-001,slave-002,slave-003
Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.
hbase.zookeeper.property.dataDir
/data/work/zookeeper
Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored.
hbase.rootdir设置hbase在hdfs上的目录,主机名为hdfs的namenode节点所在的主机
hbase.cluster.distributed设置为true,表明是完全分布式的hbase集群
hbase.master设置hbase的master主机名和端口
hbase.zookeeper.quorum设置zookeeper的主机,建议使用单数
3.修改hadoop的目录下的conf/hdfs-site.xml
dfs.datanode.max.xcievers
4096
4.复制hadoop的jar到hbase的lib目录下,删除原来的lib下的hadoop.jar
原来的hadoop-core-0.20-append-r1056497.jar
新的hadoop-0.20.2-core.jar
5.修改conf/regionservers
将所有的datanode添加到这个文件,类似与hadoop中slaves文件
6.拷贝hbase到所有的节点
3)启动hbase
$ ./bin/start-hbase.sh
4)hbase自带的web界面
http://master:60010/
5)测试
1.登录hbase客户端
./bin/hbase shell
2.新建数据表,并插入3条记录
hbase(main):003:0> create 'test', 'cf'
0 row(s) in 1.2200 seconds
hbase(main):003:0> list 'table'
test
1 row(s) in 0.0550 seconds
hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0560 seconds
hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0370 seconds
hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0450 seconds
3.查看插入的数据
hbase(main):007:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1288380727188, value=value1
row2 column=cf:b, timestamp=1288380738440, value=value2
row3 column=cf:c, timestamp=1288380747365, value=value3
3 row(s) in 0.0590 seconds
4.读取单条记录
hbase(main):008:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1288380727188, value=value1
1 row(s) in 0.0400 seconds
5.停用并删除数据表
hbase(main):012:0> disable 'test'
0 row(s) in 1.0930 seconds
hbase(main):013:0> drop 'test'
0 row(s) in 0.0770 seconds
6.退出
hbase(main):014:0> exit