环境:
windows 7 SP1
virtualbox 4.1.4 r74291
ubuntu 11.10
一、安装需求
安装java 1.6,hadoop 0.20.x及zookeeper
本次安装仅使用一台虚拟机(192.168.1.102),机上已安装hadoop 0.20.205.0和zookeeper 3.4.3(zookeeper的安装方法可见于 ZooKeeper安装过程)。
此次安装的hbase版本为0.92.0。
安装成功并执行后,该虚拟机会有以下java进程:
NameNode
DataNode
SecondaryNameNode
TaskTracker
JobTracker
HMaster (hbase)
HRegionServer (hbase)
QuorumPeerMain (zookeeper)
二、安装hbase
1、下载hbase
wget http://mirror.bit.edu.cn/apache//hbase/stable/hbase-0.92.0-security.tar.gz
其他版本下载地址(最好使用stable版本):http://www.apache.org/dyn/closer.cgi/hbase/
2、解压
tar -xf hbase-0.92.0-security.tar.gz
将解压后的hbase-0.92.0-security文件放在系统的/home/hadooptest/中。
3、修改配置
hbase的安装是基于hdfs的,hbase的配置主要涉及conf目录下的三个文件:hbase-env.sh,hbase-site.xml,regionservers。
①修改hbase-env.sh
#必修配置的地方为:
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HBASE_CLASSPATH=/home/hadooptest/hadoop-0.20.205.0/conf
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_MANAGES_ZK=true
其中,JAVA_HOME为java安装路径,HBASE_CLASSPATH为hadoop安装路径。
②修改hbase-site.xml
修改其内容为:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://namenode/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
#<property>
#<name>hbase.zookeeper.property.dataDir</name>
#<value>/home/hadooptest/zookeeper-3.4.3/zookeeperdir/zookeeper-data</value>
#</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zookeeper</value>
</property>
</configuration>
其中,hbase.rootdir这一项的配置必须与hdfs的fs.name.default项一致,还要为hbase指定根目录/hbase。hbase.zookeeper.property.dataDir属性(在此次配置中已注释掉)来把ZooKeeper保存数据的目录地址改掉,默认值是/tmp (重启的时候会被操作系统删掉),不过我已经在zookeeper的conf/zoo.cfg文件中将dataDir设置为/home/hadooptest/zookeeper-3.4.3/zookeeperdir/zookeeper-data(详见于 ZooKeeper安装过程),所以此处不须使用该属性。hbase.zookeeper.quorum指定了所有的zookeeper,此处的值为zookeeper,其已经在/etc/hosts文件中映射为192.168.1.102(本机),如果想指定多个zookeeper,可在此项中用逗号将不同的zookeeper隔开。
③修改regionservers
文件原先为空,在其中加入:
regionserver
regionserver已在/etc/hosts中映射为192.168.1.102。如果有多个regionserver,可继续添加regionserver,每行填写一个。
④为方便使用启动脚本,在/etc/profile中设置环境变量
在/etc/profile中添加以下内容:
export HBASE_HOME=/home/hadooptest/hbase-0.92.0-security
PATH=$HBASE_HOME/bin:$PATH
export PATH
⑤配置完毕。若配置多台机器,可将此次配置的hbase目录拷贝到其他机器的相同目录下。当然,/etc/hosts也要做相应的改动。
三、启动并测试hbase
1、启动zookeeper和hbase
查看一些文档得知,直接执行bin/start-hbase.sh可先启动zookeeper,再启动master,最后启动regionserver。
但是可能是配置或者版本不兼容的问题,我执行start-hbase.sh后,却仅启动了master和regionserver,没有启动zookeeper。而且master和regionserver会一直循环等待连接到zookeeper,超时便会结束进程。所以只能先手动启动zookeeper。
①启动zookeeper
执行zkServer.sh start:
hadooptest@tigerchan-VirtualBox:~$ zkServer.sh start
JMX enabled by default
Using config: /home/hadooptest/zookeeper-3.4.3/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
②启动hbase
执行start-hbase.sh:
hadooptest@tigerchan-VirtualBox:~$ start-hbase.sh
zookeeper: starting zookeeper, logging to /home/hadooptest/hbase-0.92.0-security/bin/../logs/hbase-hadooptest-zookeeper-tigerchan-VirtualBox.out
starting master, logging to /home/hadooptest/hbase-0.92.0-security/logs/hbase-hadooptest-master-tigerchan-VirtualBox.out
regionserver: starting regionserver, logging to /home/hadooptest/hbase-0.92.0-security/bin/../logs/hbase-hadooptest-regionserver-tigerchan-VirtualBox.out
2、输入jps命令查看进程
2640 NameNode
2991 DataNode
8997 Jps
8698 HRegionServer
3238 SecondaryNameNode
3580 TaskTracker
3326 JobTracker
8974 QuorumPeerMain
8478 HMaster
3、测试hbase
①执行hbase shell进入hbase
hadooptest@tigerchan-VirtualBox:~$ hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.92.0, r1231986, Tue Jan 17 02:30:24 UTC 2012
hbase(main):001:0>
②测试创建表的功能
hbase(main):001:0> create 'test', 'cf'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadooptest/hbase-0.92.0-security/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadooptest/hadoop-0.20.205.0/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times
提示错误了(SLF4J不是具体的日志解决方案,它只服务于各种各样的日志系统。此处提示有多个SLF4J绑定,不影响主要功能,暂且略过),master没有运行。(如果你创建成功,恭喜你,你可以跳到下一步继续测试了)
但是master进程已经存在。查看logs/hbase-hadooptest-master-tigerchan-VirtualBox.log日志发现:
2012-03-10 14:20:16,896 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to create version file at hdfs://namenode/hbase, retrying: java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String, org.apache.hadoop.fs.permission.FsPermission, java.lang.String, boolean, boolean, short, long)
at java.lang.Class.getMethod(Class.java:1605)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
参考博文 hadoop 异常记录 ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times后觉得应该是RPC协议不一致所造成的。解决方法是:将hbase-0.92.0-security/lib目录下的hadoop-core-1.0.0.jar文件删除,将hadoop-0.20.205.0目录下的hadoop-core-0.20.205.0.jar拷贝到hbase/lib下面。同时,我发现了hbase-0.92.0-security/lib目录下的zookeeper-3.4.2.jar文件,于是也将其删除,将zookeeper-3.4.3目录下的zookeeper-3.4.3.jar拷贝到hbase/lib下,然后重新启动hbase。
重新执行hbase shell然后创建表,这次迟迟没有输出结果。于是只能再查看hbase-hadooptest-master-tigerchan-VirtualBox.log日志,发现:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to localhost/127.0.0.1:60020 after attempts=1
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:242)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1235)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1222)
at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:496)
at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:429)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1200)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1175)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1170)
at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1918)
at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:557)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:491)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
日志显示master一直在等待连接localhost上的regionserver。于是再查看hbase-hadooptest-regionserver-tigerchan-VirtualBox.log日志,发现:
2012-03-10 15:08:41,368 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 0 on 60020: starting
2012-03-10 15:08:41,370 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 1 on 60020: starting
2012-03-10 15:08:41,372 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 2 on 60020: starting
2012-03-10 15:08:41,373 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 3 on 60020: starting
2012-03-10 15:08:41,376 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 4 on 60020: starting
2012-03-10 15:08:41,377 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 5 on 60020: starting
2012-03-10 15:08:41,380 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 6 on 60020: starting
2012-03-10 15:08:41,382 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 7 on 60020: starting
2012-03-10 15:08:41,384 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 8 on 60020: starting
2012-03-10 15:08:41,388 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 9 on 60020: starting
2012-03-10 15:08:41,394 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Serving as localhost,60020,1331362969179, RPC listening on tigerchan-VirtualBox/127.0.1.1:60020, sessionid=0x135fb6a344a0001
2012-03-10 15:08:41,400 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker localhost,60020,1331362969179 starting
regionserver却一直在监听tigerchan-VirtualBox/127.0.1.1:60020。
于是修改/etc/hosts文件,将tigerchan-VirtualBox映射为127.0.0.1。重启hbase,创建表成功。
③继续测试添加数据、浏览、删除等其他功能,参考http://hbase.apache.org/book/quickstart.html
hadooptest@tigerchan-VirtualBox:~$ hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.92.0, r1231986, Tue Jan 17 02:30:24 UTC 2012
hbase(main):001:0> create 'test', 'cf'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadooptest/hbase-0.92.0-security/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadooptest/hadoop-0.20.205.0/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
0 row(s) in 2.9830 seconds
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.2440 seconds
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 1.2880 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0400 seconds
hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0330 seconds
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1331369650710, value=value1
row2 column=cf:b, timestamp=1331369678008, value=value2
row3 column=cf:c, timestamp=1331369689414, value=value3
3 row(s) in 0.2880 seconds
hbase(main):007:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1331369650710, value=value1
1 row(s) in 0.1670 seconds
hbase(main):008:0> drop 'test'
ERROR: Table test is enabled. Disable it first.'
Here is some help for this command:
Drop the named table. Table must first be disabled. If table has
more than one region, run a major compaction on .META.:
hbase> major_compact ".META."
hbase(main):009:0> disable 'test'
0 row(s) in 2.5160 seconds
hbase(main):010:0> drop 'test'
0 row(s) in 1.8860 seconds
hbase(main):011:0> exit
4、停止hbase
执行stop-hbase.sh:
hadooptest@tigerchan-VirtualBox:~$ stop-hbase.sh
stopping hbase.............
zookeeper: no zookeeper to stop because kill -0 of pid 8422 failed with status 1
理论上应该能够停止zookeeper的QuorumPeerMain进程,但是仅停止了HMaster和HRegionServer进程。(执行stop-hbase.sh时时提示zookeeper进程号为8422,而实际上zookeeper进程号为8974,该问题暂无解决)
因此需要执行zkServer.sh stop才能停止zookeeper进程。
hadooptest@tigerchan-VirtualBox:~$ zkServer.sh stop
JMX enabled by default
Using config: /home/hadooptest/zookeeper-3.4.3/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
参考:
http://hbase.apache.org/book/quickstart.html
hadoop 异常记录 ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times
Hbase HMaster 无法启动 (Call to host:port failed on local exception)
注意你的hosts文件--记一次HBase问题定位