本篇介绍在四个节点的集群中安装Hbase,Hbase依赖于zookeeper,Hbase的安装包内自带zookeeper,本篇将配置使用单独安装的zookeeper。最后启动时遇到Master is initializing、error telling master we are up的异常,通过修改hosts文件得以解决。
安装环境
上传安装包
将下载的hbase-1.2.0-cdh5.7.6.tar.gz安装包上传到CentOS指定目录,例如/opt。
上传方法很多,这里在SecureCRT用rz命令。
解压缩安装包:
tar -zxf hbase-1.2.0-cdh5.7.6.tar.gz
重命名文件夹:
mv hbase-1.2.0-cdh5.7.6 hbase
配置
修改环境变量:
vi /etc/profile
添加
export HBASE_HOME=/opt/hbase
export PATH=$HBASE_HOME/bin:$PATH
修改hbase配置文件hbase-env.sh:
[root@slave1 conf]# pwd
/opt/hbase/conf
[root@slave1 conf]# vi hbase-env.sh
文件中去掉一下两行注释并修改
export JAVA_HOME=/opt/jdk
export HBASE_MANAGES_ZK=false
因为zookeeper用的是单独安装的,不是hbase自带的,所以上面配置为false.
修改conf文件下的hbase配置文件hbase-site.xml。
节点中间的内容是空的,增加如下配置:
<configuration>
<property>
<name>hbase.rootdirname>
<value>hdfs://ns1/hbasevalue>
property>
<property>
<name>hbase.cluster.distributedname>
<value>truevalue>
property>
<property>
<name>hbase.zookeeper.quorumname>
<value>cdh1,cdh2,cdh3value>
property>
<property>
<name>hbase.zookeeper.property.dataDirname>
<value>/opt/zookeepervalue>
property>
configuration>
修改conf文件夹下的regionservers文件:
[root@slave1 conf]# vi regionservers
cdh3
cdh4
复制安装包到其他节点
[hadoop@cdh1 conf]# scp -rq /opt/hbase hadoop@cdh3:/opt
[hadoop@cdh1 conf]# scp -rq /opt/hbase hadoop@cdh4:/opt
启动运行
HBase的启动顺序为:HDFS->Zookeeper->HBase,所以先保证hadoop和zookeeper已经启动。
[root@master bin]# /opt/hbase/bin/start-hbase.sh
starting master, logging to /opt/hbase/bin/../logs/hbase-root-master-master.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
slave2: starting regionserver, logging to /opt/hbase/logs/hbase-root-regionserver-slave2.out
slave3: starting regionserver, logging to /opt/hbase/logs/hbase-root-regionserver-slave3.out
slave1: starting regionserver, logging to /opt/hbase/logs/hbase-root-regionserver-slave1.out
slave2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
slave2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
master节点上可看到如下进程:
[root@master bin]# jps
3089 Jps
2696 QuorumPeerMain
2520 SecondaryNameNode
2858 HMaster
2365 NameNode
slave节点上可看到如下进程:
[root@slave1 opt]# jps
2258 QuorumPeerMain
2339 HRegionServer
2154 DataNode
2506 Jps
如果没有问题,那么通过hbase shell查看状态可看到:
[root@master logs]# /opt/hbase/bin/hbase shell
2018-01-24 17:38:50,902 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2018-01-24 17:38:53,829 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 1.2.0-cdh5.7.6, rUnknown, Tue Feb 21 15:18:14 PST 2017
hbase(main):001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load
我第一次查看报错:Master is initializing.
异常处理
虽然上面看到相关进程都已启动,但是进入hbase shell后查看状态报错:
[root@master bin]# /opt/hbase/bin/hbase shell
2018-01-24 16:40:15,994 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2018-01-24 16:40:18,484 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 1.2.0-cdh5.7.6, rUnknown, Tue Feb 21 15:18:14 PST 2017
hbase(main):001:0> status
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2316)
at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:783)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55652)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)
查看master节点的hbase目录下的logs日志文件hbase-root-master-master.log,发现以下这段总在重复。
2018-01-24 17:29:32,752 INFO [master:60000.activeMasterManager] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 529904 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
查看slave节点的hbase目录下的logs日志文件hbase-root-regionserver-slave1.log,发现如下一段异常:
2018-01-24 17:21:52,093 WARN [regionserver/localhost/127.0.0.1:60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.SocketException: Invalid argument
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:240)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2300)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:914)
at java.lang.Thread.run(Thread.java:748)
经过百度,发现slave节点上的/etc/hosts文件含有这样一行:
127.0.0.1 localhost slave2 slave2 localhost4 localhost4.localdomain4
改为如下再启动hbase就正常了:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4