该文章目前没有太大价值,有空儿完善一下,
节选翻译
Once you have confirmed your HDFS setup, edit conf/hbase-site.xml
. This is the file into which you add local customizations and overrides for<xreg></xreg>
<configuration> ... <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/hbase</value> <description>The directory shared by RegionServers. </description> </property> <property> <name>dfs.replication</name> <value>1</value> <description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count. </description> </property> ... </configuration>
Let HBase create the hbase.rootdir
directory. If you don't, you'll get warning saying HBase needs a migration run because the directory is missing files expected by HBase (it'll create them if you let it).大意是,如果该目录不是HBase自己创建的目录(HBase在创建目录的同时,也会该目录下创建一些必要的文件),你会得到一个警告:HBase需要在别处运行,因为该目录缺少HBase需要的文件。
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
regionservers
HBASE_MANAGES_ZK
variable in
conf/hbase-env.sh
. This variable, which defaults to
true
, tells HBase whether to start/stop the ZooKeeper ensemble servers as part of HBase start/stop.
When HBase manages the ZooKeeper ensemble,
1.you can specify ZooKeeper configuration using its native zoo.cfg
file, ,此时,该文件需添加在HBase-en.sh的中的CLASSPATH选项中。
2.or, the easier option is to just specify ZooKeeper options directly in conf/hbase-site.xml
You must at least list the ensemble servers in hbase-site.xml
using the hbase.zookeeper.quorum
property.This property defaults to a single ensemble member atlocalhost
which is not suitable for a fully distributed HBase.
<configuration> ... <property> <name>hbase.zookeeper.property.clientPort</name> <value>2222</value> <description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect. </description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>rs1.example.com,rs2.example.com,rs3.example.com,rs4.example.com,rs5.example.com</value> <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/local/zookeeper</value> <description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored. </description> </property> ... </configuration>
以下一段阅读即可(无关配置):
Also, run an odd number of machines. There can be no quorum if the number of members is an even number. Give each ZooKeeper server around 1GB of RAM, and if possible, its own dedicated disk (A dedicated disk is the best thing you can do to ensure a performant ZooKeeper ensemble). For very heavily loaded clusters, run ZooKeeper servers on separate machines from RegionServers (DataNodes and TaskTrackers).
When HBase manages ZooKeeper, it will start/stop the ZooKeeper servers as a part of the regular start/stop scripts. If you would like to run ZooKeeper yourself, independent of HBase start/stop, you would do the following(这种启动方式,可以认为是HBase自己运行时的一个子过程)
${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
Note that you can use HBase in this manner to spin up a ZooKeeper cluster, unrelated to HBase. Just make sure to setHBASE_MANAGES_ZK
tofalse
if you want it to stay up across HBase restarts so that when HBase shuts down, it doesn't take ZooKeeper down with it.
Add a pointer to your HADOOP_CONF_DIR
to theHBASE_CLASSPATH
environment variable inhbase-env.sh
.
Add a copy of hdfs-site.xml
(orhadoop-site.xml
) or, better, symlinks, under${HBASE_HOME}/conf
, or
if only a small set of HDFS client configurations, add them to hbase-site.xml
.
An example of such an HDFS client configuration isdfs.replication
. If for example, you want to run with a replication factor of 5,hbase will create files with the default of 3 (hbase的默认配置是3)unless you do the above to make the configuration available to HBase.
HBase在多机配置时,一定要注意多机间时间的一致(配置NTP),若延迟太大,可以在hbase-site.xml中配置mast.maxtimeout选项,防止时间差距过大时,系统拒绝链接。
使用HBase自带的zookeeper时,有时会发生zookeeper启动过慢,使HMaster无法链接zookeeper,导致HMaster无法启动,此时可以不stopzookeeper直接再启动一次Hmaster。