的集群是基于hadoop2.2.0的,所以不可避免的就需要使用新版的Hbase,以前和hadoop1.x的集群使用的hbase是0.94版本的,现在最新的版本是0.98的,鉴于不稳定,所以散仙就选择了0.96版的Hbase,本次搭建Hbase集群,是基于底层依赖Hadoop2.2.0的,具体的情况描述如下:
序号 | 机器IP | 角色 |
1 | 192.168.46.32 | Master |
2 | 192.168.46.11 | Slave1 |
3 | 192.168.46.10 | Slave2 |
本次的集群,散仙使用的是Hbase内置的zk,建议生产环境使用外置的zk集群,具体的配置步骤如下:
序号 | 描述 |
1 | Ant,Maven,JDK环境 |
2 | 配置各个机器之间SSH无密码登陆认证 |
3 | 配置底层Hadoop2.2.0的集群,注意需要编译64位的 |
4 | 下载Hbase0.96,无须编译,解压 |
5 | 进入hbase的conf下,配置hbase-env.sh文件 |
6 | 配置conf下的hbase-site.xml文件 |
7 | 配置conf下的regionservers文件 |
8 | 配置完成后,分发到各个节点上 |
9 | 先启动Hadoop集群,确定hadoop集群正常 |
10 | 启动Hbase集群 |
11 | 访问Hbase的60010的web界面,查看是否正常 |
12 | 使用命令bin/hbase shell进入hbase的shell终端,测试 |
13 | 配置Windows下的本地hosts映射(如需在win上查看Hbase) |
14 | 屌丝软件工程师一名 |
hbase-env.sh里面的配置如下,需要配置的地方主要有JDK环境变量的设置,和启动Hbase自带的zk管理:
# #/** # * Copyright 2007 The Apache Software Foundation # * # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ # Set environment variables here. # This script sets variables multiple times over the course of starting an hbase process, # so try to keep things idempotent unless you want to take an even deeper look # into the startup scripts (bin/hbase, etc.) # The java implementation to use. Java 1.6 required. export JAVA_HOME=/usr/local/jdk # Extra Java CLASSPATH elements. Optional. # export HBASE_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HBASE_HEAPSIZE=1000 # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, # see http://wiki.apache.org/hadoop/PerformanceTuning export HBASE_OPTS="-XX:+UseConcMarkSweepGC" # Uncomment one of the below three options to enable java garbage collection logging for the server-side processes. # This enables basic gc logging to the .out file. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc: -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment one of the below three options to enable java garbage collection logging for the client processes. # This enables basic gc logging to the .out file. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc: " # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc: -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment below if you intend to use the EXPERIMENTAL off heap cache. # export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=" # Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident #HBASE_REGIONSERVER_MLOCK=true #HBASE_REGIONSERVER_UID="hbase" # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER # The scheduling priority for daemon processes. See 'man nice'. # export HBASE_NICENESS=10 # The directory where pid files are stored. /tmp by default. # export HBASE_PID_DIR=/var/hadoop/pids # Seconds to sleep between slave commands. Unset by default. This # can be useful in large clusters, where, e.g., slave rsyncs can # otherwise arrive faster than the master can service them. # export HBASE_SLAVE_SLEEP=0.1 # Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK=true # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the # RFA appender. Please refer to the log4j.properties file to see more details on this appender. # In case one needs to do log rolling on a date change, one should set the environment property # HBASE_ROOT_LOGGER to " ,DRFA". # For example: # HBASE_ROOT_LOGGER=INFO,DRFA # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.
hbase-site.xml里面的配置如下:
hbase.rootdir hdfs://192.168.46.32:9000/hbase hbase.cluster.distributed true hbase.master 192.168.46.32:60000 hbase.tmp.dir /home/search/hbase/hbasetmp hbase.zookeeper.quorum 192.168.46.32,192.168.46.11,192.168.46.10
regionservers里面的配置如下:
h1 h2 h3
启动后的在Master上进程如下所示:
1580 SecondaryNameNode 1289 NameNode 2662 HMaster 2798 HRegionServer 1850 NodeManager 3414 Jps 2569 HQuorumPeer 1743 ResourceManager 1394 DataNode
关闭防火墙后,在win上访问Hbase的60010端口,如下所示:
在linu的shell客户端里访问hbase的shell如下所示:
至此,我们的Hbase集群就搭建完毕,下一步我们就可以使用Hbase的shell命令,来测试Hbase的增删改查了,当然我们也可以使用Java API来和Hbase交互,下一篇散仙会给出Java API操作Hbase的一些通用代码。