Hbase存储数据,由于现在的hadoop

Hbase存储数据,由于现在的hadoop
的集群是基于hadoop2.2.0的,本次搭建Hbase集群,是基于底层依赖Hadoop2.2.0的,具体的情况描述如下:


序号 机器IP 角色 1 192.168.46.32 Master 2 192.168.46.11 Slave1 3 192.168.46.10 Slave2



本次的集群,散仙使用的是Hbase内置的zk,建议生产环境使用外置的zk集群,具体的配置步骤如下:

序号 描述 1 Ant,Maven,JDK环境 2 配置各个机器之间SSH无密码登陆认证 3 配置底层Hadoop2.2.0的集群,注意需要编译64位的 4 下载Hbase0.96,无须编译,解压 5 进入hbase的conf下,配置hbase-env.sh文件 6 配置conf下的hbase-site.xml文件 7 配置conf下的regionservers文件 8 配置完成后,分发到各个节点上 9 先启动Hadoop集群,确定hadoop集群正常 10 启动Hbase集群 11 访问Hbase的60010的web界面,查看是否正常 12 使用命令bin/hbase shell进入hbase的shell终端,测试 13 配置Windows下的本地hosts映射(如需在win上查看Hbase) 14 屌丝软件工程师一名





hbase-env.sh里面的配置如下,需要配置的地方主要有JDK环境变量的设置,和启动Hbase自带的zk管理:

Xml代码 复制代码  收藏代码
  1. #  
  2. #/**  
  3. # * Copyright 2007 The Apache Software Foundation  
  4. # *  
  5. # * Licensed to the Apache Software Foundation (ASF) under one  
  6. # * or more contributor license agreements.  See the NOTICE file  
  7. # * distributed with this work for additional information  
  8. # * regarding copyright ownership.  The ASF licenses this file  
  9. # * to you under the Apache License, Version 2.0 (the  
  10. # * "License"); you may not use this file except in compliance  
  11. # * with the License.  You may obtain a copy of the License at  
  12. # *  
  13. # *     http://www.apache.org/licenses/LICENSE-2.0  
  14. # *  
  15. # * Unless required by applicable law or agreed to in writing, software  
  16. # * distributed under the License is distributed on an "AS IS" BASIS,  
  17. # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
  18. # * See the License for the specific language governing permissions and  
  19. # * limitations under the License.  
  20. # */  
  21.   
  22. # Set environment variables here.  
  23.   
  24. # This script sets variables multiple times over the course of starting an hbase process,  
  25. # so try to keep things idempotent unless you want to take an even deeper look  
  26. # into the startup scripts (bin/hbase, etc.)  
  27.   
  28. # The java implementation to use.  Java 1.6 required.  
  29.  export JAVA_HOME=/usr/local/jdk  
  30.   
  31. # Extra Java CLASSPATH elements.  Optional.  
  32. # export HBASE_CLASSPATH=  
  33.   
  34. # The maximum amount of heap to use, in MB. Default is 1000.  
  35. # export HBASE_HEAPSIZE=1000  
  36.   
  37. # Extra Java runtime options.  
  38. # Below are what we set by default.  May only work with SUN JVM.  
  39. # For more on why as well as other possible settings,  
  40. # see http://wiki.apache.org/hadoop/PerformanceTuning  
  41. export HBASE_OPTS="-XX:+UseConcMarkSweepGC"  
  42.   
  43. # Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.  
  44.   
  45. # This enables basic gc logging to the .out file.  
  46. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"  
  47.   
  48. # This enables basic gc logging to its own file.  
  49. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .  
  50. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"  
  51.   
  52. # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.  
  53. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .  
  54. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"  
  55.   
  56. # Uncomment one of the below three options to enable java garbage collection logging for the client processes.  
  57.   
  58. # This enables basic gc logging to the .out file.  
  59. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"  
  60.   
  61. # This enables basic gc logging to its own file.  
  62. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .  
  63. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"  
  64.   
  65. # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.  
  66. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .  
  67. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"  
  68.   
  69. # Uncomment below if you intend to use the EXPERIMENTAL off heap cache.  
  70. # export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="  
  71. # Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.  
  72.   
  73.   
  74. # Uncomment and adjust to enable JMX exporting  
  75. # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.  
  76. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html  
  77. #  
  78. # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"  
  79. # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"  
  80. # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"  
  81. # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"  
  82. # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"  
  83. # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"  
  84.   
  85. # File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.  
  86. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers  
  87.   
  88. # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident  
  89. #HBASE_REGIONSERVER_MLOCK=true  
  90. #HBASE_REGIONSERVER_UID="hbase"  
  91.   
  92. # File naming hosts on which backup HMaster will run.  $HBASE_HOME/conf/backup-masters by default.  
  93. # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters  
  94.   
  95. # Extra ssh options.  Empty by default.  
  96. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"  
  97.   
  98. # Where log files are stored.  $HBASE_HOME/logs by default.  
  99. # export HBASE_LOG_DIR=${HBASE_HOME}/logs  
  100.   
  101. # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers   
  102. # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"  
  103. # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"  
  104. # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"  
  105. # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"  
  106.   
  107. # A string representing this instance of hbase. $USER by default.  
  108. # export HBASE_IDENT_STRING=$USER  
  109.   
  110. # The scheduling priority for daemon processes.  See 'man nice'.  
  111. # export HBASE_NICENESS=10  
  112.   
  113. # The directory where pid files are stored. /tmp by default.  
  114. # export HBASE_PID_DIR=/var/hadoop/pids  
  115.   
  116. # Seconds to sleep between slave commands.  Unset by default.  This  
  117. # can be useful in large clusters, where, e.g., slave rsyncs can  
  118. # otherwise arrive faster than the master can service them.  
  119. # export HBASE_SLAVE_SLEEP=0.1  
  120.   
  121. # Tell HBase whether it should manage it's own instance of Zookeeper or not.  
  122.  export HBASE_MANAGES_ZK=true  
  123.   
  124. # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the   
  125. # RFA appender. Please refer to the log4j.properties file to see more details on this appender.  
  126. # In case one needs to do log rolling on a date change, one should set the environment property  
  127. # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".  
  128. # For example:  
  129. HBASE_ROOT_LOGGER=INFO,DRFA  
  130. # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as   
  131. # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.  
#
#/**
# * Copyright 2007 The Apache Software Foundation
# *
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements.  See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership.  The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License.  You may obtain a copy of the License at
# *
# *     http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */

# Set environment variables here.

# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)

# The java implementation to use.  Java 1.6 required.
 export JAVA_HOME=/usr/local/jdk

# Extra Java CLASSPATH elements.  Optional.
# export HBASE_CLASSPATH=

# The maximum amount of heap to use, in MB. Default is 1000.
# export HBASE_HEAPSIZE=1000

# Extra Java runtime options.
# Below are what we set by default.  May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.

# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment one of the below three options to enable java garbage collection logging for the client processes.

# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.


# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"

# File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"

# File naming hosts on which backup HMaster will run.  $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extra ssh options.  Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Where log files are stored.  $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs

# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers 
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER

# The scheduling priority for daemon processes.  See 'man nice'.
# export HBASE_NICENESS=10

# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids

# Seconds to sleep between slave commands.  Unset by default.  This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1

# Tell HBase whether it should manage it's own instance of Zookeeper or not.
 export HBASE_MANAGES_ZK=true

# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the 
# RFA appender. Please refer to the log4j.properties file to see more details on this appender.
# In case one needs to do log rolling on a date change, one should set the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as 
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.


hbase-site.xml里面的配置如下:

Xml代码 复制代码  收藏代码
  1. <?xml version="1.0"?>  
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
  3. <!--  
  4. /**  
  5.  *  
  6.  * Licensed to the Apache Software Foundation (ASF) under one  
  7.  * or more contributor license agreements.  See the NOTICE file  
  8.  * distributed with this work for additional information  
  9.  * regarding copyright ownership.  The ASF licenses this file  
  10.  * to you under the Apache License, Version 2.0 (the  
  11.  * "License"); you may not use this file except in compliance  
  12.  * with the License.  You may obtain a copy of the License at  
  13.  *  
  14.  *     http://www.apache.org/licenses/LICENSE-2.0  
  15.  *  
  16.  * Unless required by applicable law or agreed to in writing, software  
  17.  * distributed under the License is distributed on an "AS IS" BASIS,  
  18.  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
  19.  * See the License for the specific language governing permissions and  
  20.  * limitations under the License.  
  21.  */  
  22. -->  
  23. <configuration>  
  24.   
  25.   <property>  
  26.       <name>hbase.rootdir</name>  
  27.       <value>hdfs://192.168.46.32:9000/hbase</value><!--这里必须跟core-site.xml中的配置一样-->  
  28.   </property>  
  29.   <!-- 开启分布式模式 -->  
  30.   <property>  
  31.   <name>hbase.cluster.distributed</name>  
  32.    <value>true</value>  
  33.   </property>    
  34.   <!--    这里是对的,只配置端口,为了配置多个HMaster -->  
  35.    <property>  
  36.    <name>hbase.master</name>  
  37.    <value>192.168.46.32:60000</value>   
  38.    </property>  
  39.   
  40.      <property>  
  41.      <name>hbase.tmp.dir</name>  
  42.      <value>/home/search/hbase/hbasetmp</value>  
  43.          </property>  
  44. <!-- Hbase的外置zk集群时,使用下面的zk端口 -->  
  45.      <property>  
  46.      <name>hbase.zookeeper.quorum</name>  
  47.      <value>192.168.46.32,192.168.46.11,192.168.46.10</value>  
  48.      </property>  
  49.   
  50. </configuration>  
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

  <property>
      <name>hbase.rootdir</name>
      <value>hdfs://192.168.46.32:9000/hbase</value><!--这里必须跟core-site.xml中的配置一样-->
  </property>
  <!-- 开启分布式模式 -->
  <property>
  <name>hbase.cluster.distributed</name>
   <value>true</value>
  </property>  
  <!--    这里是对的,只配置端口,为了配置多个HMaster -->
   <property>
   <name>hbase.master</name>
   <value>192.168.46.32:60000</value> 
   </property>

     <property>
     <name>hbase.tmp.dir</name>
     <value>/home/search/hbase/hbasetmp</value>
         </property>
<!-- Hbase的外置zk集群时,使用下面的zk端口 -->
     <property>
     <name>hbase.zookeeper.quorum</name>
     <value>192.168.46.32,192.168.46.11,192.168.46.10</value>
     </property>

</configuration>




regionservers里面的配置如下:

Java代码 复制代码  收藏代码
  1. h1  
  2. h2  
  3. h3  
h1
h2
h3


启动后的在Master上进程如下所示:

Java代码 复制代码  收藏代码
  1. 1580 SecondaryNameNode  
  2. 1289 NameNode  
  3. 2662 HMaster  
  4. 2798 HRegionServer  
  5. 1850 NodeManager  
  6. 3414 Jps  
  7. 2569 HQuorumPeer  
  8. 1743 ResourceManager  
  9. 1394 DataNode  
1580 SecondaryNameNode
1289 NameNode
2662 HMaster
2798 HRegionServer
1850 NodeManager
3414 Jps
2569 HQuorumPeer
1743 ResourceManager
1394 DataNode


关闭防火墙后,在win上访问Hbase的60010端口,如下所示:


Hbase存储数据,由于现在的hadoop_第1张图片
在linu的shell客户端里访问hbase的shell如下所示:

Hbase存储数据,由于现在的hadoop_第2张图片



至此,我们的Hbase集群就搭建完毕,下一步我们就可以使用Hbase的shell命令,来测试Hbase的增删改查了,当然我们也可以使用Java API来和Hbase交互,下一篇散仙会给出Java API操作Hbase的一些通用代码。

你可能感兴趣的:(hadoop,hbase)