最近工作涉及到Hbase的使用,搭建环境是第一步,折腾了2天,终于搞定了,特意记录下来。
搭建分布式环境,准备3台主机
主机名 | IP |
node1 | 1.1.12.23 |
node2 | 1.1.12.45 |
node3 | 1.1.12.48 |
将jdk解压到目录/usr/java/并设置java环境变量
root@node1:/usr/java# tar -zxvf jdk-8u161-linux-x64.tar.gz
root@node1:/usr/java# ls
jdk1.8.0_161
root@node1:/usr/java# vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_161/
export JRE_HOME=JAVA_HOME/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
root@node1:/usr/java# source /etc/profile
root@node1:/usr/java# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
Hbase在分布式环境下依赖于HDFS,需要先安装Hadoop。
1. 创建用户和组(每台主机)
添加hadoop用户和组,主要原因是为了控制安全权限。
root@node1:~# groupadd hadoop
root@node1:~# useradd -m -g hadoop hadoop
root@node1:~# passwd hadoop
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
root@node1:~# su - hadoop
hadoop@node1:~$
2. 准备目录并设置权限(每台主机)
hadoop@node1:~$ mkdir app #hadoop,hbase,zookeeper的安装目录
hadoop@node1:~$ mkdir data #hadoop,hbase,zookeeper的数据目录
hadoop@node1:~$ ls -l
total 20
drwxr-xr-x 5 hadoop hadoop 4096 4月 11 16:19 app
drwxrwxr-x 5 hadoop hadoop 4096 4月 11 16:09 data
-rw-r--r-- 1 hadoop hadoop 8980 4月 20 2016 examples.desktop
若app与data目录的权限不是hadoop,则应设置权限:
root@node1:~# chown -R hadoop:hadoop /home/hadoop/app/ #data目录类似
3. 配置hosts(每台主机)
root@node1:~# vim /etc/hosts
127.0.0.1 localhost
1.1.12.23 node1
1.1.12.45 node2
1.1.12.48 node3
4. SSH 免密登录 (每台主机)
hadoop@node1:~$ ssh-keygen -t rsa
一路回车,然后在当前用户目录下的.ssh目录中会成成公钥文件(id_rsa.pub)和私钥文件(id_rsa).
分发公钥
hadoop@node1:~$ ssh-copy-id -i .ssh/id_rsa.pub hadoop@node1
5. 服务器功能规划
node1 | node2 | node3 | |
NameNode | yes | ||
DataNode | yes | yes | yes |
ResourceManager | yes | ||
NodeManager | yes | yes | yes |
Zookeeper | yes | yes | yes |
1. 解压安装包
root@node1:/home/hadoop/app# tar -zxvf hadoop-2.7.1.tar.gz
root@node1:/home/hadoop/app# ls
hadoop-2.7.1
root@node1:/home/hadoop/app# chown -R hadoop:hadoop hadoop-2.7.1/
root@node1:/home/hadoop/app# ls -l
total 12
drwxr-xr-x 10 hadoop hadoop 4096 4月 11 14:11 hadoop-2.7.1
2. 配置Hadoop的java环境:hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_161
3. 配置core-site.xml
fs.defaultFS #NameNode地址
hdfs://node1:9000
hadoop.tmp.dir #临时目录地址
/home/hadoop/data/tmp
4. 配置hdfs-site.xml
dfs.namenode.data.dir #NameNode的数据存储位置
file:///home/hadoop/data/dfs/name
dfs.datanode.data.dir #DataNode的数据储存位置
file:///home/hadoop/data/dfs/data
5. 配置slaves(即DataNode)
node1
node2
node3
6. 配置yarn-site.xml
yarn.resourcemanager.hostname #resourcemanager地址
node2
yarn.log-aggregation-enable #是否启用日志聚集功能
true
yarn.log-aggregation.retain-seconds #聚集日志在hdfs中最多保存多长时间
106800
yarn.nodemanager.resource.memory-mb #nodemanager分配的内存。不应小于1024Mb
1024
以上并没有配置mapreduce,若有需要可参考:点击打开链接或:点击打开链接
将其他两台主机也按照上述步骤进行配置,或直接远程复制。
1. 在NameNode上格式化:
hadoop@node1:~/app/hadoop-2.7.1$ bin/hdfs namenode -format
2. 启动集群
hadoop@node1:~/app/hadoop-2.7.1$ ./sbin/start-dfs.sh
Starting namenodes on [node1]
node1: starting namenode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-namenode-node1.out
node1: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node1.out
node3: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node3.out
node2: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-secondarynamenode-node1.out
hadoop@node1:~/app/hadoop-2.7.1$ jps
36579 NameNode
36742 DataNode
37065 Jps
36939 SecondaryNameNode
hadoop@node1:~/app/hadoop-2.7.1$ ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.7.1/logs/yarn-hadoop-resourcemanager-node1.out
node1: starting nodemanager, logging to /home/hadoop/app/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-node1.out
node2: starting nodemanager, logging to /home/hadoop/app/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-node2.out
node3: starting nodemanager, logging to /home/hadoop/app/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-node3.out
hadoop@node1:~/app/hadoop-2.7.1$ jps
36579 NameNode
36742 DataNode
37399 Jps
36939 SecondaryNameNode
37261 NodeManager
切换到resourcemanager(即node2):
hadoop@node2:~/app/hadoop-2.7.1$ ./sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.7.1/logs/yarn-hadoop-resourcemanager-node2.out
hadoop@node2:~/app/hadoop-2.7.1$ jps
23944 ResourceManager
23771 NodeManager
23595 DataNode
24172 Jps
分别在3台机器上执行以下步骤。
1. 解压安装包
hadoop@node1:~/app$ tar -zxvf zookeeper-3.3.6.tar.gz
hadoop@node1:~/app$ ls
hadoop-2.7.1 zookeeper-3.3.6
2.配置zoo.cfg
hadoop@node1:~/app/zookeeper-3.3.6/conf$ cp zoo_sample.cfg zoo.cfg
hadoop@node1:~/app/zookeeper-3.3.6/conf$ vim zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/home/hadoop/data/zdata
# the port at which the clients will connect
clientPort=2181
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
3. 创建myid文件
hadoop@node1:~/data$ touch zdata/myid
hadoop@node1:~/data$ echo 1 >> zdata/myid
hadoop@node1:~/data$ cat zdata/myid
1
其他两台机器的myid分别设置为2、3
4. 启动zookeeper
hadoop@node1:~/app/zookeeper-3.3.6$ bin/zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.3.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
hadoop@node1:~/app/zookeeper-3.3.6$ jps
38181 QuorumPeerMain
38204 Jps
1. 服务器规划
node1 | node2 | node3 | |
Master | yes | ||
RegionServer | yes | yes |
2. 解压安装包
hadoop@node1:~/app$ tar -zxvf hbase-1.2.6-bin.tar.gz
hadoop@node1:~/app$ ls
hadoop-2.7.1 hbase-1.2.6 zookeeper-3.3.6
3. 配置java环境
hadoop@node1:~/app/hbase-1.2.6$ vim conf/hbase-env.sh
# The java implementation to use. Java 1.7+ required.
export JAVA_HOME=/usr/java/jdk1.8.0_161/
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=true
4. 配置hbase-site.xml
hbase.rootdir #根目录,应来源于hdfs的namenode地址
hdfs://node1:9000/hbase
hbase.cluster.distributed #采用分布式环境
true
hbase.zookeeper.property.dataDir #zookeeper的数据存放路径,应与zookeeper配置一致
/home/hadoop/data/zdata
hbase.zookeeper.quorum #zookeeper集群的地址
node1,node2,node3
5. 配置regionservers
node2
node3
在其他主机上执行上述步骤。
6. 启动HBase
(1)启动hdfs(namenode)
hadoop@node1:~/app/hbase-1.2.6$ start-dfs.sh
Starting namenodes on [node1]
node1: starting namenode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-namenode-node1.out
node1: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node1.out
node2: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node2.out
node3: starting datanode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-datanode-node3.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.7.1/logs/hadoop-hadoop-secondarynamenode-node1.out
hadoop@node1:~/app/hbase-1.2.6$ jps
38913 SecondaryNameNode
38551 NameNode
38716 DataNode
39037 Jps
(2)启动hbase(master)
hadoop@node1:~/app/hbase-1.2.6$ ./bin/start-hbase.sh
starting master, logging to /home/hadoop/app/hbase-1.2.6/bin/../logs/hbase-hadoop-master-node1.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node1: starting regionserver, logging to /home/hadoop/app/hbase-1.2.6/bin/../logs/hbase-hadoop-regionserver-node1.out
node2: starting regionserver, logging to /home/hadoop/app/hbase-1.2.6/bin/../logs/hbase-hadoop-regionserver-node2.out
node3: starting regionserver, logging to /home/hadoop/app/hbase-1.2.6/bin/../logs/hbase-hadoop-regionserver-node3.out
node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
hadoop@node1:~/app/hbase-1.2.6$ jps
38913 SecondaryNameNode
39346 HRegionServer
39189 HMaster
38181 QuorumPeerMain
38551 NameNode
38716 DataNode
39663 Jps
(4)web页面
HBase Web UI: http://node1:16010
HDFS Web UI:http://node1:50070
1. 客户端hosts配置
1.1.12.23 node1
1.1.12.45 node2
1.1.12.48 node3
2. 将hbase-site.xml放到工程资源目录下src/main/resources
hbase.zookeeper.quorum
node1,node2,node3
3. 示例
public class App
{
static Configuration conf;
public static void main( String[] args )
{
conf = HBaseConfiguration.create();
try {
Connection conn = ConnectionFactory.createConnection(conf);
Admin admin = conn.getAdmin();
if (!admin.tableExists(TableName.valueOf("mytable"))) {
HTableDescriptor desc = new HTableDescriptor(TableName.valueOf("mytable"));
desc.addFamily(new HColumnDescriptor("c1"));
admin.createTable(desc);
}
Table table = conn.getTable(TableName.valueOf("mytable"));
//插入数据:row1,c1:a,value1
Put put = new Put(Bytes.toBytes("row1"));
put.addColumn(Bytes.toBytes("c1"), Bytes.toBytes("a"), Bytes.toBytes("value1"));
table.put(put);
table.close();
System.out.println("sucessful put data...");
conn.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}