[Hadoop] Zookeeper 安装部署与简单操作

Zookeeper - 负责分布式系统的协调服务。

Hadoop集群中,zk节点的数量 n/2+1个,其中n>1,一般集群中zk的数量为3/5/7/9个。多个zk可以搭在同一个节点上。

 

1. Zookeeper下载安装

我的环境使用的是cdh5.7.0版本的套件,所以zookeeper也选择对应的版本下载。

下载

[hadoop@hadoop000 software]$ wget http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.7.0.tar.gz

解压安装

[hadoop@hadoop000 software]$ tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz -C ~/app/

配置环境变量

export ZK_HOME=/home/hadoop/app/zookeeper-3.4.5-cdh5.7.0
export PATH=$ZK_HOME/bin:$PATH

配置zk

[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ cp conf/zoo_sample.cfg conf/zoo.cfg
[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ vi conf/zoo.cfg
dataDir=/home/hadoop/tmp/zookeeper

启动zk

[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ cd bin/
# 启动
[hadoop@hadoop000 bin]$ ./zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@hadoop000 bin]$


# 重启
[hadoop@hadoop000 bin]$ ./zkServer.sh restart

查看启动状态

# zk命令直接查看
[hadoop@hadoop000 bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Mode: standalone    # zk启动模式为standalone
[hadoop@hadoop000 bin]$

# 也可以通过jps查看zk进程QuorumPeerMain是否存活
[hadoop@hadoop000 bin]$ jps
5744 DataNode
22305 QuorumPeerMain        -- zk
5545 NameNode
6361 NodeManager
6010 SecondaryNameNode
22347 Jps
6236 ResourceManager
3439 Nailgun
[hadoop@hadoop000 bin]$

停止zk

[hadoop@hadoop000 bin]$ ./zkServer.sh stop
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
[hadoop@hadoop000 bin]$

查看zk的dataDir

[hadoop@hadoop000 zookeeper]$ ls -ltr
total 4
drwxrwxr-x. 2 hadoop hadoop 23 Aug 15 20:23 version-2   # 一个路径
-rw-rw-r--. 1 hadoop hadoop  5 Aug 15 20:28 zookeeper_server.pid   # zk server进程id号
[hadoop@hadoop000 zookeeper]$

2. zk的基本数据模型

zk的基本数据模型为一个树形结构(类似文件系统结构)。

zk的节点(znode)分为临时节点和永久节点。

临时节点对当前session有效,永久节点永久有效。(同HIVE UDF)

永久节点下面可以挂子节点,临时节点下面不能再挂任何节点。

znode特点:

znode:每一个节点都有一个id,id不会重复;每一个id都有一个父id(pid)

每一个节点上的数据发生了变化,这时候会引发数据版本号cversion变化(版本号+1)

znode上面不要存放太大的数据(几个kb),比如配置文件等。

 

3. Zookeeper 基本操作

其中zk客户端来连接zk server

# 可以启动一个zk client来查看version-2下面文件的具体内容
# 启动一个zk client, client会连接到zk server的2181端口
[hadoop@hadoop000 bin]$ ./zkCli.sh
Connecting to localhost:2181
...

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]

使用帮助

# 使用帮助
[zk: localhost:2181(CONNECTED) 0] help    # 从这里开始每下一个命令这里序号+1

查看路径

# 查看路径
[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 2] ls /zookeeper
[quota]
[zk: localhost:2181(CONNECTED) 3] ls /zookeeper/quota
[]
[zk: localhost:2181(CONNECTED) 4]

查看节点具体数据

# 查看内容
[zk: localhost:2181(CONNECTED) 4] get /zookeeper/quota
                                         # 这里空着一行代表没有数据,下面的内容为数据的属性
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 5]

# ls2 = ls + get
[zk: localhost:2181(CONNECTED) 5] ls2 /zookeeper/quota
[]                                                # 取出来数据为空的
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 6]

创建节点(顺序节点和临时节点)并查看属性

# 创建一个节点(znode)
# create -e 创建临时节点;create -s 创建顺序(sequence)节点
# 创建一个节点叫/node_a,数据为hello
[zk: localhost:2181(CONNECTED) 13] create -s /node_a hello
Created /node_a0000000003
[zk: localhost:2181(CONNECTED) 14] ls /
[zookeeper, node_a0000000003]
[zk: localhost:2181(CONNECTED) 15] get /node_a0000000003
hello                                            # 数据内容
cZxid = 0x5                                    # 创建时的id
ctime = Wed Aug 15 21:05:23 CST 2018            # 创建时间
mZxid = 0x5
mtime = Wed Aug 15 21:05:23 CST 2018            # 修改时间
pZxid = 0x5                                    # 最新(最后更新的)的子节点id
cversion = 0                                    # 版本号
dataVersion = 0                                # 数据版本号
aclVersion = 0
ephemeralOwner = 0x0                            # 是否为临时的节点(0x0不是,为永久节点)
dataLength = 5                                    # 数据长度
numChildren = 0                                    # 子节点数
[zk: localhost:2181(CONNECTED) 16]

# 在上面的节点下创建一个临时节点,对比属性变化
[zk: localhost:2181(CONNECTED) 16] create -e /node_a0000000003/tmp hello
Created /node_a0000000003/tmp
[zk: localhost:2181(CONNECTED) 17] get /node_a0000000003
hello                                            # 数据内容不变
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x5
mtime = Wed Aug 15 21:05:23 CST 2018
pZxid = 0x6                                    # 最后更新的子节点id发生变化
cversion = 1                                    # 版本号变化
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 1                                # 子节点数+1

# 查看临时节点
[zk: localhost:2181(CONNECTED) 18] get /node_a0000000003/tmp
hello
cZxid = 0x6                                    # 与上面父节点的pZid一致
ctime = Wed Aug 15 21:13:23 CST 2018
mZxid = 0x6
mtime = Wed Aug 15 21:13:23 CST 2018
pZxid = 0x6
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x1653d96c64c0000                # 临时节点
dataLength = 5
numChildren = 0
[zk: localhost:2181(CONNECTED) 19]

# 关掉当前窗口再重新开一个新窗口,再次查看/node_a0000000003/节点,下面的临时节点tmp已经不存在了,即临时节点是当前session有效的
[zk: localhost:2181(CONNECTED) 0] ls /node_a0000000003
[]
[zk: localhost:2181(CONNECTED) 1]

# 临时节点下面不能再挂任何节点
[zk: localhost:2181(CONNECTED) 14] create -e /node_a0000000003/tmp hello
Created /node_a0000000003/tmp
[zk: localhost:2181(CONNECTED) 15] create -e /node_a0000000003/tmp/test hello
Ephemerals cannot have children: /node_a0000000003/tmp/test
[zk: localhost:2181(CONNECTED) 16]


# 创建几个顺序节点
[zk: localhost:2181(CONNECTED) 3] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000001
[zk: localhost:2181(CONNECTED) 4] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000002
[zk: localhost:2181(CONNECTED) 5] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000003
[zk: localhost:2181(CONNECTED) 6] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000004
[zk: localhost:2181(CONNECTED) 7] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000005
[zk: localhost:2181(CONNECTED) 8] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, seq0000000001, seq0000000002]
[zk: localhost:2181(CONNECTED) 9]

# 一次性创建连续的多层节点是不行的。但是通过Curator提供的API可以实现
[zk: localhost:2181(CONNECTED) 13] create -s /node_a0000000003/a/b/c abc
Node does not exist: /node_a0000000003/a/b/c
[zk: localhost:2181(CONNECTED) 14]

修改节点上的数据内容(set和delete的时候要注意version)

# 修改节点上数据的内容,dataVersion + 1
[zk: localhost:2181(CONNECTED) 16] set /node_a0000000003 hello_new
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x13
mtime = Wed Aug 15 21:35:45 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 1                        # dataVersion + 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 6
[zk: localhost:2181(CONNECTED) 17] get /node_a0000000003
hello_new
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x13
mtime = Wed Aug 15 21:35:45 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 6
[zk: localhost:2181(CONNECTED) 18]

# 当前dataVersion为1,指定修改1版本的数据:set /znode data version 
[zk: localhost:2181(CONNECTED) 18] set /node_a0000000003 hello_2 1
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x14
mtime = Wed Aug 15 21:38:33 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 6
[zk: localhost:2181(CONNECTED) 19] get /node_a0000000003
hello_2                                            # 数据成功被修改
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x14
mtime = Wed Aug 15 21:38:33 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 2                                    # 数据的版本号变为2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 6
[zk: localhost:2181(CONNECTED) 20]

# 再次修改version为1的数据,会报错提示版本号为1的数据不存在(这是一个乐观锁)
[zk: localhost:2181(CONNECTED) 20] set /node_a0000000003 hello_3 1
version No is not valid : /node_a0000000003
[zk: localhost:2181(CONNECTED) 21]

删除节点

[zk: localhost:2181(CONNECTED) 21] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, seq0000000001, tmp, seq0000000002]
[zk: localhost:2181(CONNECTED) 22] delete /node_a0000000003/seq000000000
seq0000000005   seq0000000003   seq0000000004   seq0000000001   seq0000000002
[zk: localhost:2181(CONNECTED) 22] delete /node_a0000000003/seq0000000001
[zk: localhost:2181(CONNECTED) 23] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, tmp, seq0000000002]
[zk: localhost:2181(CONNECTED) 24]

# 删除时注意版本号
# 修改一下/node_a0000000003/seq0000000002的数据,dataVersion变为1。删除时要关注版本号
[zk: localhost:2181(CONNECTED) 24] get /node_a0000000003/seq0000000002
zookeeper
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0xa
mtime = Wed Aug 15 21:24:19 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0
[zk: localhost:2181(CONNECTED) 25] set /node_a0000000003/seq000000000
seq0000000005   seq0000000003   seq0000000004   seq0000000002
[zk: localhost:2181(CONNECTED) 25] set /node_a0000000003/seq0000000002 zookeeper_1
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0x17
mtime = Wed Aug 15 21:50:03 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 11
numChildren = 0
[zk: localhost:2181(CONNECTED) 26] get /node_a0000000003/seq0000000002
zookeeper_1
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0x17
mtime = Wed Aug 15 21:50:03 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 11
numChildren = 0
[zk: localhost:2181(CONNECTED) 27] delete /node_a0000000003/seq0000000002 0
version No is not valid : /node_a0000000003/seq0000000002
[zk: localhost:2181(CONNECTED) 28] delete /node_a0000000003/seq0000000002 1
[zk: localhost:2181(CONNECTED) 29] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, tmp]        # 2已经被删除了
[zk: localhost:2181(CONNECTED) 30]

 

4. zk四字命令

http://zookeeper.apache.org/ -> Documentation -> Release 3.4.13 -> administrator's Guide -> Zookeeper Commands: The Four Letter Words.

CentOS7默认没有安装netcat,要先安装

[root@hadoop01 ~]# yum -y install nmap-ncat
[root@hadoop01 yum.repos.d]# nc -help

netcat命令已经可以使用

[root@hadoop01 yum.repos.d]# echo mntr | nc localhost 2181
zk_version	3.4.5-cdh5.7.0--1, built on 03/23/2016 18:31 GMT
zk_avg_latency	0
zk_max_latency	43
zk_min_latency	0
zk_packets_received	475
zk_packets_sent	474
zk_num_alive_connections	1
zk_outstanding_requests	0
zk_server_state	standalone
zk_znode_count	6
zk_watch_count	0
zk_ephemerals_count	0
zk_approximate_data_size	87
zk_open_file_descriptor_count	28
zk_max_file_descriptor_count	4096
[root@hadoop01 yum.repos.d]# 

stat - 查看zk客户端连接情况

[root@hadoop000 ~]# echo stat | nc localhost 2181
Zookeeper version: 3.4.5-cdh5.7.0--1, built on 03/23/2016 18:31 GMT
Clients:
 /0:0:0:0:0:0:0:1:55671[1](queued=0,recved=2,sent=2)
 /0:0:0:0:0:0:0:1:55672[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/24
Received: 1930
Sent: 1929
Connections: 2
Outstanding: 0
Zxid: 0x1b
Mode: standalone
Node count: 11
[root@hadoop000 ~]#

ruok - 测试zk server的运行状态

[root@hadoop000 ~]# echo ruok | nc localhost 2181
imok[root@hadoop000 ~]#

dump - 列出节点和临时节点的一些情况

[root@hadoop000 ~]# echo dump | nc localhost 2181
SessionTracker dump:
Session Sets (3):
0 expire at Thu Aug 16 18:57:04 CST 2018:
0 expire at Thu Aug 16 18:57:14 CST 2018:
1 expire at Thu Aug 16 18:57:24 CST 2018:
	0x1653d96c64c0002
ephemeral nodes dump:
Sessions with Ephemerals (0):
[root@hadoop000 ~]#

其他四字命令参见zk官方文档。

你可能感兴趣的:(Hadoop)