Zookeeper - 负责分布式系统的协调服务。
Hadoop集群中,zk节点的数量 n/2+1个,其中n>1,一般集群中zk的数量为3/5/7/9个。多个zk可以搭在同一个节点上。
我的环境使用的是cdh5.7.0版本的套件,所以zookeeper也选择对应的版本下载。
下载
[hadoop@hadoop000 software]$ wget http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.7.0.tar.gz
解压安装
[hadoop@hadoop000 software]$ tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz -C ~/app/
配置环境变量
export ZK_HOME=/home/hadoop/app/zookeeper-3.4.5-cdh5.7.0
export PATH=$ZK_HOME/bin:$PATH
配置zk
[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ cp conf/zoo_sample.cfg conf/zoo.cfg
[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ vi conf/zoo.cfg
dataDir=/home/hadoop/tmp/zookeeper
启动zk
[hadoop@hadoop000 zookeeper-3.4.5-cdh5.7.0]$ cd bin/
# 启动
[hadoop@hadoop000 bin]$ ./zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@hadoop000 bin]$
# 重启
[hadoop@hadoop000 bin]$ ./zkServer.sh restart
查看启动状态
# zk命令直接查看
[hadoop@hadoop000 bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Mode: standalone # zk启动模式为standalone
[hadoop@hadoop000 bin]$
# 也可以通过jps查看zk进程QuorumPeerMain是否存活
[hadoop@hadoop000 bin]$ jps
5744 DataNode
22305 QuorumPeerMain -- zk
5545 NameNode
6361 NodeManager
6010 SecondaryNameNode
22347 Jps
6236 ResourceManager
3439 Nailgun
[hadoop@hadoop000 bin]$
停止zk
[hadoop@hadoop000 bin]$ ./zkServer.sh stop
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
[hadoop@hadoop000 bin]$
查看zk的dataDir
[hadoop@hadoop000 zookeeper]$ ls -ltr
total 4
drwxrwxr-x. 2 hadoop hadoop 23 Aug 15 20:23 version-2 # 一个路径
-rw-rw-r--. 1 hadoop hadoop 5 Aug 15 20:28 zookeeper_server.pid # zk server进程id号
[hadoop@hadoop000 zookeeper]$
zk的基本数据模型为一个树形结构(类似文件系统结构)。
zk的节点(znode)分为临时节点和永久节点。
临时节点对当前session有效,永久节点永久有效。(同HIVE UDF)
永久节点下面可以挂子节点,临时节点下面不能再挂任何节点。
znode特点:
znode:每一个节点都有一个id,id不会重复;每一个id都有一个父id(pid)
每一个节点上的数据发生了变化,这时候会引发数据版本号cversion变化(版本号+1)
znode上面不要存放太大的数据(几个kb),比如配置文件等。
其中zk客户端来连接zk server
# 可以启动一个zk client来查看version-2下面文件的具体内容
# 启动一个zk client, client会连接到zk server的2181端口
[hadoop@hadoop000 bin]$ ./zkCli.sh
Connecting to localhost:2181
...
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]
使用帮助
# 使用帮助
[zk: localhost:2181(CONNECTED) 0] help # 从这里开始每下一个命令这里序号+1
查看路径
# 查看路径
[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 2] ls /zookeeper
[quota]
[zk: localhost:2181(CONNECTED) 3] ls /zookeeper/quota
[]
[zk: localhost:2181(CONNECTED) 4]
查看节点具体数据
# 查看内容
[zk: localhost:2181(CONNECTED) 4] get /zookeeper/quota
# 这里空着一行代表没有数据,下面的内容为数据的属性
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 5]
# ls2 = ls + get
[zk: localhost:2181(CONNECTED) 5] ls2 /zookeeper/quota
[] # 取出来数据为空的
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 6]
创建节点(顺序节点和临时节点)并查看属性
# 创建一个节点(znode)
# create -e 创建临时节点;create -s 创建顺序(sequence)节点
# 创建一个节点叫/node_a,数据为hello
[zk: localhost:2181(CONNECTED) 13] create -s /node_a hello
Created /node_a0000000003
[zk: localhost:2181(CONNECTED) 14] ls /
[zookeeper, node_a0000000003]
[zk: localhost:2181(CONNECTED) 15] get /node_a0000000003
hello # 数据内容
cZxid = 0x5 # 创建时的id
ctime = Wed Aug 15 21:05:23 CST 2018 # 创建时间
mZxid = 0x5
mtime = Wed Aug 15 21:05:23 CST 2018 # 修改时间
pZxid = 0x5 # 最新(最后更新的)的子节点id
cversion = 0 # 版本号
dataVersion = 0 # 数据版本号
aclVersion = 0
ephemeralOwner = 0x0 # 是否为临时的节点(0x0不是,为永久节点)
dataLength = 5 # 数据长度
numChildren = 0 # 子节点数
[zk: localhost:2181(CONNECTED) 16]
# 在上面的节点下创建一个临时节点,对比属性变化
[zk: localhost:2181(CONNECTED) 16] create -e /node_a0000000003/tmp hello
Created /node_a0000000003/tmp
[zk: localhost:2181(CONNECTED) 17] get /node_a0000000003
hello # 数据内容不变
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x5
mtime = Wed Aug 15 21:05:23 CST 2018
pZxid = 0x6 # 最后更新的子节点id发生变化
cversion = 1 # 版本号变化
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 1 # 子节点数+1
# 查看临时节点
[zk: localhost:2181(CONNECTED) 18] get /node_a0000000003/tmp
hello
cZxid = 0x6 # 与上面父节点的pZid一致
ctime = Wed Aug 15 21:13:23 CST 2018
mZxid = 0x6
mtime = Wed Aug 15 21:13:23 CST 2018
pZxid = 0x6
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x1653d96c64c0000 # 临时节点
dataLength = 5
numChildren = 0
[zk: localhost:2181(CONNECTED) 19]
# 关掉当前窗口再重新开一个新窗口,再次查看/node_a0000000003/节点,下面的临时节点tmp已经不存在了,即临时节点是当前session有效的
[zk: localhost:2181(CONNECTED) 0] ls /node_a0000000003
[]
[zk: localhost:2181(CONNECTED) 1]
# 临时节点下面不能再挂任何节点
[zk: localhost:2181(CONNECTED) 14] create -e /node_a0000000003/tmp hello
Created /node_a0000000003/tmp
[zk: localhost:2181(CONNECTED) 15] create -e /node_a0000000003/tmp/test hello
Ephemerals cannot have children: /node_a0000000003/tmp/test
[zk: localhost:2181(CONNECTED) 16]
# 创建几个顺序节点
[zk: localhost:2181(CONNECTED) 3] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000001
[zk: localhost:2181(CONNECTED) 4] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000002
[zk: localhost:2181(CONNECTED) 5] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000003
[zk: localhost:2181(CONNECTED) 6] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000004
[zk: localhost:2181(CONNECTED) 7] create -s /node_a0000000003/seq zookeeper
Created /node_a0000000003/seq0000000005
[zk: localhost:2181(CONNECTED) 8] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, seq0000000001, seq0000000002]
[zk: localhost:2181(CONNECTED) 9]
# 一次性创建连续的多层节点是不行的。但是通过Curator提供的API可以实现
[zk: localhost:2181(CONNECTED) 13] create -s /node_a0000000003/a/b/c abc
Node does not exist: /node_a0000000003/a/b/c
[zk: localhost:2181(CONNECTED) 14]
修改节点上的数据内容(set和delete的时候要注意version)
# 修改节点上数据的内容,dataVersion + 1
[zk: localhost:2181(CONNECTED) 16] set /node_a0000000003 hello_new
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x13
mtime = Wed Aug 15 21:35:45 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 1 # dataVersion + 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 6
[zk: localhost:2181(CONNECTED) 17] get /node_a0000000003
hello_new
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x13
mtime = Wed Aug 15 21:35:45 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 6
[zk: localhost:2181(CONNECTED) 18]
# 当前dataVersion为1,指定修改1版本的数据:set /znode data version
[zk: localhost:2181(CONNECTED) 18] set /node_a0000000003 hello_2 1
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x14
mtime = Wed Aug 15 21:38:33 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 6
[zk: localhost:2181(CONNECTED) 19] get /node_a0000000003
hello_2 # 数据成功被修改
cZxid = 0x5
ctime = Wed Aug 15 21:05:23 CST 2018
mZxid = 0x14
mtime = Wed Aug 15 21:38:33 CST 2018
pZxid = 0x11
cversion = 8
dataVersion = 2 # 数据的版本号变为2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 6
[zk: localhost:2181(CONNECTED) 20]
# 再次修改version为1的数据,会报错提示版本号为1的数据不存在(这是一个乐观锁)
[zk: localhost:2181(CONNECTED) 20] set /node_a0000000003 hello_3 1
version No is not valid : /node_a0000000003
[zk: localhost:2181(CONNECTED) 21]
删除节点
[zk: localhost:2181(CONNECTED) 21] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, seq0000000001, tmp, seq0000000002]
[zk: localhost:2181(CONNECTED) 22] delete /node_a0000000003/seq000000000
seq0000000005 seq0000000003 seq0000000004 seq0000000001 seq0000000002
[zk: localhost:2181(CONNECTED) 22] delete /node_a0000000003/seq0000000001
[zk: localhost:2181(CONNECTED) 23] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, tmp, seq0000000002]
[zk: localhost:2181(CONNECTED) 24]
# 删除时注意版本号
# 修改一下/node_a0000000003/seq0000000002的数据,dataVersion变为1。删除时要关注版本号
[zk: localhost:2181(CONNECTED) 24] get /node_a0000000003/seq0000000002
zookeeper
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0xa
mtime = Wed Aug 15 21:24:19 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0
[zk: localhost:2181(CONNECTED) 25] set /node_a0000000003/seq000000000
seq0000000005 seq0000000003 seq0000000004 seq0000000002
[zk: localhost:2181(CONNECTED) 25] set /node_a0000000003/seq0000000002 zookeeper_1
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0x17
mtime = Wed Aug 15 21:50:03 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 11
numChildren = 0
[zk: localhost:2181(CONNECTED) 26] get /node_a0000000003/seq0000000002
zookeeper_1
cZxid = 0xa
ctime = Wed Aug 15 21:24:19 CST 2018
mZxid = 0x17
mtime = Wed Aug 15 21:50:03 CST 2018
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 11
numChildren = 0
[zk: localhost:2181(CONNECTED) 27] delete /node_a0000000003/seq0000000002 0
version No is not valid : /node_a0000000003/seq0000000002
[zk: localhost:2181(CONNECTED) 28] delete /node_a0000000003/seq0000000002 1
[zk: localhost:2181(CONNECTED) 29] ls /node_a0000000003
[seq0000000005, seq0000000003, seq0000000004, tmp] # 2已经被删除了
[zk: localhost:2181(CONNECTED) 30]
http://zookeeper.apache.org/ -> Documentation -> Release 3.4.13 -> administrator's Guide -> Zookeeper Commands: The Four Letter Words.
CentOS7默认没有安装netcat,要先安装
[root@hadoop01 ~]# yum -y install nmap-ncat
[root@hadoop01 yum.repos.d]# nc -help
netcat命令已经可以使用
[root@hadoop01 yum.repos.d]# echo mntr | nc localhost 2181
zk_version 3.4.5-cdh5.7.0--1, built on 03/23/2016 18:31 GMT
zk_avg_latency 0
zk_max_latency 43
zk_min_latency 0
zk_packets_received 475
zk_packets_sent 474
zk_num_alive_connections 1
zk_outstanding_requests 0
zk_server_state standalone
zk_znode_count 6
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 87
zk_open_file_descriptor_count 28
zk_max_file_descriptor_count 4096
[root@hadoop01 yum.repos.d]#
stat - 查看zk客户端连接情况
[root@hadoop000 ~]# echo stat | nc localhost 2181
Zookeeper version: 3.4.5-cdh5.7.0--1, built on 03/23/2016 18:31 GMT
Clients:
/0:0:0:0:0:0:0:1:55671[1](queued=0,recved=2,sent=2)
/0:0:0:0:0:0:0:1:55672[0](queued=0,recved=1,sent=0)
Latency min/avg/max: 0/0/24
Received: 1930
Sent: 1929
Connections: 2
Outstanding: 0
Zxid: 0x1b
Mode: standalone
Node count: 11
[root@hadoop000 ~]#
ruok - 测试zk server的运行状态
[root@hadoop000 ~]# echo ruok | nc localhost 2181
imok[root@hadoop000 ~]#
dump - 列出节点和临时节点的一些情况
[root@hadoop000 ~]# echo dump | nc localhost 2181
SessionTracker dump:
Session Sets (3):
0 expire at Thu Aug 16 18:57:04 CST 2018:
0 expire at Thu Aug 16 18:57:14 CST 2018:
1 expire at Thu Aug 16 18:57:24 CST 2018:
0x1653d96c64c0002
ephemeral nodes dump:
Sessions with Ephemerals (0):
[root@hadoop000 ~]#
其他四字命令参见zk官方文档。