zookeeper.admin.portUnification
zkSnapShotToolkit.sh
: 和mysql的mysqlbinlog查不多,将快照文件转化到标准输出,支持jsongetEphemerals
用于获取会话创建的所有临时节点zookeeper内置增加了插拔式的指标系统,通过开放了7000端口和/metrics
作为指标的访问路径:
root@99-129:/usr/local/apache-zookeeper-3.6.0-bin# curl http://192.168.99.129:7000/metrics
# HELP outstanding_changes_removed outstanding_changes_removed
# TYPE outstanding_changes_removed counter
outstanding_changes_removed 0.0
# HELP request_throttle_wait_count request_throttle_wait_count
# TYPE request_throttle_wait_count counter
request_throttle_wait_count 0.0
# HELP diff_count diff_count
# TYPE diff_count counter
diff_count 0.0
# HELP commit_propagation_latency commit_propagation_latency
# TYPE commit_propagation_latency summary
commit_propagation_latency{
quantile="0.5",} NaN
commit_propagation_latency{
quantile="0.9",} NaN
commit_propagation_latency{
quantile="0.99",} NaN
commit_propagation_latency_count 0.0
commit_propagation_latency_sum 0.0
# HELP dead_watchers_cleaner_latency dead_watchers_c
...
如何使用prometheus监控zookeeper这里就不说了,看以前的文章
Apache ZooKeeper支持3.6.0版以上的审核日志。默认情况下,审核日志处于禁用状态。要启用审核日志,请在conf / zoo.cfg中配置audit.enable = true。审计日志并非记录在所有的ZooKeeper服务器上,而是仅记录在连接了客户端的服务器上,如下图所示
root@99-129:/usr/local/zookeeper# tail -f logs/zookeeper_audit.log
2020-04-20 05:29:40,099 INFO audit.Log4jAuditLogger: user=root operation=serverStart result=success
2020-04-20 05:30:42,912 INFO audit.Log4jAuditLogger: session=0x100013a1f1a0000 user=192.168.99.130 ip=192.168.99.130 operation=delete znode=/str1000 result=success
2020-04-20 05:30:46,588 INFO audit.Log4jAuditLogger: session=0x100013a1f1a0000 user=192.168.99.130 ip=192.168.99.130 operation=delete znode=/str1002 result=success
如果要修改自定义审核日志文件,备份数,最大文件大小,自定义审核记录器,需要在log4j.properties
中修改定义。
默认情况下,只有四个身份验证提供程序:
根据配置的身份验证提供程序确定用户:
自定义身份验证提供程序可以重写org.apache.zookeeper.server.auth.AuthenticationProvider.getUserName(String id)以提供用户名。如果身份验证提供程序未覆盖此方法,则将org.apache.zookeeper.data.Id.id中存储的所有内容都用作用户。通常,只有用户名存储在此字段中,但是取决于用户身份验证提供者存储在其中的内容。对于审核日志记录,将org.apache.zookeeper.data.Id.id的值作为用户。
在ZooKeeper服务器中,并非所有操作都由客户端完成,而是某些操作由服务器本身完成。例如,当客户端关闭会话时,临时znode将被服务器删除。这些删除操作不是由客户端直接完成的,而是由服务器本身完成的,这些操作称为系统操作。对于这些系统操作,在审核记录这些操作时,会将与ZooKeeper服务器关联的用户视为用户。例如,如果在ZooKeeper中,服务器主体是zookeeper/[email protected],则它将成为系统用户,并且所有系统操作都将使用该用户名记录。
2020-04-20 05:29:40,099 INFO audit.Log4jAuditLogger: user=root operation=serverStart result=success
如果没有与ZooKeeper服务器关联的用户,则将启动ZooKeeper服务器的用户视为该用户。例如,如果服务器由root启动,则将root作为系统用户
user=root operation=serverStart result=success
单个客户端可以将多个身份验证方案附加到一个会话,在这种情况下,所有经过身份验证的方案都将作为用户使用,并以逗号分隔的列表形式显示。例如,如果客户端通过主体[email protected]和ip 127.0.0.1进行身份验证,则创建znode审核日志将如下所示:
session=0x10c0bcb0000 user=[email protected],127.0.0.1 ip=127.0.0.1 operation=create znode=/a result=success
将快照数据转换成标准输出或者json文件
root@99-131:/usr/local/apache-zookeeper-3.6.0-bin# /usr/local/apache-zookeeper-3.6.0-bin/bin/zkSnapShotToolkit.sh -d /tmp/zookeeper/version-2/snapshot.40000b802
/str22589
cZxid = 0x00000400005847
ctime = Mon Apr 20 04:45:17 EDT 2020
mZxid = 0x00000400005847
mtime = Mon Apr 20 04:45:17 EDT 2020
pZxid = 0x00000400005847
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x00000000000000
data = ZGVtbw== # base64编码
TxnLogToolkit是ZooKeeper附带的命令行工具,能够恢复带有损坏CRC的事务日志条目
$ bin/zkTxnLogToolkit.sh log.100000001
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
4/5/18 2:15:58 PM CEST session 0x16295bafcc40000 cxid 0x0 zxid 0x100000001 createSession 30000
CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
4/5/18 2:16:12 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x100000003 createSession 30000
4/5/18 2:17:34 PM CEST session 0x26295bafcc90000 cxid 0x0 zxid 0x200000001 closeSession null
4/5/18 2:17:34 PM CEST session 0x16295bd23720000 cxid 0x0 zxid 0x200000002 createSession 30000
4/5/18 2:18:02 PM CEST session 0x16295bd23720000 cxid 0x2 zxid 0x200000003 create '/andor,#626262,v{s{31,s{'world,'anyone}}},F,1
EOF reached after 6 txns.
交互式选择性修复
$ bin/zkTxnLogToolkit.sh -r log.100000001
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
CRC ERROR - 4/5/18 2:16:05 PM CEST session 0x16295bafcc40000 cxid 0x1 zxid 0x100000002 closeSession null
Would you like to fix it (Yes/No/Abort) ? y
EOF reached after 6 txns.
Recovery file log.100000001.fixed has been written with 1 fixed CRC error(s)