本地运行Neo4j的日志优化

译者言：本文重点介绍Neo4j的日志配置中关于日志文件大小的控制选项，并给出在本地运行Neo4j时推荐的日志配置。

在本地运行Neo4j的用户有一件事一定要注意：事务日志所占用的磁盘大小，特别是进行了大量的增删数据之后。我们先来看一下下面的这条语句：

UNWIND range(0, 1000) AS id
CREATE (:Foo {id: id});
MATCH (f:Foo)
DELETE f

这条语句是创建一批数据后立刻删除，运行几次之后。我们通过下面的指令来看一下数据库的状态

:sysinfo

我们能看到这样的表格

可以看出，大部分空间都被 Logical Log 占用了，也就是Neo4j的事务日志。他们保存在data/databases/graph.db 目录下，文件名以neostore.transaction.db开头。

$ ls -alh data/databases/graph.db/neostore.transaction.db*
-rw-rw-r-- 1 markhneedham markhneedham 1.3M Dec 22 19:17 data/databases/graph.db/neostore.transaction.db.30
-rw-rw-r-- 1 markhneedham markhneedham 1.3M Dec 22 19:17 data/databases/graph.db/neostore.transaction.db.31
-rw-rw-r-- 1 markhneedham markhneedham 145M Dec 24 21:31 data/databases/graph.db/neostore.transaction.db.32

最大的文件超过了100M, 那么，为什么会有这些文件呢？通过运行下面的语句可以看到checkpoint和事务日志的默认配置，通过些配置控制了这些文件的个数。

CALL dbms.listConfig("dbms.checkpoint.interval")
YIELD name, description, value
RETURN name, description, value
UNION
CALL dbms.listConfig("dbms.tx_log")
YIELD name, description, value
RETURN name, description, value

运行上面的语句后可看到下图：

大概内容是：

每900秒(15分钟)做一次checkpointing
每100000次事务做一次checkpointing
事务日志保留7天
事务日志最大250MB

这些默认配置对于生产环境而言还行，但是如果是在本地运行Neo4j,其实并不需要保留这么多日志。 Chris Gioran有一篇非常好的文章详细介绍了这些配置项 https://neo4j.com/developer/kb/checkpointing-and-log-pruning-interactions/。

在本地运行Neo4j的话，我认为下面的配置可能更合适：

dbms.checkpoint.interval.time=30s
dbms.checkpoint.interval.tx=1

dbms.tx_log.rotation.retention_policy=false
dbms.tx_log.rotation.size=1M

现在：

每30秒做一次checkpointing
每个事务都做一次checkpointing
只保留最少日志文件个数
每个日志文件1MB

我们再次运行文章开头的语句，在log/debug.log中可以看到如下输出：

2018-12-24 21:43:20.132+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 56 checkpoint started...
2018-12-24 21:43:20.589+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 56 checkpoint completed in 457ms
2018-12-24 21:43:20.592+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Pruned log versions 30-31, last checkpoint was made in version 33
2018-12-24 21:43:30.593+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 57 checkpoint started...
2018-12-24 21:43:30.716+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 57 checkpoint completed in 122ms
2018-12-24 21:43:30.736+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Pruned log versions 32-32, last checkpoint was made in version 34
2018-12-24 21:43:40.737+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 65 checkpoint started...
2018-12-24 21:43:40.982+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by scheduler for tx count threshold @ txId: 65 checkpoint completed in 245ms
2018-12-24 21:43:40.995+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Pruned log versions 33-40, last checkpoint was made in version 42

可以看到日志删除的次数增加了，运行 :sysinfo 可以看到 Logical Log 变小了很多。

好，就这么多，希望这些对你有用。

译者言：checkpoint 一般翻译成检查点，但我认为这个意思有点牵强，所以，本文中我把其作为专有名词使用，并没有译成中文。我的理解是：checkpoint是一个动作，是其存储在页面缓存的数据变化内容刷新到持久化存储的文件中。最后仍然是欢迎各位留言交流。

本地运行Neo4j的日志优化

你可能感兴趣的:(本地运行Neo4j的日志优化)