Neo4j-库管理

Neo4j权威指南第五章读书笔记

监控

截图

解释：

Store Size: 存储容量，调用操作系统获取Neo4j数据库存储文件的大小

前面看过Neo4j数据的存储方式，节点，关系，属性，大段String，索引都是分开存放的，如图所示，当前总容量是1.4GB
IDAllocation: 分配的ID数
PageCache 页面缓存这个为什么木有数
Transcations 事务

指标	解释
ArrayStrog	数组存储容量
IndexStore	属性存储容量
LabelStore	标签存储容量
NodeStore	Node存储容量
RelationshipStore	关系存储容量
PropertyStore	属性存储容量
Last Tx ID	最后提交的事务ID
Current	当前事务ID
Peak	并发事务的最高峰值
Opened	启动的事务总数
Committed	提交的事务总数

查询管理

开启超时查询保护：

dbms.transcation.timeout = 10s // 默认0表示不启用查询超时保护. 对java API访问时带的超时参数没影响

查询操作query(社区版没找到~~):

dbms.listQueries() 列出运行的语句

dbms.killQuery(queryId)

dbms.killQueries([queryId, queryId, ....])

数据收集器

dbms.udc.enabled=false 关闭数据收集(默认true)

安全管理

社区版的用户没有提供涉及角色、权限控制等安全管理功能。

只提供：

dbms.security.listUsers

dbms.security.showCurrentUsers

dbms.security.changePassword

dbms.security.createUser

dbms.security.deleteUser

运维优化

配置内存

./bin/neo4j-admin memrec 推荐配置

操作系统内存

操作系统内存 = 可用内存 - (页面缓存 + 堆空间大小)

index 和 schema 目录需要留出足够的内存用作操作系统的文件缓冲区，否则索引文件不能完全装载在内存中，影响查询效率，基本计算方式：

系统内存 = 1GB + (graph.db/index) + graph.db / schema)

graph.db 目前我们图数据库大概1.5G
页面缓存大小：

页面缓存用于缓存存储在磁盘上的Neo4j数据，确保大部分数据缓存到内存中，提高查询性能

dbms.memory.pagecache.size, 保证数据都能装进去，再加20%预留

堆大小

堆内存大小足够大对维持并发操作是非常有用的，一般建议 8~ 16GB之间

dbms.memory.heap.initial_size

dbms.memory.heap.max_size

设置为同一数值(如16000)，以避免不必要的垃圾收集

堆的JVM调优：

dbms.jvm.additional 附加JVM参数

        #**************************************************************
        # JVM Parameters
        #**********************************************************
        
        # G1GC generally strikes a good balance between throughput and tail
        # latency, without too much tuning.
        dbms.jvm.additional=-XX:+UseG1GC

        # Have common exceptions keep producing stack traces, so they can be
        # debugged regardless of how often logs are rotated.
        dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow
        
        # Make sure that `initmemory` is not only allocated, but committed to
        # the process, before starting the database. This reduces memory
        # fragmentation, increasing the effectiveness of transparent huge
        # pages. It also reduces the possibility of seeing performance drop
        # due to heap-growing GC events, where a decrease in available page
        # cache leads to an increase in mean IO response time.
        # Try reducing the heap memory, if this flag degrades performance.
        dbms.jvm.additional=-XX:+AlwaysPreTouch
        
        # Trust that non-static final fields are really final.
        # This allows more optimizations and improves overall performance.
        # NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or
        # serialization to change the value of final fields!
        dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
        dbms.jvm.additional=-XX:+TrustFinalNonStaticFields
        
        # Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
        dbms.jvm.additional=-XX:+DisableExplicitGC
        dbms.jvm.additional=-Xss100M
        # Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
        dbms.jvm.additional=-XX:+DisableExplicitGC
        dbms.jvm.additional=-Xss100M
        
        # Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
        # jmx.password files are required.
        # Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
        # the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
        # For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
        # On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
        # and have permissions set to 0600.
        # For details on setting these file permissions on Windows see:
        #     http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access
        
        # Some systems cannot discover host name automatically, and need this line configured:
        #dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME
        
        # Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
        # This is to protect the server from any potential passive eavesdropping.
        dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

        # This mitigates a DDoS vector.
        dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

(1) 分配新老两代大小比例：

   -XX:NewRation=N  # 年老代/年轻代 = N, N一般 2~ 8之间 大的年轻代适合更改排序大量数据， 运行并发线程也需要给年轻代更大空间，但年轻代过大容易引起频繁fullGC

(2) 堆的大小：

   最终目的是减少GC时间， 过大的堆会造成一般不GC， 一旦GC会耗费很长时间

(3) 并发垃圾回收：

   -XX:+UseG1GC

系统调优：

Neo4j读写许多读写操作， Linux默认CFQ公平排队调度IO. 期限调度器更适合数据库的特定IO工作负载情况，优先读操作，减少读的等待时间，而增加写的等待时间。

修改驱动器sda：

echo 'deadline' > /sys/block/sda/queue/scheduler

cat  /sys/block/sda/queue/scheduler

磁盘内存：

快速固态硬盘 > 固态硬盘 > 机械硬盘

启动需要预热，将索引加载到内存： CALL apoc.warmup.run(true);

dstat或vmstat等工具

备份恢复：

企业版提供了备份工具 neo4j-backup 备份到远程或本地, 支持全量，增量

dbms.backup.enable=true # 默认值就是true
dbms.backup.address=<主机名/IP>:6362

社区版的只好自己写备份脚本了，直接粘贴复制数据文件也可以，但是官方建议更安全的数据导入导出工具(停服)

neo4j-admin dump --database=graph.db --to=${file}  # 备份压缩

neo4j-admin load --from=${file_catalog}/${file} --database=graph.db --force  # 恢复数据