JanusGraph0.5动态图管理配置

最近开始研究janusgraph图数据库,想通过开源图数据库开发一套编程语言的知识平台。之前有研究过janusgraph图计算相关的代码,结果比较失望,目前janusgraph只支持tinkerpop内置的几个图计算算法包括pagerank、最短路径、连通图。参考Tinkerpop图计算。在计算性能上还有待优化。
janusgraph整体来说还是比较难上手的,这也和tinkerpop本身复杂的服务端配置文件相关,并且启动方式较为特殊,实际上是通过grenmlin启动脚本实现,jasusgraph通配置文件已插件的方式嵌入到了GremlinServer中,官网说本质上是一些jar包组成不带main方法的。所以实际上系统中显示的进程是org.apache.tinkerpop.gremlin.server.GremlinServer

JanusGraph0.5动态图管理配置_第1张图片

说了很多废话,接下来开始学习janusgraph的动态图管理。
在janusgraph中通过修改server.yaml文件中的graphs选项可以创建一张或者多张图
JanusGraph0.5动态图管理配置_第2张图片
这里我定义了graph0graph1两张图,使用了不同的图配置文件,这些配置信息将决定图的后端存储类型和索引类型,建议文件名为janusgraph-{storage}-{index}.properties.
问题是如何访问这两个图?
需要在groovy脚本变量中配置图的遍历器(GraphTraversalSoource),修改scripts/empty-sample.groovy文件

// Copyright 2019 JanusGraph Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// an init script that returns a Map allows explicit setting of global bindings.
def globals = [:]

// defines a sample LifeCycleHook that prints some output to the Gremlin Server console.
// note that the name of the key in the "global" map is unimportant.
globals << [hook : [
      onStartUp: { ctx ->
          ctx.logger.info("Executed once at startup of Gremlin Server.")
      },
      onShutDown: { ctx ->
          ctx.logger.info("Executed once at shutdown of Gremlin Server.")
      }
] as LifeCycleHook]

// define the default TraversalSource to bind queries to - this one will be named "g".
globals << [g : graph0.traversal(),g1 : graph1.traversal()]

globals实际是全局可访问的键值对变量,因为gremlin内部使用groovy脚本引擎执行语句,所以事先声明好g和g1两个变量保证控制台程序能够访问得到。

动态图管理
上述方法在数据库启动前确定图的名称和配置,在数据库运行的过程中无法进行图的扩展。比如有需求添加新的图。ConfiguredGraphFactory能够弥补静态图配置的不足,官网描述如下:

The JanusGraph Server can be configured to use the`ConfiguredGraphFactory`. The`ConfiguredGraphFactory`is an access point to your graphs, similar to the`JanusGraphFactory`. These graph factories provide methods for dynamically managing the graphs hosted on the server.

ConfiguredGraphFactory提供方法动态管理服务器上的图列表。从源码上看静态方法提供了创建、打开、关闭、删除操作,配置模板创建更新、删除等操作。
JanusGraph0.5动态图管理配置_第3张图片

开启配置步骤:

 配置管理
  • 修改图属性配置文件conf/janusgraph-configurationmanagement.properties
gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory
graph.graphname=ConfigurationManagementGraph
#后端使用berkeleyDB
storage.backend=berkeleyje
storage.directory=./db/berkeley

#使用lucene作为索引
index.search.backend=lucene
index.search.directory=./db/searchindex
  • 修改server启动配置文件conf/gremlin-server/gremlin-server-configurationmanagement.yaml,配置graphManager和graphs.graphManger的作用是维护graphname->Graph和graphname->TraversalSource的映射关系
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphManager: org.janusgraph.graphdb.management.JanusGraphManager
graphs: {
  ConfigurationManagementGraph: conf/janusgraph-configurationmanagement.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-configurationmanagement.groovy]}}}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
  • 修改scripts/empty-sample.groovy文件,注释掉全局变量g.
//globals << [g : graph0.traversal(),g1 : graph1.traversal()]
  • 启动janusgraph核心进程(GremlinServer)
#WINDOWS10
.\bin\gremlin-server.bat .\conf\gremlin-server\gremlin-server-configurationmanagement.yaml
  • 启动janusgraph客户端(GremlinConsole)
# WINDOWS10
.\gremlin.bat
gremlin>:remote connect tinkerpop.server conf/remote.yaml session
gremlin>:remote console
  • 尝试获取ConfiguredGraphFactory图管理器,
gremlin> ConfiguredGraphFactory
==>class org.janusgraph.core.ConfiguredGraphFactory

注:如果出现相关的错误,请检查properties文件中的gremlin.graph和graph.graphname是否正确

No such property: ConfiguredGraphFactory for class: Script7
Type ':help' or ':h' for help.
Display stack trace? [yN]

ConfiguredGraphFactory管理API使用

ConfiguredGraphFactory

创建图配置

map=new HashMap();
map.put('storage.backend','berkeleyje');
map.put('storage.directory','../db/berkeley');
map.put('index.search.backend','lucene');
map.put('index.search.directory','../db/lucene');
map.put('graph.graphname','graph0');
config = new MapConfiguration(map);
#创建图graph0
ConfiguredGraphFactory.createConfiguration(config);

#创建图graph1
map.put('graph.graphname','graph1');
config = new MapConfiguration(map);

ConfiguredGraphFactory.createConfiguration(config);
#获取所有图名
ConfiguredGraphFactory.getGraphNames();
==>graph1
==>graph0

你可能感兴趣的:(java,图数据库,知识图谱,配置化)