Gremlin是Apache TinkerPop框架下实现的图遍历语言,支持OLTP与OLAP,是目前图数据库领域主流的查询语言,可类比SQL语言之于关系型数据库。
HugeGraph是国内的一款开源图数据库,完全支持Gremlin语言。本文将讲述如何基于HugeGraph搭建一个执行Gremlin的图形化环境。
HugeGraph的github仓库下有很多子项目,我们这里只需要使用其中的两个:hugegraph
和hugegraph-studio
。
进入hugegraph项目,克隆代码库
进入终端
$ git clone git@github.com:hugegraph/hugegraph.git
完成后会在当前目录下多出来一个hugegraph
的子目录,不过这个目录里面的文件是源代码,我们需要编译打包才能生成可以运行包。
进入hugegraph
目录,执行命令:
$ git checkout release-0.7
$ mvn package -DskipTests
注意:一定要先切换分支,hugegraph主分支上版本已经升级到
0.8.0
了,但是studio似乎还没有升级,为避免踩坑我们还是使用已发布版。
经过一长串的控制台输出后,最后如果能看到BUILD SUCCESS
表示打包成功。
hugegraph-0.7.4
和一个压缩包
hugegraph-0.7.4.tar.gz
,这就是我们即将要使用可以运行的包。
本人有轻微强迫症,不喜欢源代码和二进制包放在一起,容易混淆,所以把hugegraph-0.7.4
拷到上一层目录,然后删除源代码目录,这样上层目录又回归清爽了。
$ mv hugegraph-0.7.4 ../hugegraph-0.7.4
$ cd ..
$ rm -rf hugegraph
到这儿安装包就准备好了。不过,这样操作是需要你本地装了jdk
、git
和maven
命令行工具的,如果你没有安装也没关系,我们还可以直接下载hugegraph
官方的release
包。
release
包点击github
代码的上面的导航releases
可以看到hugegraph目前有两个release,点击hugegraph-0.7.4.tar.gz
就开始下载了。
下载完之后解压即可
$ tar -zxvf hugegraph-0.7.4.tar.gz
解压完之后能看到一个hugegraph-0.7.4
目录,这个目录和用源码包打包生成的是一样的。
下面讲解如何配置参数。
虽然标题叫配置参数,但其实hugegraph
的默认配置就已经能在大部分环境下直接使用了,不过还是说明一下几个重要的配置项。
进入hugegraph-0.7.4
目录,修改HugeGraphServer
提供服务的url (host + port)
$ vim conf/rest-server.properties
# bind url
restserver.url=http://127.0.0.1:8080
# gremlin url to connect
gremlinserver.url=http://127.0.0.1:8182
# graphs list with pair NAME:CONF_PATH
graphs=[hugegraph:conf/hugegraph.properties]
# authentication
#auth.require_authentication=
#auth.admin_token=
#auth.user_tokens=[]
restserver.url
就是HugeGraphServer
对外提供RESTful API
服务的地址,host
为127.0.0.1
时只能在本机访问的,按需要修改其中的host
和port
部分即可。我这里由于studio
也是准备在本地启动,8080
端口也没有其他服务占用,所以不修改它。
graphs
是可供连接的图名与配置项的键值对列表,hugegraph:conf/hugegraph.properties
表示通过HugeGraphServer
可以访问到一个名为hugegraph
的图实例,该图的配置文件路径为conf/hugegraph.properties
。我们可以不用去管图的配置文件,按需要修改图的名字即可。我这里仍然没有修改它。
hugegraph
启动服务之前是需要手动初始化后端的,不过大家也不要看到“手动”两个字就害怕,其实就是调一个命令的事。
$ bin/init-store.sh
Initing HugeGraph Store...
2018-09-07 16:02:12 1082 [main] [INFO ] com.baidu.hugegraph.cmd.InitStore [] - Init graph with config file: conf/hugegraph.properties
2018-09-07 16:02:12 1201 [main] [INFO ] com.baidu.hugegraph.HugeGraph [] - Opening backend store 'rocksdb' for graph 'hugegraph'
2018-09-07 16:02:12 1258 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/schema
2018-09-07 16:02:12 1417 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/schema' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1445 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/system
2018-09-07 16:02:12 1450 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/system' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1456 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/graph
2018-09-07 16:02:12 1461 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/graph' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1491 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: schema
2018-09-07 16:02:12 1511 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: system
2018-09-07 16:02:12 1543 [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: graph
2018-09-07 16:02:13 1804 [pool-3-thread-1] [INFO ] com.baidu.hugegraph.backend.Transaction [] - Clear cache on event 'store.init'
这里可以看到,hugegraph
初始化了rocksdb
后端,那为什么是rocksdb
而不是别的呢,其实就是上一步说的conf/hugegraph.properties
中配置的。
$ vim conf/hugegraph.properties
# gremlin entrence to create graph
gremlin.graph=com.baidu.hugegraph.HugeFactory
# cache config
#schema.cache_capacity=1048576
#graph.cache_capacity=10485760
#graph.cache_expire=600
# schema illegal name template
#schema.illegal_name_regex=\s+|~.*
#vertex.default_label=vertex
backend=rocksdb
serializer=binary
store=hugegraph
# rocksdb backend config
#rocksdb.data_path=/path/to/disk
#rocksdb.wal_path=/path/to/disk
...
其中backend=rocksdb
就是设置后端为rocksdb
的配置项。
其他的后端还包括:memory
、cassandra
、scylladb
、hbase
、mysql
和palo
。我们这里不用去管它,用默认的rocksdb
即可。
初始化完成之后,会在当前目录下出现一个rocksdb-data
的目录,这就是存放后端数据的地方,没事千万不要随意删它或移动它。
注意:初始化后端这个操作只需要在第一次启动服务前执行一次,不要每次起服务都执行。不过即使执行了也没关系,hugegraph检测到已经初始化过了会跳过。
终于到了启动服务了,同样也是一条命令
$ bin/start-hugegraph.sh
Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK
看到上面的OK
就表示启动成功了,我们可以jps
看一下进程。
$ jps
...
4101 HugeGraphServer
4233 Jps
...
如果还不放心,我们可以发个HTTP
请求试试看。
$ curl http://127.0.0.1:8080/graphs
{"graphs":["hugegraph"]}
到这里HugeGraphServer
的部署就完成了,接下来我们来部署HugeGraphStudio
。
步骤与部署HugeGraphServer
大体类似,我们就不那么啰嗦了。
记得先返回最上层目录,避免目录嵌套在一起了。
克隆代码库
$ git clone [email protected]:hugegraph/hugegraph-studio.git
Cloning into 'hugegraph-studio'...
mux_client_request_session: read from master failed: Broken pipe
remote: Counting objects: 326, done.
remote: Compressing objects: 100% (189/189), done.
remote: Total 326 (delta 115), reused 324 (delta 113), pack-reused 0
Receiving objects: 100% (326/326), 1.60 MiB | 350.00 KiB/s, done.
Resolving deltas: 100% (115/115), done.
编译打包
studio
是一个包含前端的项目,使用react.js
实现,自行打包的话需要安装npm
、webpack
等工具。
$ cd hugegraph-studio
$ mvn package -DskipTests
studio
打包的时间会稍长一点。
...
[INFO] Reactor Summary:
[INFO]
[INFO] hugegraph-studio ................................... SUCCESS [ 0.003 s]
[INFO] studio-api ......................................... SUCCESS [ 4.683 s]
[INFO] studio-dist ........................................ SUCCESS [01:42 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:47 min
[INFO] Finished at: 2018-09-07T16:32:44+08:00
[INFO] Final Memory: 34M/390M
[INFO] ------------------------------------------------------------------------
将打包好的目录拷到上一层,删除源码目录(纯个人喜好)。
$ mv hugegraph-studio-0.7.0 ../
$ cd ..
$ rm -rf hugegraph-studio
至此,我的最上层目录就只剩下两个安装包,如下:
$ ls
hugegraph-0.7.4 hugegraph-studio-0.7.0
进入hugegraph-studio-0.7.0
目录,修改唯一的一个配置文件。
$ cd hugegraph-studio-0.7.0
$ vim conf/hugegraph-studio.properties
studio.server.port=8088
studio.server.host=localhost
graph.server.host=localhost
graph.server.port=8080
graph.name=hugegraph
# the directory name released by react
studio.server.ui=ui
# the file location of studio-api.war
studio.server.api.war=war/studio-api.war
# default folder in your home directory, set to a non-empty value to override
data.base_directory=~/.hugegraph-studio
show.limit.data=250
show.limit.edge.total=1000
show.limit.edge.increment=20
# separator ','
gremlin.limit_suffix=[.V(),.E(),.hasLabel(STR),.hasLabel(NUM),.path()]
需要修改的参数是graph.server.host=localhost
、graph.server.port=8080
、graph.name=hugegraph
。它们与HugeGraphServer
的配置文件conf/rest-server.properties
中的配置项对应,其中:
graph.server.host=localhost
与restserver.url=http://127.0.0.1:8080
的host
对应;graph.server.port=8080
与的restserver.url=http://127.0.0.1:8080
的port
对应;graph.name=hugegraph
与graphs=[hugegraph:conf/hugegraph.properties]
的图名对应。因为我之前并没有修改HugeGraphServer
的配置文件conf/rest-server.properties
,所以这里也不需要修改HugeGraphStudio
的配置文件conf/hugegraph-studio.properties
。
$ bin/hugegraph-studio.sh
studio
的启动默认是不会放到后台的,所以我们会在控制台上看到一大串日志,在最底下看到如下日志表示启动成功:
信息: Starting ProtocolHandler [http-nio-127.0.0.1-8088]
16:56:24.507 [main] INFO com.baidu.hugegraph.studio.HugeGraphStudio ID: TS: - HugeGraphStudio is now running on: http://localhost:8088
然后我们按照提示,在浏览器中输入http://localhost:8088
,就进入了studio
的界面:
图中Gremlin
下的框,就是我们输入gremlin
语句进而操作hugegraph
的入口了,下面我们给出一个例子。
以下内容参考CSDN博客通过Gremlin语言构建关系图并进行图分析。
在输入框中输入以下代码以创建一个“TinkerPop关系图”:
// PropertyKey
graph.schema().propertyKey("name").asText().ifNotExist().create()
graph.schema().propertyKey("age").asInt().ifNotExist().create()
graph.schema().propertyKey("addr").asText().ifNotExist().create()
graph.schema().propertyKey("lang").asText().ifNotExist().create()
graph.schema().propertyKey("tag").asText().ifNotExist().create()
graph.schema().propertyKey("weight").asFloat().ifNotExist().create()
// VertexLabel
graph.schema().vertexLabel("person").properties("name", "age", "addr", "weight").useCustomizeStringId().ifNotExist().create()
graph.schema().vertexLabel("software").properties("name", "lang", "tag", "weight").primaryKeys("name").ifNotExist().create()
graph.schema().vertexLabel("language").properties("name", "lang", "weight").primaryKeys("name").ifNotExist().create()
// EdgeLabel
graph.schema().edgeLabel("knows").sourceLabel("person").targetLabel("person").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("created").sourceLabel("person").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("contains").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("define").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("implements").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("supports").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()
// TinkerPop
okram = graph.addVertex(T.label, "person", T.id, "okram", "name", "Marko A. Rodriguez", "age", 29, "addr", "Santa Fe, New Mexico", "weight", 1)
spmallette = graph.addVertex(T.label, "person", T.id, "spmallette", "name", "Stephen Mallette", "age", 0, "addr", "", "weight", 1)
tinkerpop = graph.addVertex(T.label, "software", "name", "TinkerPop", "lang", "java", "tag", "Graph computing framework", "weight", 1)
tinkergraph = graph.addVertex(T.label, "software", "name", "TinkerGraph", "lang", "java", "tag", "In-memory property graph", "weight", 1)
gremlin = graph.addVertex(T.label, "language", "name", "Gremlin", "lang", "groovy/python/javascript", "weight", 1)
okram.addEdge("created", tinkerpop, "weight", 1)
spmallette.addEdge("created", tinkerpop, "weight", 1)
okram.addEdge("knows", spmallette, "weight", 1)
tinkerpop.addEdge("define", gremlin, "weight", 1)
tinkerpop.addEdge("contains", tinkergraph, "weight", 1)
tinkergraph.addEdge("supports", gremlin, "weight", 1)
// Titan
dalaro = graph.addVertex(T.label, "person", T.id, "dalaro", "name", "Dan LaRocque ", "age", 0, "addr", "", "weight", 1)
mbroecheler = graph.addVertex(T.label, "person", T.id, "mbroecheler", "name", "Matthias Broecheler", "age", 29, "addr", "San Francisco", "weight", 1)
titan = graph.addVertex(T.label, "software", "name", "Titan", "lang", "java", "tag", "Graph Database", "weight", 1)
dalaro.addEdge("created", titan, "weight", 1)
mbroecheler.addEdge("created", titan, "weight", 1)
okram.addEdge("created", titan, "weight", 1)
dalaro.addEdge("knows", mbroecheler, "weight", 1)
titan.addEdge("implements", tinkerpop, "weight", 1)
titan.addEdge("supports", gremlin, "weight", 1)
// HugeGraph
javeme = graph.addVertex(T.label, "person", T.id, "javeme", "name", "Jermy Li", "age", 29, "addr", "Beijing", "weight", 1)
zhoney = graph.addVertex(T.label, "person", T.id, "zhoney", "name", "Zhoney Zhang", "age", 29, "addr", "Beijing", "weight", 1)
linary = graph.addVertex(T.label, "person", T.id, "linary", "name", "Linary Li", "age", 28, "addr", "Wuhan. Hubei", "weight", 1)
hugegraph = graph.addVertex(T.label, "software", "name", "HugeGraph", "lang", "java", "tag", "Graph Database", "weight", 1)
javeme.addEdge("created", hugegraph, "weight", 1)
zhoney.addEdge("created", hugegraph, "weight", 1)
linary.addEdge("created", hugegraph, "weight", 1)
javeme.addEdge("knows", zhoney, "weight", 1)
javeme.addEdge("knows", linary, "weight", 1)
hugegraph.addEdge("implements", tinkerpop, "weight", 1)
hugegraph.addEdge("supports", gremlin, "weight", 1)
点击右上角的三角按钮,这样就创建出了一个图。
在输入框中输入:
g.V()
就能查出上面创建的图的所有顶点和边。
至此,执行Gremlin的图形化环境就已经搭建完成,后续就可以做各种各样炫酷的gremlin
查询了。