本章内容:
着重讲述图数据库模型设计的原则、Neo4j的语言Cypher、建立数据库注意事项等。本章的内容对于入门图数据库很重要,为以后使用图数据库打下良好的基础。
====================================================================================
1、Modeling is an abstracting activity motivated by a particular need or goal. We model in order to bring specific facets of an unruly domain into a space where they can be structured and manipulated.
这句话着重讲述建模的含义。建模是一种抽象的活动,将不受任何限制的领域转变成结构化以及可操纵的空间中。这样子,通过建模,就可以很好的表达原空间。
2、Graph representations are no different in this respect. What perhaps differentiates them from many other data modeling techniques, however, is the close affinity between the logical and physical graph models.
图表示不同于以往任何的表现形式。在逻辑上以及物理上,图模型有比较好的亲密性。不像其他的数据建模技术,逻辑上完全不等于物理上。
3、 In other words, to make relational stores perform well enough for regular application needs, we have to abandon any vestiges of true domain affinity and accept that we have to change the user’s data model to suit the database engine not the user. This technique is given the name denormalization.
关系型数据库为了更好的服务于应用需求,他们会放弃一些设计原则,改变数据模型去适应数据库而不是用户。这项技术就是“反范式”。通俗点讲就是,通过在原有的一些表上,添加一些新的冗余列(数据来自于其他表),这样子就能加快查询速度,而不用进行表与表的join操作。
4、The technical mechanism through which we evolve a database is known as migration.
升级数据库的技术叫“迁移”。这句话需要明白什么是“migaration”就好。
5、The problem with the denormalized model is its resistance to the kind of rapid evolution the business demands of its systems.
反范式存在的问题就是阻碍了商业需求的快速变化。
6、Relational databases—with their rigid schemas and complex modeling characteristics—are not an especially good tool for supporting rapid change.
由于严格的范式以及模型复杂的特点,关系型数据库难以应对需求的快速变化。
7、The Property Graph Model(图模型的定义),图示例如下图1:
The main abstractions in a property graph are nodes, relationships and properties.
(i) Nodes are placeholders for properties.
(ii) Relationships connect nodes.
图1:property graph 示例
8、In Neo4j the humane and expressive query language Cypher is the primary means of accessing the database.
在流行的图数据库Neo4j中,访问数据库最典型的语言是Cypher。
9、Like most query languages, Cypher is composed of clauses. A reasonably simple query is made up of START, MATCH and RETURN clauses
像其他查询语言一样,Cypher由语句构成。一个合理的查询是由三个语句构成, 开始、匹配以及返回。其中“开始”定义查询开始的图节点,“匹配”定义了如何进行查询,“返回”定义了返回哪些信息。典型示例如下图2所示:
图2:cypher语句示例
上面的cypher语句完成的功能是:查询莎士比亚写过的话剧有哪些?
10、The other clauses we can use in a Cypher query include:
• WHERE: Provides criteria for filtering portions of a pattern.
• CREATE and CREATE UNIQUE: Creates nodes and relationships.
• DELETE: Removes nodes, relationships and properties.
• SET: Set property values.
• FOREACH: Performs an updating action for each element in a list.
• WITH: Divides a query into multiple, distinct parts.
在Cypher语言中还包含一些其他的语句,WHERE是过滤条件,作用于MATCH返回的结果上。CREATE是创建图数据库,CREATE UNIQUE是创建图数据库,如果存在重复的节点等,创建失败。DELETE是删除图中的节点、关系或者性质。SET是更新性质的值。FOREACH是对于列表中每一元素都进行更新操作。WITH是连接若干个查询语句,像linux中的管道一样。
11、CREATE语句示例:
图3:CREATE语句示例
这个示例是图1的图模型的建立过程。可以参考图1。
12、By ensuring the correctness of the domain model we’re implicitly improving the graph model, since in a graph database what you sketch on the whiteboard is typically what you store in the database.
在图数据库中,你在白板上所画出来的模型和实际存储在数据库中的模型是一样的,并无多大差别。
13、Modeling guidelines
• Use nodes to represent things in the domain; use relationships to represent the structural relationships between those things.
• Every entity—that is, everything with an identity—should be a separate node.
• Complex, multi-part property values—addresses, for example—should usually be pulled out into separate nodes, even though they typically don’t have their own global identity.
• Actions should be modeled in terms of their products, which should be represented as nodes. Prefer (alice)-[:WROTE]->(review)-[:OF]->(film) to (alice)-[:REVIEWED]->(film).
在设计图数据模型时,需要遵循一些经验型的指导原则。用结点代表领域内的事物,关系代表事物之间的联系。每一个实体都是独立的一个结点,对于复杂并且有多层性质值时(地址:北京市朝阳区XX路),尽量拆分开成不同的结点。
14、Testing the model测试图模型的两个方法:
(i)The first, and simplest, is just to check that the graph reads well.
最简单的方法就是检查这个图是否读起来比较合理或者通顺,你可以从图中的某一个结点开始进行读这个图 。
(ii)we also need to consider the queries we’ll run on the graph.
考虑这个图支持的一些查询。用这些查询去检测这个图是否合理。
15、In the general case, don’t encode entities (nouns) into relationships. Use relationships to convey semantics about how entities are related, and the quality of those relationships, not to directly encode verbs. And remember that domain entities aren’t always clear in the way humans communicate, so think carefully about the nouns that you’re actually dealing with.
这句话总结了一下在建立图模型时候的注意事项。在通常状况下,不要把实体(名词)编码成关系,关系用于传达实体之间是如何联系的,关系不单单是编码成动词。同时需要记住,在人的交流过程中,领域实体不是那么明显,需要我们仔细的去观察。
16、model naturally and in high-fidelity and trust in the graph database to do the right thing.
尽管在建立图模型的时候,我们可能加入了很多结点或者关系,导致数据量很大,以至于我们怀疑图模型是否有效。此时,你需要记住,相信你的模型正在做正确的事情。保持一颗自信心。