概率推理_知识图上的概率推理

概率推理

知识图和原因 (KNOWLEDGE GRAPHS & REASONING)

Using a knowledge graph without reasoning is like having an inviting cake and leave it there to admire it: aesthetically fascinating but a waste of yummy ingredients and, in the long run, pointless!

不加说明地使用知识图就像是一个诱人的蛋糕,放在那儿欣赏它:美观迷人,但浪费了美味的原料,从长远来看,毫无意义!

Reasoning enables the ‘knowledge part’ of a knowledge graph which, without it, can be viewed more like a database than a knowledge model.

推理使知识图的“知识部分”成为可能,如果没有知识图,则知识图更像数据库,而不是知识模型

Here’s the difference in a nutshell.

简而言之,这就是区别。

While a relational database has its central point in modeling reality with relations, storing into tables, and querying in SQL a domain of interest, a knowledge graph is more about graph-like data (but not only data as in a graph database).

虽然关系数据库在通过关系对现实进行建模,将其存储到表中以及在SQL中查询感兴趣的域时具有中心点,但是知识图更多地是关于像图的数据(但不仅是像图数据库中的数据)。

With a knowledge graph, you can model a reality of interest for which you have graph-like data (the so-called ground extensional component). You can abstractly describe how such reality works, that is: what are the rules of the underlying business, what are the things humans know, but neither programs nor databases do. Finally, through the reasoning process, you can generate new knowledge in the form of new nodes and edges for your graph, namely, the derived extensional component, a.k.a. reasoning component [1].

使用知识图 ,您可以为感兴趣的现实建模,并获得类似图的数据(所谓的地面扩展组件 )。 您可以抽象地描述这种现实的工作原理,即:基础业务的规则是什么,人们知道什么,但是程序和数据库都没有。 最后,通过推理过程,您可以以新节点和边的形式为图形生成新知识,即派生的扩展组件,即推理组件 [1]。

For a straightforward example of reasoning on knowledge graphs, you can check it out here on Medium.

有关知识图上推理的简单示例,可以在Medium上进行检查。

So far, so good.

到目前为止,一切都很好。

But the reality is uncertain, data are uncertain, the domain knowledge is uncertain, and even our human brain inferential processes are, by design, uncertain. Therefore, reasoning needs to include and handle all kinds of uncertainty.

但是现实是不确定的,数据是不确定的,领域知识是不确定的,甚至我们的人脑推理过程在设计上也是不确定的。 因此,推理需要包括和处理各种不确定性。

So, if you want to make cutting-edge reasoning on knowledge graphs and if you want a more reasonable and reliable representation of reality and not a kind of robot/puppet version of human reasoning, you need to consider uncertainty and incorporate it in your reasoning.

因此,如果您想在知识图上进行最前沿的推理,并且想要更合理和可靠地表示现实,而不是某种人类推理的机器人/木偶版本,则需要考虑不确定性并将其纳入推理中

如何在KG上做概率推理 (How to cook probabilistic reasoning on KGs)

Calum Lewis on Calum Lewis Unsplash 摄,Unsplash

In probabilistic reasoning with KGs, we need to perform the ‘probabilistic part’ of course. Still, the bad news is that if we want to achieve it wisely, we also need to satisfy at the same time, the standard requirements for reasoning on KGs [2].

在使用KG进行概率推理时,我们当然需要执行“概率部分”。 不过,坏消息是,如果我们想明智地实现它,我们需要同时满足KG推理的标准要求 [2]。

We don’t want to be able to, for example, handle null values but not be able to traverse the graph, right? So we need FOUR CRITICAL INGREDIENTS for the basic recipe here:

例如,我们不希望能够处理空值,但又无法遍历图形,对吗? 因此,我们在这里需要四个关键成分作为基本配方:

  1. Full recursion

    完全递归

If you want to explore graph structures, at least, you will expect to traverse these graphs! Of course not only to traverse: you want to explore graph data with a multitude of algorithms, where the extent and the form of the paths are not known a priori.

如果至少要探索图形结构,则可以遍历这些图形! 当然,不仅要遍历,还需要使用多种算法来探索图形数据,而这些算法的先验范围并不知道路径的范围和形式。

With graph databases, you may be used to pattern-matching languages like in Neo4J. Here the world is different: knowledge graphs don’t usually do pattern matching, but rely on a way more powerful tool: recursion.

使用图数据库,您可能会像Neo4J一样习惯于模式匹配语言。 这里的世界是不同的:知识图通常不进行模式匹配,而是依靠更强大的工具:递归

So, I am sorry to inform you that since you want to reason on a kg, you need recursion.

因此,很遗憾地通知您,由于您要推理一个公斤,因此需要递归。

And it must be full, in the sense that we cannot get away with many existing forms of partial recursion, as we want a complete choice of the graph algorithm we can write.

并且它必须是完整的 ,从某种意义上说,我们无法摆脱许多现有的部分递归形式,因为我们希望可以完全选择可以编写的图算法。

Let’s take a basic graph problem like st-connectivity (the decision problem asking in a directed graph, if t is reachable from s). It is expressible via left recursion. But there are a lot of fundamental but non-trivial problems that are impossible to state, not even with linear recursion. That’s why full recursion is a must.

让我们来看一个基本的图问题,例如st-connectivity (如果t从s可以到达,则有向图中的决策问题)。 它可以通过左递归表示。 但是,存在许多根本但非平凡的问题,即使是线性递归也无法解决。 这就是为什么必须进行完全递归。

I think knowing about recursion is never enough, so if you agree to me, also read this or this, or come back to the first year of computer science and reread a brilliant book like this!

我想知道关于递归是远远不够的,所以如果你同意我,也看过这个或这或回来计算机科学的第一年,重读一本书灿烂像这样 !

2. Induction (no, it is not the recursion again… read also ingredient 2!)

2.归纳法(不,这不是递归……也请阅读成分2!)

A language, or if you want to be more informal and breezy, a system aiming at reasoning (on knowledge graphs) and that wants to be ready for probabilistic reasoning must be able to express non-ground inductive definitions [3,4].

一种语言,或者如果您想变得更加非正式和轻松,一个旨在推理(在知识图上)并且希望为概率推理做好准备的系统必须能够表达非地面归纳定义 [3,4]。

An inductive definition is a definition given in terms of itself, and it must be sound. There must be one or more non-recursive base cases + one or more recursive cases that eventually lead back to the base case.

归纳定义是根据本身给出的定义,它必须是合理的。 必须有一个或多个非递归基本案例+一个或多个最终递归回到基本案例的递归案例。

Very well-known examples of inductive definitions are the Fibonacci series and the factorial. Probably the plainest example of an inductive definition is transitive closure. If a system can’t even do a non-ground (only with the base case) transitive closure of a binary relation. And it is of the essence for reasoning since:

归纳定义的非常著名的例子是斐波那契数列和阶乘。 归纳定义的最简单的例子可能是传递闭包 。 如果系统甚至无法进行非地面操作(仅在基本情况下),则将暂时关闭二进制关系。 这是推理的关键,因为:

if a system can’t even work out a transitive closure, you can be sure it won’t help you with reasoning!

如果系统甚至无法计算出传递闭包,则可以确保它不会帮助您推理!

3. Semantics

3.语义学

The reasoning process must be traced back to a specific semantic, the ingredient that gives meaning to the intensional component and to query answers.

推理过程必须追溯到特定的语义,即赋予内涵成分和查询答案含义的成分。

You have a lot of different choices for semantics at your disposal, in this case, language semantics: stable-model semantics, well-founded semantics, least-model semantics, and all their variants, etc. They offer a more or less efficient reasoning process in terms of performance.

对于语义,您有许多不同的选择,在这种情况下,是语言语义:稳定模型语义,有充分根据的语义,最小模型语义及其所有变体等。它们提供了或多或少有效的推理性能方面的过程。

A reasonable choice, in this case, is the use of well-founded semantics.

在这种情况下,合理的选择是使用可靠的语义

Let’s take the answer to a conjunctive query. With the stable-model semantics, the construction of the solution needs to consider the facts appearing in all the possible interpretations that satisfy your intensional component and the query. On the contrary, in well-founded semantics, a correct answer to the query contains all the facts of whatever true interpretation. The saving is self-evident in terms of performance.

让我们来回答一个联合查询。 使用稳定模型语义,解决方案的构建需要考虑出现在满足您的内涵组件和查询的所有可能解释中的事实。 相反,在有充分根据的语义中,对查询的正确答案包含任何真实解释的所有事实。 就性能而言,节省是不言而喻的。

This and other theoretical underpinnings [5] are such that you can develop faster reasoners using well-founded semantics, and therefore it is a better choice!

这和其他理论基础[5]使得您可以使用良好的语义开发更快的推理机,因此这是一个更好的选择!

4. Ontological reasoning

4.本体论推理

We’re talking about reasoning ok, but it is a bit vague, which kind of reasoning? I would like to have a powerful version of reasoning, of course, one that allowed me to query my knowledge graph for having unexpected, new, and significant insights!

我们在说推理还可以,但是有点模糊,哪种推理? 当然,我想拥有一个强大的推理版本,该版本可以让我查询我的知识图谱以获取意外,新的和重要的见解!

So, at least we would like to have ontological reasoning that in philosophical terms can be seen as the ability of reasoning ‘on the existing’. While in mathematical terms implies the presence of specific operators like inclusion, universal and existential quantification, etc.

因此,至少我们希望有本体论推理,以哲学术语可以将其视为“对现有事物”进行推理的能力。 用数学术语表示存在特定的运算符,例如包含,通用和存在量化等。

In computer science, ontological reasoning is often made to correspond to the ability of a system to support SPARQL (to query language) under the reasoning rules, known as ‘entailment regime’, of OWL 2 QL profile (a complex of concepts and reasoning rules in a well-known language of the semantic web community).

在计算机科学中,本体推理往往对应作出一个系统的支持SPARQL(以查询语言)下的推理规则,被称为“蕴含政权”的能力OWL 2 QL轮廓 (一个复杂的概念和推理规则以语义网络社区的知名语言)。

This ‘profile’ is a kind of toolbox with all the essentials of reasoning: just think that it includes symmetric, reflexive, and irreflexive properties, disjoint properties or, simply define classes and sub-classes.

这种“轮廓”是具有推理所有基本要素的一种工具箱:只需认为它包含对称,自反和不自反特性,不相交的特性,或者简单地定义类和子类即可。

Ontological reasoning under this profile also offers excellent performance, with a nice trade-off between expressive power and computational complexity; for example, being in the AC0 complexity class when the query is assumed to be fixed.

在这种情况下,本体论推理还提供了出色的性能,在表达能力和计算复杂度之间取得了很好的折衷。 例如,假设查询是固定的,则属于AC0复杂度类。

制备: (Preparation:)

Photo by Aarón Blanco Tejedor on Unsplash 照片由 AarónBlanco Tejedor 摄于 Unsplash

The four ingredients we just saw are for the basic recipe: reasoning. But for the advanced recipe, probabilistic reasoning, you will need the joint effort of the all ‘fabulous five’. You will need at least another crucial ingredient, THE FIFTH INGREDIENT: now, you must decide how you wish to manage uncertainty.

我们刚刚看到的四种成分是基本配方:推理。 但是对于高级配方 ,概率推理,您将需要所有“ 神话般的五个人”的共同努力。 您至少需要另一个关键成分, 第五种成分 :现在,您必须决定如何处理不确定性。

Scientists have conceived the most different approaches to cope with the most various situations involving imperfect or unknown information: probability theory, fuzzy logic, subjective logic, evidence theory, etc.

科学家已经构思出最不同的方法来应对涉及不完善或未知信息的大多数情况:概率论,模糊逻辑,主观逻辑,证据理论等。

If we keep it to the ground and refer to probability theory for uncertainty, what kind of models/systems can we use to perform probabilistic reasoning on KGs?

如果我们坚持到底,并参考概率论的不确定性, 我们可以使用哪种模型/系统对KG进行概率推理?

Well, a lot of people are working on probabilistic reasoning. So many people involved that there exist at least three main related research areas: probabilistic logic programming, probabilistic programming languages, and statistical relational learning.

好吧,很多人都在进行概率推理。 涉及的人员如此之多,至少存在三个主要的相关研究领域:概率逻辑编程,概率编程语言和统计关系学习。

It takes me a while just to dive into the different branches of science attempting to this goal. And definitely, a lot of time to truly understand what the heck the state of the art has to offer for probabilistic reasoning on knowledge graphs. And now I can report to you what I’ve found.

我花了一段时间才潜入尝试达到该目标的科学的不同分支。 无疑, 有很多时间才能真正了解知识图上的概率推理所能提供的最新技术。 现在,我可以向您报告所发现的内容。

PROBABILISTIC LOGIC PROGRAMMING is a group of very nice languages that allows you to define very compact and elegantly simple logic programs. More, they use Sato semantics, a straightforward and compact way to define semantics.

概率逻辑程序设计是一组非常好的语言,使您可以定义非常紧凑且优雅的简单逻辑程序。 另外,他们使用Sato语义,这是一种定义语义的简单明了的方法。

PROBABILISTIC PROGRAMMING LANGUAGES are elegant because they let you treat all the statistical distributions natively, and this is so attractive and comfortable for statisticians.

概率编程语言之所以优雅,是因为它们可让您本地处理所有统计分布,这对统计人员而言是如此诱人且舒适。

STATISTICAL RELATIONAL LEARNING allows us to use Probabilistic Graphical Models like Markov Logic Networks, so all competencies people have in a machine learning environment can be immediately put to use (for example, probabilistic inference in PGM). You can create PGMs that are very close to your reality of interest and model entities and relationships with minimal effort.

统计关系学习使我们可以使用概率图形模型(例如Markov Logic Networks),因此人们可以立即使用人们在机器学习环境中拥有的所有能力(例如,PGM中的概率推理)。 您可以轻松创建与现实情况非常接近的PGM,并为实体和关系建模。

All pleasant and positive points and no drawbacks! So, what should I go for?

所有令人愉快和积极的观点,没有缺点! 那么,我应该去做什么呢?

All these techniques have focused only on the fifth ingredient of the recipe: HOW TO MANAGE UNCERTAINTY. But they almost completely avoid the other four.

所有这些技术仅集中于配方的第五个成分:如何处理不确定性。 但是他们几乎完全避免了其他四个。

In other words, they provided a specific way to cope with uncertainty but without all the other fundamentals features for reasoning on knowledge graphs.

换句话说,它们提供了一种解决不确定性的特定方法,但没有在知识图上进行推理的所有其他基本原理。

They are not applicable to probabilistic reasoning on knowledge graphs.

它们不适用于知识图上的概率推理。

It is as if they studied and developed the best masala (the curry powder) for a curry recipe, forgetting the vegetables or the chicken or even the fire sometimes. How can you cook your curry then?

仿佛他们研究并开发出了最适合咖喱食谱的咖喱粉(咖喱粉),有时会忘记蔬菜,鸡肉甚至是火。 那你怎么煮咖喱呢?

Image: Photo by Enrico Mantegazza on Unsplash 图片: Enrico Mantegazza在 Unsplash上 拍摄的照片

Having noted this, I have tried to provide my contribution to cooking probabilistic knowledge graphs, making an extra effort, studying what is necessary, and going to groceries for all the required ingredients. This is my curry recipe for probabilistic reasoning on knowledge graphs.

注意到这一点之后,我试图为烹饪概率知识图做出贡献,做出额外的努力,研究必要的知识,并为所有必需的食材提供食品杂货。 这是我在知识图上进行概率推理的咖喱食谱 。

I’m eager to listen to yours!

我很想听听你的!

结论 (CONCLUSIONS)

Without reasoning, knowledge graphs are half-cooked (read more about it here), and handling uncertainly for principled reasoning is paramount. Yet, there are five fundamentals for doing this, the ‘fabulous five’. They are full recursion, induction, semantics, ontological reasoning, and probability theory. More of the existing approaches neglect most of these fabulous five and focus only on one of these ‘ingredients’. Yet some attempts have been made: here there have been used all ‘fabulous five’.

如果不进行推理,知识图将被半熟( 在此处了解更多信息),并且对原则性推理进行不确定的处理至关重要。 然而,这样做有五个基本原则,即“ 神话般的五个”。 它们是完整的递归,归纳,语义,本体论推理和概率论。 现有的方法更多地忽略了这五个神话般的大部分,而仅关注这些“成分”之一。 然而,已经做了一些尝试: 这里使用了所有“ 神话般的五”

Follow me on Medium for more.

跟随我的中等更多。

Let’s keep in touch also on Linkedin!

让我们也保持联系Linkedin !

参考资料 (REFERENCES)

[1] L. Bellomarini et al., Knowledge Graphs and Enterprise AI: The Promise of an Enabling Technology (2019), ICDE

[1] L. Bellomarini等人,《 知识图谱和企业AI:实现技术的承诺》 (2019年),ICDE

[2] L. Bellomarini et al., Reasoning under Uncertainty in Knowledge Graphs (2020), RuleML+RR 2020

[2] L. Bellomarini等人, 知识图不确定性下的推理 (2020),RuleML + RR 2020

[3] D. Fierens et al., Inference and learning in probabilistic logic programming weighted boolean formulas (2015), Theory and Practice of Logic Programming

[3] D. Fierens等人, 概率逻辑编程加权布尔公式的推理和学习 (2015年),逻辑编程的理论与实践

[4] J. Lee and Y. Wang, Weighted rules under the stable model semantics (2016), KR. pp.145–154. AAAI Press

[4] J. Lee和Y. Wang, 稳定模型语义下的加权规则 (2016),KR。 第145–154页。 AAAI新闻

[5] M. Denecker, et al., Logic programming revisited: Logic programs as inductive definitions (2001), ACM Trans. Comput. Log.

[5] M. Denecker等人,“ 逻辑编程再探” :作为归纳定义的逻辑程序 (2001),ACM Trans。 计算 日志。

翻译自: https://towardsdatascience.com/probabilistic-reasoning-on-knowledge-graphs-d510269f5cf0

概率推理

你可能感兴趣的:(python,机器学习,人工智能,java,算法)