关于rdflib解析三元组介绍

接上回:rdflib解析protege的owl文件

1. 解析三元组

解析文件为creature.owl,文件介绍可参考:请看标题二:protege导出owl文件(该文两图之间有creature.owl的百度网盘链接,可以自行取用~,如果有用,请点个赞嘿嘿嘿)

from rdflib import Graph
import csv

g = Graph()
g.parse(r"creature.owl", format="xml")

with open("creature.csv","w",newline='',encoding='utf-8') as f1:
    writer=csv.writer(f1)
    header=['Subject','Predicate','Object']
    writer.writerow(header)
    for stmt in g:
        print(stmt)
        list_triple=[]
        for st in stmt:
            st = str(st)
            list_triple.append(st)
        writer.writerow(list_triple)

2.解析结果

creature的内容
关于rdflib解析三元组介绍_第1张图片

owl文件很小,解析出61条三元组(请大家自行分析一波吧~)

Subject,Predicate,Object
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#虎,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆生动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆生动物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
Nff78fb4e14324cee8a72a2182381f076,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#AllDisjointClasses
Nd855ca9f5c1045f68646e0c530797319,http://www.w3.org/1999/02/22-rdf-syntax-ns#first,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#空
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#微生物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
Nff78fb4e14324cee8a72a2182381f076,http://www.w3.org/2002/07/owl#members,Na100b9cbefec4b76a54dfb79604c00e2
Ne4411de3bfd94de2909efcd9491f373f,http://www.w3.org/1999/02/22-rdf-syntax-ns#first,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#地点
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#两栖动物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鲸鱼,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#水生动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#水生动物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#北方,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#太平洋,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苹果树,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
Nd855ca9f5c1045f68646e0c530797319,http://www.w3.org/1999/02/22-rdf-syntax-ns#rest,Ne4411de3bfd94de2909efcd9491f373f
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苹果树,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#高度,3米
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#地点
Na100b9cbefec4b76a54dfb79604c00e2,http://www.w3.org/1999/02/22-rdf-syntax-ns#first,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#被吃,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#ObjectProperty
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆生动物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苹果树,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#种子植物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#吃,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#ObjectProperty
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#被吃,http://www.w3.org/2000/01/rdf-schema#range,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#虎
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#两栖动物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鲸鱼,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#居住在,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#太平洋
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鲸鱼,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
Na100b9cbefec4b76a54dfb79604c00e2,http://www.w3.org/1999/02/22-rdf-syntax-ns#rest,Nd855ca9f5c1045f68646e0c530797319
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#吃,http://www.w3.org/2002/07/owl#inverseOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#被吃
Ne4411de3bfd94de2909efcd9491f373f,http://www.w3.org/1999/02/22-rdf-syntax-ns#rest,http://www.w3.org/1999/02/22-rdf-syntax-ns#nil
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#水生动物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鹿,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#太平洋,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#幼鹿,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#被吃,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#东北虎
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#空,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#地点
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Ontology
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苔类植物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苔藓植物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#被吃,http://www.w3.org/2000/01/rdf-schema#domain,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鹿
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#种子植物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#吃,http://www.w3.org/2000/01/rdf-schema#domain,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#虎
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苹果树,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#居住在,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#北方
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#幼鹿,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鹿
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#吃,http://www.w3.org/2000/01/rdf-schema#range,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鹿
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苔藓植物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#鹿,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆生动物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#空,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#幼鹿,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#植物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苔类植物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#高度,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#DatatypeProperty
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#居住在,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#ObjectProperty
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#种子植物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#植物
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#东北虎,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#NamedIndividual
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#地点,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#北方,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#虎,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#东北虎,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#虎
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#苔藓植物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#植物

3.我的理解

①当只有一层,三元组可“完美解析”

owl文件中的“两栖动物”相关内容仅有如下4行:
    

    <owl:Class rdf:about="http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#两栖动物">
        <rdfs:subClassOf rdf:resource="http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物"/>
    owl:Class>

解析出的“两栖动物”相关内容仅有如下2行:
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#两栖动物,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#Class
http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#两栖动物,http://www.w3.org/2000/01/rdf-schema#subClassOf,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#动物

如上所述,当只有“一层”的时候,解析的很完美

②如果出现多层xml数据呢?
就会出现前文提到的“父xml块无法与子xml块中的数据建立直接联系,需要通过唯一标识关联”

owl文件有这样一个xml块:
    <rdf:Description>
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#AllDisjointClasses"/>
        <owl:members rdf:parseType="Collection">
            <rdf:Description rdf:about="http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海"/>
            <rdf:Description rdf:about="http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#空"/>
            <rdf:Description rdf:about="http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#陆"/>
        owl:members>
    rdf:Description>
此块可看作两层(注:此块与上面的块是同级的),rdf里有rdf和owl,owl里还有rdf,所以rdf-owl-rdf可以看出两层

我们非常想要看到:“AllDisjointClasses,members,海”,事实上我们只得到:
① Na100b9cbefec4b76a54dfb79604c00e2,http://www.w3.org/1999/02/22-rdf-syntax-ns#first,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海
可见,rdflib有自己的想法~

唯一与AllDisjointClasses有关的三元组(然并卵?)
② Nff78fb4e14324cee8a72a2182381f076,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2002/07/owl#AllDisjointClasses

还是分析一下①的“乱码”Na100b9cbefec4b76a54dfb79604c00e2吧
③ Nff78fb4e14324cee8a72a2182381f076,http://www.w3.org/2002/07/owl#members,Na100b9cbefec4b76a54dfb79604c00e2
④ Na100b9cbefec4b76a54dfb79604c00e2,http://www.w3.org/1999/02/22-rdf-syntax-ns#first,http://www.semanticweb.org/user/ontologies/2021/0/untitled-ontology-49#海
⑤ Na100b9cbefec4b76a54dfb79604c00e2,http://www.w3.org/1999/02/22-rdf-syntax-ns#rest,Nd855ca9f5c1045f68646e0c530797319

很明显,Na100b9cbefec4b76a54dfb79604c00e2 是沟通“AllDisjointClasses,members,海”的桥梁(纽带),具体可以理解为一种关系的标识

综上,我们想得到“AllDisjointClasses,members,海”,却只得到如下三条:
   Nff78fb4e14324cee8a72a2182381f076,type,AllDisjointClasses
   Nff78fb4e14324cee8a72a2182381f076,members,Na100b9cbefec4b76a54dfb79604c00e2
   Na100b9cbefec4b76a54dfb79604c00e2,first,海

其实rdflib的想法很好呀,把所有关系都列出来啦~
so,下一步就是我们写算法,去掉“乱码”(标识符),获取自己想要的信息~
(直觉上不是很建议欸,除非对owl了解到非常透彻或者选择手动修改,自动化很容易出错哦~)

暂时就到这里啦,有问题评论区见~
关于rdflib解析三元组介绍_第2张图片

你可能感兴趣的:(KG,#,python,notes,知识图谱,rdflib,三元组,python)