马超的博客

图谱问答：自定义组装问答系统进阶指南

图谱问答：自定义组装问答系统进阶指南
- 一、软件安装
- - 1.1 安装ONgDB
  - 1.2 安装APOC和OLAB组件
  - 1.3 修改配置并启动ONgDB
- 二、获取数据
- - 2.1 沪深股票
  - 2.2 上市公司管理层
- 三、设计图数据模型
- 四、构建图数据
- - 4.1 设计索引和约束
  - 4.2 构建数据
- 五、问答组件配置
- - 5.1 配置图数据模型
  - 5.2 配置实体权重规则
  - 5.3 配置实体搜索规则
  - 5.4 配置意图匹配规则
  - 5.5 配置时间页码处理等规则
  - 5.6 配置算子规则
  - 5.7 算子解析模块配置
- 六、配置词库
- - 6.1 安装词库依赖
  - 6.2 配置词
  - 6.3 词库热更新
- 七、安装问答模块存储过程
- - 7.1 问答结果
  - - 7.1.1 意图字符匹配和列表求交集【非分类模型解意图】
    - 7.1.2 带分类模型的查询方式【分类模型解意图】
  - 7.2 问答结果Cypher
  - 7.3 问答推理图谱
  - 7.4 生成推荐问题列表
- 八、配置样例问答
- 九、运行Graph QABot页面

Here’s the table of contents:

图谱问答：自定义组装问答系统进阶指南

本文内容比较长且稍显复杂，比较详细的说明了如何自定义组装一个问答系统，该问答系统框架核心是配置模块（即规则引擎）和图算法（最优子图匹配/连通子图分析），每个步骤都可以进行拆解实现灵活组装。

另外本文所有的Cypher脚本作者都做了梳理，准备好环境后可以直接下载运行。本指南所有相关资料都可以在这里下载：

graph-qabot

graph-qabot-demo

一、软件安装

1.1 安装ONgDB

下载ONgDB，下载社区版和企业版都可以

https://github.com/graphfoundation/ongdb/releases/tag/1.0.4

1.2 安装APOC和OLAB组件

将组件下载后放置在/plugins文件夹下面。

下载APOC

# 下载all版本
https://github.com/graphfoundation/ongdb-apoc/releases/tag/3.4.0.10

下载OLAB

https://github.com/ongdb-contrib/ongdb-lab-apoc/releases/tag/1.0.0

1.3 修改配置并启动ONgDB

修改配置文件

# conf/ongdb.conf
dbms.security.procedures.unrestricted=apoc.*,olab.*,custom.*
apoc.import.file.enabled=true

启动图数据库节点

# Windows启动ONgDB
bin\ongdb.bat console

二、获取数据

2.1 沪深股票

import tushare as ts
import system_constant as sys_cnt
import os

# 初始化pro接口
pro = ts.pro_api(sys_cnt.tushare_token())


# 获取股票相关信息
def run():
    # 查询当前所有正常上市交易的股票列表
    data = pro.stock_basic(exchange='', list_status='L', fields='ts_code,symbol,name,area,industry,list_date')
    data.to_csv(path_or_buf=os.getcwd().replace('python', 'csv') + '\\stocks.csv',
              encoding='GBK',
              columns=['ts_code', 'symbol', 'name', 'area', 'industry', 'list_date'],
              index=False)


if __name__ == '__main__':
    run()

2.2 上市公司管理层

import tushare as ts
import pandas as pd
import system_constant as sys_cnt
import os
import time

# 初始化pro接口
pro = ts.pro_api(sys_cnt.tushare_token())


# 获取股票相关信息
def run():
    # 查询当前所有正常上市交易的股票列表
    data = pro.stock_basic(exchange='', list_status='L', fields='ts_code,symbol,name,area,industry,list_date')
    tss = data['ts_code']
    result = []
    ct = 0
    bk_ct = 100000
    for code in tss:
        time.sleep(0.5)
        # 获取单个公司高管全部数据
        df = pro.stk_managers(ts_code=code)
        result.append(df)
        ct += 1
        if ct > bk_ct:
            break
    df_merge = pd.concat(result)
    print(df_merge)
    df_merge.to_csv(path_or_buf=os.getcwd().replace('python', 'csv') + '\\managers.csv',
                    encoding='GBK',
                    columns=['ts_code', 'ann_date', 'name', 'gender', 'lev', 'title', 'edu', 'national', 'birthday',
                             'begin_date', 'end_date'],
                    index=False)


if __name__ == '__main__':
    run()

三、设计图数据模型

将下面的JSON格式数据导入，Graphene图数据建模工具。

{"data":{"nodes":[{"x":613,"y":284,"color":"#E52B50","label":"股票代码","properties":[{"id":"a2aeffb2","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"afb48f90","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"a1b949a0","isSelected":false,"isNode":true},{"x":678,"y":444,"color":"#AB274F","label":"股票","properties":[{"id":"a4a8beb4","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a9bf77ad","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"a5844bbd","isSelected":false,"isNode":true},{"x":413,"y":368,"color":"#3B7A57","label":"股票名称","properties":[{"id":"a08f768e","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a09d0ab9","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"a78c159d","isSelected":false,"isNode":true},{"x":426,"y":475,"color":"#FF7E00","label":"地域","properties":[{"id":"a59d20a1","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a28d19bb","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"a49f8ebc","isSelected":false,"isNode":true},{"x":481,"y":557,"color":"#FF033E","label":"行业","properties":[{"id":"aea1c5b2","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"aaa4b082","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"a381f688","isSelected":false,"isNode":true},{"x":714,"y":569,"color":"#AF002A","label":"上市日期","properties":[{"id":"a284f1aa","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a0b444a4","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"ac81ffb4","isSelected":false,"isNode":true},{"x":834,"y":380,"color":"#E32636","label":"高管","properties":[{"id":"a4b02384","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"id":"a2ade2ad","isSelected":false,"isNode":true},{"x":890,"y":266,"color":"#C46210","label":"性别","properties":[{"id":"ad9cab82","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a2aa0f86","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"af9a0f97","isSelected":false,"isNode":true},{"x":977,"y":447,"color":"#E52B50","label":"学历","properties":[{"id":"a090f498","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""},{"id":"a8a0e1b4","key":"value","type":"","defaultValue":"","limitMin":"","limitMax":"","isRequired":false,"isAutoGenerated":false,"isSystem":false,"description":"值"}],"id":"ad931890","isSelected":false,"isNode":true},{"x":1001,"y":180,"color":"#AB274F","label":"性别_别名","properties":[{"id":"a3980888","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"id":"aa8b5199","isSelected":false,"isNode":true}],"edges":[{"startNodeId":"a5844bbd","endNodeId":"a1b949a0","middlePointOffset":[0,0],"properties":[{"id":"a29766a0","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"股票代码","id":"a5934a81","isSelected":false,"isEdge":true},{"startNodeId":"a5844bbd","endNodeId":"a78c159d","middlePointOffset":[0,0],"properties":[{"id":"a196c08e","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"股票名称","id":"acaecfb9","isSelected":false,"isEdge":true},{"startNodeId":"a5844bbd","endNodeId":"a49f8ebc","middlePointOffset":[0,0],"properties":[{"id":"a88ae1a4","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"地域","id":"aa94de98","isSelected":false,"isEdge":true},{"startNodeId":"a5844bbd","endNodeId":"a381f688","middlePointOffset":[0,0],"properties":[{"id":"a493c4a1","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"所属行业","id":"aead0aa2","isSelected":false,"isEdge":true},{"startNodeId":"a5844bbd","endNodeId":"ac81ffb4","middlePointOffset":[0,0],"properties":[{"id":"aebda5bd","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"上市日期","id":"a19f3281","isSelected":false,"isEdge":true},{"startNodeId":"a2ade2ad","endNodeId":"a5844bbd","middlePointOffset":[0,0],"properties":[{"id":"adaf73aa","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"任职于","id":"a3bcec8c","isSelected":false,"isEdge":true},{"startNodeId":"a2ade2ad","endNodeId":"af9a0f97","middlePointOffset":[0,0],"properties":[{"id":"aca28680","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"性别","id":"ae973eb2","isSelected":false,"isEdge":true},{"startNodeId":"a2ade2ad","endNodeId":"ad931890","middlePointOffset":[0,0],"properties":[{"id":"a7af3fb0","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"学历","id":"aca71396","isSelected":false,"isEdge":true},{"startNodeId":"af9a0f97","endNodeId":"aa8b5199","middlePointOffset":[5.5,-6],"properties":[{"id":"a692c7a2","key":"id","type":"ID","defaultValue":"","limitMin":"","limitMax":"","isRequired":true,"isAutoGenerated":true,"isSystem":true,"description":""}],"label":"别名","id":"a9a6bb89","isSelected":false,"isEdge":true}]}}

四、构建图数据

4.1 设计索引和约束

CREATE CONSTRAINT ON (n:股票代码) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:股票) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:股票名称) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:地域) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:行业) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:上市日期) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:高管) ASSERT (n.md5) IS NODE KEY;
CREATE INDEX ON :高管(value);
CREATE CONSTRAINT ON (n:性别) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:学历) ASSERT (n.value) IS NODE KEY;
CREATE CONSTRAINT ON (n:性别_别名) ASSERT (n.value) IS NODE KEY;

4.2 构建数据

将生成的CSV数据，放置在/import目录下，CSV转为UTF-8-BOM格式（可以使用Notepad++转格式）。

LOAD CSV WITH HEADERS FROM 'file:/stocks.csv' AS row
WITH row.ts_code AS fval,row.symbol AS tval
WHERE tval IS NOT NULL AND tval<>''
MERGE (f:股票 {value:fval})
MERGE (t:股票代码 {value:tval})
MERGE (f)-[:股票代码]->(t);

LOAD CSV WITH HEADERS FROM 'file:/stocks.csv' AS row
WITH row.ts_code AS fval,row.name AS tval
WHERE tval IS NOT NULL AND tval<>''
MERGE (f:股票 {value:fval})
MERGE (t:股票名称 {value:tval})
MERGE (f)-[:股票名称]->(t);

LOAD CSV WITH HEADERS FROM 'file:/stocks.csv' AS row
WITH row.ts_code AS fval,row.area AS tval
WHERE tval IS NOT NULL AND tval<>''
MERGE (f:股票 {value:fval})
MERGE (t:地域 {value:tval})
MERGE (f)-[:地域]->(t);

LOAD CSV WITH HEADERS FROM 'file:/stocks.csv' AS row
WITH row.ts_code AS fval,row.industry AS tval
WHERE tval IS NOT NULL AND tval<>''
MERGE (f:股票 {value:fval})
MERGE (t:行业 {value:tval})
MERGE (f)-[:所属行业]->(t);

LOAD CSV WITH HEADERS FROM 'file:/stocks.csv' AS row
WITH row.ts_code AS fval,row.list_date AS tval
WHERE tval IS NOT NULL AND tval<>''
MERGE (f:股票 {value:fval})
MERGE (t:上市日期 {value:TOINTEGER(tval)})
MERGE (f)-[:上市日期]->(t);

LOAD CSV WITH HEADERS FROM 'file:/managers.csv' AS row
WITH row,apoc.util.md5([row.name,row.gender,row.birthday]) AS fval,row.ts_code AS tval
WHERE tval IS NOT NULL AND tval<>'' AND fval IS NOT NULL AND fval<>'' AND (row.end_date IS NULL OR row.end_date='')
MERGE (f:高管 {md5:fval}) SET f+={value:row.name}
MERGE (t:股票 {value:tval})
MERGE (f)-[:任职于]->(t);

LOAD CSV WITH HEADERS FROM 'file:/managers.csv' AS row
WITH row,apoc.util.md5([row.name,row.gender,row.birthday]) AS fval,row.gender AS tval
WHERE tval IS NOT NULL AND tval<>'' AND fval IS NOT NULL AND fval<>'' AND (row.end_date IS NULL OR row.end_date='')
MERGE (f:高管 {md5:fval}) SET f+={value:row.name}
MERGE (t:性别 {value:tval})
MERGE (f)-[:性别]->(t);

LOAD CSV WITH HEADERS FROM 'file:/managers.csv' AS row
WITH row,apoc.util.md5([row.name,row.gender,row.birthday]) AS fval,row.edu AS tval
WHERE tval IS NOT NULL AND tval<>'' AND fval IS NOT NULL AND fval<>'' AND (row.end_date IS NULL OR row.end_date='')
MERGE (f:高管 {md5:fval}) SET f+={value:row.name}
MERGE (t:学历 {value:tval})
MERGE (f)-[:学历]->(t);

WITH ['女性','女'] AS list
UNWIND list AS wd
WITH wd
MATCH (n:性别) WHERE n.value='F' WITH n,wd
MERGE (t:性别_别名 {value:wd})
MERGE (n)-[:别名]->(t);

WITH ['男性','男'] AS list
UNWIND list AS wd
WITH wd
MATCH (n:性别) WHERE n.value='M' WITH n,wd
MERGE (t:性别_别名 {value:wd})
MERGE (n)-[:别名]->(t);

五、问答组件配置

配置模块使用函数和过程的方式，安装在ONgDB中，方便在Cypher中动态调用。另外，也可以将配置作为参数动态传入，依赖的下游函数作为入参调用。

5.1 配置图数据模型

配置图数据模型json文件，可从Graphene工具的Export -> Schema Json功能导出后配置。

CALL apoc.custom.asFunction(
'inference.search.qabot',
'RETURN \'{"graph":{"nodes":[{"properties_filter":[],"id":"1","labels":["股票代码"]},{"properties_filter":[],"id":"2","labels":["股票"]},{"properties_filter":[],"id":"3","labels":["股票名称"]},{"properties_filter":[],"id":"4","labels":["地域"]},{"properties_filter":[],"id":"5","labels":["行业"]},{"properties_filter":[],"id":"6","labels":["上市日期"]},{"properties_filter":[],"id":"7","labels":["高管"]},{"properties_filter":[],"id":"8","labels":["性别"]},{"properties_filter":[],"id":"9","labels":["学历"]},{"properties_filter":[],"id":"10","labels":["性别_别名"]}],"relationships":[{"startNode":"2","properties_filter":[],"id":"1","type":"股票代码","endNode":"1"},{"startNode":"2","properties_filter":[],"id":"2","type":"股票名称","endNode":"3"},{"startNode":"2","properties_filter":[],"id":"3","type":"地域","endNode":"4"},{"startNode":"2","properties_filter":[],"id":"4","type":"所属行业","endNode":"5"},{"startNode":"2","properties_filter":[],"id":"5","type":"上市日期","endNode":"6"},{"startNode":"7","properties_filter":[],"id":"6","type":"任职于","endNode":"2"},{"startNode":"7","properties_filter":[],"id":"7","type":"性别","endNode":"8"},{"startNode":"7","properties_filter":[],"id":"8","type":"学历","endNode":"9"},{"startNode":"8","properties_filter":[],"id":"9","type":"别名","endNode":"10"}]}}\' AS graphDataSchema',
'STRING',
NULL,
false,
'搜索时图数据扩展模式(schema)的动态获取'
);

5.2 配置实体权重规则

当分词后某个词命中多个标签时（词有歧义），该配置可以帮助系统生成一个查询的优先级队列。

CALL apoc.custom.asFunction(
'inference.weight.qabot',
'RETURN \'{"LABEL":{"股票":12,"高管":11}}\' AS weight',
'STRING',
NULL,
false,
'本体权重'
);

5.3 配置实体搜索规则

配置词可以在哪些标签下执行搜索（默认考虑性能，使用全等匹配）。

CALL apoc.custom.asFunction(
'inference.match.qabot',
'RETURN \'{"股票名称":"value","地域":"value","上市日期":"value","性别_别名":"value","高管":"value","学历":"value","行业":"value","股票代码":"value"}\' AS nodeHitsRules',
'STRING',
NULL,
false,
'实体匹配规则'
);

5.4 配置意图匹配规则

配置查询的意图解析规则，其中EQUALS模式可以配合分类模型一起使用。

    CONTAINS模式、EQUALS模式、OTHER词列表求交集模式。
    一、CONTAINS模式表示查询语句中包含意图配置LIST中任意一个值，即返回该LIST对应的意图标签。
    二、EQUALS模式表示查询语句对应的问题分类（可以配合意图分类模型一起使用，7.1.2 带分类模型的查询方式【分类模型解意图】详细说明该模式的使用方式），与LIST中值全等包含即返回该LIST对应的意图标签（LIST一般配置分类模型对应的问题类别）。
    三、OTHER词列表求交集模式，该模式表示非上述情况的一种模式，使用时查询语句分词的结果与配置的意图LIST直接求交集，非空的话返回该LIST对应的意图标签，一般与意图配置的sort参数一起使用（为优化命中多个意图标签的情况）。

CALL apoc.custom.asFunction(
'inference.intended.qabot',
'RETURN \'[{"label":"上市日期","return_var_alias":"n1","sort":1,"list":["是什么时候","什么时候"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"学历","return_var_alias":"n2","sort":2,"list":["什么学历"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"高管","return_var_alias":"n3","sort":3,"list":["高管有多少位","高管都有哪些","高管有多少个"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"股票","return_var_alias":"n4","sort":4,"list":["哪些上市公司","有多少家上市公司","哪个公司","有多少家","公司有哪些","公司有多少家","股票有哪些","股票有多少支","股票有多少个","股票代码？","股票？"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"行业","return_var_alias":"n5","sort":5,"list":["什么行业","同一个行业嘛"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"股票名称","return_var_alias":"n6","sort":6,"list":["股票名称？"],"parse_mode":"CONTAINS","order_by_field":null,"order_by_type":null},{"label":"性别","return_var_alias":"n7","sort":7,"list":["性别"],"parse_mode":"OTHER","order_by_field":null,"order_by_type":null},{"label":"性别","return_var_alias":"n8","sort":8,"list":["查询性别"],"parse_mode":"EQUALS","order_by_field":null,"order_by_type":null}]\' AS intendedIntent',
'STRING',
NULL,
false,
'预期意图'
);

5.5 配置时间页码处理等规则

配置解析后特定元素添加到实体识别entityRecognitionHit结果的方法，下面是时间和页码的拼接操作。
时间解析函数olab.nlp.timeparser(oper.query) ,页码解析函数olab.nlp.pagenum.parse(oper.query) ，这两个函数解析后结果会拼接在entityRecognitionHit，供下游算子使用。

CALL apoc.custom.asFunction(
'inference.parseadd.qabot',
'WITH $time AS time,$page AS page,$entityRecognitionHit AS entityRecognitionHit WITH entityRecognitionHit,REDUCE(l=\'\',e IN time.list | l+\'({var}.value>=\'+TOINTEGER(apoc.date.convertFormat(e.detail.time[0],\'yyyy-MM-dd HH:mm:ss\',\'yyyyMMdd\'))+\' AND {var}.value<=\'+TOINTEGER(apoc.date.convertFormat(e.detail.time[1],\'yyyy-MM-dd HH:mm:ss\',\'yyyyMMdd\'))+\') OR \') AS timeFilter, REDUCE(l=\'\',e IN page.list | l+\'{var}.value\'+e[0]+e[1]+\' AND \') AS pageFilter CALL apoc.case([size(timeFilter)>4,\'RETURN SUBSTRING($timeFilter,0,size($timeFilter)-4) AS timeFilter\'],\'RETURN "" AS timeFilter\',{timeFilter:timeFilter}) YIELD value WITH entityRecognitionHit,value.timeFilter AS timeFilter,pageFilter CALL apoc.case([size(pageFilter)>5,\'RETURN SUBSTRING($pageFilter,0,size($pageFilter)-5) AS pageFilter\'],\'RETURN "" AS pageFilter\',{pageFilter:pageFilter}) YIELD value WITH entityRecognitionHit,timeFilter,value.pageFilter AS pageFilter WITH entityRecognitionHit,timeFilter,pageFilter CALL apoc.case([timeFilter<>"",\'RETURN apoc.map.setPairs({},[[\\\'上市日期\\\',[{category:\\\'node\\\',labels:[\\\'上市日期\\\'],properties_filter:[{value:$timeFilter}]}]]]) AS time\'],\'RETURN {} AS time\',{timeFilter:timeFilter}) YIELD value WITH value.time AS time,pageFilter,entityRecognitionHit CALL apoc.case([pageFilter<>"",\'RETURN apoc.map.setPairs({},[[\\\'文章页数\\\',[{category:\\\'node\\\',labels:[\\\'文章页数\\\'],properties_filter:[{value:$pageFilter}]}]]]) AS page\'],\'RETURN {} AS page\',{pageFilter:pageFilter}) YIELD value WITH value.page AS page,time,entityRecognitionHit RETURN apoc.map.setKey({},\'entities\',apoc.map.merge(apoc.map.merge(entityRecognitionHit.entities,time),page)) AS entityRecognitionHit',
'MAP',
[['entityRecognitionHit','MAP'],['time','MAP'],['page','MAP']],
false,
'问答解析的内容增加到entityRecognitionHit'
);

5.6 配置算子规则

定义从查询语句解析算子的操作，目前仅支持了统计COUNT算子。

CALL apoc.custom.asFunction(
'inference.operators.qabot',
'RETURN \'[{"keywords":["最多","最大"],"operator":{"sort": 1,"agg_operator_field": "value","agg_operator": "MAX","agg_operator_type": "NODE"}},{"keywords":["最小","最少"],"operator":{"sort": 2,"agg_operator_field": "value","agg_operator": "MIN","agg_operator_type": "NODE"}},{"keywords":["平均"],"operator":{"sort": 3,"agg_operator_field": "value","agg_operator": "AVG","agg_operator_type": "NODE"}},{"keywords":["有多少","有多少只","有多少支","多少只","多少支","多少","一共"],"operator":{"sort": 5,"agg_operator_field": "value","agg_operator": "COUNT","agg_operator_type": "NODE"}}]\' AS weight',
'STRING',
NULL,
false,
'配置算子规则'
);

5.7 算子解析模块配置

算子的一个额外解析函数。

CALL apoc.custom.asFunction(
'inference.operators.parse',
'WITH $query AS query,custom.inference.operators.qabot() AS list WITH query,apoc.convert.fromJsonList(list) AS list UNWIND list AS map WITH map,REDUCE(l=[],em IN apoc.coll.sortMaps(REDUCE(l=[],e IN map.keywords | l+{wd:e,len:LENGTH(e)}),\'len\')| l+em.wd) AS keywords,query UNWIND keywords AS keyword WITH map,keyword,query CALL apoc.case([query CONTAINS keyword,\'RETURN $keyword AS hit\'],\'RETURN \\\' \\\' AS hit\',{keyword:keyword}) YIELD value WITH map,keyword,query,REPLACE(query,value.hit,\' \') AS trim_query,value.hit AS hit WITH query,trim_query,map.operator AS operator ORDER BY map.operator.sort ASC WITH COLLECT({query:query,trim_query:trim_query,operator:operator}) AS list WITH FILTER(e IN list WHERE e.query<>e.trim_query)[0] AS map,list WITH map,list[0] AS smap CALL apoc.case([map IS NOT NULL,\'RETURN $map AS map\'],\'RETURN $smap AS map\',{map:map,smap:smap}) YIELD value WITH value.map AS map CALL apoc.case([map.trim_query=map.query,\'RETURN {} AS operator\'],\'RETURN $operator AS operator\',{trim_query:map.trim_query,operator:map.operator}) YIELD value RETURN map.query AS query,map.trim_query AS trim_query,value.operator AS operator',
'MAP',
[['query','STRING']],
false,
'算子解析'
);

六、配置词库

查询语句分词时需要依赖词库配置，词的配置主要来自图数据模型和实体数据。

6.1 安装词库依赖

下载下面的文件夹，放置在根目录下，并设置Python的执行路径和词典位置。

https://github.com/ongdb-contrib/ongdb-lab-apoc/tree/1.0/nlp

# nlp/nlp.properties
defined.dic.path=nlp/dic/dynamic/dynamic.dic
python.commands.path=E:\\software\\anaconda3\\python.exe

6.2 配置词

//dic,dynamic,dynamic.dic
//意图配置相关词
WITH custom.inference.intended.qabot() AS str
WITH apoc.convert.fromJsonList(str) as list
UNWIND list AS map
WITH map.label AS label,map.list as list,map
WHERE UPPER(map.parse_mode)<>'CONTAINS' AND UPPER(map.parse_mode)<>'EQUALS'
WITH apoc.coll.union([label],list) as list
UNWIND list AS wd
WITH collect(DISTINCT wd) AS list
RETURN olab.nlp.userdic.add('dynamic.dic',list,true,'UTF-8') AS words;

//图数据模型相关词
WITH custom.inference.search.qabot() AS str
WITH apoc.convert.fromJsonMap(str).graph AS graph
WITH graph.relationships AS relationships,graph.nodes AS nodes
WITH REDUCE(l=[],e IN relationships | l+e.type) AS relationships,REDUCE(l=[],e IN nodes | l+e.labels[0]) AS nodes
WITH apoc.coll.union(relationships,nodes) AS list
RETURN olab.nlp.userdic.add('dynamic.dic',list,true,'UTF-8') AS words;

//实体匹配规则相关词
WITH custom.inference.match.qabot() AS str
WITH olab.map.keys(apoc.convert.fromJsonMap(str)) AS list
UNWIND list AS lb
WITH lb
WHERE lb<>'性别' AND lb<>'上市日期'
CALL apoc.cypher.run('MATCH (n:'+lb+') WHERE NOT n.value CONTAINS \' \' RETURN COLLECT(DISTINCT n.value) AS list',{}) YIELD value
WITH value.list AS list
RETURN olab.nlp.userdic.add('dynamic.dic',list,true,'UTF-8') AS words;

RETURN olab.nlp.userdic.add('dynamic.dic',['测试','胡永乐','性别'],true,'UTF-8') AS words;

6.3 词库热更新

RETURN olab.nlp.userdic.refresh();

七、安装问答模块存储过程

这节主要展示了下面几个过程的封装方式，以及原始的查询语句。另外7.1.2 单独列出了查询语句带上分类模型结果classifier_type使用的过程。

根据输入的问题生成推荐问题列表

CALL custom.qabot.recommend_list('{qa}') YIELD raw_query,re_query,score RETURN raw_query,re_query,score;

单轮问答过程

CALL custom.qabot('{qa}') YIELD result RETURN result;

问答生成的Cypher查询

CALL custom.qabot.cypher('{qa}') YIELD cypher RETURN cypher;

问答图谱生成

CALL custom.qabot.graph('{qa}') YIELD path RETURN path;

7.1 问答结果

7.1.1 意图字符匹配和列表求交集【非分类模型解意图】

// 1.搜索语句
WITH LOWER('火力发电行业博士学历的男性高管有多少位？') AS query
// 2.个性化配置：图数据模型/本体权重/实体匹配规则/预期意图
WITH query,
     custom.inference.search.qabot() AS graphDataSchema,
     custom.inference.weight.qabot() AS weight,
     custom.inference.match.qabot() AS nodeHitsRules,
     //预期意图定义中支持设置一个排序参数
     custom.inference.intended.qabot() AS intendedIntent,
     custom.inference.operators.parse(query) AS oper
// 3.个性化语句解析：解析时间/解析页面
WITH oper,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,
     olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page
// 4.从查询语句中过滤时间词
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:' '})) AS query
// 5.过滤时间词后进行分词
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.hanlp.standard.segment(query) AS words
// 6.分词后结果只保留名词且不能是纯数字
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH 'n' AND olab.string.matchCnEn(mp.word)<>'') OR mp.nature='uw')| m.word) AS words
// 7.实体识别
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,
     olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,'EXACT',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits
// 8.生成权重搜索队列
WITH oper,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits
CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value
WITH oper,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1
// 9.将个性化语句解析结果增加到entityRecognitionHit
WITH oper,operator,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit
// 10.意图识别
WITH oper,operator,graphDataSchema,intendedIntent,words,entityRecognitionHit,
     apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper.query,words,intendedIntent)) AS intentSchema
WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes)
// 11.图上下文语义解析
WITH operator,graphDataSchema,intentSchema,intendedIntent,
     olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema
// 12.查询转换【不设置skip参数,每个查询抽取100条结果】
WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),intentSchema,-1,10,{},operator) AS cypher
WITH REPLACE(cypher,'RETURN n','RETURN DISTINCT n') AS cypher
// 13.执行查询【value返回为一个MAP，MAP的KEY SIZE小于等于解析后返回意图类别个数】
CALL apoc.cypher.run(cypher,{}) YIELD value WITH value SKIP 0 LIMIT 10
WITH olab.map.keys(value) AS keys,value
UNWIND keys AS key
WITH apoc.map.get(value,key) AS n
CALL apoc.case([apoc.coll.contains(['NODE'],apoc.meta.cypher.type(n)),'WITH $n AS n,LABELS($n) AS lbs WITH lbs[0] AS label,n.value AS value RETURN label+$sml+UPPER(TOSTRING(value)) AS result'],'WITH $n AS n RETURN TOSTRING(n) AS result',{n:n,sml:'：'}) YIELD value
RETURN value.result AS result;

CALL apoc.custom.asProcedure(
'qabot',
'WITH LOWER($ask) AS query WITH query,  custom.inference.search.qabot() AS graphDataSchema,  custom.inference.weight.qabot() AS weight,  custom.inference.match.qabot() AS nodeHitsRules,  custom.inference.intended.qabot() AS intendedIntent,  custom.inference.operators.parse(query) AS oper WITH oper,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,  olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:\' \'})) AS query WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.hanlp.standard.segment(query) AS words WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH \'n\' AND olab.string.matchCnEn(mp.word)<>\'\') OR mp.nature=\'uw\')| m.word) AS words WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,  olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,\'EXACT\',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits WITH oper,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value WITH oper,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1 WITH oper,operator,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit WITH oper,operator,graphDataSchema,intendedIntent,words,entityRecognitionHit,  apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper.query,words,intendedIntent)) AS intentSchema WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes) WITH operator,graphDataSchema,intentSchema,intendedIntent,  olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),intentSchema,-1,10,{},operator) AS cypher WITH REPLACE(cypher,\'RETURN n\',\'RETURN DISTINCT n\') AS cypher CALL apoc.cypher.run(cypher,{}) YIELD value WITH value SKIP 0 LIMIT 10 WITH olab.map.keys(value) AS keys,value UNWIND keys AS key WITH apoc.map.get(value,key) AS n CALL apoc.case([apoc.coll.contains([\'NODE\'],apoc.meta.cypher.type(n)),\'WITH $n AS n,LABELS($n) AS lbs WITH lbs[0] AS label,n.value AS value RETURN label+$sml+UPPER(TOSTRING(value)) AS result\'],\'WITH $n AS n RETURN TOSTRING(n) AS result\',{n:n,sml:\'：\'}) YIELD value RETURN value.result AS result;',
'READ',
[['result','STRING']],
[['ask','STRING']],
'问答机器人'
);

CALL custom.qabot('火力发电行业博士学历的男性高管有多少位？') YIELD result RETURN result;

7.1.2 带分类模型的查询方式【分类模型解意图】

问题：胡永乐是男性还是女性？
问题类别：查询性别

// 1.搜索语句
WITH LOWER('胡永乐是男性还是女性？') AS query,'查询性别' AS classifier_type
// 2.个性化配置：图数据模型/本体权重/实体匹配规则/预期意图
WITH classifier_type,query,
     custom.inference.search.qabot() AS graphDataSchema,
     custom.inference.weight.qabot() AS weight,
     custom.inference.match.qabot() AS nodeHitsRules,
     //预期意图定义中支持设置一个排序参数
     custom.inference.intended.qabot() AS intendedIntent,
     custom.inference.operators.parse(query) AS oper
// 3.个性化语句解析：解析时间/解析页面
WITH classifier_type,oper,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,
     olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page
// 4.从查询语句中过滤时间词
WITH classifier_type,oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:' '})) AS query
// 5.过滤时间词后进行分词
WITH classifier_type,oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.hanlp.standard.segment(query) AS words
// 6.分词后结果只保留名词且不能是纯数字
WITH classifier_type,oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH 'n' AND olab.string.matchCnEn(mp.word)<>'') OR mp.nature='uw')| m.word) AS words
// 7.实体识别
WITH classifier_type,oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,
     olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,'EXACT',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits
// 8.生成权重搜索队列
WITH classifier_type,oper,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits
CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value
WITH classifier_type,oper,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1
// 9.将个性化语句解析结果增加到entityRecognitionHit
WITH classifier_type,oper,operator,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit
// 10.意图识别
WITH classifier_type,oper,operator,graphDataSchema,intendedIntent,words,entityRecognitionHit,
     apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper.query,words,intendedIntent,classifier_type)) AS intentSchema
WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes)
// 11.图上下文语义解析
WITH operator,graphDataSchema,intentSchema,intendedIntent,
     olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema
// 12.查询转换【不设置skip参数,每个查询抽取100条结果】
WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),intentSchema,-1,10,{},operator) AS cypher
WITH REPLACE(cypher,'RETURN n','RETURN DISTINCT n') AS cypher
// 13.执行查询【value返回为一个MAP，MAP的KEY SIZE小于等于解析后返回意图类别个数】
CALL apoc.cypher.run(cypher,{}) YIELD value WITH value SKIP 0 LIMIT 10
WITH olab.map.keys(value) AS keys,value
UNWIND keys AS key
WITH apoc.map.get(value,key) AS n
CALL apoc.case([apoc.coll.contains(['NODE'],apoc.meta.cypher.type(n)),'WITH $n AS n,LABELS($n) AS lbs WITH lbs[0] AS label,n.value AS value RETURN label+$sml+UPPER(TOSTRING(value)) AS result'],'WITH $n AS n RETURN TOSTRING(n) AS result',{n:n,sml:'：'}) YIELD value
RETURN value.result AS result;

7.2 问答结果Cypher

// 1.搜索语句
WITH LOWER('火力发电行业博士学历的男性高管有多少位？') AS query
// 2.个性化配置：图数据模型/本体权重/实体匹配规则/预期意图
WITH query,
     custom.inference.search.qabot() AS graphDataSchema,
     custom.inference.weight.qabot() AS weight,
     custom.inference.match.qabot() AS nodeHitsRules,
     //预期意图定义中支持设置一个排序参数
     custom.inference.intended.qabot() AS intendedIntent,
     custom.inference.operators.parse(query) AS oper
// 3.个性化语句解析：解析时间/解析页面
WITH oper,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,
     olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page
// 4.从查询语句中过滤时间词
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:' '})) AS query
// 5.过滤时间词后进行分词
WITH oper,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.hanlp.standard.segment(query) AS words
// 6.分词后结果只保留名词且不能是纯数字
WITH oper,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH 'n' AND olab.string.matchCnEn(mp.word)<>'') OR mp.nature='uw')| m.word) AS words
// 7.实体识别
WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,
     olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,'EXACT',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits
// 8.生成权重搜索队列
WITH oper,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits
CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value
WITH oper,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1
// 9.将个性化语句解析结果增加到entityRecognitionHit
WITH oper,operator,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit
// 10.意图识别
WITH operator,graphDataSchema,intendedIntent,words,entityRecognitionHit,
     apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper.query,words,intendedIntent)) AS intentSchema
WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes)
// 11.图上下文语义解析
WITH operator,graphDataSchema,intentSchema,intendedIntent,
     olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema
// 12.查询转换【不设置skip参数,每个查询抽取100条结果】
WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),intentSchema,-1,10,{},operator) AS cypher
WITH REPLACE(cypher,'RETURN n','RETURN DISTINCT n') AS cypher
RETURN cypher;

CALL apoc.custom.asProcedure(
'qabot.cypher',
'WITH LOWER($ask) AS query WITH query,  custom.inference.search.qabot() AS graphDataSchema,  custom.inference.weight.qabot() AS weight,  custom.inference.match.qabot() AS nodeHitsRules,  custom.inference.intended.qabot() AS intendedIntent,  custom.inference.operators.parse(query) AS oper WITH oper,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,  olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:\' \'})) AS query WITH oper,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.hanlp.standard.segment(query) AS words WITH oper,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH \'n\' AND olab.string.matchCnEn(mp.word)<>\'\') OR mp.nature=\'uw\')| m.word) AS words WITH oper,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,  olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,\'EXACT\',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits WITH oper,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value WITH oper,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1 WITH oper,operator,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit WITH operator,graphDataSchema,intendedIntent,words,entityRecognitionHit,  apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper.query,words,intendedIntent)) AS intentSchema WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes) WITH operator,graphDataSchema,intentSchema,intendedIntent,  olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),intentSchema,-1,10,{},operator) AS cypher WITH REPLACE(cypher,\'RETURN n\',\'RETURN DISTINCT n\') AS cypher RETURN cypher;',
'READ',
[['cypher','STRING']],
[['ask','STRING']],
'问答机器人：生成查询语句'
);

CALL custom.qabot.cypher('火力发电行业博士学历的男性高管有多少位？') YIELD cypher RETURN cypher;

7.3 问答推理图谱

// 1.搜索语句
WITH LOWER('火力发电行业博士学历的男性高管有多少位？') AS query
// 2.个性化配置：图数据模型/本体权重/实体匹配规则/预期意图
WITH query,
     custom.inference.search.qabot() AS graphDataSchema,
     custom.inference.weight.qabot() AS weight,
     custom.inference.match.qabot() AS nodeHitsRules,
     //预期意图定义中支持设置一个排序参数
     custom.inference.intended.qabot() AS intendedIntent
// 3.个性化语句解析：解析时间/解析页面
WITH query AS oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,
     olab.nlp.timeparser(query) AS time,olab.nlp.pagenum.parse(query) AS page
// 4.从查询语句中过滤时间词
WITH oper,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:' '})) AS query
// 5.过滤时间词后进行分词
WITH oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.hanlp.standard.segment(query) AS words
// 6.分词后结果只保留名词且不能是纯数字
WITH oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH 'n' AND olab.string.matchCnEn(mp.word)<>'') OR mp.nature='uw')| m.word) AS words
// 7.实体识别
WITH oper,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,
     olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,'EXACT',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits
// 8.生成权重搜索队列
WITH oper,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits
CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value
WITH oper,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1
// 9.将个性化语句解析结果增加到entityRecognitionHit
WITH oper,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit
// 10.意图识别
WITH graphDataSchema,intendedIntent,words,entityRecognitionHit,
     apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper,words,intendedIntent)) AS intentSchema
WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes)
// 11.图上下文语义解析
WITH graphDataSchema,intentSchema,intendedIntent,
     olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema
// 12.查询转换【不设置skip参数,每个查询抽取100条结果】
WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),'',-1,10,{}) AS cypher
// 13.执行查询【value返回为一个MAP，MAP的KEY SIZE小于等于解析后返回意图类别个数】
CALL apoc.cypher.run(cypher,{}) YIELD value WITH value SKIP 0 LIMIT 10
WITH value.graph AS graph
UNWIND graph AS path
RETURN path;

CALL apoc.custom.asProcedure(
'qabot.graph',
'WITH LOWER($ask) AS query WITH query,  custom.inference.search.qabot() AS graphDataSchema,  custom.inference.weight.qabot() AS weight,  custom.inference.match.qabot() AS nodeHitsRules,  custom.inference.intended.qabot() AS intendedIntent WITH query AS oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,  olab.nlp.timeparser(query) AS time,olab.nlp.pagenum.parse(query) AS page WITH oper,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:\' \'})) AS query WITH oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.hanlp.standard.segment(query) AS words WITH oper,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH \'n\' AND olab.string.matchCnEn(mp.word)<>\'\') OR mp.nature=\'uw\')| m.word) AS words WITH oper,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,  olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,\'EXACT\',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits WITH oper,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value WITH oper,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit LIMIT 1 WITH oper,graphDataSchema,intendedIntent,words,custom.inference.parseadd.qabot(entityRecognitionHit,time,page).entityRecognitionHit AS entityRecognitionHit WITH graphDataSchema,intendedIntent,words,entityRecognitionHit,  apoc.convert.toJson(olab.intent.schema.parse(graphDataSchema,oper,words,intendedIntent)) AS intentSchema WHERE SIZE(apoc.convert.fromJsonList(intendedIntent))>SIZE(apoc.convert.fromJsonMap(intentSchema).graph.nodes) WITH graphDataSchema,intentSchema,intendedIntent,  olab.semantic.schema(graphDataSchema,intentSchema,apoc.convert.toJson(entityRecognitionHit)) AS semantic_schema WITH olab.semantic.cypher(apoc.convert.toJson(semantic_schema),\'\',-1,10,{}) AS cypher CALL apoc.cypher.run(cypher,{}) YIELD value WITH value SKIP 0 LIMIT 10 WITH value.graph AS graph UNWIND graph AS path RETURN path;',
'READ',
[['path','PATH']],
[['ask','STRING']],
'问答机器人：问答推理图谱'
);

CALL custom.qabot.graph('火力发电行业博士学历的男性高管有多少位？') YIELD path RETURN path;

7.4 生成推荐问题列表

// 1.搜索语句
WITH LOWER('2023年三月六日上市的股票代码？') AS query
// 2.个性化配置：图数据模型/本体权重/实体匹配规则/预期意图
WITH query,query AS raw_query,
     custom.inference.search.qabot() AS graphDataSchema,
     custom.inference.weight.qabot() AS weight,
     custom.inference.match.qabot() AS nodeHitsRules,
     //预期意图定义中支持设置一个排序参数
     custom.inference.intended.qabot() AS intendedIntent,
     custom.inference.operators.parse(query) AS oper
// 3.个性化语句解析：解析时间/解析页面
WITH raw_query,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,
     olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page
// 4.从查询语句中过滤时间词
WITH raw_query,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:' '})) AS query
// 5.过滤时间词后进行分词
WITH raw_query,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     olab.hanlp.standard.segment(query) AS words
// 6.分词后结果只保留名词且不能是纯数字
WITH raw_query,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,
     EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH 'n' AND olab.string.matchCnEn(mp.word)<>'') OR mp.nature='uw')| m.word) AS words
// 7.实体识别
WITH raw_query,query,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,
     olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,'EXACT',words,{isMergeLabelHit:true,labelMergeDis:0.5}) AS entityRecognitionHits
// 8.生成权重搜索队列
WITH raw_query,query,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits
CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value
WITH raw_query,query,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit
WITH raw_query,entityRecognitionHit.entities AS map
WITH raw_query,olab.map.keys(map) AS keys,map
WITH raw_query,REDUCE(l=[],key IN keys | l+{raw:key,rep:FILTER(e IN apoc.map.get(map,key,NULL,FALSE) WHERE SIZE(key)>2)[0].labels[0]}) AS reps
WITH raw_query,FILTER(e IN reps WHERE e.rep IS NOT NULL) AS reps
WITH raw_query,olab.replace(raw_query,reps) AS re_query
WITH raw_query,re_query,olab.editDistance(raw_query,re_query) AS score WHERE score<1
RETURN raw_query,re_query,score ORDER BY score DESC

CALL apoc.custom.asProcedure(
 'qabot.recommend_list',
 'WITH LOWER($qa) AS query WITH query,query AS raw_query,  custom.inference.search.qabot() AS graphDataSchema,  custom.inference.weight.qabot() AS weight,  custom.inference.match.qabot() AS nodeHitsRules,  custom.inference.intended.qabot() AS intendedIntent,  custom.inference.operators.parse(query) AS oper WITH raw_query,oper.query AS query,oper.operator AS operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,  olab.nlp.timeparser(oper.query) AS time,olab.nlp.pagenum.parse(oper.query) AS page WITH raw_query,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.replace(query,REDUCE(l=[],mp IN time.list | l+{raw:mp.text,rep:\' \'})) AS query WITH raw_query,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  olab.hanlp.standard.segment(query) AS words WITH raw_query,operator,query,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,  EXTRACT(m IN FILTER(mp IN words WHERE (mp.nature STARTS WITH \'n\' AND olab.string.matchCnEn(mp.word)<>\'\') OR mp.nature=\'uw\')| m.word) AS words WITH raw_query,query,operator,graphDataSchema,weight,nodeHitsRules,intendedIntent,time,page,words,  olab.entity.recognition(graphDataSchema,nodeHitsRules,NULL,\'EXACT\',words,{isMergeLabelHit:true,labelMergeDis:0.4}) AS entityRecognitionHits WITH raw_query,query,operator,graphDataSchema,weight,intendedIntent,time,page,words,entityRecognitionHits CALL olab.entity.ptmd.queue(graphDataSchema,entityRecognitionHits,weight) YIELD value WITH raw_query,query,operator,graphDataSchema,intendedIntent,time,page,words,value AS entityRecognitionHit WITH raw_query,entityRecognitionHit.entities AS map WITH raw_query,olab.map.keys(map) AS keys,map WITH raw_query,REDUCE(l=[],key IN keys | l+{raw:key,rep:FILTER(e IN apoc.map.get(map,key,NULL,FALSE) WHERE SIZE(key)>2)[0].labels[0]}) AS reps WITH raw_query,FILTER(e IN reps WHERE e.rep IS NOT NULL) AS reps WITH raw_query,olab.replace(raw_query,reps) AS re_query WITH raw_query,re_query,olab.editDistance(raw_query,re_query) AS score WHERE score<1 AND score>0.6 RETURN DISTINCT raw_query,re_query,score ORDER BY score DESC LIMIT 10',
 'READ',
 [['raw_query','STRING'],['re_query','STRING'],['score','NUMBER']],
 [['qa','STRING']],
 '推荐问题列表：自动推荐'
);

CALL custom.qabot.recommend_list('2023年三月六日上市的股票代码？') YIELD raw_query,re_query,score RETURN raw_query,re_query,score

八、配置样例问答

支持在Graph QABot的前端页面配置一些样例问题，供快速查看。

WITH '[{"qa":"火力发电行业博士学历的男性高管有多少位？","label":"学历"},{"qa":"山西都有哪些上市公司？","label":"地域"},{"qa":"富奥股份的高管都是什么学历？","label":"学历"},{"qa":"中国宝安属于什么行业？","label":"股票"},{"qa":"建筑工程行业有多少家上市公司？","label":"行业"},{"qa":"刘卫国是哪个公司的高管？","label":"高管"},{"qa":"美丽生态上市时间是什么时候？","label":"时间"},{"qa":"山西的上市公司有多少家？","label":"地域"},{"qa":"博士学历的高管都有哪些？","label":"学历"},{"qa":"上市公司是博士学历的高管有多少个？","label":"学历"},{"qa":"刘卫国是什么学历？","label":"高管"},{"qa":"富奥股份的男性高管有多少个？","label":"高管"},{"qa":"同在火力发电行业的上市公司有哪些？","label":"行业"},{"qa":"同在火力发电行业的上市公司有多少家？","label":"行业"},{"qa":"大悦城和荣盛发展是同一个行业嘛？","label":"股票"},{"qa":"同在河北的上市公司有哪些？","label":"股票"},{"qa":"神州高铁是什么时候上市的？","label":"时间"},{"qa":"火力发电行业男性高管有多少个？","label":"高管"},{"qa":"2023年三月六日上市的股票代码？","label":"股票"},{"qa":"2023年三月六日上市的股票有哪些？","label":"股票"},{"qa":"2023年三月六日上市的股票有多少个？","label":"股票"},{"qa":"胡永乐是什么性别？","label":"性别"}]' AS list
WITH apoc.convert.fromJsonList(list) AS list
UNWIND list AS map
WITH map
MERGE (n:DEMO_QA {qa:map.qa,label:map.label});

九、运行Graph QABot页面

你可能感兴趣的:(知识图谱,Neo4j,ONgDB,知识图谱,图数据库,问答机器人,ONgDB,Neo4j)

手机FunASR识别SIM卡通话占用内存和运行性能分析
手机FunASR识别SIM卡通话占用内存和运行性能分析--本地AI电话机器人上一篇：手机无网离线使用FunASR识别SIM卡语音通话内容下一篇：手机通话语音离线ASR识别商用和优化方向一、前言书接上一文《阿里FunASR本地断网离线识别模型简析》，我们其实在2023年底的时候输出过一版基于离线FunASR的ASR转文字方案。当时为了减少模型文件的数量和大小，只引入了【vad_res】、【asr_o
AAAI—24—Main—paper（关于Multi—Modal的全部文章摘要）
我们生活在一个由多种模态（Multimodal）信息构成的世界，包括视觉信息、听觉信息、文本信息、嗅觉信息等等，当研究的问题或者数据集包含多种这样的模态信息时我们称之为多模态学习多模态机器学习旨在处理学习（视觉，听觉，语言等）不同模态融合交织的信息。下游任务（1）视觉问答1.视觉问答(visualquestionanswering,VQA).给予视觉输入(图像或视频),VQA代表了正确提供一个问题
Spring Data Neo4j 与后端人工智能算法的数据交互 AI大模型应用实战 spring neo4j 人工智能 ai
SpringDataNeo4j与后端人工智能算法的数据交互关键词：SpringDataNeo4j、图数据库、人工智能算法、数据交互、知识图谱、图神经网络、数据集成摘要：本文深入探讨了如何利用SpringDataNeo4j框架实现后端人工智能算法与图数据库的高效数据交互。文章首先介绍了图数据库和人工智能算法的基本概念，然后详细解析了SpringDataNeo4j的核心架构和原理。接着，通过实际代码示
知乎问答感怀
知乎问答感怀世间纯良，明月当亮，浮尘无流春芳易逝，再想那年为讲。思晨中日，数天难享。水去无痕，但见花开费思量。何故有情？时常足长，促容间，赤墨本心比莲，自性为上。叶生叶落，共存酒酿。
python学习试题（选择，问答，代码等）爱莉希雅&&& python 学习开发语言
python选择题（1）以下哪个是合法的Python变量名？[email protected]答案：B（2）表达式True+2的结果是？A.TrueB.3C.2D.TypeError答案：B（3）以下哪个表达式会引发错误？A."1"+"2"B.[1,2]+[3,4]C.(1,2)+(3,4)D.{1,2}+{3,4}答案：D（4）以下哪个是将字符串转换为整数的正确方法？A.str
人工智能怎么入门？零基础入门指南：从小白到AI实战者的第一步 OpenCV图像识别人工智能人工智能计算机视觉自然语言处理神经网络机器学习
人工智能（AI）是当今最具前景的科技领域之一。从聊天机器人到自动驾驶，从图像识别到语音翻译，AI正在以前所未有的速度改变世界。但对于初学者来说，一个最常见的问题是：“我没有基础，也不是学数学或计算机的，人工智能还能学吗？我该怎么入门？”答案是：可以学，而且你并不孤单。越来越多的人正在以“跨专业、转行、自学”的方式进入AI领域。关键是，你需要一个清晰的入门路径，理解应该先做什么、学什么、避开什么误区
VL53L0X激光测距传感器资料汇总：您的智能测距解决方案伍熠逸Peg
VL53L0X激光测距传感器资料汇总：您的智能测距解决方案去发现同类优质开源项目:https://gitcode.com/VL53L0X激光测距传感器资料汇总项目的核心功能/场景：提供VL53L0X传感器集成、调试与开发资源，助力智能测距应用。项目介绍在现代科技领域，精确的测距能力对于自动化、机器人技术以及智能家居等应用至关重要。VL53L0X激光测距传感器资料汇总项目，就是一个为开发者提供全面资
知识图谱系列（2）：知识图谱的技术架构与组成要素程序员查理 #知识图谱知识图谱架构人工智能 AI Agent RAG
1.引言知识图谱作为一种强大的知识表示和组织方式，已经在搜索引擎、推荐系统、智能问答等多个领域展现出巨大的价值。在之前的上一篇文章中，我们介绍了知识图谱的基础概念与发展历程，了解了知识图谱的定义、核心特征、发展历史以及在AI发展中的地位与作用。要深入理解和应用知识图谱，我们需要进一步探索其内部的技术架构和组成要素。知识图谱不仅仅是一个简单的数据结构，而是一个复杂的技术体系，涉及知识的表示、存储、查
LangChain内置代理类型深度对比分析(43) Android 小码蜂 LangChain框架入门 langchain 人工智能深度学习神经网络自然语言处理
LangChain内置代理类型深度对比分析一、LangChain代理概述与核心价值1.1代理在LangChain中的定位在LangChain框架体系里，代理（Agent）扮演着智能任务执行者的关键角色。它区别于普通的链式结构，能够依据任务需求，动态调用不同工具（Tool）、结合语言模型的推理能力，自主规划执行步骤并完成复杂任务。无论是智能问答、代码生成，还是数据分析等场景，代理都可通过灵活组合工具
【Go语言-Day 16】从零掌握 Go 函数：参数、多返回值与命名返回值的妙用吴师兄大模型 Go 语言从入门到精通 golang 开发语言后端 go语言函数人工智能大模型
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
【Go语言-Day 14】深入解析 map：创建、增删改查与“键是否存在”的奥秘吴师兄大模型 Go 语言从入门到精通 golang 开发语言后端人工智能 python go语言大模型
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
【机器学习】解密计算机视觉：CNN、目标检测与图像识别核心技术（第25天）吴师兄大模型 0基础实现机器学习入门到精通机器学习计算机视觉 cnn 人工智能目标检测图像识别 pytorch
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
6. ETL Pipeline-SpringAI实战起凡7 Spring AI etl 嵌入式实时数据库 ai spring 语言模型
ETLPipelineETL是提取、转换、加载的缩写，从原始的文档到数据库需要经历提取（.doc、.ppt、.xlsx等）、转换（数据结构化、清理数据、数据分块）、写入向量数据库。这个过程可以进行多种处理，确保最后的数据适合AI问答。SpringAI提供了ETL框架。它是搭建知识库框架的基石。框架介绍DocumentReader：文档读取器，读取文档，比如PDF、Word、Excel等。如：Jso
人工智能-基础篇-18-什么是RAG(检索增强生成：知识库+向量化技术+大语言模型LLM整合的技术框架) weisian151 人工智能人工智能语言模型自然语言处理
RAG（Retrieval-AugmentedGeneration，检索增强生成）是一种结合外部知识检索与大语言模型（LLM）生成能力的技术框架，旨在提升生成式AI在问答、内容创作等任务中的准确性、实时性和领域适应性。1、核心概念大语言模型（LLM）的两大局限性：时效性不足：LLM的训练数据截止于某一时间点，无法获取最新信息（如2025年后的新事件）。知识幻觉：当问题超出模型训练数据范围时，LLM
！LangChain内置代理类型深度对比分析(43)
LangChain内置代理类型深度对比分析一、LangChain代理概述与核心价值1.1代理在LangChain中的定位在LangChain框架体系里，代理（Agent）扮演着智能任务执行者的关键角色。它区别于普通的链式结构，能够依据任务需求，动态调用不同工具（Tool）、结合语言模型的推理能力，自主规划执行步骤并完成复杂任务。无论是智能问答、代码生成，还是数据分析等场景，代理都可通过灵活组合工具
【深度学习-Day 35】实战图像数据增强：用PyTorch和TensorFlow扩充你的数据集吴师兄大模型深度学习入门到精通深度学习 pytorch tensorflow 人工智能 python 大模型 LLM
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
使用 Node.js 调用 DeepSeek API：一个简单示例 CDOG程序狗 node.js
好的！以下是一篇简洁的文章，介绍如何使用前端JavaScript（以Node.js为例）调用DeepSeekAI框架，并提供一个具体的代码示例。文章面向初学者，涵盖基本步骤和注意事项。使用Node.js调用DeepSeekAPI：一个简单示例DeepSeek是一个强大的AI平台，提供类似OpenAI的API接口，开发者可以通过JavaScript轻松集成其语言模型，实现智能问答、文本生成等功能。本
py_trees实践:实现机器人循迹任务 H1_Coldfire task planning 机器人 python
书接上回的py_trees快速实践，写了一个机器人沿着拓扑路径循迹移动，最后到达目标点后，执行一个任务动作的行为树。在行为树中，增加了在每个tick检查机器人电量的逻辑。在电量低于一定阈值时，会中断当前任务并触发充电动作。这个逻辑体现了行为树响应性(Reactive)的特点，希望对学习行为树的同学有一点参考价值。下面直接给出相应的代码：#!/usr/bin/python3#coding:utf-8
deepseek学术论文全流程深度辅助指南（从开题至答辩）
在学术论文的创作旅程中，从开题到答辩的每一个阶段都至关重要。以下为你详细介绍如何借助高效工具和技巧，顺利完成这一复杂过程。阶段一：开题攻坚操作流程精准定位研究方向：输入指令「我是机械工程专业本科学生，请推荐5个适合毕业设计的智能机器人相关课题，要求：具有创新性但不过于前沿；需要仿真实验而非实物制作；附相关参考文献查找关键词」。通过明确专业、课题类型及具体要求，为研究方向的确定奠定基础。精心优化题目
Python爬虫实战：爬取百度学术摘要信息全流程详解与代码示例 Python爬虫项目 2025年爬虫实战项目 python 爬虫开发语言 scrapy 学习 dubbo 百度
1.前言随着学术资源数字化的普及，百度学术成为学者们常用的论文搜索平台。获取大量论文摘要信息对于文献综述、知识图谱构建等研究极为重要。本文将系统讲解如何利用Python编写爬虫，批量抓取百度学术上的论文摘要。我们将结合最新Python爬虫技术，涵盖基础同步爬虫、异步爬虫、多线程，全面实战演示。2.项目背景与目标百度学术支持通过关键词搜索论文，展示论文标题、作者、期刊、摘要等信息。目标是：根据关键词
【Go语言-Day 7】循环控制全解析：从 for 基础到 for-range 遍历与高级控制
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
【Go语言-Day 5】掌握Go的运算脉络：算术、逻辑到位的全方位指南吴师兄大模型 Go 语言从入门到精通 golang 开发语言后端人工智能 python go语言 LLM
Langchain系列文章目录01-玩转LangChain：从模型调用到Prompt模板与输出解析的完整指南02-玩转LangChainMemory模块：四种记忆类型详解及应用场景全覆盖03-全面掌握LangChain：从核心链条构建到动态任务分配的实战指南04-玩转LangChain：从文档加载到高效问答系统构建的全程实战05-玩转LangChain：深度评估问答系统的三种高效方法（示例生成、手
AI驱动下的企业学习平台，如何重构员工发展与HR角色 weixin_54980836 人工智能学习重构
近期，JoshBersin官方网站分享了一篇关于L&D领域AI深度变革的文章，文章所描绘的并非仅仅是新工具的涌现，而是一场触及L&D本质与HR战略价值的深刻革命。当Docebo坚定走向“AI原生”，当Sana以知识图谱重构组织智慧，它们揭示的正是我们HR从业者必须直面的未来——AI驱动的学习已不再是效率的提升，而是组织能力与人才价值创造方式的根本性进化。一、超越自动化：AI原生平台对学习本质的重构
手机通话语音离线ASR识别商用和优化方向 limingade 本地AI电话机器人手机提取电话的信令和声音智能手机 FunASR离线识别 Android做ASR 手机断网离线ASR ASR语音转文字识别语音识别
手机通话语音离线ASR识别商用和优化方向--本地AI电话机器人上一篇：手机FunASR识别SIM卡通话占用内存和运行性能分析下一篇：编写中。一、前言前面的篇章中，我们尝试了将FunASR的ONNX模型文件加载到Android应用中，实现手机本地不依赖服务器和网络的离线ASR语音识别。并将这个ASR能力应用到了手机麦克风、手机本地的历史通话录音、手机实时的SIM卡电话通话内容的解析上。在实践中，我们
人工智能-基础篇-23-智能体Agent到底是什么？怎么理解？（智能体=看+想+做） weisian151 人工智能人工智能
1、智能体是什么？想象你有一个超级聪明的小助手，它能：自己看环境（比如看到天气、听到声音、读到数据）；自己做决定（比如下雨了要关窗，电量低要去充电）；自己动手干活（比如帮你订外卖、打扫房间、开车）；越用越聪明（比如记住你的习惯，下次不用你提醒）。这个“小助手”就是智能体（Agent）——它是一个能自主感知、思考、行动并学习的系统，可以是软件（比如手机里的AI助手）、硬件（比如机器人），或者软硬结合
c# 直接使用c++ 类库文件 bug菌¹ 全栈Bug调优(实战版)#CSDN问答解惑(全栈版)c++c#
本文收录于《CSDN问答解惑-专业版》专栏，主要记录项目实战过程中的Bug之前因后果及提供真实有效的解决方案，希望能够助你一臂之力，帮你早日登顶实现财富自由；同时，欢迎大家关注&&收藏&&订阅！持续更新中，up！up！up！！问题描述 c#直接使用c++类库文件。需要完整的改写以下c++代码和c#完整调用例子每一个函数都要有，不要省略代码。classNendToCSharp
仓库机器人效率翻倍的秘密：CAN主站+Modbus TCP的网关神操作 JIANGHONGZN 协议网关 CAN MODBUS TCP 伺服电机工业通讯
在自动化仓库领域，货架机器人依赖伺服系统精准定位实现货物高效存取。在此场景下，JH-CAN-TCP疆鸿智能CAN主站转ModbusTCP作为从站接西门子PLC，CAN作为主站连接伺服电机，而网关在其中发挥着关键作用。网关充当通信协议转换的桥梁。一方面，它将CAN主站与伺服电机连接。CAN总线凭借高可靠性、强实时性及多主通信等优势，能快速、稳定地向伺服电机发送指令，并实时获取电机的运行状态数据，如位
GNN--知识图谱（逐步贯通基础到项目实践）峙峙峙图神经网络知识图谱人工智能
原文仓库链接：知识图谱–贯通已有知识地图记录知识关系图谱和跨学科碰撞新启发知识图谱mermaid可能需要下载插件才能渲染线性代数神经网络深度学习框架硬件加速图论GNN框架交叉理解前向理解定义：前向理解：A–>B，A为B的基础铺垫知识，通过深入学习A对B有更好的理解01.LinearAlgebraforLinearLayerofNN从线性代数行列变换的角度看神经网络中的线性层线性代数矩阵乘法，可以理
运维打铁: 数据库主从复制与读写分离配置懂搬砖运维打铁原力计划运维数据库 adb
文章目录思维导图一、数据库主从复制原理配置步骤1.主库配置2.从库配置3.验证配置二、数据库读写分离原理配置方法1.中间件实现2.应用层实现总结思维导图数据库主从复制与读写分离配置数据库主从复制数据库读写分离原理配置步骤主库配置从库配置验证配置原理配置方法中间件实现应用层实现一、数据库主从复制原理数据库主从复制是一种将主数据库的数据复制到一个或多个从数据库的技术。主数据库负责处理写操作，从数据库负
【Arduino 动手做】DIY Arduino 机器人手臂，带智能手机控制驴友花雕 Arduino动手做机器人智能手机嵌入式硬件单片机 c++机器人手臂带智能手机控制 Arduino 动手做
《Arduino手册（思路与案例）》栏目介绍：在电子制作与智能控制的应用领域，本栏目涵盖了丰富的内容，包括但不限于以下主题：ArduinoBLDC、ArduinoCNC、ArduinoE-Ink、ArduinoESP32SPP、ArduinoFreeRTOS、ArduinoFOC、ArduinoGRBL、ArduinoHTTP、ArduinoHUB75、ArduinoIoTCloud、Arduin
java封装继承多态等麦田的设计者 java eclipse jvm c encapsulatopn
最近一段时间看了很多的视频却忘记总结了，现在只能想到什么写什么了，希望能起到一个回忆巩固的作用。 1、final关键字译为：最终的 &
F5与集群的区别 bijian1013 weblogic 集群 F5
http请求配置不是通过集群，而是F5；集群是weblogic容器的，如果是ejb接口是通过集群。 F5同集群的差别，主要还是会话复制的问题，F5一把是分发http请求用的，因为http都是无状态的服务，无需关注会话问题，类似
LeetCode[Math] - #7 Reverse Integer Cwind java 题解 Math LeetCode Algorithm
原题链接：#7 Reverse Integer 要求：按位反转输入的数字例1：输入 x = 123, 返回 321 例2：输入 x = -123, 返回 -321 难度：简单分析：对于一般情况，首先保存输入数字的符号，然后每次取输入的末位（x%10）作为输出的高位（result = result*10 + x%10）即可。但
BufferedOutputStream 周凡杨
首先说一下这个大批量，是指有上千万的数据量。例子：有一张短信历史表，其数据有上千万条数据，要进行数据备份到文本文件，就是执行如下SQL然后将结果集写入到文件中！ select t.msisd
linux下模拟按键输入和鼠标被触发 linux
查看/dev/input/eventX是什么类型的事件， cat /proc/bus/input/devices 设备有着自己特殊的按键键码，我需要将一些标准的按键，比如0－9，X－Z等模拟成标准按键，比如KEY_0,KEY-Z等，所以需要用到按键模拟，具体方法就是操作/dev/input/event1文件，向它写入个input_event结构体就可以模拟按键的输入了。 linux/in
ContentProvider初体验肆无忌惮_ ContentProvider
ContentProvider在安卓开发中非常重要。与Activity，Service，BroadcastReceiver并称安卓组件四大天王。在android中的作用是用来对外共享数据。因为安卓程序的数据库文件存放在data/data/packagename里面，这里面的文件默认都是私有的，别的程序无法访问。如果QQ游戏想访问手机QQ的帐号信息一键登录，那么就需要使用内容提供者COnte
关于Spring MVC项目（maven）中通过fileupload上传文件 843977358 mybatis spring mvc 修改头像上传文件 upload
Spring MVC 中通过fileupload上传文件，其中项目使用maven管理。 1.上传文件首先需要的是导入相关支持jar包：commons-fileupload.jar,commons-io.jar 因为我是用的maven管理项目，所以要在pom文件中配置（每个人的jar包位置根据实际情况定） <!-- 文件上传 start by zhangyd-c --&g
使用svnkit api，纯java操作svn，实现svn提交，更新等操作 aigo svnkit
原文：http://blog.csdn.net/hardwin/article/details/7963318 import java.io.File; import org.apache.log4j.Logger; import org.tmatesoft.svn.core.SVNCommitInfo; import org.tmateso
对比浏览器，casperjs，httpclient的Header信息 alleni123 爬虫 crawler header
@Override protected void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException { String type=req.getParameter("type"); Enumeration es=re
java.io操作 DataInputStream和DataOutputStream基本数据流百合不是茶 java 流
1，java中如果不保存整个对象，只保存类中的属性，那么我们可以使用本篇文章中的方法，如果要保存整个对象先将类实例化后面的文章将详细写到 2，DataInputStream 是java.io包中一个数据输入流允许应用程序以与机器无关方式从底层输入流中读取基本 Java 数据类型。应用程序可以使用数据输出流写入稍后由数据输入流读取的数据。
车辆保险理赔案例 bijian1013 车险
理赔案例：一货运车，运输公司为车辆购买了机动车商业险和交强险，也买了安全生产责任险，运输一车烟花爆竹，在行驶途中发生爆炸，出现车毁、货损、司机亡、炸死一路人、炸毁一间民宅等惨剧，针对这几种情况，该如何赔付。赔付建议和方案：客户所买交强险在这里不起作用，因为交强险的赔付前提是：“机动车发生道路交通意外事故”；如果是交通意外事故引发的爆炸，则优先适用交强险条款进行赔付，不足的部分由商业
学习Spring必学的Java基础知识(5)—注解 bijian1013 java spring
文章来源：http://www.iteye.com/topic/1123823，整理在我的博客有两个目的：一个是原文确实很不错，通俗易懂，督促自已将博主的这一系列关于Spring文章都学完；另一个原因是为免原文被博主删除，在此记录，方便以后查找阅读。有必要对
【Struts2一】Struts2 Hello World bit1129 Hello world
Struts2 Hello World应用的基本步骤创建Struts2的Hello World应用，包括如下几步： 1.配置web.xml 2.创建Action 3.创建struts.xml，配置Action 4.启动web server，通过浏览器访问配置web.xml <?xml version="1.0" encoding="
【Avro二】Avro RPC框架 bit1129 rpc
1. Avro RPC简介 1.1. RPC RPC逻辑上分为二层，一是传输层，负责网络通信；二是协议层，将数据按照一定协议格式打包和解包从序列化方式来看，Apache Thrift 和Google的Protocol Buffers和Avro应该是属于同一个级别的框架，都能跨语言，性能优秀，数据精简，但是Avro的动态模式（不用生成代码，而且性能很好）这个特点让人非常喜欢，比较适合R
lua　set get cookie ronin47 lua cookie
lua: local access_token = ngx.var.cookie_SGAccessToken if access_token then ngx.header["Set-Cookie"] = "SGAccessToken="..access_token.."; path=/;Max-Age=3000" end
java-打印不大于N的质数 bylijinnan java
public class PrimeNumber { /** * 寻找不大于N的质数 */ public static void main(String[] args) { int n=100; PrimeNumber pn=new PrimeNumber(); pn.printPrimeNumber(n); System.out.print
Spring源码学习-PropertyPlaceholderHelper bylijinnan java spring
今天在看Spring 3.0.0.RELEASE的源码，发现PropertyPlaceholderHelper的一个bug 当时觉得奇怪，上网一搜，果然是个bug，不过早就有人发现了，且已经修复：详见： http://forum.spring.io/forum/spring-projects/container/88107-propertyplaceholderhelper-bug
[逻辑与拓扑]布尔逻辑与拓扑结构的结合会产生什么? comsci 拓扑
如果我们已经在一个工作流的节点中嵌入了可以进行逻辑推理的代码,那么成百上千个这样的节点如果组成一个拓扑网络,而这个网络是可以自动遍历的,非线性的拓扑计算模型和节点内部的布尔逻辑处理的结合,会产生什么样的结果呢? 是否可以形成一种新的模糊语言识别和处理模型呢? 大家有兴趣可以试试,用软件搞这些有个好处,就是花钱比较少,就算不成
ITEYE 都换百度推广了 cuisuqiang Google AdSense 百度推广广告外快
以前ITEYE的广告都是谷歌的Google AdSense，现在都换成百度推广了。为什么个人博客设置里面还是Google AdSense呢？都知道Google AdSense不好申请，这在ITEYE上也不是讨论了一两天了，强烈建议ITEYE换掉Google AdSense。至少，用一个好申请的吧。什么时候能从ITEYE上来点外快，哪怕少点
新浪微博技术架构分析 dalan_123 新浪微博架构
新浪微博在短短一年时间内从零发展到五千万用户，我们的基层架构也发展了几个版本。第一版就是是非常快的，我们可以非常快的实现我们的模块。我们看一下技术特点，微博这个产品从架构上来分析，它需要解决的是发表和订阅的问题。我们第一版采用的是推的消息模式，假如说我们一个明星用户他有10万个粉丝，那就是说用户发表一条微博的时候，我们把这个微博消息攒成10万份，这样就是很简单了，第一版的架构实际上就是这两行字。第
玩转ARP攻击 dcj3sjt126com r
我写这片文章只是想让你明白深刻理解某一协议的好处。高手免看。如果有人利用这片文章所做的一切事情，盖不负责。网上关于ARP的资料已经很多了，就不用我都说了。用某一位高手的话来说，“我们能做的事情很多，唯一受限制的是我们的创造力和想象力”。 ARP也是如此。以下讨论的机子有一个要攻击的机子：10.5.4.178 硬件地址：52:54:4C:98
PHP编码规范 dcj3sjt126com 编码规范
一、文件格式 1. 对于只含有 php 代码的文件，我们将在文件结尾处忽略掉 "?>" 。这是为了防止多余的空格或者其它字符影响到代码。例如：<?php$foo = 'foo';2. 缩进应该能够反映出代码的逻辑结果，尽量使用四个空格，禁止使用制表符TAB，因为这样能够保证有跨客户端编程器软件的灵活性。例
linux 脱机管理（nohup） eksliang linux nohup nohup
脱机管理 nohup 转载请出自出处：http://eksliang.iteye.com/blog/2166699 nohup可以让你在脱机或者注销系统后，还能够让工作继续进行。他的语法如下 nohup [命令与参数] --在终端机前台工作 nohup [命令与参数] & --在终端机后台工作但是这个命令需要注意的是，nohup并不支持bash的内置命令，所
BusinessObjects Enterprise Java SDK greemranqq java BO SAP Crystal Reports
最近项目用到oracle_ADF 从SAP/BO 上调用水晶报表，资料比较少，我做一个简单的分享，给和我一样的新手提供更多的便利。首先，我是尝试用JAVA JSP 去访问的。官方API：http://devlibrary.businessobjects.com/BusinessObjectsxi/en/en/BOE_SDK/boesdk_ja
系统负载剧变下的管控策略 iamzhongyong 高并发
假如目前的系统有100台机器，能够支撑每天1亿的点击量（这个就简单比喻一下），然后系统流量剧变了要，我如何应对，系统有那些策略可以处理，这里总结了一下之前的一些做法。 1、水平扩展这个最容易理解，加机器，这样的话对于系统刚刚开始的伸缩性设计要求比较高，能够非常灵活的添加机器，来应对流量的变化。 2、系统分组假如系统服务的业务不同，有优先级高的，有优先级低的，那就让不同的业务调用提前分组
BitTorrent DHT 协议中文翻译 justjavac bit
前言做了一个磁力链接和BT种子的搜索引擎 {Magnet & Torrent}，因此把 DHT 协议重新看了一遍。 BEP: 5Title: DHT ProtocolVersion: 3dec52cb3ae103ce22358e3894b31cad47a6f22bLast-Modified: Tue Apr 2 16:51:45 2013 -070
Ubuntu下Java环境的搭建 macroli java 工作 ubuntu
配置命令：　　$sudo apt-get install ubuntu-restricted-extras 　　再运行如下命令：　　$sudo apt-get install sun-java6-jdk 　　待安装完毕后选择默认Java. 　　$sudo update- alternatives --config java 　　安装过程提示选择，输入“2”即可，然后按回车键确定。
js字符串转日期（兼容IE所有版本） qiaolevip TO Date String IE
/** * 字符串转时间（yyyy-MM-dd HH:mm:ss） * result （分钟） */ stringToDate : function(fDate){ var fullDate = fDate.split(" ")[0].split("-"); var fullTime = fDate.split("
【数据挖掘学习】关联规则算法Apriori的学习与SQL简单实现购物篮分析 superlxw1234 sql 数据挖掘关联规则
关联规则挖掘用于寻找给定数据集中项之间的有趣的关联或相关关系。关联规则揭示了数据项间的未知的依赖关系，根据所挖掘的关联关系，可以从一个数据对象的信息来推断另一个数据对象的信息。例如购物篮分析。牛奶 ⇒ 面包 [支持度：3%，置信度：40%] 支持度3%：意味3%顾客同时购买牛奶和面包。置信度40%：意味购买牛奶的顾客40%也购买面包。规则的支持度和置信度是两个规则兴
Spring 5.0 的系统需求，期待你的反馈 wiselyman spring
Spring 5.0将在2016年发布。Spring5.0将支持JDK 9。 Spring 5.0的特性计划还在工作中，请保持关注，所以作者希望从使用者得到关于Spring 5.0系统需求方面的反馈。