rasa 以知识图谱为基础的action

本文翻译自:https://rasa.com/docs/action-server/knowledge-bases
仅供学习参考。

1、为什么要引入知识图谱?

答曰:在对话中,用户的输入并不总是某些对象的名字,而是用第几个或者它之类的引用话术,那么我们就需要跟踪这些对象信息,以便解析为用户所理解的正确对象;

并且用户还可能希望在对话中获得对象的详细的信息,比如《黑客帝国》的主演有谁?那么由于对象的信息非常多变,如果采取硬编码工程量太大,所以rasa提供了集成知识库来应对此挑战;

要使用此集成,可以创建从ActionQueryKnowledgeBase继承的自定义操作。

2、创建一个knowledge Base

一开始为了熟悉,我将先使用InMemoryKnowledgeBase,这个是什么意思?就是知识库是在内存中,还不是存储在数据库中,当数据量非常大的时候才创建(后续再研究);

那么为了初始化InMemoryKnowledgeBase,我们需要有一个json文件,下面为示例,下面的json包含了三个餐厅和三个酒店的信息,restaurant里面的是键值对,每个同级的对象都应该包含相同的键值对,其中id和name是必要的,如果你不想要,则必须要去修改InMemoryKnowledgeBase

{
    "restaurant": [
        {
            "id": 0,
            "name": "Donath",
            "cuisine": "Italian",
            "outside-seating": true,
            "price-range": "mid-range"
        },
        {
            "id": 1,
            "name": "Berlin Burrito Company",
            "cuisine": "Mexican",
            "outside-seating": false,
            "price-range": "cheap"
        },
        {
            "id": 2,
            "name": "I due forni",
            "cuisine": "Italian",
            "outside-seating": true,
            "price-range": "mid-range"
        }
    ],
    "hotel": [
        {
            "id": 0,
            "name": "Hilton",
            "price-range": "expensive",
            "breakfast-included": true,
            "city": "Berlin",
            "free-wifi": true,
            "star-rating": 5,
            "swimming-pool": true
        },
        {
            "id": 1,
            "name": "Hilton",
            "price-range": "expensive",
            "breakfast-included": true,
            "city": "Frankfurt am Main",
            "free-wifi": true,
            "star-rating": 4,
            "swimming-pool": false
        },
        {
            "id": 2,
            "name": "B&B",
            "price-range": "mid-range",
            "breakfast-included": false,
            "city": "Berlin",
            "free-wifi": false,
            "star-rating": 1,
            "swimming-pool": false
        },
    ]
}

3、定义NLU

新意图:query_knowledge_base,为了让bot知道用户希望从知识库中进行检索;
对提及的实体进行注释,以便模型能够检测到类似于“第一个”这样的引用话术;
广泛的使用同义词。
ActionQueryKnowledgeBase可以处理两种请求:

查询特定类型的对象列表,回到上面的例子就是查询有多少restaurant,那么将会返回一个列表;
查询某个对象的特定属性,这就是更细的查询了,比如查询restaurant里面Donath的cuisine;

那么意图应该包含这两种请求的多种变式。


- intent: query_knowledge_base
  examples: |
    - what [restaurants]{"entity": "object_type", "value": "restaurant"} can you recommend?
    - list some [restaurants]{"entity": "object_type", "value": "restaurant"}
    - can you name some [restaurants]{"entity": "object_type", "value": "restaurant"} please?
    - can you show me some [restaurant]{"entity": "object_type", "value": "restaurant"} options
    - list [German]{"entity": "cuisine"} [restaurants]{"entity": "object_type", "value": "restaurant"}
    - do you have any [mexican]{"entity": "cuisine"} [restaurants]{"entity": "object_type", "value": "restaurant"}?
    - do you know the [price range]{"entity": "attribute", "value": "price-range"} of [that one]{"entity": "mention"}?
    - what [cuisine]{"entity": "attribute"} is [it]{"entity": "mention"}?
    - do you know what [cuisine]{"entity": "attribute"} the [last one]{"entity": "mention", "value": "LAST"} has?
    - does [Donath]{"entity": "restaurant"} have [outside seating]{"entity": "attribute", "value": "outside-seating"}?
    - what is the [price range]{"entity": "attribute", "value": "price-range"} of [Berlin Burrito Company]{"entity": "restaurant"}?
    - what is with [I due forni]{"entity": "restaurant"}?
    - Do you also have any [Vietnamese]{"entity": "cuisine"} [restaurants]{"entity": "object_type", "value": "restaurant"}?
    - What about any [Mexican]{"entity": "cuisine", "value": "mexican"} [restaurants]{"entity": "object_type", "value": "restaurant"}?
    - Do you also know some [Italian]{"entity": "cuisine"} [restaurants]{"entity": "object_type", "value": "restaurant"}?
    - can you tell me the [price range]{"entity": "attribute", "value": "price-range"} of [that restaurant]{"entity": "mention"}?
    - what [cuisine]{"entity": "attribute"} do [they]{"entity": "mention"} have?
    - what [hotels]{"entity": "object_type", "value": "hotel"} can you recommend?
    - please list some [hotels]{"entity": "object_type", "value": "hotel"} in [Frankfurt am Main]{"entity": "city"} for me
    - what [hotels]{"entity": "object_type", "value": "hotel"} do you know in [Berlin]{"entity": "city"}?
    - name some [hotels]{"entity": "object_type", "value": "hotel"} in [Berlin]{"entity": "city"}
    - show me some [hotels]{"entity": "object_type", "value": "hotel"}
    - what are [hotels]{"entity": "object_type", "value": "hotel"} in [Berlin]{"entity": "city"}
    - does the [last]{"entity": "mention", "value": "LAST"} one offer [breakfast]{"entity": "attribute", "value": "breakfast-included"}?
    - does the [second one]{"entity": "mention", "value": "2"} [include breakfast]{"entity": "attribute", "value": "breakfast-included"}?
    - what is the [price range]{"entity": "attribute", "value": "price-range"} of the [second]{"entity": "mention", "value": "2"} hotel?
    - does the [first]{"entity": "mention", "value": "1"} one have [wifi]{"entity": "attribute", "value": "free-wifi"}?
    - does the [third]{"entity": "mention", "value": "3"} one have a [swimming pool]{"entity": "attribute", "value": "swimming-pool"}?
    - what is the [star rating]{"entity": "attribute", "value": "star-rating"} of [Berlin Wall Hostel]{"entity": "hotel"}?
    - Does the [Hilton]{"entity": "hotel"} have a [swimming pool]{"entity": "attribute", "value": "swimming-pool"}?

要在nlu中指定和注释一下的实体:

object_type (对象类型):每当Nlu中引用知识库的特定对象类型,该对象类型应该标记为实体,如 restaurant ,它是知识库里面的键;
mention (提及,引用):如果用户通过“第一个”、 “那个”、“它” 来引用对象,那么应该把这些属于标记为mention;
attribute:在知识库中的所有属性在nlu中有应该标记为attribute,可以使用同义词将属性名称映射到知识库中的名称。

还需要在domain中增加这些实体:

entities:
  - object_type
  - mention
  - attribute
 
slots:
  object_type:
    type: unfeaturized
  mention:
    type: unfeaturized
  attribute:
    type: unfeaturized

4、创建一个action去查询你的知识库

action.py
from rasa_sdk.knowledge_base.storage import InMemoryKnowledgeBase
from rasa_sdk.knowledge_base.actions import ActionQueryKnowledgeBase
 
class MyKnowledgeBaseAction(ActionQueryKnowledgeBase):
    def __init__(self):
        knowledge_base = InMemoryKnowledgeBase("data.json")
        super().__init__(knowledge_base)

必须 要将知识库传递给构造的类,现在这样是InMemoryKnowledgeBase,当然也可以是自己的知识库,但是需要注意只能从一个知识库里面提取信息,暂时还不支持同时使用多个知识库。

不要忘记将此action加入到domain 中

domain.yml
actions:
- action_query_knowledge_base

5、查询知识库对象

为了能够查询任何知识库的对象,用户请求需要包含object_type对象类型,来看下例子

用户:Can you please name some restaurants?

这个问题中包含了"restaruants",bot需要获取这个实体以行程查询

用户:What Italian restaurant options in Berlin do I have?

这个问题中用户希望获得(1)有意大利料理(2)位于柏林的餐厅列表,那么NER在处理中将获得这些属性,然后通过这些属性去知识库中过滤找到对应的餐厅。

相应的需要在NLU中做配合

intents:
- intent: query_knowledge_base
  examples: |
    - What [Italian](cuisine) [restaurant](object_type) options in [Berlin](city) do I have?.

这里的cuisine和city应该与知识库中的相对应,并讲这些作为实体和词槽加入domain里面。

6、查询知识库对象属性

这种查询用于用户希望查询有关对象的特定信息,那么该请求应该包含感兴趣的对象和属性,例如:

用户:What is the cuisine of Berlin Burrito Company?

其中cuisine(感兴趣的对象的属性) Berlin Burrito Company(感兴趣的对象)

NLU中也要有训练的数据

intents:
- intent: query_knowledge_base
  examples: |
    - What is the [cuisine](attribute) of [Berlin Burrito Company](restaurant)?

7、解决引用话术的问题

按照上面的例子,用户有可能不是用具体的名字来指代餐馆,也可能提及引用先前列出的对象,比如:

用户:What is the cuisine of the second restaurant you mentioned?

rasa可以解析两种引用类型:(1)有序停用,如“第一个”,(2)例如“它”或者“那一个”。

8、有序引用

当用户根据对象在列表中的位置引用对象时,成为顺序引用,如:

User: What restaurants in Berlin do you know?

Bot: Found the following objects of type ‘restaurant’: 1: I due forni 2: PastaBar 3: Berlin Burrito Company

User: Does the first one have outside seating?

有序引用通常在向用户展示对象列表的时候使用,为了能够解析这些引用为实际对象,在KnowledgeBase里面设置了有序的引用映射, /rasa-sdk/knowledge_base/storage.py

storage.py
class KnowledgeBase:
    def __init__(self) -> None:
 
        self.ordinal_mention_mapping = {
            "1": lambda l: l[0],
            "2": lambda l: l[1],
            "3": lambda l: l[2],
            "4": lambda l: l[3],
            "5": lambda l: l[4],
            "6": lambda l: l[5],
            "7": lambda l: l[6],
            "8": lambda l: l[7],
            "9": lambda l: l[8],
            "10": lambda l: l[9],
            "ANY": lambda l: random.choice(l),
            "LAST": lambda l: l[-1],
        }
    ....

如果提到“第一个”,那么可以在NLU中使用同义词来做映射

intents:
- intent: query_knowledge_base
  examples: |
    - Does the [first one]{entity: "mention", value": 1} have [outside seating]{entity: "attribute", value": "outside-seating"}

9、其他引用

看下面的一段对话

User: What is the cuisine of PastaBar?

Bot: PastaBar has an Italian cuisine.

User: Does it have wifi?

Bot: Yes.

User: Can you give me an address

如果NER检测到it, 那么知识库操作会将它解析为会话中最后提到的对象"PastaBar" 。

你可能感兴趣的:(Python学习,会话机器人,rasa)