对话系统 Rasa 1 - 初识跑通

文章目录

    • 关于 Rasa
      • Rasa 结构
      • 关于对话系统
      • 安装 rasa
    • rasa 常见命令
    • rasa init
    • 运行 examples
      • 1、rasa train 训练模型
      • 2、rasa shell 启用交互模式
      • 3、rasa run actions
      • 4、启动服务 5005
      • 5、rasa visualize 生成故事图


关于 Rasa

Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

  • github: https://github.com/RasaHQ/rasa
  • 文档:https://rasa.com/docs/

Rasa 结构

Rasa 框架包含4个部分:

  • Rasa NLU:提取用户意图 和 关键的上下文信息
  • Rasa Core:根据对话历史,选择最优的回复和动作
  • 通道 channel 和 动作 action,连接对话机器人、用户、后端服务系统
  • tracker store、lock store 和 event broker 等辅助系统

相关书籍教程:

  • 《Rasa实战:构建开源对话机器人》 作者: 孔晓泉 / 王冠
    https://book.douban.com/subject/35777057/
    随书代码
    https://github.com/Chinese-NLP-book/rasa_chinese_book_code

关于对话系统

典型对话系统

  • IR-BOT:检索型问答系统
    如提问回答,不需要参考上下文内容的形式
  • Task-bot:任务型对话系统
    如,订票系统
  • Chitchat-bot: 闲聊系统
    如:微软小冰等陪聊

任务型对话主要包括以下部分:

  • 理解模块: ASR 语音识别 + SLU 自然语言理解
  • 对话管理模块: DST状态追踪,DPO 策略优化
  • 产生模块 : NLG 自然语言生成 + TTS 语音合成

SLU 自然语言理解
SLU: Spoken Language Understanding
自然语言理解 要有一种表示自然语言含义的形式,一般采用传统的三元组方式,即:

  • action,意图
  • slot,需要填充的槽值
  • value,对应的值

DST 对话状态追踪
DST: Dialogue State Tracking

  • 对话状态应该包含持续对话所需要的各种信息
  • DST 问题:一句最新的系统和用户动作,更新对话状态

DPO 策略优化
DPO: Dialogue Policy Optimization

  • 系统如何做出反馈动作
    • 作为序列决策过程进行优化:增强学习

自然语言生成也有很多方法,比如:

  • 基于模板
  • 基于语法规则
  • 基于生成模型

安装 rasa

linux 上,使用 pip 安装即可

pip install rasa

在安装过程中,可能会提醒某些包缺失(如 spacy)或者版本不合适(如 SQLAlchemy 需要在 2.0.0 以上),提高版本即可。
你也可以为 rasa 项目新建一个 env,来避免和其他工程的冲突。
在 macOS 上只运行这个命令,或者源码安装,pip 显示安装成功,但可能调用会出问题,还没找到原因,欢迎反馈。


rasa 常见命令

  • rasa init, 创建一个新的项目,包含样本训练模型、配置和动作;
  • rasa train, 使用 NLU 训练数据、故事数据和配置训练模型;
  • rata interactive,交互式的训练,通过和机器人对话 修正可能的错误,并将对话数据导出;
  • rasa run,运行 rasa 服务器;
  • rasa shell,等价于执行 rasa run 命令,开启基于命令行界面的对话界面 和机器人进行交流;
  • rasa run actions,运行 rasa 动作服务器;
  • rasa x,启动 rasa x 服务器;
  • rasa -h,打印帮助信息;
  • telemetry:Configuration of Rasa Open Source telemetry reporting.
  • rasa test,使用测试NLU数据和stories来测试Rasa模型。
  • rasa visualize,可视化stories。
  • rasa data,训练数据的工具。
  • rasa export,通过一个event broker导出会话。
  • rasa evaluate,评估模型的工具。

这些命令都可以后接 -h 来查看帮助信息和更多选项。


rasa init

rasa 安装成功后,可以使用 rasa init 初始化一个工程,训练模型并启用交互模式,你可以输入英文进行对话;需要结束对话,可以输入 /stop 来停止。
终端打印的日志,可以看到训练过程、模型保存的地址等。

/home/xx/.local/lib/python3.7/site-packages/rasa/core/tracker_store.py:876: MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
  Base: DeclarativeMeta = declarative_base()
┌────────────────────────────────────────────────────────────────────────────────┐
│ Rasa Open Source reports anonymous usage telemetry to help improve the product │
│ for all its users.                                                             │
│                                                                                │
│ If you'd like to opt-out, you can use `rasa telemetry disable`.                │
│ To learn more, check out https://rasa.com/docs/rasa/telemetry/telemetry.       │
└────────────────────────────────────────────────────────────────────────────────┘
Welcome to Rasa! 

To get started quickly, an initial project will be created.
If you need some help, check out the documentation at https://rasa.com/docs/rasa.
Now let's start! 

? Please enter a path where the project will be created [default: current directory]

Created project directory at '/home/xx/scode/rasa_demos/bot1'.
Finished creating project structure.
? Do you want to train an initial model?  Yes
Training an initial model...
/home/xx/.local/lib/python3.7/site-packages/past/types/oldstr.py:5: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Iterable
/home/xx/.local/lib/python3.7/site-packages/past/builtins/misc.py:4: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Mapping
The configuration for pipeline and policies was chosen automatically. It was written into the config file at 'config.yml'.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Starting to train component 'RegexFeaturizer'.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Finished training component 'RegexFeaturizer'.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Starting to train component 'LexicalSyntacticFeaturizer'.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Finished training component 'LexicalSyntacticFeaturizer'.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Starting to train component 'CountVectorsFeaturizer'.
2023-02-01 19:30:10 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 80 vocabulary items were created for text attribute.
2023-02-01 19:30:10 INFO     rasa.engine.training.hooks  - Finished training component 'CountVectorsFeaturizer'.
2023-02-01 19:30:11 INFO     rasa.engine.training.hooks  - Starting to train component 'CountVectorsFeaturizer'.
2023-02-01 19:30:11 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 697 vocabulary items were created for text attribute.
2023-02-01 19:30:11 INFO     rasa.engine.training.hooks  - Finished training component 'CountVectorsFeaturizer'.
2023-02-01 19:30:11 INFO     rasa.engine.training.hooks  - Starting to train component 'DIETClassifier'.
Epochs: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:29<00:00,  3.41it/s, t_loss=1.1, i_acc=1]


2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Finished training component 'DIETClassifier'.
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Starting to train component 'EntitySynonymMapper'.
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Finished training component 'EntitySynonymMapper'.
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Starting to train component 'ResponseSelector'.
2023-02-01 19:30:44 INFO     rasa.nlu.selectors.response_selector  - Retrieval intent parameter was left to its default value. This response selector will be trained on training examples combining all retrieval intents.
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Finished training component 'ResponseSelector'.

Processed story blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 949.87it/s, # trackers=1]
Processed story blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 435.97it/s, # trackers=3]
Processed story blocks: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 128.48it/s, # trackers=12]
Processed story blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 40.57it/s, # trackers=39]
Processed rules: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2016.49it/s, # trackers=1]
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Starting to train component 'MemoizationPolicy'.
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 977.01it/s, # action=12]
Processed actions: 12it [00:00, 7184.08it/s, # examples=12]
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Finished training component 'MemoizationPolicy'.
2023-02-01 19:30:44 INFO     rasa.engine.training.hooks  - Starting to train component 'RulePolicy'.
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1059.17it/s, # action=5]
Processed actions: 5it [00:00, 12468.20it/s, # examples=4]
Processed trackers: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 1012.46it/s, # action=12]
Processed trackers: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1449.81it/s]
Processed trackers: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 952.08it/s]


2023-02-01 19:30:45 INFO     rasa.engine.training.hooks  - Finished training component 'RulePolicy'.
2023-02-01 19:30:45 INFO     rasa.engine.training.hooks  - Starting to train component 'TEDPolicy'.

Processed trackers: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [00:00<00:00, 1389.82it/s, # action=30]
Epochs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:24<00:00,  4.15it/s, t_loss=0.7, loss=0.537, acc=1]

2023-02-01 19:31:10 INFO     rasa.engine.training.hooks  - Finished training component 'TEDPolicy'.
2023-02-01 19:31:10 INFO     rasa.engine.training.hooks  - Starting to train component 'UnexpecTEDIntentPolicy'.

2023-02-01 19:31:10 WARNING  rasa.shared.utils.common  - The UnexpecTED Intent Policy is currently experimental and might change or be removed in the future  Please share your feedback on it in the forum (https://forum.rasa.com) to help us make this feature ready for production.
Processed trackers: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [00:00<00:00, 2348.53it/s, # intent=12]


Epochs: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:21<00:00,  4.70it/s, t_loss=0.125, loss=0.0121, acc=1]
2023-02-01 19:31:35 INFO     rasa.engine.training.hooks  - Finished training component 'UnexpecTEDIntentPolicy'.

Your Rasa model is trained and saved at 'models/20230201-193003-allegro-golf.tar.gz'.
? Do you want to speak to the trained assistant on the command line?  Yes

/home/xx/.local/lib/python3.7/site-packages/sanic_cors/extension.py:39: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  SANIC_VERSION = LooseVersion(sanic_version)
2023-02-01 19:37:27 INFO     root  - Connecting to channel 'cmdline' which was specified by the '--connector' argument. Any other channels will be ignored. To connect to all given channels, omit the '--connector' argument.
2023-02-01 19:37:27 INFO     root  - Starting Rasa server on http://0.0.0.0:5005
2023-02-01 19:37:29 INFO     rasa.core.processor  - Loading model models/20230201-193003-allegro-golf.tar.gz...
2023-02-01 19:37:50 WARNING  rasa.shared.utils.common  - The UnexpecTED Intent Policy is currently experimental and might change or be removed in the future  Please share your feedback on it in the forum (https://forum.rasa.com) to help us make this feature ready for production.
2023-02-01 19:38:02 INFO     root  - Rasa server is up and running.
Bot loaded. Type a message and press enter (use '/stop' to exit): 
Your input ->                                                                                                                                                                                                
Great, carry on!
Your input ->  hello                                                                                                                                                                                         
Hey! How are you?
Your input ->  are you ok                                                                                                                                                                                    
I am a bot, powered by Rasa.
Your input ->  what to eat tonight?                                                                                                                                                                          
Bye

这个工程生成的目录如下:

.
├── actions
│   ├── actions.py
│   ├── __init__.py
│   └── __pycache__
│       ├── actions.cpython-37.pyc
│       └── __init__.cpython-37.pyc
├── config.yml
├── credentials.yml
├── data
│   ├── nlu.yml
│   ├── rules.yml
│   └── stories.yml
├── domain.yml
├── endpoints.yml
├── models
│   └── 20230201-193003-allegro-golf.tar.gz
└── tests
    └── test_stories.yml

数据文件说明可参考文章: https://blog.csdn.net/lovechris00/article/details/128882414


运行 examples

下载源码:https://github.com/RasaHQ/rasa , 可以在 rasa-main/examples 看到有一下例子:

concertbot  e2ebot  formbot knowledgebasebot  moodbot  nlg_server  reminderbot  responseselectorbot  rules

进入一个,如 reminderbot;每个示例都有对应的 README.md;可以参考它来玩。

1、rasa train 训练模型

执行 rasa train 即可训练模型,结果将保存在 reminderbot/models/ 下,我的模型命名为 20230202-202229-binary-string.tar.gz,可以发现,命名和时间相关。


2、rasa shell 启用交互模式

前提是你已经有模型在 models 文件夹下了;
如,我这里

reminderbot$ rasa shell
/home/xx/.local/lib/python3.7/site-packages/rasa/core/tracker_store.py:876: MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
  Base: DeclarativeMeta = declarative_base()
/home/xx/.local/lib/python3.7/site-packages/past/types/oldstr.py:5: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Iterable
/home/xx/.local/lib/python3.7/site-packages/past/builtins/misc.py:4: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Mapping
/home/xx/.local/lib/python3.7/site-packages/sanic_cors/extension.py:39: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  SANIC_VERSION = LooseVersion(sanic_version)
2023-02-04 10:46:08 INFO     root  - Connecting to channel 'cmdline' which was specified by the '--connector' argument. Any other channels will be ignored. To connect to all given channels, omit the '--connector' argument.
2023-02-04 10:46:08 INFO     root  - Starting Rasa server on http://0.0.0.0:5005
2023-02-04 10:46:10 INFO     rasa.core.processor  - Loading model models/20230202-202229-binary-string.tar.gz...
/home/xx/.local/lib/python3.7/site-packages/rasa/utils/train_utils.py:531: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss.
  category=UserWarning,
2023-02-04 10:46:24 INFO     rasa.nlu.utils.spacy_utils  - Trying to load SpaCy model with name 'en_core_web_md'.
2023-02-04 10:46:27 INFO     rasa.nlu.utils.spacy_utils  - Trying to load SpaCy model with name 'en_core_web_md'.
2023-02-04 10:46:29 INFO     root  - Rasa server is up and running.
Bot loaded. Type a message and press enter (use '/stop' to exit): 
Your input ->  hello                                                                                                                 
What can I do for you?
Your input ->  /stop                                                                                                                 
2023-02-04 10:47:17 INFO     root  - Killing Sanic server now.

3、rasa run actions

运行Rasa SDK action server

前提是该目录下有 actions/actions.py 文件;

rasa run actions

你可以使用下面命令,查看更多选项

rasa run actions -h

python callback_server.py

4、启动服务 5005

rasa run --enable-api 

这将在 http://0.0.0.0:5005 启动服务;你可以发送请求:

curl -XPOST http://localhost:5005/webhooks/callback/webhook \
-d '{"sender": "tester", "message": "hello"}' \
-H "Content-type: application/json"

如果你的端口被占用,可以查询并杀死对应端口服务:

$ lsof -i:5005
COMMAND   PID     USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
rasa    20886 newtranx   13u  IPv4 222171601      0t0  TCP *:5005 (LISTEN)

$ kill -9 20886

5、rasa visualize 生成故事图

rasa visualize

如果您的故事位于默认位置 data/ 以外的位置,则可以使用 –stories 标志指定它们的位置;


2023-02-04(六)

你可能感兴趣的:(NLP,rasa,对话系统)