目录
安装
各式各样的Adapter
MongoDB安装
基础版本
处理时间和数学计算的Adapter
导出语料到json文件
反馈式学习聊天机器人
使用Ubuntu数据集构建聊天机器人
一个中文的例子
利用已经提供好的小中文语料库
chatterbot是一款python接口的,基于一系列规则和机器学习算法完成的聊天机器人。具有结构清晰,可扩展性好,简单实用的特点。
开源库地址是:https://github.com/gunthercox/ChatterBot
语料库地址是:https://github.com/gunthercox/chatterbot-corpus/tree/master/chatterbot_corpus/data
用pip 安装即可
C:\Users\Administrator>pip install chatterbot
Collecting chatterbot
Downloading https://files.pythonhosted.org/packages/60/8e/cdcc5c8c97dc4d591dc96b6f452f31393fc1393bd1f37d2819bcce9d0d57/ChatterBot-0.8.7-py2.py3-none-any.whl (74kB)
100% |████████████████████████████████| 81kB 3.2MB/s
Collecting pymongo<4.0,>=3.3 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/b2/8e/7171a56414354a4cb0862c9f2bde057c26f7cd0f28f982a3892fa0be5a89/pymongo-3.7.1-cp36-cp36m-win_amd64.whl (311kB)
100% |████████████████████████████████| 317kB 5.1MB/s
Collecting python-dateutil<2.7,>=2.6 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/4b/0d/7ed381ab4fe80b8ebf34411d14f253e1cf3e56e2820ffa1d8844b23859a2/python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
100% |████████████████████████████████| 194kB 4.3MB/s
Requirement already satisfied: nltk<4.0,>=3.2 in c:\python\python36\lib\site-packages (from chatterbot) (3.3)
Collecting python-twitter<4.0,>=3.0 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/e6/2c/9fc6565b57ce6f3cc8e20b6c4bde8960dd0857629d41654bce46a6dd0bf9/python_twitter-3.4.2-py2.py3-none-any.whl (61kB)
100% |████████████████████████████████| 71kB 8.0MB/s
Collecting mathparse<0.2,>=0.1 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/ea/2d/43daf97570358559f5ea269a6bd85d0108a6d57121e8ee101d059056edd6/mathparse-0.1.1-py3-none-any.whl
Collecting sqlalchemy<1.3,>=1.2 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/aa/cc/348eec885d81f7260b07d961b3ececfc0aa82f7d4a8f45ff997e0d3f44ba/SQLAlchemy-1.2.11.tar.gz (5.6MB)
100% |████████████████████████████████| 5.6MB 290kB/s
Collecting chatterbot-corpus<1.2,>=1.1 (from chatterbot)
Downloading https://files.pythonhosted.org/packages/a4/8e/0417039ff044c4b4f9c5d63ff8b661381e3d9e33733b70446388b162d585/chatterbot_corpus-1.1.2-py2.py3-none-any.whl (116kB)
100% |████████████████████████████████| 122kB 861kB/s
Requirement already satisfied: six>=1.5 in c:\python\python36\lib\site-packages (from python-dateutil<2.7,>=2.6->chatterbot) (1.11.0)
Collecting requests-oauthlib (from python-twitter<4.0,>=3.0->chatterbot)
Downloading https://files.pythonhosted.org/packages/94/e7/c250d122992e1561690d9c0f7856dadb79d61fd4bdd0e598087dce607f6c/requests_oauthlib-1.0.0-py2.py3-none-any.whl
Collecting future (from python-twitter<4.0,>=3.0->chatterbot)
Downloading https://files.pythonhosted.org/packages/00/2b/8d082ddfed935f3608cc61140df6dcbf0edea1bc3ab52fb6c29ae3e81e85/future-0.16.0.tar.gz (824kB)
100% |████████████████████████████████| 829kB 2.9MB/s
Requirement already satisfied: requests in c:\python\python36\lib\site-packages (from python-twitter<4.0,>=3.0->chatterbot) (2.19.1)
Collecting PyYAML<4.0,>=3.12 (from chatterbot-corpus<1.2,>=1.1->chatterbot)
Downloading https://files.pythonhosted.org/packages/4f/ca/5fad249c5032270540c24d2189b0ddf1396aac49b0bdc548162edcf14131/PyYAML-3.13-cp36-cp36m-win_amd64.whl (206kB)
100% |████████████████████████████████| 215kB 3.9MB/s
Collecting oauthlib>=0.6.2 (from requests-oauthlib->python-twitter<4.0,>=3.0->chatterbot)
Downloading https://files.pythonhosted.org/packages/e6/d1/ddd9cfea3e736399b97ded5c2dd62d1322adef4a72d816f1ed1049d6a179/oauthlib-2.1.0-py2.py3-none-any.whl (121kB)
100% |████████████████████████████████| 122kB 3.8MB/s
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (3.0.4)
Requirement already satisfied: urllib3<1.24,>=1.21.1 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (1.23)
Requirement already satisfied: certifi>=2017.4.17 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (2018.8.24)
Requirement already satisfied: idna<2.8,>=2.5 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (2.7)
Installing collected packages: pymongo, python-dateutil, oauthlib, requests-oauthlib, future, python-twitter, mathparse, sqlalchemy, PyYAML, chatterbot-corpus, chatterbot
Found existing installation: python-dateutil 2.7.3
Uninstalling python-dateutil-2.7.3:
Successfully uninstalled python-dateutil-2.7.3
Running setup.py install for future ... done
Running setup.py install for sqlalchemy ... done
Successfully installed PyYAML-3.13 chatterbot-0.8.7 chatterbot-corpus-1.1.2 future-0.16.0 mathparse-0.1.1 oauthlib-2.1.0 pymongo-3.7.1 python-dateutil-2.6.1 python-twitter-3.4.2 requests-oauthlib-1.0.0 sqlalchemy-1.2.11
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
chatterbot的聊天逻辑和输入输出以及存储,是由各种adapter来限定的,流程图如下:
ChatBot要使用
chatterbot.storage.MongoDatabaseAdapter,
必须先安装Mongo,
当然也可以跳过这一步,
直接使用默认的chatterbot.storage.SQLStorageAdapter
下载地址: 需要先注册一下
https://www.mongodb.com/download-center#enterprise
https://downloads.mongodb.com/win32/mongodb-win32-x86_64-enterprise-windows-64-4.0.2-signed.msi?_ga=2.119936011.2018295441.1536370911-184988497.1536370911
安装说明参看:
https://docs.mongodb.com/manual/tutorial/install-mongodb-enterprise-on-windows/?_ga=2.119936011.2018295441.1536370911-184988497.1536370911
直接点下一步安装就好了。
配置相关文件(建议在安装目录MongoDB文件夹下,与Server同级目录),
新建数据库目录:C:\Program Files\MongoDB\\MongoDB\data,
新建日志目录:C:\Program Files\MongoDB\\MongoDB\logs,此目录中新建mongodb.log文件,
新建配置文件:C:\Program Files\MongoDB\MongoDB\mongo.conf,内容如下:
##数据文件
dbpath=C:\Program Files\MongoDB\data
##日志文件
logpath=C:\Program Files\MongoDB\logs\mongodb.log
##错误日志采用追加模式
logappend=true
#启用日志文件,默认启用
journal=true
#这个选项可以过滤掉一些无用的日志信息,若需要调试使用请设置为false
quiet=true
#端口号默认为27001
port=27001
配置MongoDB环境变量:编辑path,添加安装目录C:\Program Files\MongoDB\Server\4.0\bin;
执行Mongo.conf文件,等待几分钟创建data文件完成,注意,因为我默认安装在C盘,所以需要以管理员身份启动cmd:
Microsoft Windows [版本 10.0.17134.228]
(c) 2018 Microsoft Corporation。保留所有权利。
C:\WINDOWS\system32>mongod.exe -f "C:\Program Files\MongoDB\mongo.conf"
2018-09-08T10:19:35.605+0800 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
注意:不要关闭这个cmd页面,以确保MongoDB是运行的,不然ChatterBot会失败
浏览器地址栏输入:http://127.0.0.1:27001/,出现下图表示配置成功:
测试MongoDB运行状态,执行如下命令,测试成功:
Microsoft Windows [版本 10.0.17134.228]
(c) 2018 Microsoft Corporation。保留所有权利。
C:\Users\hgy413>mongo
MongoDB shell version v4.0.2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.0.2
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
Server has startup warnings:
2018-09-08T09:55:22.304+0800 I CONTROL [initandlisten]
2018-09-08T09:55:22.304+0800 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2018-09-08T09:55:22.304+0800 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2018-09-08T09:55:22.304+0800 I CONTROL [initandlisten]
MongoDB Enterprise >
# -*- coding: utf-8 -*-
from chatterbot import ChatBot
# 构建ChatBot并指定Adapter
# https://chatterbot.readthedocs.io/en/stable/chatterbot.html
bot = ChatBot('Default Response Example Bot',
storage_adapter='chatterbot.storage.SQLStorageAdapter',
logic_adapters=[
{
'import_path': 'chatterbot.logic.BestMatch'
},
{
'import_path': 'chatterbot.logic.LowConfidenceAdapter',
'threshold': 0.65,
'default_response': 'I am sorry, but I do not understand.'
}
],
#给定的语料是个列表
trainer='chatterbot.trainers.ListTrainer'
)
# 手动给定一点语料用于训练
bot.train([
'How can I help you?',
'I want to create a chat bot',
'Have you read the documentation?',
'No, I have not',
'This should help get you started: http://chatterbot.rtfd.org/en/latest/quickstart.html'
])
# 给定问题并取回结果
question = "How do I make an omelette?"
print(question)
response = bot.get_response(question)
print(response)
print("\n")
question = "how to make a chat bot"
print(question)
response = bot.get_response(question)
print(response)
结果:
List Trainer: [####################] 100%
How do I make an omelette?
I am sorry, but I do not understand.
how to make a chat bot
Have you read the documentation?
再次运行
List Trainer: [####################] 100%
How do I make an omelette?
how to make a chat bot
how to make a chat bot
Have you read the documentation?
出现这样的结果是我运行了两次,它把我的输入how to make a chat bot记住了,并应用到了下一次会话
# -*- coding:utf-8 -*-
from chatterbot import ChatBot
bot = ChatBot(name="Math & Time Bot",
logic_adapters=[
"chatterbot.logic.MathematicalEvaluation",
"chatterbot.logic.TimeLogicAdapter"
],
input_adapter="chatterbot.input.VariableInputTypeAdapter",
output_adapter="chatterbot.output.OutputAdapter"
)
# 进行数学计算
question = "What is 4 + 9"
print(question)
response = bot.get_response(question)
print(response)
print("\n")
# 回答和时间相关的问题
question = "What time is it?"
print(question)
response = bot.get_response(question)
print(response)
结果:
What is 4 + 9
4 + 9 = 13
What time is it?
The current time is 11:03 AM
# -*- coding:utf-8 -*-
from chatterbot import ChatBot
'''
如果一个已经训练好的chatbot,你想取出它的语料,用于别的chatbot构建,可以这么做
'''
chatbot = ChatBot(
'Export Example Bot',
trainer='chatterbot.trainers.ChatterBotCorpusTrainer'
)
# 训练一下咯
chatbot.train("chatterbot.corpus.english")
# 把语料导出到json文件中
chatbot.trainer.export_for_training("./my_export.json")
结果:
Connected to pydev debugger (build 182.4129.34)
ai.yml Training: [####################] 100%
botprofile.yml Training: [####################] 100%
computers.yml Training: [####################] 100%
conversations.yml Training: [####################] 100%
emotion.yml Training: [####################] 100%
food.yml Training: [####################] 100%
gossip.yml Training: [####################] 100%
greetings.yml Training: [####################] 100%
history.yml Training: [####################] 100%
humor.yml Training: [####################] 100%
literature.yml Training: [####################] 100%
money.yml Training: [####################] 100%
movies.yml Training: [####################] 100%
politics.yml Training: [####################] 100%
psychology.yml Training: [####################] 100%
science.yml Training: [####################] 100%
sports.yml Training: [####################] 100%
trivia.yml Training: [####################] 100%
# -*- coding:utf-8 -*-
from chatterbot import ChatBot
import logging
"""
反馈式的聊天机器人,会根据你的反馈进行学习
"""
# 把下面这行前的注释去掉,可以把一些信息写入日志中
# logging.basicConfig(level=logging.INFO)
# 创建一个聊天机器人
bot = ChatBot(name="Feedback Learning Bot",
storage_adapter="chatterbot.storage.SQLStorageAdapter",
logic_adapters=[
'chatterbot.logic.BestMatch'
],
input_adapter='chatterbot.input.TerminalAdapter',
output_adapter='chatterbot.output.TerminalAdapter'
)
DEFAULT_SESSION_ID = bot.storage.create_conversation()
def get_feedback():
from chatterbot.utils import input_function
text = input_function()#等待输入Yes或No
if 'Yes' in text:
return True
elif 'No' in text:
return False
else:
print('Please type either "Yes" or "No"')
return get_feedback()
print('Type something to begin...')
# 每次用户有输入内容,这个循环就会开始执行
while True:
try:
input_statement = bot.input.process_input_statement()# 这里输入一点语句
statement, response = bot.generate_response(input_statement, DEFAULT_SESSION_ID)#得到response语句
print('\n Is "{}" this a coherent response to "{}"? \n'.format(response, input_statement))
if get_feedback():#得到反馈
bot.learn_response(response,input_statement)#学习
# Update the conversation history for the bot
# It is important that this happens last, after the learning step
bot.storage.add_to_conversation(CONVERSATION_ID, statement, response)
bot.output.process_response(response)
# 直到按ctrl-c 或者 ctrl-d 才会退出
except (KeyboardInterrupt, EOFError, SystemExit):
break
结果:
Type something to begin...
hello
Is "Hi" this a coherent response to "hello"?
Yes
Hi
ok
Is "" this a coherent response to "ok"?
No
这个训练集有点大,500+M,我们可以提前把它下载到py文件同目录的data文件夹内( ./data/ubuntu_dialogs.tgz)
# -*-coding:utf-8 -*-
from chatterbot import ChatBot
'''
这是一个使用Ubuntu语料构建聊天机器人的例子
'''
chatbot = ChatBot(name="Example Bot",
trainer='chatterbot.trainers.UbuntuCorpusTrainer')
# 使用Ubuntu数据集开始训练
chatbot.train()
# 我们来看看训练后的机器人的应答
response = chatbot.get_response('How are you doing today?')
print(response)
结果:训练集非常大,时间占用很长。
。。。。。。。。。。。。。。
i know it's extra work but i'm trying to isolate a certain prob 4
FAT and NTFS should mount in ubuntu no problems 4
i have NTFS backup drive and i have full control, didnt have to do anything either 4
do this sudo mkdir /media/temp 4
sudo mount -t ntfs-3g /dev/sdb1 /media/temp 4
how is your user permissions, have you fiddled with them? 4
was only trying to help, you found the fix :D 4
hello? 4
。。。。。。。。。。。。。。。。
注意chatterbot,中文聊天机器人的场景下一定要用python3.X,用python2.7会有编码问题
# -*- coding:utf-8 -*-
#手动设置一些语料
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
Chinese_bot = ChatBot("Training demo")
Chinese_bot.set_trainer(ListTrainer)
Chinese_bot.train([
'你好',
'你好',
'有什么能帮你的?',
'想买数据科学的课程',
'具体是数据科学那块呢?',
'机器学习',
])
#测试一下
while True:
question = input("请输入:")
response = Chinese_bot.get_response(question)
print(response)
if 'bye' in question:
break
结果:
请输入:你好
你好
请输入:有什么可以帮你的
想买数据科学的课程
请输入:是数据科学哪块的呢?
机器学习
请输入:你是猪头吗
你好
请输入:具体是哪个猪头
机器学习
请输入:
# -*- coding: utf-8 -*-
from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer
chatbot = ChatBot(name="ChineseChatBot")
chatbot.set_trainer(ChatterBotCorpusTrainer)
# 使用中文语料库训练它
chatbot.train("chatterbot.corpus.chinese")
#测试一下
while True:
question = input("请输入:")
response = chatbot.get_response(question)
print(response)
if 'bye' in question:
break
结果:
请输入:你好
你好
请输入:我的名字叫goo
我也还不错
请输入:你知道我的名字了吗?
我对你的感情,是人类和bot之间独有的信任和友谊 你可以把它叫做爱。
请输入:我的名字叫什么?
吃喝睡 还有旅行。 你喜欢旅行吗?
在这里有个大一点的中文语料库,不过还是很乱:链接:https://pan.baidu.com/s/10SLUTsJ-1fW6hPq8fWo74w 密码:40z8
这是List,所以使用ListTrainer
# -*- coding: utf-8 -*-
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
chatbot = ChatBot(name="ChineseChatBot")
chatbot.set_trainer(ListTrainer)
# 使用中文语料库训练它
#chatbot.train("chatterbot.corpus.chinese")
chatbot.train("D:/DownLoad/Chinese_dialog.txt")
#测试一下
while True:
question = input("请输入:")
response = chatbot.get_response(question)
print(response)
if 'bye' in question:
break