AI-ChatBot-1.ChatterBot安装,Mongo安装,简单测试

目录

安装

各式各样的Adapter

MongoDB安装

基础版本

处理时间和数学计算的Adapter

导出语料到json文件

反馈式学习聊天机器人

使用Ubuntu数据集构建聊天机器人

一个中文的例子

利用已经提供好的小中文语料库


 

 

chatterbot是一款python接口的,基于一系列规则和机器学习算法完成的聊天机器人。具有结构清晰,可扩展性好,简单实用的特点。

开源库地址是:https://github.com/gunthercox/ChatterBot

语料库地址是:https://github.com/gunthercox/chatterbot-corpus/tree/master/chatterbot_corpus/data

安装

用pip 安装即可

C:\Users\Administrator>pip install chatterbot
Collecting chatterbot
  Downloading https://files.pythonhosted.org/packages/60/8e/cdcc5c8c97dc4d591dc96b6f452f31393fc1393bd1f37d2819bcce9d0d57/ChatterBot-0.8.7-py2.py3-none-any.whl (74kB)
    100% |████████████████████████████████| 81kB 3.2MB/s
Collecting pymongo<4.0,>=3.3 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/b2/8e/7171a56414354a4cb0862c9f2bde057c26f7cd0f28f982a3892fa0be5a89/pymongo-3.7.1-cp36-cp36m-win_amd64.whl (311kB)
    100% |████████████████████████████████| 317kB 5.1MB/s
Collecting python-dateutil<2.7,>=2.6 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/4b/0d/7ed381ab4fe80b8ebf34411d14f253e1cf3e56e2820ffa1d8844b23859a2/python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
    100% |████████████████████████████████| 194kB 4.3MB/s
Requirement already satisfied: nltk<4.0,>=3.2 in c:\python\python36\lib\site-packages (from chatterbot) (3.3)
Collecting python-twitter<4.0,>=3.0 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/e6/2c/9fc6565b57ce6f3cc8e20b6c4bde8960dd0857629d41654bce46a6dd0bf9/python_twitter-3.4.2-py2.py3-none-any.whl (61kB)
    100% |████████████████████████████████| 71kB 8.0MB/s
Collecting mathparse<0.2,>=0.1 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/ea/2d/43daf97570358559f5ea269a6bd85d0108a6d57121e8ee101d059056edd6/mathparse-0.1.1-py3-none-any.whl
Collecting sqlalchemy<1.3,>=1.2 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/aa/cc/348eec885d81f7260b07d961b3ececfc0aa82f7d4a8f45ff997e0d3f44ba/SQLAlchemy-1.2.11.tar.gz (5.6MB)
    100% |████████████████████████████████| 5.6MB 290kB/s
Collecting chatterbot-corpus<1.2,>=1.1 (from chatterbot)
  Downloading https://files.pythonhosted.org/packages/a4/8e/0417039ff044c4b4f9c5d63ff8b661381e3d9e33733b70446388b162d585/chatterbot_corpus-1.1.2-py2.py3-none-any.whl (116kB)
    100% |████████████████████████████████| 122kB 861kB/s
Requirement already satisfied: six>=1.5 in c:\python\python36\lib\site-packages (from python-dateutil<2.7,>=2.6->chatterbot) (1.11.0)
Collecting requests-oauthlib (from python-twitter<4.0,>=3.0->chatterbot)
  Downloading https://files.pythonhosted.org/packages/94/e7/c250d122992e1561690d9c0f7856dadb79d61fd4bdd0e598087dce607f6c/requests_oauthlib-1.0.0-py2.py3-none-any.whl
Collecting future (from python-twitter<4.0,>=3.0->chatterbot)
  Downloading https://files.pythonhosted.org/packages/00/2b/8d082ddfed935f3608cc61140df6dcbf0edea1bc3ab52fb6c29ae3e81e85/future-0.16.0.tar.gz (824kB)
    100% |████████████████████████████████| 829kB 2.9MB/s
Requirement already satisfied: requests in c:\python\python36\lib\site-packages (from python-twitter<4.0,>=3.0->chatterbot) (2.19.1)
Collecting PyYAML<4.0,>=3.12 (from chatterbot-corpus<1.2,>=1.1->chatterbot)
  Downloading https://files.pythonhosted.org/packages/4f/ca/5fad249c5032270540c24d2189b0ddf1396aac49b0bdc548162edcf14131/PyYAML-3.13-cp36-cp36m-win_amd64.whl (206kB)
    100% |████████████████████████████████| 215kB 3.9MB/s
Collecting oauthlib>=0.6.2 (from requests-oauthlib->python-twitter<4.0,>=3.0->chatterbot)
  Downloading https://files.pythonhosted.org/packages/e6/d1/ddd9cfea3e736399b97ded5c2dd62d1322adef4a72d816f1ed1049d6a179/oauthlib-2.1.0-py2.py3-none-any.whl (121kB)
    100% |████████████████████████████████| 122kB 3.8MB/s
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (3.0.4)
Requirement already satisfied: urllib3<1.24,>=1.21.1 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (1.23)
Requirement already satisfied: certifi>=2017.4.17 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (2018.8.24)
Requirement already satisfied: idna<2.8,>=2.5 in c:\python\python36\lib\site-packages (from requests->python-twitter<4.0,>=3.0->chatterbot) (2.7)
Installing collected packages: pymongo, python-dateutil, oauthlib, requests-oauthlib, future, python-twitter, mathparse, sqlalchemy, PyYAML, chatterbot-corpus, chatterbot
  Found existing installation: python-dateutil 2.7.3
    Uninstalling python-dateutil-2.7.3:
      Successfully uninstalled python-dateutil-2.7.3
  Running setup.py install for future ... done
  Running setup.py install for sqlalchemy ... done
Successfully installed PyYAML-3.13 chatterbot-0.8.7 chatterbot-corpus-1.1.2 future-0.16.0 mathparse-0.1.1 oauthlib-2.1.0 pymongo-3.7.1 python-dateutil-2.6.1 python-twitter-3.4.2 requests-oauthlib-1.0.0 sqlalchemy-1.2.11
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

 

各式各样的Adapter

chatterbot的聊天逻辑和输入输出以及存储,是由各种adapter来限定的,流程图如下:

AI-ChatBot-1.ChatterBot安装,Mongo安装,简单测试_第1张图片

 

MongoDB安装

ChatBot要使用
chatterbot.storage.MongoDatabaseAdapter,
必须先安装Mongo,
当然也可以跳过这一步,
直接使用默认的chatterbot.storage.SQLStorageAdapter

下载地址: 需要先注册一下

https://www.mongodb.com/download-center#enterprise

https://downloads.mongodb.com/win32/mongodb-win32-x86_64-enterprise-windows-64-4.0.2-signed.msi?_ga=2.119936011.2018295441.1536370911-184988497.1536370911

安装说明参看:

https://docs.mongodb.com/manual/tutorial/install-mongodb-enterprise-on-windows/?_ga=2.119936011.2018295441.1536370911-184988497.1536370911

直接点下一步安装就好了。

配置相关文件(建议在安装目录MongoDB文件夹下,与Server同级目录),

新建数据库目录:C:\Program Files\MongoDB\\MongoDB\data,

新建日志目录:C:\Program Files\MongoDB\\MongoDB\logs,此目录中新建mongodb.log文件,

新建配置文件:C:\Program Files\MongoDB\MongoDB\mongo.conf,内容如下:

##数据文件
dbpath=C:\Program Files\MongoDB\data

##日志文件
logpath=C:\Program Files\MongoDB\logs\mongodb.log

##错误日志采用追加模式
logappend=true

#启用日志文件,默认启用
journal=true

#这个选项可以过滤掉一些无用的日志信息,若需要调试使用请设置为false
quiet=true

#端口号默认为27001
port=27001

 配置MongoDB环境变量:编辑path,添加安装目录C:\Program Files\MongoDB\Server\4.0\bin;

执行Mongo.conf文件,等待几分钟创建data文件完成,注意,因为我默认安装在C盘,所以需要以管理员身份启动cmd:

Microsoft Windows [版本 10.0.17134.228]
(c) 2018 Microsoft Corporation。保留所有权利。

C:\WINDOWS\system32>mongod.exe -f "C:\Program Files\MongoDB\mongo.conf"
2018-09-08T10:19:35.605+0800 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'

注意:不要关闭这个cmd页面,以确保MongoDB是运行的,不然ChatterBot会失败

浏览器地址栏输入:http://127.0.0.1:27001/,出现下图表示配置成功:
AI-ChatBot-1.ChatterBot安装,Mongo安装,简单测试_第2张图片

测试MongoDB运行状态,执行如下命令,测试成功:

Microsoft Windows [版本 10.0.17134.228]
(c) 2018 Microsoft Corporation。保留所有权利。

C:\Users\hgy413>mongo
MongoDB shell version v4.0.2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.0.2
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
Server has startup warnings:
2018-09-08T09:55:22.304+0800 I CONTROL  [initandlisten]
2018-09-08T09:55:22.304+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2018-09-08T09:55:22.304+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2018-09-08T09:55:22.304+0800 I CONTROL  [initandlisten]
MongoDB Enterprise >

 

基础版本

# -*- coding: utf-8 -*-
from chatterbot import ChatBot

# 构建ChatBot并指定Adapter
# https://chatterbot.readthedocs.io/en/stable/chatterbot.html
bot = ChatBot('Default Response Example Bot',
              storage_adapter='chatterbot.storage.SQLStorageAdapter',
              logic_adapters=[
                             {
                                'import_path': 'chatterbot.logic.BestMatch'
                             },
                            {
                                'import_path': 'chatterbot.logic.LowConfidenceAdapter',
                                'threshold': 0.65,
                                 'default_response': 'I am sorry, but I do not understand.'
                            }
                            ],
                            #给定的语料是个列表
                            trainer='chatterbot.trainers.ListTrainer'
              )


# 手动给定一点语料用于训练
bot.train([
    'How can I help you?',
    'I want to create a chat bot',
    'Have you read the documentation?',
    'No, I have not',
    'This should help get you started: http://chatterbot.rtfd.org/en/latest/quickstart.html'
])

# 给定问题并取回结果
question = "How do I make an omelette?"
print(question)
response = bot.get_response(question)
print(response)

print("\n")
question = "how to make a chat bot"
print(question)
response = bot.get_response(question)
print(response)

结果:

List Trainer: [####################] 100%
How do I make an omelette?
I am sorry, but I do not understand.


how to make a chat bot
Have you read the documentation?

 再次运行

List Trainer: [####################] 100%
How do I make an omelette?
how to make a chat bot


how to make a chat bot
Have you read the documentation?

出现这样的结果是我运行了两次,它把我的输入how to make a chat bot记住了,并应用到了下一次会话

 

处理时间和数学计算的Adapter

# -*- coding:utf-8 -*-
from chatterbot import  ChatBot

bot = ChatBot(name="Math & Time Bot",
              logic_adapters=[
                  "chatterbot.logic.MathematicalEvaluation",
                  "chatterbot.logic.TimeLogicAdapter"
              ],
              input_adapter="chatterbot.input.VariableInputTypeAdapter",
              output_adapter="chatterbot.output.OutputAdapter"
              )

# 进行数学计算
question = "What is 4 + 9"
print(question)
response = bot.get_response(question)
print(response)
print("\n")

# 回答和时间相关的问题
question = "What time is it?"
print(question)
response = bot.get_response(question)
print(response)

结果:

What is 4 + 9
4 + 9 = 13


What time is it?
The current time is 11:03 AM

导出语料到json文件

# -*- coding:utf-8 -*-
from chatterbot import ChatBot
'''
如果一个已经训练好的chatbot,你想取出它的语料,用于别的chatbot构建,可以这么做
'''

chatbot = ChatBot(
    'Export Example Bot',
    trainer='chatterbot.trainers.ChatterBotCorpusTrainer'
)

# 训练一下咯
chatbot.train("chatterbot.corpus.english")

# 把语料导出到json文件中
chatbot.trainer.export_for_training("./my_export.json")

结果:

Connected to pydev debugger (build 182.4129.34)
ai.yml Training: [####################] 100%
botprofile.yml Training: [####################] 100%
computers.yml Training: [####################] 100%
conversations.yml Training: [####################] 100%
emotion.yml Training: [####################] 100%
food.yml Training: [####################] 100%
gossip.yml Training: [####################] 100%
greetings.yml Training: [####################] 100%
history.yml Training: [####################] 100%
humor.yml Training: [####################] 100%
literature.yml Training: [####################] 100%
money.yml Training: [####################] 100%
movies.yml Training: [####################] 100%
politics.yml Training: [####################] 100%
psychology.yml Training: [####################] 100%
science.yml Training: [####################] 100%
sports.yml Training: [####################] 100%
trivia.yml Training: [####################] 100%

反馈式学习聊天机器人

# -*- coding:utf-8 -*-
from chatterbot import ChatBot
import logging

"""
反馈式的聊天机器人,会根据你的反馈进行学习
"""
# 把下面这行前的注释去掉,可以把一些信息写入日志中
# logging.basicConfig(level=logging.INFO)

# 创建一个聊天机器人
bot = ChatBot(name="Feedback Learning Bot",
              storage_adapter="chatterbot.storage.SQLStorageAdapter",
              logic_adapters=[
                   'chatterbot.logic.BestMatch'
              ],
              input_adapter='chatterbot.input.TerminalAdapter',
              output_adapter='chatterbot.output.TerminalAdapter'
              )

DEFAULT_SESSION_ID = bot.storage.create_conversation()
def get_feedback():
    from chatterbot.utils import input_function

    text = input_function()#等待输入Yes或No
    if 'Yes' in text:
        return True
    elif 'No' in text:
        return False
    else:
        print('Please type either "Yes" or "No"')
        return get_feedback()

print('Type something to begin...')

# 每次用户有输入内容,这个循环就会开始执行
while True:
    try:
        input_statement = bot.input.process_input_statement()# 这里输入一点语句
        statement, response = bot.generate_response(input_statement, DEFAULT_SESSION_ID)#得到response语句

        print('\n Is "{}" this a coherent response to "{}"? \n'.format(response, input_statement))

        if get_feedback():#得到反馈
            bot.learn_response(response,input_statement)#学习
            # Update the conversation history for the bot
            # It is important that this happens last, after the learning step
            bot.storage.add_to_conversation(CONVERSATION_ID, statement, response)
        bot.output.process_response(response)

    # 直到按ctrl-c 或者 ctrl-d 才会退出
    except (KeyboardInterrupt, EOFError, SystemExit):
        break

结果:

Type something to begin...
hello

 Is "Hi" this a coherent response to "hello"? 

Yes
Hi
ok

 Is "" this a coherent response to "ok"? 

No

使用Ubuntu数据集构建聊天机器人

这个训练集有点大,500+M,我们可以提前把它下载到py文件同目录的data文件夹内( ./data/ubuntu_dialogs.tgz)

# -*-coding:utf-8 -*-
from chatterbot import ChatBot

'''
这是一个使用Ubuntu语料构建聊天机器人的例子
'''

chatbot = ChatBot(name="Example Bot",
                  trainer='chatterbot.trainers.UbuntuCorpusTrainer')

# 使用Ubuntu数据集开始训练
chatbot.train()

# 我们来看看训练后的机器人的应答
response = chatbot.get_response('How are you doing today?')
print(response)

结果:训练集非常大,时间占用很长。

。。。。。。。。。。。。。。
i know it's extra work but i'm trying to isolate a certain prob 4
FAT and NTFS should mount in ubuntu no problems 4
i have NTFS backup drive and i have full control, didnt have to do anything either 4
do this sudo mkdir /media/temp 4
sudo mount -t ntfs-3g /dev/sdb1 /media/temp 4
how is your user permissions, have you fiddled with them? 4
was only trying to help, you found the fix :D 4
hello? 4
。。。。。。。。。。。。。。。。

一个中文的例子

注意chatterbot,中文聊天机器人的场景下一定要用python3.X,用python2.7会有编码问题

# -*- coding:utf-8 -*-

#手动设置一些语料
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer

Chinese_bot = ChatBot("Training demo")
Chinese_bot.set_trainer(ListTrainer)
Chinese_bot.train([
    '你好',
    '你好',
    '有什么能帮你的?',
    '想买数据科学的课程',
    '具体是数据科学那块呢?',
    '机器学习',
])

#测试一下
while True:
    question = input("请输入:")
    response = Chinese_bot.get_response(question)
    print(response)
    if 'bye' in question:
        break

结果:

请输入:你好
你好
请输入:有什么可以帮你的
想买数据科学的课程
请输入:是数据科学哪块的呢?
机器学习
请输入:你是猪头吗
你好
请输入:具体是哪个猪头
机器学习
请输入:

利用已经提供好的小中文语料库

# -*- coding: utf-8 -*-
from  chatterbot import ChatBot
from  chatterbot.trainers import ChatterBotCorpusTrainer

chatbot = ChatBot(name="ChineseChatBot")
chatbot.set_trainer(ChatterBotCorpusTrainer)

# 使用中文语料库训练它
chatbot.train("chatterbot.corpus.chinese")

#测试一下
while True:
    question = input("请输入:")
    response = chatbot.get_response(question)
    print(response)
    if 'bye' in question:
        break

结果:

请输入:你好
你好
请输入:我的名字叫goo
我也还不错
请输入:你知道我的名字了吗?
我对你的感情,是人类和bot之间独有的信任和友谊 你可以把它叫做爱。
请输入:我的名字叫什么?
吃喝睡 还有旅行。 你喜欢旅行吗?

在这里有个大一点的中文语料库,不过还是很乱:链接:https://pan.baidu.com/s/10SLUTsJ-1fW6hPq8fWo74w 密码:40z8

这是List,所以使用ListTrainer

# -*- coding: utf-8 -*-
from  chatterbot import ChatBot
from  chatterbot.trainers import ListTrainer

chatbot = ChatBot(name="ChineseChatBot")
chatbot.set_trainer(ListTrainer)

# 使用中文语料库训练它
#chatbot.train("chatterbot.corpus.chinese")
chatbot.train("D:/DownLoad/Chinese_dialog.txt")

#测试一下
while True:
    question = input("请输入:")
    response = chatbot.get_response(question)
    print(response)
    if 'bye' in question:
        break

 

你可能感兴趣的:(AI)