Scrapy中把数据写入MongoDB

前言

本文参考自崔庆才的《Python3网络爬虫开发实战教程》一书,如想深入了解Scrapy爬虫框架,还请阅读原书。
参考链接(崔庆才的个人博客,可以去膜拜一下大佬):https://cuiqingcai.com/5052.html

1、setting.py中打开管道

ITEM_PIPELINES = {
   # 'tianmao.pipelines.TianmaoPipeline': 300,
}

2、setting.py中写入mongodb配置

# mongodb
HOST = "127.0.0.1"  # 服务器地址
PORT = 27017  # mongo默认端口号
USER = "用户名"
PWD = "密码"
DB = "数据库名"
TABLE = "表名"

3、pipeline.py文件中导入pymongo,将数据写入数据库

from pymongo import MongoClient

参考以下代码

class TianmaoPipeline(object):
    def __init__(self, host, port, user, pwd, db, table):
        self.host = host
        self.port = port
        self.user = user
        self.pwd = pwd
        self.db = db
        self.table = table

    @classmethod
    def from_crawler(cls, crawler):
        HOST = crawler.settings.get('HOST')
        PORT = crawler.settings.get('PORT')
        USER = crawler.settings.get('USER')
        PWD = crawler.settings.get('PWD')
        DB = crawler.settings.get('DB')
        TABLE = crawler.settings.get('TABLE')
        return cls(HOST, PORT, USER, PWD, DB, TABLE)

    def open_spider(self, spider):
        self.client = MongoClient('mongodb://%s:%s@%s:%s' %(self.user,self.pwd,self.host,self.port))

    def close_spider(self, spider):
        self.client.close()

    def process_item(self, item, spider):
        self.client[self.db][self.table].save(dict(item))

4、错误原因

setting.py文件少写一个E,所以导致一直无法存入数据库
Scrapy中把数据写入MongoDB_第1张图片
Scrapy中把数据写入MongoDB_第2张图片

你可能感兴趣的:(教程)