第六章 Scrapy框架(十四) 2020-03-16

十四、Scrapy框架–实战–zcool网站精选图高速下载(3


settings.py 设置


ITEM_PIPELINES= {

  'imagedownload.pipelines.ImagedownloadPipeline': 300,

   'scrapy.pipelines.images.ImagesPipeline': 1

}


pipeline.py 代码


from scrapy.pipelines.images import ImagesPipeline

from ..imagedownload import settings

import os

import re

 

 

class ImagedownloadPipeline(ImagesPipeline):

    def get_media_requests(self, item, info):

        media_requests =super(ImagedownloadPipeline, self).get_media_requests(item, info)

        for media_request in media_requests:

            media_request.item = item

        return media_requests

 

    def file_path(self, request, response=None,info=None):

        origin_path =super(ImagedownloadPipeline, self).file_path(request, response, info)

        title = request.item['title']

        title = re.sub(r,'[\\/:\*\?"<>]', "", title)

 

        save_path =os.path.join(settings.IMAGES_STORE, title)

        if not os.path.exists(save_path):

            os.mkdir(save_path)

        imsge_name =origin_path.replace("full/", "")

        return os.path.join(save_path,imsge_name)



上一篇文章 第六章 Scrapy框架(十三) 2020-03-15 地址:

https://www.jianshu.com/p/23a56b78deee

下一篇文章 第六章 Scrapy框架(十五) 2020-03-17 地址:

https://www.jianshu.com/p/5053c6dddbcc



以上资料内容来源网络,仅供学习交流,侵删请私信我,谢谢。

你可能感兴趣的:(第六章 Scrapy框架(十四) 2020-03-16)