Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息

Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息

 

 

目录

输出结果

实现代码


 

 

 

输出结果

后期更新……

 

 

实现代码

import scrapy
class DmozSpider(scrapy.Spider): 
    name ="dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [
        "https://dmoztools.net/Computers/Programming/Languages/Python/Resources/"
        "https://dmoztools.net/Computers/Programming/Languages/Python/Books/"
        ]
    def parse(self,response): 
        filename = response.url.split("/")[-2] 
        with open(filename, 'wb') as f:  
            f.write(response.body) 

 

 

相关文章
Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息

 

你可能感兴趣的:(Crawler)