python爬虫第13关项目当当图书榜单爬虫

练习介绍
要求:
请使用Scrapy,爬取当当网2018年图书销售榜单前3页的数据(图书名、作者和书的价格)。

当当网2018年图书销售榜单链接:
http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1

目的:
练习定义item
练习编写spiders文件
练习修改settings文件

1.创建当当爬虫的项目
在终端输入

scrapy startproject dangdang

2.新建爬虫文件
在spiders下建立新文件dangdang.py

#dangdang.py
import scrapy
import bs4
from ..items import DangdangItem

class DangdangSpider(scrapy.Spider):
    name='dangdang'
    allowed_domains=['http://bang.dangdang.com']
    start_urls=[]
    for i in range(1,4):
        url='http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-{}'.format(i)
        start_urls.append(url)

    def parse(self,response):
        html=response.text 
        soup=bs4.BeautifulSoup(html,'html.parser')
        datas=soup.find('ul',class_='bang_list clearfix bang_list_mode').find_all('li')
        for data in datas:
            item=DangdangItem()
            item['name']=data.find('div',class_='name').find('a')['title']
            item['author']=data.find('div',class_='publisher_info').find('a').text
            item['price']=data.find('div',class_='price').find('span',class_='price_n').text
            yield item

3.修改items.py文件

#items.py
import scrapy

class DangdangItem(scrapy.Item):
    name = scrapy.Field()
    author=scrapy.Field()
    price=scrapy.Field()
    

4.修改settings.py文件

#settings.py
BOT_NAME = 'dangdang'

SPIDER_MODULES = ['dangdang.spiders']
NEWSPIDER_MODULE = 'dangdang.spiders'

USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'

ROBOTSTXT_OBEY = False

5.在python\dangdang 下执行命令scrapy crawl dangdang

2019-08-23 23:04:03 [scrapy.utils.log] INFO: Scrapy 1.7.3 started (bot: dangdang)
2019-08-23 23:04:03 [scrapy.utils.log] INFO: Versions: lxml 4.4.1.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 19.7.0, Python 3.7.4 (v3.7.4:e09359112e, Jul  8 2019, 14:54:52) - [Clang 6.0 (clang-600.0.57)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1c  28 May 2019), cryptography 2.7, Platform Darwin-18.7.0-x86_64-i386-64bit
2019-08-23 23:04:03 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'dangdang', 'NEWSPIDER_MODULE': 'dangdang.spiders', 'SPIDER_MODULES': ['dangdang.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
2019-08-23 23:04:03 [scrapy.extensions.telnet] INFO: Telnet Password: fb325988170a6ad1
2019-08-23 23:04:03 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2019-08-23 23:04:03 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2019-08-23 23:04:03 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2019-08-23 23:04:03 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2019-08-23 23:04:03 [scrapy.core.engine] INFO: Spider opened
2019-08-23 23:04:03 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2019-08-23 23:04:03 [py.warnings] WARNING: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/spidermiddlewares/offsite.py:61: URLWarning: allowed_domains accepts only domains, not URLs. Ignoring URL entry http://bang.dangdang.com in allowed_domains.
  warnings.warn(message, URLWarning)

2019-08-23 23:04:03 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2019-08-23 23:04:03 [scrapy.core.engine] DEBUG: Crawled (200)  (referer: None)
2019-08-23 23:04:03 [scrapy.core.engine] DEBUG: Crawled (200)  (referer: None)
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '东野圭吾', 'name': '东野圭吾:白夜行(易烊千玺、邓伦推荐,东野圭吾作品无冕之王)', 'price': '¥41.10'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '安托万·德·圣埃克苏佩里',
 'name': '小王子(畅销300万册,作者基金会官方认证简体中文版)【果麦经典】',
 'price': '¥19.80'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '刘慈欣', 'name': '三体:全三册 刘慈欣代表作,亚洲首部“雨果奖”获奖作品!', 'price': '¥46.50'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '钱钟书', 'name': '围城', 'price': '¥27.30'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '黄仁宇', 'name': '万历十五年 一本好书 腾讯视频栏目推荐', 'price': '¥16.10'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '加·泽文',
 'name': '岛上书店(每个人的生命中,都有无比艰难的那一年,将人生变得美好而辽阔。加·泽文感动全球千万读者的治愈小说!)',
 'price': '¥17.50'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '贾平凹',
 'name': '自在独行 贾平凹的独行世界(畅销300万册的国民精神读本,中国作家协会推荐精读)',
 'price': '¥19.50'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '姜自霞', 'name': '魔法拼音国(套装 共7册)', 'price': '¥49.00'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '余华', 'name': '活着(2017年新版)', 'price': '¥14.00'}
2019-08-23 23:04:03 [scrapy.core.engine] DEBUG: Crawled (200)  (referer: None)
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '张嘉佳', 'name': '云边有个小卖部', 'price': '¥21.00'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '路遥', 'name': '平凡的世界:全三册(朱一龙推荐,八年级下册自主阅读推荐)', 'price': '¥74.50'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '戴尔·卡耐基', 'name': '人性的弱点(薛之谦推荐,畅销100万册)【果麦经典】', 'price': '¥16.00'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '大冰', 'name': '我不(大冰作品。十个月狂销200万册,不容错过的奇书!)', 'price': '¥19.50'}
2019-08-23 23:04:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '毛姆',
 'name': '月亮和六便士(全新导读无删节详注版! 半年创当当110000名读者五星好评奇迹)看“一本好书”,在当当享阅读之趣',
 'price': '¥12.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '陈磊',
 'name': '半小时漫画中国史(修订版)(看半小时漫画,通五千年历史!《半小时漫画中国史》系列开篇之作)团购电话4001066666转6',
 'price': '¥19.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '怀特', 'name': '夏洛的网(新)', 'price': '¥19.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '克莱儿·麦克福尔',
 'name': '摆渡人2:重返荒原(系列畅销千万册。每一个镌刻着爱与善意的灵魂,都会成为我们生命中的摆渡人!《摆渡人》完结篇即将上市!)',
 'price': '¥21.40'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '东野圭吾', 'name': '东野圭吾:恶意(2016版,东野圭吾四大杰作之一)', 'price': '¥27.30'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '哥伦', 'name': '霍乱时期的爱情(2015版)  一本好书 腾讯视频栏目推荐', 'price': '¥34.20'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '埃德加·斯诺', 'name': '红星照耀中国(青少版)人民文学出版社', 'price': '¥28.10'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
{'author': '蔡崇达',
 'name': '皮囊(畅销300万册的国民读本,刘德华、李敬泽作序。繁体版面世即进入台湾诚品、博客来畅销榜单)',
 'price': '¥29.80'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '周国平', 'name': '我喜欢生命本来的样子(周国平经典散文作品集)', 'price': '¥21.10'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '乔安娜柯尔', 'name': '神奇校车·桥梁书版(全20册)', 'price': '¥75.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '郑利强', 'name': '我的第一本地理启蒙书', 'price': '¥24.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '太宰治', 'name': '人间失格(日本小说家太宰治的自传体小说,李现推荐)', 'price': '¥12.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '阿兰德·丹姆', 'name': '小熊和最好的爸爸(全7册)', 'price': '¥17.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '戴维·伽特森', 'name': '雪落香杉树 (福克纳奖得主,全球畅销500万册)', 'price': '¥23.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '张嘉骅', 'name': '少年读史记(套装全5册)', 'price': '¥50.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '乔安娜柯尔', 'name': '神奇校车·图画书版(全12册,新增《科学博览会》1册)', 'price': '¥99.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '大冰',
 'name': '你坏(大冰作品。一年销量突破280万册。用《你坏》向你说声:你好!)',
 'price': '¥19.80'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '东野圭吾',
 'name': '东野圭吾:解忧杂货店(胡歌、王俊凯、刘昊然倾情推荐,东野圭吾长篇小说代表作,这家店帮你找回内心流失的东西)',
 'price': '¥27.30'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '沈复',
 'name': '浮生六记(李现推荐版本,畅销300万册,1041725条当当读者五星好评。沈复给芸娘的绝美情书)【果麦经典】',
 'price': '¥14.60'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '简·尼尔森', 'name': '正面管教(修订版)', 'price': '¥19.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '陈卫平', 'name': '写给儿童的中国历史(全14册)', 'price': '¥177.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '毛姆',
 'name': '月亮与六便士(新版未删节!2018当当名著桂冠!2017豆瓣阅读桂冠!上海国际学校指定必读译本!)作家榜经典文库',
 'price': '¥19.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '克莱儿·麦克福尔',
 'name': '摆渡人(系列畅销千万册。如果命运是一条孤独的河流,谁会是你灵魂的摆渡人?《摆渡人》完结篇即将上市!)',
 'price': '¥18.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '加西亚·马尔克斯', 'name': '马尔克斯:百年孤独(50周年纪念版)', 'price': '¥41.30'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '卡勒德·胡赛尼', 'name': '追风筝的人(2018年新版)', 'price': '¥21.60'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '佐佐木圭一', 'name': '所谓情商高,就是会说话', 'price': '¥16.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
{'author': '李思圆', 'name': '生活需要仪式感 (把温暖和感动带给你在乎的人)', 'price': '¥18.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '海明威',
 'name': '诺奖少年版(全30册) 33万五星好评,日销ZUI高达50000册,系列销量突破4200000册,全国多所小学暑期阅读书目)',
 'price': '¥205.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '山下英子', 'name': '断舍离', 'price': '¥16.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '杨绛', 'name': '我们仨', 'price': '¥11.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '老杨的猫头鹰',
 'name': '好看的皮囊千篇一律,有趣的灵魂万里挑一(老杨的猫头鹰最新作品“醒脑之书”系列之三)',
 'price': '¥19.90'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '玛兹丽施', 'name': '如何说孩子才会听 怎么听孩子才肯说(全新修订版)', 'price': '¥18.40'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '王小波', 'name': '一只特立独行的猪', 'price': '¥22.80'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '莱曼·弗兰克·鲍姆',
 'name': '百年童话绘本·典藏版(全套30册)当当2018年度常青藤畅销书奖,台湾企鹅金牌畅销书,历时5年匠心绘制,上千张手绘插画美翻了!',
 'price': '¥236.80'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '杨绛', 'name': '我们仨(新版)', 'price': '¥14.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '莳田晋至', 'name': '在教室说错了没关系', 'price': '¥18.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '高春香',
 'name': '这就是二十四节气(中国二十四节气彩绘版,文津图书奖获奖绘本,共4册)',
 'price': '¥50.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '慕颜歌', 'name': '你的善良必须有点锋芒', 'price': '¥16.40'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '余华', 'name': '许三观卖血记(新版)', 'price': '¥16.00'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '陈磊',
 'name': '半小时漫画世界史(看半小时漫画,通五千年历史!其实是一本严谨的极简世界史!)',
 'price': '¥19.20'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '高铭', 'name': '天才在左 疯子在右(2018全新完整版)', 'price': '¥34.40'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '稻盛和夫',
 'name': '阿米巴经营——畅销十周年纪念版,当当全国独家(团购,请致电400-106-6666转6)',
 'price': '¥17.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '东野圭吾',
 'name': '东野圭吾:嫌疑人X的献身(王凯、张鲁一推荐,至为纯粹的爱情,绝好的诡计)',
 'price': '¥26.30'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '曹文轩', 'name': '曹文轩文集典藏版(全7册)', 'price': '¥65.50'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '史蒂芬·霍金', 'name': '时间简史(插图本)(央视《朗读者》推荐)', 'price': '¥32.40'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '安东尼·布朗', 'name': '我爸爸+我妈妈(全2册)', 'price': '¥48.60'}
2019-08-23 23:04:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
{'author': '陈卫平', 'name': '写给儿童的中国地理(全14册)', 'price': '¥196.00'}
2019-08-23 23:04:04 [scrapy.core.engine] INFO: Closing spider (finished)
2019-08-23 23:04:04 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 1026,
 'downloader/request_count': 3,
 'downloader/request_method_count/GET': 3,
 'downloader/response_bytes': 360462,
 'downloader/response_count': 3,
 'downloader/response_status_count/200': 3,
 'elapsed_time_seconds': 0.871348,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2019, 8, 23, 15, 4, 4, 218998),
 'item_scraped_count': 60,
 'log_count/DEBUG': 63,
 'log_count/INFO': 10,
 'log_count/WARNING': 1,
 'memusage/max': 54800384,
 'memusage/startup': 54796288,
 'response_received_count': 3,
 'scheduler/dequeued': 3,
 'scheduler/dequeued/memory': 3,
 'scheduler/enqueued': 3,
 'scheduler/enqueued/memory': 3,
 'start_time': datetime.datetime(2019, 8, 23, 15, 4, 3, 347650)}
2019-08-23 23:04:04 [scrapy.core.engine] INFO: Spider closed (finished)


或者
5.在scrapy.cfg同级目录下建立main.py文件

from scrapy import cmdline
cmdline.execute(['scrapy','crawl','dangdang'])

6.运行main.py文件

目前代码还有一些问题没解决。。。不知道这个问题怎么处理。
不知道为什么终端可以运行,main.py文件却会报错。

Scrapy 1.7.3 - no active project

Unknown command: crawl

Use "scrapy" to see available commands

你可能感兴趣的:(python基础及爬虫)