爬取知乎碰到的问题------------------4、使用redis时碰到的错误:redis.exceptions.ResponseError

使用scrapy-redis爬取知乎,当redis中存的数据量多的时候碰到的问题。
解决办法参考:https://blog.csdn.net/song19890528/article/details/38536871
这个最好还是用redis集群比较好,可以去参考崔庆才博客https://cuiqingcai.com/6058.html
2019-01-31 01:11:46 [twisted] CRITICAL: Unhandled error in Deferred:
2019-01-31 01:11:46 [twisted] CRITICAL: 
Traceback (most recent call last):
  File "D:\Python3\lib\site-packages\twisted\internet\task.py", line 517, in _oneWorkUnit
    result = next(self._iterator)
  File "D:\Python3\lib\site-packages\scrapy\utils\defer.py", line 63, in 
    work = (callable(elem, *args, **named) for elem in iterable)
  File "D:\Python3\lib\site-packages\scrapy\core\scraper.py", line 183, in _process_spidermw_output
    self.crawler.engine.crawl(request=output, spider=spider)
  File "D:\Python3\lib\site-packages\scrapy\core\engine.py", line 210, in crawl
    self.schedule(request, spider)
  File "D:\Python3\lib\site-packages\scrapy\core\engine.py", line 216, in schedule
    if not self.slot.scheduler.enqueue_request(request):
  File "D:\Python3\lib\site-packages\scrapy_redis\scheduler.py", line 162, in enqueue_request
    if not request.dont_filter and self.df.request_seen(request):
  File "D:\Python3\lib\site-packages\scrapy_redis\dupefilter.py", line 100, in request_seen
    added = self.server.sadd(self.key, fp)
  File "D:\Python3\lib\site-packages\redis\client.py", line 1821, in sadd
    return self.execute_command('SADD', name, *values)
  File "D:\Python3\lib\site-packages\redis\client.py", line 755, in execute_command
    return self.parse_response(connection, command_name, **options)
  File "D:\Python3\lib\site-packages\redis\client.py", line 768, in parse_response
    response = connection.read_response()
  File "D:\Python3\lib\site-packages\redis\connection.py", line 638, in read_response
    raise response
redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
2019-01-31 01:11:46 [scrapy.core.scraper] ERROR: Spider error processing  (referer: https://zhihu.com/people/shui-shuo-cheng-xu-yuan-bu-hui-xie-wen-zhang/following)
Traceback (most recent call last):
  File "D:\Python3\lib\site-packages\scrapy\utils\defer.py", line 102, in iter_errback
    yield next(it)
GeneratorExit

大意为:(错误)misconf redis被配置以保存数据库快照,但misconf redis目前不能在硬盘上持久化。用来修改数据集合的命令不能用,请使用日志的错误详细信息。

这是由于强制停止redis快照,不能持久化引起的,运行info命令查看redis快照的状态,如下:

解决方案如下:

运行 config set stop-writes-on-bgsave-error no 命令

你可能感兴趣的:(redis,爬虫,scrapy)