爬虫中遇到的问题Crawled (404),[scrapy.spidermiddlewares.offsite] DEBUG: Filtered offsite request to

1.错误1:url地址有误

Crawled (200) (referer: None)
DEBUG: Crawled (404) (referer: None)

解决:复制url的完全地址
start_urls = ['http://www.itcast.cn/channel/teacher.shtml/']  # 刚开始的url
start_urls = ['http://www.itcast.cn/channel/teacher.shtml#ajavaee']  # 改了之后的url
2.错误2:[scrapy.spidermiddlewares.offsite] DEBUG: Filtered offsite request to
解决:dont_filter=True
yield scrapy.Request(
    item["s_href"],
    callback=self.parse_book_list,
    meta={"item": deepcopy(item)},
    dont_filter=True
)

你可能感兴趣的:(问题解决,爬虫)