scrapy.crawler.CrawlerProcess

https://doc.scrapy.org/en/latest/topics/api.html#crawler-api

方法 描述 其他
crawl(crawler_or_spidercls, *args, **kwargs) 根据传入的参数启动一个爬虫
crawlers 查看已经添加的爬虫
create_crawler(crawler_or_spidercls) 创建一个爬虫
join() Returns a deferred that is fired when all managed crawlers have completed their executions.
start(stop_after_crawl=True)
stop() 停止
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

process = CrawlerProcess(get_project_settings())

# 'followall' is the name of one of the spiders of the project.
process.crawl('followall', domain='scrapinghub.com')
process.start() # the script will block here until the crawling is finished

你可能感兴趣的:(爬虫)