WebCollector——断点爬取

转载:
http://datahref.com/archives/200

crawler.setResumable(true);
crawler.start(xxx);

Notice that if you involve the Crawler.start(int round) method in non-resumable mode, all your history data would be deleted. Make sure your crawler is always in resumable mode if you don’t want to lose your history data.

Resumable mode is not applicable to RamCrawler.

Make sure your crawler uses the same crawlpath as the previous crawling task.

你可能感兴趣的:(#,webcollect)