org.archive.crawler.Heritrix

1、ensure using java 1.6+ before hitting a later cryptic error
2、Set some system properties early.
ignoredSchemes,maxFormSize
3、parsing command line options
4、DEFAULTS until changed by cmd-line options
authLogin 、authPassword、jobsDir、properties、bindHosts、port、SSL options 、
6、Set timezone here.
7、Start Heritrix.
7.1、create engine
7.2、start restlet component

你可能感兴趣的:(Heritrix)