failed with: java.lang.NullPointerException

failed with: java.lang.NullPointerException

需要在nutch的配置文件 'conf/nutch-site.xml'. 里设置如下,不然就报上面的错误了。

当然在crawl-urlfilter.txt里面也要相应于 urls/url.txt里的域名进行设置。


"1.0"?>
"text/xsl" href="configuration.xsl"?>





http.agent.name
MySearch
My Search Engine



http.agent.description

Further description of our bot- this text is used in
the User-Agent header. It appears in parenthesis after the agent name.




http.agent.url

A URL to advertise in the User-Agent header. This will
appear in parenthesis after the agent name. Custom dictates that this
should be a URL of a page explaining the purpose and behavior of this
crawler.




http.agent.email

An email address to advertise in the HTTP 'From' request
header and User-Agent header. A good practice is to mangle this
address (e.g. 'info at example dot com') to avoid spamming.



 

你可能感兴趣的:(java)