boilerpipe(Boilerplate Removal and Fulltext Extraction from HTML pages) 源码分析
开源Java模块boilerpipe(1.1.0),http://code.google.com/p/boilerpipe/使用例子,URLurl=newURL("http://www.example.com/some-location/index.html");//NOTE:UseArticleExtractorunlessDefaultExtractorgivesbetterresultsfo