需求:无聊,就是玩玩。
功能:指定博客地址,随机访问某一篇博文,增加该博文访问量。
实现:1、通过Httpclient实现模拟访问操作。
2、指定博客列表URL,通过HTMLParse爬虫框架分析HTML节点,获取所有博文URL,随机模拟操作。
3、使用线程Thread Sleep设置定时访问博文URL。
4、使用Java Service Wrapper打包成Windows服务,每天自动启动并自动运行程序(可选)。
准备工作:
需要准备以下LIB:可根据情况删剪LIB。
详细代码:
package org.csdn.service; import java.io.IOException; import java.util.Random; import org.apache.http.HttpResponse; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.CookieStore; import org.apache.http.client.HttpClient; import org.apache.http.client.methods.HttpGet; import org.apache.http.client.params.ClientPNames; import org.apache.http.client.protocol.ClientContext; import org.apache.http.impl.client.BasicCookieStore; import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager; import org.apache.http.params.BasicHttpParams; import org.apache.http.protocol.BasicHttpContext; import org.apache.http.protocol.HttpContext; import org.apache.http.util.EntityUtils; import org.htmlparser.Parser; import org.htmlparser.filters.AndFilter; import org.htmlparser.filters.HasAttributeFilter; import org.htmlparser.filters.TagNameFilter; import org.htmlparser.tags.LinkTag; import org.htmlparser.tags.Span; import org.htmlparser.util.NodeList; import org.htmlparser.util.ParserException; public class RefreshCSDN extends Thread{ private CookieStore cookieStore = new BasicCookieStore(); private String blogURL = ""; // 所有博客URL数组。 private static String [] links = null; // 博客访问次数。 private int c = 1; public RefreshCSDN(String blogURL) throws Exception { this.blogURL = blogURL; this.getCSDNBlogList(); } public void run(){ while(true){ try { // 从博客列表中随机取出一篇博文。 String url = links[new Random().nextInt(links.length-1)]; System.out.println(url); this.refreshCSDN(url); // 30秒刷一次。 Thread.sleep(30*1000); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); System.out.println("出错,暂停10分钟再继续。"); try { Thread.sleep(2*60*60*1000); } catch (InterruptedException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } continue; } } } // 模拟请求。 @SuppressWarnings("deprecation") private HttpResponse execute(String url) throws ClientProtocolException, IOException { // 必须要设置头的HOST,否则CSDN返回权限403 Forhidden BasicHttpParams headerParams = new BasicHttpParams(); headerParams.setParameter(ClientPNames.HANDLE_REDIRECTS, Boolean.TRUE); headerParams.setParameter("Host", "blog.csdn.net"); HttpClient httpClient = new DefaultHttpClient( new ThreadSafeClientConnManager()); HttpGet httpGet = new HttpGet(url); httpGet.setParams(headerParams); httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0"); HttpContext localContext = new BasicHttpContext(); localContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore); HttpResponse response = httpClient.execute(httpGet, localContext); System.out.println("Request URL:"+url); // 返回状态。 System.out.println("[Status:" + response.getStatusLine().getStatusCode() + "]"); return response; } public void refreshCSDN(String url) throws Exception { HttpResponse response = execute(url); int statusCode = response.getStatusLine().getStatusCode(); if(statusCode == 200){ System.out.println("成功Refresh:" + (c++) +"次."); } System.out.println(""+statusCode); } // 获取博文所有URL private void getCSDNBlogList() throws Exception { HttpResponse response = execute(blogURL); String content = EntityUtils.toString(response.getEntity()); NodeList nodeList = this.getNodeByClass(content, "span", "link_title"); links = new String[nodeList.size()]; for(int i =0; i < nodeList.size();i++){ Span span = (Span)nodeList.elementAt(i); LinkTag link = (LinkTag)span.getChildren().elementAt(0); links[i] = "http://blog.csdn.net"+link.getAttribute("href"); } } // 遍历HTML节点。 private NodeList getNodeByClass(String content, String tag, String className) { Parser parser = Parser.createParser(content, "utf-8"); AndFilter filter = new AndFilter(new TagNameFilter(tag), new HasAttributeFilter("class", className)); try { return parser.parse(filter); } catch (ParserException e) { e.printStackTrace(); return null; } } public static void main(String[] args) throws Exception { // 设置博客目录列表URL。 RefreshCSDN csdn = new RefreshCSDN( "http://blog.csdn.net/programmer_sir?viewmode=contents"); csdn.start(); csdn.start(); // 可以开多个线程,开多了小心CSDN搞你。}}
以上代码,Main方法直接运行即可。
打印结果:
Request URL:http://blog.csdn.net/programmer_sir?viewmode=contents [Status:200] http://blog.csdn.net/programmer_sir/article/details/9009729 Request URL:http://blog.csdn.net/programmer_sir/article/details/9009729 [Status:200] 成功Refresh:1次. 200 http://blog.csdn.net/programmer_sir/article/details/9049005 Request URL:http://blog.csdn.net/programmer_sir/article/details/9049005 [Status:200] 成功Refresh:2次. 200 http://blog.csdn.net/programmer_sir/article/details/10285231 Request URL:http://blog.csdn.net/programmer_sir/article/details/10285231 [Status:200] 成功Refresh:3次. 200 http://blog.csdn.net/programmer_sir/article/details/25710825 Request URL:http://blog.csdn.net/programmer_sir/article/details/25710825 [Status:200] 成功Refresh:4次. 200 http://blog.csdn.net/programmer_sir/article/details/18351107 Request URL:http://blog.csdn.net/programmer_sir/article/details/18351107 [Status:200] 成功Refresh:5次. 200 http://blog.csdn.net/programmer_sir/article/details/23603881 Request URL:http://blog.csdn.net/programmer_sir/article/details/23603881 [Status:200] 成功Refresh:6次. 200注意事项:
1、建议是30秒~一分钟刷一次,如果间隔时间太短,CSDN是不会增加访问量的。
2、http://blog.csdn.net/programmer_sir?viewmode=contents是我的博客列表页面,指定你的博客列表URL即可。
3、代码更新于2014年9月份10号。后续如果CSDN对IP访问限制住了,那你可以使用JAVA代理,这样的话CSDN是无法限制刷流量的,关于如何使用JAVA代理,可以参考我很早前的一篇博文。
4、如果你只写了一篇博客,CSDN的LIST标签不是SPAN好像,你要改以下方法。
getNodeByClass
5、那些只看不评论的程序员,注定是很难提升的。哈哈。。