模拟CSDN请求,做一点事

 需求:无聊,就是玩玩。偷笑

 功能:指定博客地址,随机访问某一篇博文,增加该博文访问量。

 实现:1、通过Httpclient实现模拟访问操作。

              2、指定博客列表URL,通过HTMLParse爬虫框架分析HTML节点,获取所有博文URL,随机模拟操作。

              3、使用线程Thread Sleep设置定时访问博文URL。

              4、使用Java Service Wrapper打包成Windows服务,每天自动启动并自动运行程序(可选)。

准备工作:

需要准备以下LIB:可根据情况删剪LIB。

 详细代码:

package org.csdn.service;

import java.io.IOException;
import java.util.Random;

import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.CookieStore;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.params.ClientPNames;
import org.apache.http.client.protocol.ClientContext;
import org.apache.http.impl.client.BasicCookieStore;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager;
import org.apache.http.params.BasicHttpParams;
import org.apache.http.protocol.BasicHttpContext;
import org.apache.http.protocol.HttpContext;
import org.apache.http.util.EntityUtils;
import org.htmlparser.Parser;
import org.htmlparser.filters.AndFilter;
import org.htmlparser.filters.HasAttributeFilter;
import org.htmlparser.filters.TagNameFilter;
import org.htmlparser.tags.LinkTag;
import org.htmlparser.tags.Span;
import org.htmlparser.util.NodeList;
import org.htmlparser.util.ParserException;

public class RefreshCSDN extends Thread{

	private CookieStore cookieStore = new BasicCookieStore();
	private String blogURL = "";
	// 所有博客URL数组。
	private static String [] links = null;
	// 博客访问次数。
	private int c = 1;

	public RefreshCSDN(String blogURL) throws Exception {
		this.blogURL = blogURL;
		this.getCSDNBlogList();
	}

	public void run(){
		while(true){
			try {
				// 从博客列表中随机取出一篇博文。
				String url = links[new Random().nextInt(links.length-1)];
				System.out.println(url);
				this.refreshCSDN(url);
				// 30秒刷一次。
				Thread.sleep(30*1000);
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
				System.out.println("出错,暂停10分钟再继续。");
				try {
					Thread.sleep(2*60*60*1000);
				} catch (InterruptedException e1) {
					// TODO Auto-generated catch block
					e1.printStackTrace();
				}
				continue;
			}
		}
	}
	
	// 模拟请求。
	@SuppressWarnings("deprecation")
	private HttpResponse execute(String url)
			throws ClientProtocolException, IOException {
		// 必须要设置头的HOST,否则CSDN返回权限403 Forhidden
		BasicHttpParams headerParams = new BasicHttpParams();
		headerParams.setParameter(ClientPNames.HANDLE_REDIRECTS, Boolean.TRUE);
		headerParams.setParameter("Host", "blog.csdn.net");
		
		HttpClient httpClient = new DefaultHttpClient(
				new ThreadSafeClientConnManager());
		HttpGet httpGet = new HttpGet(url);
		httpGet.setParams(headerParams);
		httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0");
		HttpContext localContext = new BasicHttpContext();
		localContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore);
		HttpResponse response = httpClient.execute(httpGet, localContext);
		System.out.println("Request URL:"+url);
		// 返回状态。
		System.out.println("[Status:" + response.getStatusLine().getStatusCode() + "]");
		return response;
	}

	public void refreshCSDN(String url) throws Exception {
		HttpResponse response = execute(url);
		int statusCode = response.getStatusLine().getStatusCode();
		if(statusCode == 200){
			System.out.println("成功Refresh:" + (c++) +"次.");
		}
		System.out.println(""+statusCode);
	}

	// 获取博文所有URL
	private void getCSDNBlogList() throws Exception {
		HttpResponse response = execute(blogURL);
		String content = EntityUtils.toString(response.getEntity());
		NodeList nodeList = this.getNodeByClass(content, "span", "link_title");
		links = new String[nodeList.size()];
		for(int i =0; i < nodeList.size();i++){
			Span span = (Span)nodeList.elementAt(i);
			LinkTag link = (LinkTag)span.getChildren().elementAt(0);
			links[i] = "http://blog.csdn.net"+link.getAttribute("href");
		}
	}

	// 遍历HTML节点。
	private NodeList getNodeByClass(String content, String tag, String className) {
		Parser parser = Parser.createParser(content, "utf-8");
		AndFilter filter = new AndFilter(new TagNameFilter(tag),
				new HasAttributeFilter("class", className));
		try {
			return parser.parse(filter);
		} catch (ParserException e) {
			e.printStackTrace();
			return null;
		}
	}

	public static void main(String[] args) throws Exception {
		// 设置博客目录列表URL。
		RefreshCSDN csdn = new RefreshCSDN(
				"http://blog.csdn.net/programmer_sir?viewmode=contents");
		csdn.start();
                csdn.start(); // 可以开多个线程,开多了小心CSDN搞你。
 }}

以上代码,Main方法直接运行即可。

打印结果:

Request URL:http://blog.csdn.net/programmer_sir?viewmode=contents
[Status:200]
http://blog.csdn.net/programmer_sir/article/details/9009729
Request URL:http://blog.csdn.net/programmer_sir/article/details/9009729
[Status:200]
成功Refresh:1次.
200
http://blog.csdn.net/programmer_sir/article/details/9049005
Request URL:http://blog.csdn.net/programmer_sir/article/details/9049005
[Status:200]
成功Refresh:2次.
200
http://blog.csdn.net/programmer_sir/article/details/10285231
Request URL:http://blog.csdn.net/programmer_sir/article/details/10285231
[Status:200]
成功Refresh:3次.
200
http://blog.csdn.net/programmer_sir/article/details/25710825
Request URL:http://blog.csdn.net/programmer_sir/article/details/25710825
[Status:200]
成功Refresh:4次.
200
http://blog.csdn.net/programmer_sir/article/details/18351107
Request URL:http://blog.csdn.net/programmer_sir/article/details/18351107
[Status:200]
成功Refresh:5次.
200
http://blog.csdn.net/programmer_sir/article/details/23603881
Request URL:http://blog.csdn.net/programmer_sir/article/details/23603881
[Status:200]
成功Refresh:6次.
200
注意事项:

1、建议是30秒~一分钟刷一次,如果间隔时间太短,CSDN是不会增加访问量的。

2、http://blog.csdn.net/programmer_sir?viewmode=contents是我的博客列表页面,指定你的博客列表URL即可。

3、代码更新于2014年9月份10号。后续如果CSDN对IP访问限制住了,那你可以使用JAVA代理,这样的话CSDN是无法限制刷流量的,关于如何使用JAVA代理,可以参考我很早前的一篇博文。

4、如果你只写了一篇博客,CSDN的LIST标签不是SPAN好像,你要改以下方法。

getNodeByClass

5、那些只看不评论的程序员,注定是很难提升的。哈哈。。

你可能感兴趣的:(模拟CSDN请求,做一点事)