如何使用Java中HttpClient解析Html中的table

1、打开MyEclipse新建一个Java Project 输入名称XXX(httpClientTest)

如何使用Java中HttpClient解析Html中的table

2、打开地址:http://hc.apache.org/downloads.cgi,下载相应的jar包

如何使用Java中HttpClient解析Html中的table

3、打开新建的项目新建lib文件夹,并导入之前下载的jar包,右键项目选择Bulid Path--Configure Bulid Path--Libraries--Add JARs 导入lib中的jar如图所示

如何使用Java中HttpClient解析Html中的table

4、新建ClientTest及ClientPojo类。部分代码如下:(这里需要解析Html所以用到了jsoup,可自行上网下载导入jar包方式如上一步骤)

测试地址我选择的是:http://www.live.chinacourt.org/fygg/index/kindid/5.shtml,可根据自己项目需要自行设置。

如何使用Java中HttpClient解析Html中的table

如何使用Java中HttpClient解析Html中的table

如何使用Java中HttpClient解析Html中的table

如何使用Java中HttpClient解析Html中的table

package com.gsoft.getnotice;


import java.io.IOException;
import java.util.ArrayList;
import java.util.List;


import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


public class HttpClient {


public ArrayList<ArrayList<String>> getNoticeList(String url) {
HttpClientBuilder  httpClientBuilder =HttpClientBuilder.create();
CloseableHttpClient closeableHttpClient=httpClientBuilder.build();
//String url="http://www.live.chinacourt.org/fygg/index/kindid/5.shtml";
//进行url设置

//String url="http://localhost:8090/httpclient-getnotice/";
HttpGet httpGet= new HttpGet(url);
ArrayList<ArrayList<String>> lists = new ArrayList<ArrayList<String>>();
try {
/*String urlheader="http://www.live.chinacourt.org";
List <ClientPojo> clientPojo = new ArrayList<ClientPojo>();
List <ClientPojo> clientPojoTexts = new ArrayList<ClientPojo>();*/
HttpResponse httpResponse =closeableHttpClient.execute(httpGet);
HttpEntity entity =httpResponse.getEntity();
if(null!=entity){
//设置编码格式
String response=EntityUtils.toString(entity, "GBK");
//转换Document对象
Document doc=Jsoup.parse(response);
//根据类名获取对象
Elements tables=doc.getElementsByClass("xian");

//Elements elenent=doc.select("table");

Elements tr=tables.select("tr");
for (int i = 0; i < tr.size(); i++) {
//String[] strings= new String[4];
//String[] tds= new String[4];
Element trs=tr.get(i);
//Elements href=trs.select("a");
Elements td=trs.select("td");
//将td值存入list中
ArrayList<String> list = new ArrayList<String>();
for (int j = 0; j < td.size(); j++) {
if(j!=0){
Element tdpojo=td.get(j);
//tds[j]=tdpojo.text();
list.add(tdpojo.text());
}

//System.out.println("---"+tdpojo.text());
}
lists.add(list);
}

//System.out.println(tr);
}
} catch (ClientProtocolException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}finally{
try {
closeableHttpClient.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

return lists;
}
  
}

 

http://jingyan.baidu.com/article/22fe7ced2741043002617f1c.html

你可能感兴趣的:(java,httpclient,JSoup)