java爬取某电影网站数据

使用jsoup获取动态网站的数据
java爬取某电影网站数据_第1张图片

部分代码:

/**
 * 获取分类的所有相对链接地址 和名称,保存到map中,返回数据
 *
 */
public class GetMoviesName {
    private String url;
    HashMap hrefandname = new HashMap();

    public GetMoviesName(String url) {
        this.url = url;
    }

    public HashMap getAllKinds() throws IOException{

        Document kinds = Jsoup.connect(url)
                            .userAgent("Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36")
                            .timeout(10000)
                            .get();
        Elements elements = kinds.select("#content .types a");
        for(Element element : elements){
            String kindurl = element.attr("href");
            hrefandname.put(kindurl, element.text());
        }
        return hrefandname;
    }
}
try {
            //获取该类别影片的数量total、可在线观看数量playable_count
            document = Jsoup.connect(finalurl).timeout(10000).ignoreContentType(true).execute().body(); 
            // document------{"playable_count":18,"total":32,"unwatched_count":32}可在线观看18部,共32部,未观看32部
        } catch (IOException e) {
            e.printStackTrace();
        }
    private int id;
    private String name;
    private String types;
    private String release_date;
    private double score;
    private String url;
    private String is_playable;

源码下载:https://github.com/YanKuan-IT/Jsuop-DouBanMoviesInfo.git

你可能感兴趣的:(jsoup网络爬虫,java)