【基于Jsoup】Android_App暴走笑话开发(二)

继续上一篇所讲 

上一篇完成了基本的抓取网页内容,现在这篇是在上一篇的基础上的优化。

下面是效果图

和上一篇一样,利用对返回的HTML数据做分析,得到自己相应想要的数据,放入Adapter,显示在listView中。

 Runnable runnable = new Runnable(){
        @Override
        public void run() {
            Message message = new Message();
            try {
                if(url.isEmpty()){
                    return;
                }
                String u =url;
                Connection conn = Jsoup.connect(u);
                conn.header("User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:32.0) Gecko/    20100101 Firefox/32.0");
                Document doc = conn.get();
                Elements elements = doc.select("span[id=text110]");
                Log.v(TAG,"size  "+elements.size());
                all=elements.toString();
                message.what = WebActivity.FG;
            }catch(Exception x){
                x.printStackTrace();
            }
            //   new MyTask().execute();
            handler.sendMessage(message);
        }
    };

上一篇已经提到了如何获取,这里也就不再详述

不妨来看看,Jsoup的一些源码:

 /**
     * Creates a new {@link Connection} to a URL. Use to fetch and parse a HTML page.
     * <p>
     * Use examples:
     * <ul>
     *  <li><code>Document doc = Jsoup.connect("http://example.com").userAgent("Mozilla").data("name", "jsoup").get();</code></li>
     *  <li><code>Document doc = Jsoup.connect("http://example.com").cookie("auth", "token").post();
     * </ul>
     * @param url URL to connect to. The protocol must be {@code http} or {@code https}.
     * @return the connection. You can add data, cookies, and headers; set the user-agent, referrer, method; and then execute.
     */
    public static Connection connect(String url) {
        return HttpConnection.connect(url);
    }
这是通过提供的url获取HTML信息

/**
     * Set a request header.
     * @param name header name
     * @param value header value
     * @return this Connection, for chaining
     * @see org.jsoup.Connection.Request#headers()
     */
    public Connection header(String name, String value);
设置请求header,我们App的代码如下,其中我们可以看出一些相关信息,火狐浏览器( 20100101 Firefox/32.0
),以及Linux系统(
Linux x86_64 ),这是一个假请求

conn.header("User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:32.0) Gecko/    20100101 Firefox/32.0");

紧接着是获取Document

 /**
     * Execute the request as a GET, and parse the result.
     * @return parsed Document
     * @throws IOException on error
     */
    public Document get() throws IOException;
得到Document后开始做分析,查询整个HTML信息,得到符合条件的

/**
     * Find elements that match the {@link Selector} query, with this element as the starting context. Matched elements
     * may include this element, or any of its children.
     * <p/>
     * This method is generally more powerful to use than the DOM-type {@code getElementBy*} methods, because
     * multiple filters can be combined, e.g.:
     * <ul>
     * <li>{@code el.select("a[href]")} - finds links ({@code a} tags with {@code href} attributes)
     * <li>{@code el.select("a[href*=example.com]")} - finds links pointing to example.com (loosely)
     * </ul>
     * <p/>
     * See the query syntax documentation in {@link org.jsoup.select.Selector}.
     *
     * @param query a {@link Selector} query
     * @return elements that match the query (empty if none match)
     * @see org.jsoup.select.Selector
     */
    public Elements select(String query) {
        return Selector.select(query, this);
    }
我们在App中用到的查询是

Elements elements = doc.select("span[id=text110]");
我们对应原网页看一遍,可以注意到就是以 span 标签和 id=text110 两个条件确定的内容

【基于Jsoup】Android_App暴走笑话开发(二)_第1张图片
最后就完成了抓取,并进一步显示信息


下面是下载地址:
http://download.csdn.net/detail/u011669081/9328481







你可能感兴趣的:(html,android,JSoup)