百度抽奖概率改4个小时频繁黑屏频繁弹出源码的前端FE T8李森:请云端高level的同学参加会议。。。对,我级别到了。。。
666666
业务背景:如何保证搜索算法的好坏?所以有了竞品评测,自己的APP采用接口的方式抓取前6个卡片的关键字段。对于竞品的无法抓到人家的接口,采用jsoup爬取pc端前端字段,存成我们需要的字段。如视频的时长,播放量,点赞数,类型等。基于PM提供的一批query,抓取多个APP的搜索数据。最后统一存到OSS上,给到PM外包做标注(相关性、满意度、打分)
jsoup参考资料:
https://www.jianshu.com/p/fd5caaaa950d
深坑:
爬虫爬到的网页源码和按F12查看的网页源码不一致。为什么?
网页最终显示的页面源码是经过浏览器解析后的,get或者post请求到的源码是服务器直接返回的,不一样是正常的。
审查元素(或者用开发者工具,Firebug)看到的是现在实时性的内容(经过js的修改),而网页源代码看到的是就是最开始浏览器收到HTTP响应内容
这个原因,就是页面加载的时候浏览器会渲染,把对应的class填充内容,但是爬虫的时候没有渲染的功能
开始不知道,爬取数据的时候发现有的字段返回为null
如,爬取爱奇艺的网页,我尝试了JS/HTML格式化(http://tool.chinaz.com/Tools/jsformat.aspx)
尝试了json格式,但本身是HTML(https://www.json.cn/#)
尝试了VScode...
但是最后发现在谷歌浏览器直接开发者模式下查看Elements比较好,格式清晰一目了然,由于开发者模式下查询比较卡,可以打开查看网页源码,进行搜索查找元素
分层为
写代码之前,要学习jsoup,很简单,看懂了再去写效率高。。。
第一次写爬虫,对照竞品爬取代码debug,仿照写
选择器 select 取class直接select(.classname)
如遇:
解决报错:javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException
参考:https://blog.csdn.net/u010248330/article/details/70161899
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:746)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:722)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:306)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:295)
at com.alibaba.pingce.jingpin.BliHandler.getBliPcResult(BliHandler.java:44)
at com.alibaba.pingce.jingpin.BliHandler.main(BliHandler.java:199)
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292)
at sun.security.validator.Validator.validate(Validator.java:260)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491)
... 16 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)
... 22 more
Exception in thread "main" java.lang.NullPointerException
at com.alibaba.pingce.jingpin.BliHandler.getBliPcResult(BliHandler.java:189)
at com.alibaba.pingce.jingpin.BliHandler.main(BliHandler.java:199)
Disconnected from the target VM, address: '127.0.0.1:56813', transport: 'socket'
在网上查阅了信息说是证书问题,可以在代码中写一段逻辑忽略证书:
下面是网上下载的代码:http://www.sojson.com/blog/195.html
import java.security.cert.CertificateException;
import java.security.cert.X509Certificate;
import javax.net.ssl.HostnameVerifier;
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLSession;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
public class SslUtils {
public static void trustAllHttpsCertificates() throws Exception {
TrustManager[] trustAllCerts = new TrustManager[1];
TrustManager tm = new miTM();
trustAllCerts[0] = tm;
SSLContext sc = SSLContext.getInstance("SSL");
sc.init(null, trustAllCerts, null);
HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
}
static class miTM implements TrustManager,X509TrustManager {
public X509Certificate[] getAcceptedIssuers() {
return null;
}
public boolean isServerTrusted(X509Certificate[] certs) {
return true;
}
public boolean isClientTrusted(X509Certificate[] certs) {
return true;
}
public void checkServerTrusted(X509Certificate[] certs, String authType)
throws CertificateException {
return;
}
public void checkClientTrusted(X509Certificate[] certs, String authType)
throws CertificateException {
return;
}
}
/**
* 忽略HTTPS请求的SSL证书,必须在openConnection之前调用
* @throws Exception
*/
public static void ignoreSsl() throws Exception{
HostnameVerifier hv = new HostnameVerifier() {
public boolean verify(String urlHostName, SSLSession session) {
return true;
}
};
trustAllHttpsCertificates();
HttpsURLConnection.setDefaultHostnameVerifier(hv);
}
}
//在URLConnection con = url.openConnection()之前使用就行
public static void main(String[] args) {
//String url="http://wx1.sinaimg.cn/mw690/006sl6kBgy1fel3aq0nyej30i20hxq7i.jpg";
String url="https://05.imgmini.eastday.com/mobile/20170413/20170413053046_4a5e70ed0b39c824517630e6954861f2_1.jpeg";
String downToFilePath="d:/download/image/";
String fileName="test";
try {
SslUtils.ignoreSsl();
} catch (Exception e) {
e.printStackTrace();
}
imageDownLoad(url, downToFilePath,fileName);
}
在代码中,增加如上工具类方法的异常信息捕获即可
BliHandler
package com.alibaba.pingce.jingpin;
import com.alibaba.algo.dao.SokuTopQueryCompareSnapshotInfoDao;
import com.alibaba.fastjson.JSONObject;
import com.alibaba.pingce.component.Constants;
import com.alibaba.pingce.model.JingPinModle;
import com.alibaba.util.http.handler.SslUtil;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
@Service
public class BliHandler {
@Autowired
SokuTopQueryCompareSnapshotInfoDao sokuTopQueryCompareSnapshotInfoDao;
public List getBliPcResult(String query, int num) {
List jingPinModles = new ArrayList<>();
try {
try {
SslUtil.ignoreSsl();
} catch (Exception e) {
e.printStackTrace();
}
// String url="http://so.iqiyi.com/so/q_"+ URLEncoder.encode ( query,"UTF-8" )+"?source=input&sr=1476998987782";
// String url = "https://search.bilibili.com/all?keyword=" + URLEncoder.encode(query, "UTF-8") + "&from_source=nav_suggest_new";
String url = "https://search.bilibili.com/all?keyword=" + query + "&from_source=nav_suggest_new";
// logger.info ( url );
// System.out.println("utl==" + url);
Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31").get();
// System.out.println("doc=="+doc);
HashMap docSourceMap = new HashMap<>();
docSourceMap.put("bangumi-item-wrap", 1); //节目
// docSourceMap.put("", 10); //节目大词
docSourceMap.put("video-item matrix", 2); //ugc
// docSourceMap.put("", 12); //人物
// docSourceMap.put("live-room-item", 98);//直播
// docSourceMap.put("mixin-list",1111);
List classes = new ArrayList<>();
classes.add("bangumi-item-wrap");
classes.add("video-item matrix");
classes.add("live-room-item");
// classes.add("mixin-list");
// Elements docList = doc.select ( "div[class=layout-main] > div" );
// 获取当前query搜索结果的所有类型卡片列表(节目、ugc等)
Elements docList = doc.select(".mixin-list");
// System.out.println("docList==" + docList);
// 获取所有类型卡片列表里的节目列表
// Elements bangumi_list = docList.select("." + classes.get(i));
// Elements bangumi_list = docList.select(".bangumi-list");
Elements bangumi_list = docList.select(".bangumi-item-wrap");
// 获取所有类型卡片列表里的ugc列表
// Elements videoListClearfix = docList.select(".video-item");
// 标签[class=]
Elements videoListClearfix = docList.select("li[class=video-item matrix]");
// 获取所有类型卡片列表里的直播列表
Elements liveList = docList.select("ul[class=live-room-wrap clearfix]").select("li[class=live-room-item]");
for (int i = 0; i < classes.size(); i++) {
String title = "null";
String pic = "null";
String site = "null";
String time = "null";
String anchor = "null";
String timelength = "null";
String videoUrl = "null";
String type = "null";
String playCount = "null";
String headIcon = "null";
int rank = 1;
// for (Element element : docList) {
// 节目卡bangumi_list
if (!bangumi_list.isEmpty()) {
for (Element element : bangumi_list) {
// System.out.println("element==" + element);
if (jingPinModles.size() >= 5) {
break;
}
JSONObject curDoc = new JSONObject();
String figure = element.attr("class").trim();
// System.out.println("figure为==" + figure);
if (!classes.contains(figure)) {
continue;
}
Integer docSource = docSourceMap.get(figure);
// System.out.println("docSource为==" + docSource);
JingPinModle jingPinModle = new JingPinModle();
// 节目-番剧
// 两种写法都可以,获取class div[class=right-info] 或者.right-info
// String category = element.select("div[class=right-info]").select("span[class=bangumi-label]").text().trim();
String category = element.select(".right-info").select("span[class=bangumi-label]").text().trim();
// System.out.println("category==" + category);
if (!category.isEmpty()) {
type = "节目(番剧)";
} else {
type = "专题";
}
title = element.select(".right-info").select("a[href]").attr("title").trim();
site = "B站";
String pic1 = "http" + element.select(".lazy-img");
// System.out.println("pic1===" + pic1);
pic = "http" + element.select(".lazy-img").attr("img[src]");
// Elements elements = element.select("a[class=left-img]");
//
// System.out.println("------------------------");
// for(Element element1:elements){
// System.out.println(JSONObject.toJSONString(element1.select("a").attr("href")));
// System.out.println("element1===="+element1);
// }
videoUrl = "http:" + element.select("a").attr("href").trim();
jingPinModle.setRank(rank++);
jingPinModle.setQuery(query);
jingPinModle.setVdo_title(title);
jingPinModle.setPic(pic);
jingPinModle.setSite(site);
jingPinModle.setCreate_time(time);
jingPinModle.setRel_people(anchor);
jingPinModle.setSeconds(timelength);
jingPinModle.setUrl(videoUrl);
jingPinModle.setType(type);
jingPinModles.add(jingPinModle);
// break;
}
}
// ugc卡videoListClearfix
if (!videoListClearfix.isEmpty()) {
Element element = videoListClearfix.get(i);
// System.out.println("element==" + element);
if (jingPinModles.size() >= 5) {
break;
}
JSONObject curDoc = new JSONObject();
String figure = element.attr("class").trim();
// System.out.println("figure为==" + figure);
if (!classes.contains(figure)) {
continue;
}
Integer docSource = docSourceMap.get(figure);
// System.out.println("docSource为==" + docSource);
JingPinModle jingPinModle = new JingPinModle();
// 标题
title = element.select(".info").select(".headline").
select("a[class=title]").attr("title").trim();
// 上传时间
time = element.select(".info").select(".tags").select("span[class=so-icon time]").text();
System.out.println("time==" + time);
// select("div[desc=发布时间]").select("span[class=so-icon time]").text().trim();
// 播放数
playCount = element.select(".info").select(".tags").select("span[class=so-icon watch-num]").text();
// 作者
anchor = element.select(".info").select(".tags").select("span[class=so-icon]").select("a[class=up-name]").text();
if (anchor.isEmpty()) {
anchor = element.select("div[class=result-right]").
select("div[desc=上传者]").select("a[class=uploader-name]").attr("title").trim();
}
// anchor = element.select ( "div[class=result-right]" ).select ( "div[class=qy-search-result-info uploader-ico]" ).
// select ( "span[class=info-uploader]" ).text().replace("+关注","").trim();
// 视频时长
timelength = element.select(".img").select("span[class=so-imgTag_rb]").text();
// 视频封面
pic = "http:" + element.select("div[class=result-figure]").select("img[class=qy-mod-cover]").attr("src").
trim();
videoUrl = "http:" + element.select("div[class=result-right]").
select("a[class=main-tit]").attr("href").trim();
type = "ugc";
site = "B站";
jingPinModle.setRank(rank++);
jingPinModle.setQuery(query);
jingPinModle.setVdo_title(title);
jingPinModle.setPic(pic);
jingPinModle.setSite(site);
jingPinModle.setCreate_time(time);
jingPinModle.setRel_people(anchor);
jingPinModle.setSeconds(timelength);
jingPinModle.setUrl(videoUrl);
jingPinModle.setType(type);
// 视频时长
jingPinModle.setPlay_count(playCount);
jingPinModles.add(jingPinModle);
// for (Element element : videoListClearfix) {
// }
}
}
} catch (Exception e) {
e.printStackTrace();
}
JingPinModle capture_model = new JingPinModle();
capture_model.setPic(sokuTopQueryCompareSnapshotInfoDao.selectUrlBySiteAndQuery(query, Constants.BliBli));
capture_model.setQuery(query);
capture_model.setRank(jingPinModles.size() + 1);
jingPinModles.add(capture_model);
return jingPinModles;
}
public static void main(String[] args) {
BliHandler handler = new BliHandler();
List modles = handler.getBliPcResult("辉夜大小姐", 5);
System.out.println(modles.size());
}
}
调试的时候,发现图片取不到,为null
以下是开发者模式下抓取到的字段img
换一种方式,不用jsoup改用json解析:截取“显示网络源码”里的json,从window.__INITIAL_STATE__=到;(function(){var s;之前的json。pic取值如下(拼接https)
videoid取值如下 https://www.bilibili.com/video/av 拼接json里的id
爬取结果如下:
jsoup源码:
源码:
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.jsoup.nodes;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import org.jsoup.SerializationException;
import org.jsoup.helper.StringUtil;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document.OutputSettings;
import org.jsoup.parser.Parser;
import org.jsoup.select.NodeFilter;
import org.jsoup.select.NodeTraversor;
import org.jsoup.select.NodeVisitor;
public abstract class Node implements Cloneable {
static final String EmptyString = "";
Node parentNode;
int siblingIndex;
protected Node() {
}
public abstract String nodeName();
protected abstract boolean hasAttributes();
public boolean hasParent() {
return this.parentNode != null;
}
public String attr(String attributeKey) {
Validate.notNull(attributeKey);
if (!this.hasAttributes()) {
return "";
} else {
String val = this.attributes().getIgnoreCase(attributeKey);
if (val.length() > 0) {
return val;
} else {
return attributeKey.startsWith("abs:") ? this.absUrl(attributeKey.substring("abs:".length())) : "";
}
}
}
public abstract Attributes attributes();
public Node attr(String attributeKey, String attributeValue) {
this.attributes().putIgnoreCase(attributeKey, attributeValue);
return this;
}
public boolean hasAttr(String attributeKey) {
Validate.notNull(attributeKey);
if (attributeKey.startsWith("abs:")) {
String key = attributeKey.substring("abs:".length());
if (this.attributes().hasKeyIgnoreCase(key) && !this.absUrl(key).equals("")) {
return true;
}
}
return this.attributes().hasKeyIgnoreCase(attributeKey);
}
public Node removeAttr(String attributeKey) {
Validate.notNull(attributeKey);
this.attributes().removeIgnoreCase(attributeKey);
return this;
}
public Node clearAttributes() {
Iterator it = this.attributes().iterator();
while(it.hasNext()) {
it.next();
it.remove();
}
return this;
}
public abstract String baseUri();
protected abstract void doSetBaseUri(String var1);
public void setBaseUri(final String baseUri) {
Validate.notNull(baseUri);
this.traverse(new NodeVisitor() {
public void head(Node node, int depth) {
node.doSetBaseUri(baseUri);
}
public void tail(Node node, int depth) {
}
});
}
public String absUrl(String attributeKey) {
Validate.notEmpty(attributeKey);
return !this.hasAttr(attributeKey) ? "" : StringUtil.resolve(this.baseUri(), this.attr(attributeKey));
}
protected abstract List ensureChildNodes();
public Node childNode(int index) {
return (Node)this.ensureChildNodes().get(index);
}
public List childNodes() {
return Collections.unmodifiableList(this.ensureChildNodes());
}
public List childNodesCopy() {
List nodes = this.ensureChildNodes();
ArrayList children = new ArrayList(nodes.size());
Iterator var3 = nodes.iterator();
while(var3.hasNext()) {
Node node = (Node)var3.next();
children.add(node.clone());
}
return children;
}
public abstract int childNodeSize();
protected Node[] childNodesAsArray() {
return (Node[])this.ensureChildNodes().toArray(new Node[this.childNodeSize()]);
}
public Node parent() {
return this.parentNode;
}
public final Node parentNode() {
return this.parentNode;
}
public Node root() {
Node node;
for(node = this; node.parentNode != null; node = node.parentNode) {
;
}
return node;
}
public Document ownerDocument() {
Node root = this.root();
return root instanceof Document ? (Document)root : null;
}
public void remove() {
Validate.notNull(this.parentNode);
this.parentNode.removeChild(this);
}
public Node before(String html) {
this.addSiblingHtml(this.siblingIndex, html);
return this;
}
public Node before(Node node) {
Validate.notNull(node);
Validate.notNull(this.parentNode);
this.parentNode.addChildren(this.siblingIndex, node);
return this;
}
public Node after(String html) {
this.addSiblingHtml(this.siblingIndex + 1, html);
return this;
}
public Node after(Node node) {
Validate.notNull(node);
Validate.notNull(this.parentNode);
this.parentNode.addChildren(this.siblingIndex + 1, node);
return this;
}
private void addSiblingHtml(int index, String html) {
Validate.notNull(html);
Validate.notNull(this.parentNode);
Element context = this.parent() instanceof Element ? (Element)this.parent() : null;
List nodes = Parser.parseFragment(html, context, this.baseUri());
this.parentNode.addChildren(index, (Node[])nodes.toArray(new Node[nodes.size()]));
}
public Node wrap(String html) {
Validate.notEmpty(html);
Element context = this.parent() instanceof Element ? (Element)this.parent() : null;
List wrapChildren = Parser.parseFragment(html, context, this.baseUri());
Node wrapNode = (Node)wrapChildren.get(0);
if (wrapNode != null && wrapNode instanceof Element) {
Element wrap = (Element)wrapNode;
Element deepest = this.getDeepChild(wrap);
this.parentNode.replaceChild(this, wrap);
deepest.addChildren(new Node[]{this});
if (wrapChildren.size() > 0) {
for(int i = 0; i < wrapChildren.size(); ++i) {
Node remainder = (Node)wrapChildren.get(i);
remainder.parentNode.removeChild(remainder);
wrap.appendChild(remainder);
}
}
return this;
} else {
return null;
}
}
public Node unwrap() {
Validate.notNull(this.parentNode);
List childNodes = this.ensureChildNodes();
Node firstChild = childNodes.size() > 0 ? (Node)childNodes.get(0) : null;
this.parentNode.addChildren(this.siblingIndex, this.childNodesAsArray());
this.remove();
return firstChild;
}
private Element getDeepChild(Element el) {
List children = el.children();
return children.size() > 0 ? this.getDeepChild((Element)children.get(0)) : el;
}
void nodelistChanged() {
}
public void replaceWith(Node in) {
Validate.notNull(in);
Validate.notNull(this.parentNode);
this.parentNode.replaceChild(this, in);
}
protected void setParentNode(Node parentNode) {
Validate.notNull(parentNode);
if (this.parentNode != null) {
this.parentNode.removeChild(this);
}
this.parentNode = parentNode;
}
protected void replaceChild(Node out, Node in) {
Validate.isTrue(out.parentNode == this);
Validate.notNull(in);
if (in.parentNode != null) {
in.parentNode.removeChild(in);
}
int index = out.siblingIndex;
this.ensureChildNodes().set(index, in);
in.parentNode = this;
in.setSiblingIndex(index);
out.parentNode = null;
}
protected void removeChild(Node out) {
Validate.isTrue(out.parentNode == this);
int index = out.siblingIndex;
this.ensureChildNodes().remove(index);
this.reindexChildren(index);
out.parentNode = null;
}
protected void addChildren(Node... children) {
List nodes = this.ensureChildNodes();
Node[] var3 = children;
int var4 = children.length;
for(int var5 = 0; var5 < var4; ++var5) {
Node child = var3[var5];
this.reparentChild(child);
nodes.add(child);
child.setSiblingIndex(nodes.size() - 1);
}
}
protected void addChildren(int index, Node... children) {
Validate.noNullElements(children);
List nodes = this.ensureChildNodes();
Node[] var4 = children;
int var5 = children.length;
for(int var6 = 0; var6 < var5; ++var6) {
Node child = var4[var6];
this.reparentChild(child);
}
nodes.addAll(index, Arrays.asList(children));
this.reindexChildren(index);
}
protected void reparentChild(Node child) {
child.setParentNode(this);
}
private void reindexChildren(int start) {
List childNodes = this.ensureChildNodes();
for(int i = start; i < childNodes.size(); ++i) {
((Node)childNodes.get(i)).setSiblingIndex(i);
}
}
public List siblingNodes() {
if (this.parentNode == null) {
return Collections.emptyList();
} else {
List nodes = this.parentNode.ensureChildNodes();
List siblings = new ArrayList(nodes.size() - 1);
Iterator var3 = nodes.iterator();
while(var3.hasNext()) {
Node node = (Node)var3.next();
if (node != this) {
siblings.add(node);
}
}
return siblings;
}
}
public Node nextSibling() {
if (this.parentNode == null) {
return null;
} else {
List siblings = this.parentNode.ensureChildNodes();
int index = this.siblingIndex + 1;
return siblings.size() > index ? (Node)siblings.get(index) : null;
}
}
public Node previousSibling() {
if (this.parentNode == null) {
return null;
} else {
return this.siblingIndex > 0 ? (Node)this.parentNode.ensureChildNodes().get(this.siblingIndex - 1) : null;
}
}
public int siblingIndex() {
return this.siblingIndex;
}
protected void setSiblingIndex(int siblingIndex) {
this.siblingIndex = siblingIndex;
}
public Node traverse(NodeVisitor nodeVisitor) {
Validate.notNull(nodeVisitor);
NodeTraversor.traverse(nodeVisitor, this);
return this;
}
public Node filter(NodeFilter nodeFilter) {
Validate.notNull(nodeFilter);
NodeTraversor.filter(nodeFilter, this);
return this;
}
public String outerHtml() {
StringBuilder accum = new StringBuilder(128);
this.outerHtml(accum);
return accum.toString();
}
protected void outerHtml(Appendable accum) {
NodeTraversor.traverse(new Node.OuterHtmlVisitor(accum, this.getOutputSettings()), this);
}
OutputSettings getOutputSettings() {
Document owner = this.ownerDocument();
return owner != null ? owner.outputSettings() : (new Document("")).outputSettings();
}
abstract void outerHtmlHead(Appendable var1, int var2, OutputSettings var3) throws IOException;
abstract void outerHtmlTail(Appendable var1, int var2, OutputSettings var3) throws IOException;
public T html(T appendable) {
this.outerHtml(appendable);
return appendable;
}
public String toString() {
return this.outerHtml();
}
protected void indent(Appendable accum, int depth, OutputSettings out) throws IOException {
accum.append('\n').append(StringUtil.padding(depth * out.indentAmount()));
}
public boolean equals(Object o) {
return this == o;
}
public boolean hasSameValue(Object o) {
if (this == o) {
return true;
} else {
return o != null && this.getClass() == o.getClass() ? this.outerHtml().equals(((Node)o).outerHtml()) : false;
}
}
public Node clone() {
Node thisClone = this.doClone((Node)null);
LinkedList nodesToProcess = new LinkedList();
nodesToProcess.add(thisClone);
while(!nodesToProcess.isEmpty()) {
Node currParent = (Node)nodesToProcess.remove();
int size = currParent.childNodeSize();
for(int i = 0; i < size; ++i) {
List childNodes = currParent.ensureChildNodes();
Node childClone = ((Node)childNodes.get(i)).doClone(currParent);
childNodes.set(i, childClone);
nodesToProcess.add(childClone);
}
}
return thisClone;
}
public Node shallowClone() {
return this.doClone((Node)null);
}
protected Node doClone(Node parent) {
Node clone;
try {
clone = (Node)super.clone();
} catch (CloneNotSupportedException var4) {
throw new RuntimeException(var4);
}
clone.parentNode = parent;
clone.siblingIndex = parent == null ? 0 : this.siblingIndex;
return clone;
}
private static class OuterHtmlVisitor implements NodeVisitor {
private Appendable accum;
private OutputSettings out;
OuterHtmlVisitor(Appendable accum, OutputSettings out) {
this.accum = accum;
this.out = out;
out.prepareEncoder();
}
public void head(Node node, int depth) {
try {
node.outerHtmlHead(this.accum, depth, this.out);
} catch (IOException var4) {
throw new SerializationException(var4);
}
}
public void tail(Node node, int depth) {
if (!node.nodeName().equals("#text")) {
try {
node.outerHtmlTail(this.accum, depth, this.out);
} catch (IOException var4) {
throw new SerializationException(var4);
}
}
}
}
}