流烟默

Jsoup - 使用详解与爬虫

【1】简介

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

scrape and parse HTML from a URL, file, or string
find and extract data, using DOM traversal or CSS selectors
manipulate the HTML elements, attributes, and text
clean user-submitted content against a safe white-list, to prevent XSS attacks
output tidy HTML

jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.

教程地址如下 :

the website: https://jsoup.org/。

中文地址：http://www.open-open.com/jsoup/

翻译过来也就是说，通过jsoup你可以很方便的从url或者文件或者仅仅只是一段html中使用DOM/CSS/JS的知识解析出来你想要得到的东西。也可以使用它来防止XSS攻击。

至于为什么，则是jsoup会将得到的html解析为DOM Tree。拿到了DOM Tree，则一切就变得简单起来。

【2】Jsoup类

Jsoup源码如下：

一向认为，研究技术，首先看源码。

/**
 The core public access point to the jsoup functionality.

 @author Jonathan Hedley */
public class Jsoup {
    private Jsoup() {}

    /**
     Parse HTML into a Document. The parser will make a sensible, balanced document tree out of any HTML.

     @param html    HTML to parse
     @param baseUri The URL where the HTML was retrieved from. Used to resolve relative URLs to absolute URLs, that occur before the HTML declares a {@code } tag.
     @return sane HTML
     */
    public static Document parse(String html, String baseUri) {
        return Parser.parse(html, baseUri);
    }

    /**
     Parse HTML into a Document, using the provided Parser. You can provide an alternate parser, such as a simple XML
     (non-HTML) parser.

     @param html    HTML to parse
     @param baseUri The URL where the HTML was retrieved from. Used to resolve relative URLs to absolute URLs, that occur
     before the HTML declares a {@code } tag.
     @param parser alternate {@link Parser#xmlParser() parser} to use.
     @return sane HTML
     */
    public static Document parse(String html, String baseUri, Parser parser) {
        return parser.parseInput(html, baseUri);
    }

    /**
     Parse HTML into a Document. As no base URI is specified, absolute URL detection relies on the HTML including a
     {@code } tag.

     @param html HTML to parse
     @return sane HTML

     @see #parse(String, String)
     */
    public static Document parse(String html) {
        return Parser.parse(html, "");
    }

    /**
     * Creates a new {@link Connection} to a URL. Use to fetch and parse a HTML page.
     * 
     * Use examples:
     * 

     *  Document doc = Jsoup.connect("http://example.com").userAgent("Mozilla").data("name", "jsoup").get();
     *  Document doc = Jsoup.connect("http://example.com").cookie("auth", "token").post();
     * 
     * @param url URL to connect to. The protocol must be {@code http} or {@code https}.
     * @return the connection. You can add data, cookies, and headers; set the user-agent, referrer, method; and then execute.
     */
    public static Connection connect(String url) {
        return HttpConnection.connect(url);
    }

    /**
     Parse the contents of a file as HTML.

     @param in          file to load HTML from
     @param charsetName (optional) character set of file contents. Set to {@code null} to determine from {@code http-equiv} meta tag, if
     present, or fall back to {@code UTF-8} (which is often safe to do).
     @param baseUri     The URL where the HTML was retrieved from, to resolve relative links against.
     @return sane HTML

     @throws IOException if the file could not be found, or read, or if the charsetName is invalid.
     */
    public static Document parse(File in, String charsetName, String baseUri) throws IOException {
        return DataUtil.load(in, charsetName, baseUri);
    }

    /**
     Parse the contents of a file as HTML. The location of the file is used as the base URI to qualify relative URLs.

     @param in          file to load HTML from
     @param charsetName (optional) character set of file contents. Set to {@code null} to determine from {@code http-equiv} meta tag, if
     present, or fall back to {@code UTF-8} (which is often safe to do).
     @return sane HTML

     @throws IOException if the file could not be found, or read, or if the charsetName is invalid.
     @see #parse(File, String, String)
     */
    public static Document parse(File in, String charsetName) throws IOException {
        return DataUtil.load(in, charsetName, in.getAbsolutePath());
    }

     /**
     Read an input stream, and parse it to a Document.

     @param in          input stream to read. Make sure to close it after parsing.
     @param charsetName (optional) character set of file contents. Set to {@code null} to determine from {@code http-equiv} meta tag, if
     present, or fall back to {@code UTF-8} (which is often safe to do).
     @param baseUri     The URL where the HTML was retrieved from, to resolve relative links against.
     @return sane HTML

     @throws IOException if the file could not be found, or read, or if the charsetName is invalid.
     */
    public static Document parse(InputStream in, String charsetName, String baseUri) throws IOException {
        return DataUtil.load(in, charsetName, baseUri);
    }

    /**
     Read an input stream, and parse it to a Document. You can provide an alternate parser, such as a simple XML
     (non-HTML) parser.

     @param in          input stream to read. Make sure to close it after parsing.
     @param charsetName (optional) character set of file contents. Set to {@code null} to determine from {@code http-equiv} meta tag, if
     present, or fall back to {@code UTF-8} (which is often safe to do).
     @param baseUri     The URL where the HTML was retrieved from, to resolve relative links against.
     @param parser alternate {@link Parser#xmlParser() parser} to use.
     @return sane HTML

     @throws IOException if the file could not be found, or read, or if the charsetName is invalid.
     */
    public static Document parse(InputStream in, String charsetName, String baseUri, Parser parser) throws IOException {
        return DataUtil.load(in, charsetName, baseUri, parser);
    }

    /**
     Parse a fragment of HTML, with the assumption that it forms the {@code body} of the HTML.

     @param bodyHtml body HTML fragment
     @param baseUri  URL to resolve relative URLs against.
     @return sane HTML document

     @see Document#body()
     */
    public static Document parseBodyFragment(String bodyHtml, String baseUri) {
        return Parser.parseBodyFragment(bodyHtml, baseUri);
    }

    /**
     Parse a fragment of HTML, with the assumption that it forms the {@code body} of the HTML.

     @param bodyHtml body HTML fragment
     @return sane HTML document

     @see Document#body()
     */
    public static Document parseBodyFragment(String bodyHtml) {
        return Parser.parseBodyFragment(bodyHtml, "");
    }

    /**
     Fetch a URL, and parse it as HTML. Provided for compatibility; in most cases use {@link #connect(String)} instead.
     
     The encoding character set is determined by the content-type header or http-equiv meta tag, or falls back to {@code UTF-8}.

     @param url           URL to fetch (with a GET). The protocol must be {@code http} or {@code https}.
     @param timeoutMillis Connection and read timeout, in milliseconds. If exceeded, IOException is thrown.
     @return The parsed HTML.

     @throws java.net.MalformedURLException if the request URL is not a HTTP or HTTPS URL, or is otherwise malformed
     @throws HttpStatusException if the response is not OK and HTTP response errors are not ignored
     @throws UnsupportedMimeTypeException if the response mime type is not supported and those errors are not ignored
     @throws java.net.SocketTimeoutException if the connection times out
     @throws IOException if a connection or read error occurs

     @see #connect(String)
     */
    public static Document parse(URL url, int timeoutMillis) throws IOException {
        Connection con = HttpConnection.connect(url);
        con.timeout(timeoutMillis);
        return con.get();
    }

    /**
     Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of permitted
     tags and attributes.

     @param bodyHtml  input untrusted HTML (body fragment)
     @param baseUri   URL to resolve relative URLs against
     @param whitelist white-list of permitted HTML elements
     @return safe HTML (body fragment)

     @see Cleaner#clean(Document)
     */
    public static String clean(String bodyHtml, String baseUri, Whitelist whitelist) {
        Document dirty = parseBodyFragment(bodyHtml, baseUri);
        Cleaner cleaner = new Cleaner(whitelist);
        Document clean = cleaner.clean(dirty);
        return clean.body().html();
    }

    /**
     Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of permitted
     tags and attributes.

     @param bodyHtml  input untrusted HTML (body fragment)
     @param whitelist white-list of permitted HTML elements
     @return safe HTML (body fragment)

     @see Cleaner#clean(Document)
     */
    public static String clean(String bodyHtml, Whitelist whitelist) {
        return clean(bodyHtml, "", whitelist);
    }

    /**
     * Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of
     * permitted tags and attributes.
     * 
The HTML is treated as a body fragment; it's expected the cleaned HTML will be used within the body of an
     * existing document. If you want to clean full documents, use {@link Cleaner#clean(Document)} instead, and add
     * structural tags (html, head, body etc) to the whitelist.
     *
     * @param bodyHtml input untrusted HTML (body fragment)
     * @param baseUri URL to resolve relative URLs against
     * @param whitelist white-list of permitted HTML elements
     * @param outputSettings document output settings; use to control pretty-printing and entity escape modes
     * @return safe HTML (body fragment)
     * @see Cleaner#clean(Document)
     */
    public static String clean(String bodyHtml, String baseUri, Whitelist whitelist, Document.OutputSettings outputSettings) {
        Document dirty = parseBodyFragment(bodyHtml, baseUri);
        Cleaner cleaner = new Cleaner(whitelist);
        Document clean = cleaner.clean(dirty);
        clean.outputSettings(outputSettings);
        return clean.body().html();
    }

    /**
     Test if the input body HTML has only tags and attributes allowed by the Whitelist. Useful for form validation.
     
The input HTML should still be run through the cleaner to set up enforced attributes, and to tidy the output.
     Assumes the HTML is a body fragment (i.e. will be used in an existing HTML document body.)
     @param bodyHtml HTML to test
     @param whitelist whitelist to test against
     @return true if no tags or attributes were removed; false otherwise
     @see #clean(String, org.jsoup.safety.Whitelist) 
     */
    public static boolean isValid(String bodyHtml, Whitelist whitelist) {
        return new Cleaner(whitelist).isValidBodyHtml(bodyHtml);
    }

}

其方法如下图所示：

方法名parse的多个重载方法，基本就是解析不同状况下的html。如有的提供了baseUrl，有的提供了parser。

最后四个方法在防止用户恶意输入html脚本或者过滤HTML时使用。

connet则是从url获取Document。示例如下：

        Connection conn = Jsoup.connect(url );
        String userAgent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36";
        // 修改http包中的header,伪装成浏览器进行抓取
        conn.header("User-Agent", userAgent);
        Document doc = null;
        try {
            doc = conn.get();
        } catch (IOException e) {
            e.printStackTrace();
        }

拿到Document之后，接下来我们需要做的就是获取具体Element，Node,

【3】Document

什么是Document？

W3C解释：

每个载入浏览器的 HTML 文档都会成为 Document 对象。

Document 对象使我们可以从脚本中对 HTML 页面中的所有元素进行访问。

提示：Document 对象是 Window 对象的一部分，可通过 window.document 属性对其进行访问。

源码方法如下图：

需要注意的是，Document是Element的子类。

/**
 A HTML Document.

 @author Jonathan Hedley, [email protected] */
public class Document extends Element {
    private OutputSettings outputSettings = new OutputSettings();
    private QuirksMode quirksMode = QuirksMode.noQuirks;
    private String location;
    private boolean updateMetaCharset = false;

    /**
     Create a new, empty Document.
     @param baseUri base URI of document
     @see org.jsoup.Jsoup#parse
     @see #createShell
     */
    public Document(String baseUri) {
        super(Tag.valueOf("#root", ParseSettings.htmlDefault), baseUri);
        this.location = baseUri;
    }

    /**
     Create a valid, empty shell of a document, suitable for adding more elements to.
     @param baseUri baseUri of document
     @return document with html, head, and body elements.
     */
    public static Document createShell(String baseUri) {
        Validate.notNull(baseUri);

        Document doc = new Document(baseUri);
        Element html = doc.appendElement("html");
        html.appendElement("head");
        html.appendElement("body");

        return doc;
    }
    //...
}

也可以这样理解，Document是许多个Element组成的。

【4】Element

什么是Element ？

W3C：

在 HTML DOM 中，Element 对象表示 HTML 元素。

Element 对象可以拥有类型为元素节点、文本节点、注释节点的子节点。

NodeList 对象表示节点列表，比如 HTML 元素的子节点集合。

元素也可以拥有属性。属性是属性节点。

源码如下：

Element继承自Node

/**
 * A HTML element consists of a tag name, attributes, and child nodes (including text nodes and
 * other elements).
 * 
 * From an Element, you can extract data, traverse the node graph, and manipulate the HTML.
 * 
 * @author Jonathan Hedley, [email protected]
 */
public class Element extends Node {
    private static final List EMPTY_NODES = Collections.emptyList();
    private static final Pattern classSplit = Pattern.compile("\\s+");
    private Tag tag;
    private WeakReference> shadowChildrenRef; // points to child elements shadowed from node children
    List childNodes;
    private Attributes attributes;
    private String baseUri;

    /**
     * Create a new, standalone element.
     * @param tag tag name
     */
    public Element(String tag) {
        this(Tag.valueOf(tag), "", new Attributes());
    }

    /**
     * Create a new, standalone Element. (Standalone in that is has no parent.)
     * 
     * @param tag tag of this element
     * @param baseUri the base URI
     * @param attributes initial attributes
     * @see #appendChild(Node)
     * @see #appendElement(String)
     */
    public Element(Tag tag, String baseUri, Attributes attributes) {
        Validate.notNull(tag);
        Validate.notNull(baseUri);
        childNodes = EMPTY_NODES;
        this.baseUri = baseUri;
        this.attributes = attributes;
        this.tag = tag;
    }

    /**
     * Create a new Element from a tag and a base URI.
     * 
     * @param tag element tag
     * @param baseUri the base URI of this element. It is acceptable for the base URI to be an empty
     *            string, but not null.
     * @see Tag#valueOf(String, ParseSettings)
     */
    public Element(Tag tag, String baseUri) {
        this(tag, baseUri, null);
    }
    //...
 }

其方法如下：

so many !

还没完，继续看Node类。

【5】Node

什么是Node ？

W3C：

在 HTML DOM （文档对象模型）中，每个部分都是节点：

文档本身是文档节点
所有 HTML 元素是元素节点
所有 HTML 属性是属性节点
HTML 元素内的文本是文本节点
注释是注释节点

源码如下：

/**
 The base, abstract Node model. Elements, Documents, Comments etc are all Node instances.

 @author Jonathan Hedley, [email protected] */
public abstract class Node implements Cloneable {
    static final String EmptyString = "";
    Node parentNode;
    int siblingIndex;

    /**
     * Default constructor. Doesn't setup base uri, children, or attributes; use with caution.
     */
    protected Node() {
    }

    /**
     Get the node name of this node. Use for debugging purposes and not logic switching (for that, use instanceof).
     @return node name
     */
    public abstract String nodeName();

    /**
     * Check if this Node has an actual Attributes object.
     */
    protected abstract boolean hasAttributes();

    public boolean hasParent() {
        return parentNode != null;
    }

    /**
     * Get an attribute's value by its key. Case insensitive
     * 
     * To get an absolute URL from an attribute that may be a relative URL, prefix the key with abs,
     * which is a shortcut to the {@link #absUrl} method.
     * 
     * E.g.:
     * String url = a.attr("abs:href");
     *
     * @param attributeKey The attribute key.
     * @return The attribute, or empty string if not present (to avoid nulls).
     * @see #attributes()
     * @see #hasAttr(String)
     * @see #absUrl(String)
     */
    public String attr(String attributeKey) {
        Validate.notNull(attributeKey);
        if (!hasAttributes())
            return EmptyString;

        String val = attributes().getIgnoreCase(attributeKey);
        if (val.length() > 0)
            return val;
        else if (attributeKey.startsWith("abs:"))
            return absUrl(attributeKey.substring("abs:".length()));
        else return "";
    }

    /**
     * Get all of the element's attributes.
     * @return attributes (which implements iterable, in same order as presented in original HTML).
     */
    public abstract Attributes attributes();

    /**
     * Set an attribute (key=value). If the attribute already exists, it is replaced. The attribute key comparison is
     * case insensitive.
     * @param attributeKey The attribute key.
     * @param attributeValue The attribute value.
     * @return this (for chaining)
     */
    public Node attr(String attributeKey, String attributeValue) {
        attributes().putIgnoreCase(attributeKey, attributeValue);
        return this;
    }
    //...
 }

到这里可以归纳一下，document，element，comment etc are all Node instance。

你甚至可以认为，一个Html Document中，一切皆为节点！

其方法如下：

至此，拥有了这些方法，只要拿到Document，就可以得到任何你想要的Document中的内容。

【6】使用Jsoup进行爬虫

爬虫，听起来很高大上的一个技术。其实不然，尤其是使用Jsoup，稍微研究一下就可以很容易掌握。

主要思路如下：

获取模板url
添加头部用户代理伪装浏览器

很多网站可能会做了防爬虫处理
拿到Document
建立自己的过滤规则
获取自己想要的数据持久化

这里有一个爬虫实战，爬取模板url并进行自动翻页获取指定内容操作。

点击下载：http://download.csdn.net/download/j080624/10214159

【Docker系列四】Docker 网络 Kwan的解忧杂货铺@新空间代码工作室 s4 Docker系列 docker 网络容器
欢迎来到我的博客，很高兴能够在这里和您见面！希望您在这里可以感受到一份轻松愉快的氛围，不仅可以获得有趣的内容和知识，也可以畅所欲言、分享您的想法和见解。推荐:kwan的首页,持续学习,不断总结,共同进步,活到老学到老导航檀越剑指大厂系列:全面总结java核心技术,jvm,并发编程redis,kafka,Spring,微服务等常用开发工具系列:常用的开发工具,IDEA,Mac,Alfred,Git,
Vue3前端开发：组件化设计与状态管理 caihuayuan4 面试题汇总与解析 spring sql java 大数据课程设计
Vue3前端开发：组件化设计与状态管理一、Vue3组件化设计组件基本概念与特点是一款流行的JavaScript框架，它支持组件化设计，这意味着我们可以将页面分解成多个独立的组件，每个组件负责一部分功能，通过组件的嵌套和复用，可以快速构建复杂的用户界面。组件化设计具有以下特点：组件示例组件选项在上面的代码示例中，我们通过Vue.component方法注册了一个名为my-component的组件，这是
AJAX（Asynchronous JavaScript and XML）详解与应用风亦辰739 javascript ajax xml
一、什么是AJAX？AJAX（AsynchronousJavaScriptandXML，异步JavaScript和XML）是一种用于创建异步Web应用程序的技术。它可以在不重新加载整个网页的情况下，与服务器进行数据交换，从而提供更好的用户体验。1.1AJAX的核心特点异步通信：数据请求不会阻塞页面，提升用户体验。减少服务器负担：只获取需要的数据，减少流量。提升用户体验：网页响应速度更快，减少页面刷
java选择语句 FAQEW java
Java选择结构深度解析一、if结构体系1.单条件判断//基础if结构intscore=85;if(score>=60){System.out.println("考试通过");}//判断空值（防御性编程）Stringtext=null;if(text!=null&&!text.isEmpty()){System.out.println(text.length());}执行流程：truefalse条
Unity 与 JavaScript 的通信交互：实现跨平台的双向通信 Front_Yue 3D技术实践指南 unity javascript 3d
前言在现代游戏开发和Web应用中，Unity和JavaScript的结合越来越常见。Unity是一个强大的跨平台游戏引擎，而JavaScript是Web开发的核心技术之一。通过Unity和JavaScript的通信交互，开发者可以实现从Unity到Web页面的功能扩展，或者从Web页面控制Unity的行为。这种双向通信的能力为开发者提供了更多的可能性，例如在Unity中嵌入Web视图，或者在Web
Java有哪些编程技巧？ java
Java编程技巧：提升效率与质量的实用指南在Java编程中，掌握一些高效的编程技巧不仅可以提高开发效率，还能提升代码的可读性、可维护性和性能。以下是一些实用的Java编程技巧，供开发者参考和应用。一、代码优化技巧（一）合理使用数据类型选择合适的数据类型：根据实际需求选择合适的数据类型。例如，如果只需要存储整数，且数值范围较小，可以使用int而不是long，以节省内存。使用包装类时需谨慎：Java的
Sa-Token v1.20.0 发布，新增临时Token认证
框架介绍Sa-Token是一个轻量级Java权限认证框架，主要解决：登录认证、权限认证、分布式Session会话、单点登录、OAuth2.0等一系列权限相关问题。框架针对踢人下线、自动续签、前后台分离、分布式会话……等常见业务进行N多适配，通过sa-token，你可以以一种极简的方式实现系统的权限认证部分Sa-Tokenv1.20.0版本更新包括以下内容：新增：新增Solon适配插件，感谢大佬@刘
关于Java的变量和常量的应用 MOSCATO, 新手 java 开发语言
在Java语言中，关于数据的存储和其他语言都大差不差，都是在磁盘中找到一个位置，把数据放进去，然后给这个位置做上标记，以便后续的查找，只不过各种语言都有自己的查找和标记的方式，这里讲到的Java则是通过JVM（Java虚拟机）来实现这个功能。话跑偏了，接下来是Java常量的介绍常量的定义在Java中，常量通常通过final关键字修饰。一旦被赋值后，其值就不能被修改。例如：finalintMAX_V
JavaScript反爬技术解析与应对不做超级小白 web逆向知识碎片 web前端 javascript 开发语言 ecmascript
JavaScript反爬技术解析与应对前言在当今Web爬虫与数据抓取的生态环境中，网站运营方日益关注数据安全与隐私保护，因此逐步采用多种反爬技术来限制非授权访问。本文从JavaScript角度出发，深入剖析主流反爬策略的技术原理，并探讨相应的绕过方案，以期为研究者和开发者提供系统性的理解与实践指导。1.JavaScript反爬技术概述1.1右键禁用与开发者工具防护部分网站采用JavaScript拦
Java：从入门到创新 java
Java：从入门到创新一、Java简介Java是一种广泛使用的高级编程语言，自1995年首次发布以来，一直深受开发者的喜爱。它由SunMicrosystems公司开发，后来被Oracle公司收购。Java的设计目标是简单、健壮、安全且跨平台，这些特性使其在企业级应用开发中占据重要地位。二、Java的主要特点（一）简单易学Java的语法与C语言和C++语言很接近，但丢弃了C++中一些复杂且容易出错的
[代码规范]1_良好的命名规范能减轻工作负担啾啾大学习编程通用代码规范 Java命名规范命名规范长命名方案
欢迎来到啾啾的博客，一个致力于构建完善的Java程序员知识体系的博客，记录学习的点滴，分享工作的思考、实用的技巧，偶尔分享一些杂谈。欢迎评论交流，感谢您的阅读。目录引言命名——提炼含义减少注释类名命名接口与实现类的命名方法命名的最佳实践1.方法名的结构2.参数与返回值的隐含3.避免缩写4.逻辑与副作用的体现5.条件判断方法长命名处理——实战答疑处理方法1.利用上下文环境简化名称2.使用领域术语或缩
GIS三维可视化进阶：Three.js集成Cesium引擎实现全球地形LOD与OGC标准服务调用贝格前端工场 javascript 开发语言 ecmascript
Three.js与Cesium引擎基础介绍Three.js是一款基于JavaScript的开源三维图形库，它提供了丰富的API用于创建和操作三维场景、物体、材质等。在Web端的三维可视化领域应用广泛，因其能够在浏览器中高效渲染复杂的三维模型和场景，大大降低了开发人员创建三维交互内容的门槛。通过简单的代码，即可实现如创建三维几何体（立方体、球体等）、为物体添加材质（如纹理材质、光照材质）以及设置相机
java语言map的五种遍历方法 0319zz Java细节 java 开发语言
publicstaticvoidmain(String[]args){Mapmap=newHashMapentry:map.entrySet()){Stringkey=entry.getKey();Integervalue=entry.getValue();System.out.println("Key:"+key+",Value:"+value);}//第二种：使用for-each循环和keyS
「JavaScript深入」Socket.IO：基于 WebSocket 的实时通信库八了个戒 JavaScript系列面试宝典大前端 javascript websocket 开发语言前端
Socket.IOSocket.IO的核心特性Socket.IO的架构解析Socket.IO的工作流程Socket.IO示例：使用Node.js搭建实时聊天服务器1.安装Socket.IO2.服务器端代码（Node.js）3.客户端代码（HTML+JavaScript）4.房间功能高级功能实现1.命名空间2.中间件3.二进制传输性能优化策略1.负载均衡2.资源管理3.监控与调试安全与可靠性1.安全
Apache大数据旭哥优选大数据选题 Apache大数据旭大数据定制选题 java hadoop spark 开发语言 idea hive 数据库架构
定制旭哥服务，一对一，无中介包安装+答疑+售后态度和技术都很重要定制按需求做要求不高就实惠一点定制需提前沟通好怎么做，这样才能避免不必要的麻烦python、flask、Django、mapreduce、mysqljava、springboot、vue、echarts、hadoop、spark、hive、hbase、flink、SparkStreaming、kafka、flume、sqoop分析+推
java简单的小程序_编写一个简单的入门java小程序雷幺幺 java简单的小程序
1.创建一个java程序的步骤a打开editplus软件，选择左上角的file选项，在弹出来的菜单中选择new然后再从弹出来的菜单中选择normaltextb按住ctrl+s快捷键，保存。1选择要保存的位置2给文件命名(以大写的字母开头)3选择文件的后缀，以.java后缀结尾c进行代码的编写，所有字符我们必须都是英文输入状态下的d打开控制台(win+r在弹出左下角的命令行中输入cmd)e找到jav
Java基础7（解耦、引入工厂模式、代理设计模式、适配器设计模式、内部类）孤影恋长风 java
类设计的注意事项：类的设计主要是父类的设计子类最好不要继承一个已经完全实现的类，因为一旦发达向上转型，所调用的方法，一定是被子类覆盖过的方法，所以只会继承抽象类和接口。解耦耦合度是什么？两个对象之间相互依赖的程度，是衡量代码独立性的一个指标。软件开发追求高/低耦合度？软件开发追求低耦合度怎么才能降低代码的耦合度？降低代码的耦合度是一个非常重要的实践，它有助于提高代码的可维护性、可读性和可扩展性。引
LeetCode 21Merge Two Sorted Lists 合并两个排序链表 Java 我欲混吃与等死 LeetCode leetcode 链表 java
题目：将两个已排序的链表合并在一起。举例1：输入：list1=[1,2,4],list2=[1,3,4];输出：[1,1,2,3,4,4];举例2：输入：list1=[],list2=[];输出：[]举例3：输入：list1=[],list2=[0];输出：[0]解题思路：遍历两个链表，比较节点值来合并链表，当其中一个链表遍历完成时，将另一个链表剩余部分拼入新链表。/***Definitionfo
Java后端开发技术详解小二爱编程· java 开发语言
Java作为一门成熟的编程语言，已广泛应用于后端开发领域。其强大的生态系统和广泛的支持库使得Java成为许多企业和开发者的首选后端开发语言。随着云计算、微服务架构和大数据技术的兴起，Java后端开发的技术栈也不断演进。本文将详细介绍Java后端开发的核心技术，包括Java基础、常见框架、数据库操作、缓存技术、异步编程等。1.Java基础：理解面向对象的编程Java是一种面向对象的编程语言，面向对象
【Linux】Hadoop-3.4.1的伪分布式集群的初步配置孤独打铁匠Julian Linux linux hadoop ubuntu
配置步骤一、检查环境JDK#目前还是JDK8最适合Hadoopjava-versionecho$JAVA_HOMEHadoophadoopversionecho$HADOOP_HOME二、配置SSH免密登录Hadoop需要通过SSH管理节点（即使在伪分布式模式下）sudoaptinstallopenssh-server#安装SSH服务（如未安装）cd~/.ssh/ssh-keygen-trsa#生
Spring Boot 项目 90% 存在这 15 个致命漏洞，你的代码在裸奔吗？风象南原创随笔 java spring boot 后端 web安全系统安全
文章首发公众号【风象南】SpringBoot作为一款广泛使用的Java开发框架，虽然为开发者提供了诸多便利，但也并非无懈可击，其安全漏洞问题不容忽视。本文将深入探讨SpringBoot常见的安全漏洞类型、产生原因以及相应的解决方案，帮助开发者更好地保障应用程序的安全。1.SQL注入漏洞漏洞描述：当应用程序使用用户输入的数据来构建SQL查询时，如果没有进行适当的过滤或转义，攻击者就可以通过构造恶意的
golang jwt挖坑 qiang527052 golang个人笔记 golang jwt
golangjwt使用golangjwt使用中遇到的一个坑，特此记录。具体描述：因为公司需要，现有架构jwt生成token的代码是java实现的，然后现在在golang中需要对此token进行解析。java用到的jar包：io.jsonwebtoken.jjwt0.9.0golang用到的库：github.com/dgrijalva/jwt-gojava生成token测试代码如下：publicst
入门级带你实现一个安卓智能家居APP（2）kotlin版本一粒程序米 android kotlin 智能家居 WiFi 单片机
前言上一篇写过java版本的实现，这一篇就写一下kotlin版本的吧。效果展示本APP是通过tcp/ip协议与连了WiFi的单片机通信。其实除了主活动类和新建项目时有一丢丢不同，其他的都是一样的哈~第一步：你得会一点点kotlin基础，建议看一本书，是郭霖大神些的《第一行代码》第三版，里面除了安卓的基础教学，还有kotlin的。第二步：建议看一本书，是郭霖大神些的《第一行代码》，先入门安卓基础。不
vscode设置console.log的快捷输出方式活宝小娜 vscode vscode ide 编辑器
vscode设置console.log的快捷输出方式编辑器中输入clg回车，可以直接输出console.log，并且同步输出变量的字符串和值1、打开vscode点击左上角的文件2、找到首选项3、点击用户代码配置4、在顶部输入框种输入javas，选择JavaScript选项5、打开里面注释的代码，写入如下内容{//Placeyoursnippetsforjavascripthere."Printto
【Java se】程序逻辑控制 MABO-mb java 开发语言前端
一、顺序结构顺序结构比较简单，按照代码书写的顺序一行一行执行。System.out.println("aaa");System.out.println("bbb");System.out.println("ccc");//运行结果aaabbbccc如果调整代码的书写顺序,则执行顺序也发生变化System.out.println("aaa");System.out.println("ccc");Sy
springboot基于bs 架构的母婴用户商城全程服务管理系统(源码+lw+部署文档+讲解等) 源码哆哆V+ymhydo Java毕设优质源码 spring boot 架构后端
具体实现截图技术栈后端框架SpringBoot采用springboot作为后台的框架，java框架具有简化配置和开发的效率。Spring框架目前是很多java开发者的首选框架，Spring主要有两大功能，控制反转和面向切面的编程。控制反转（IOC）可以实现代码的依赖注入，减少代码的耦合性，大大提高了软件质量，面向切面编程（AOP）主要是应用动态代理的技术对代码逻辑进行分离，可以实现对代码的重用，适
Java对象的hashcode 阿黄学技术 Java基础 java 开发语言
在Java中，hashcode和equals方法是Object类的两个重要方法，它们在处理对象比较和哈希集合（如HashMap、HashSet）时起着关键作用。对于equals大部分Java程序员都不陌生，它通常是比较两个对象的内容(值)是否相等(==双等于比较对象的内存地址)，如果是Object中的equals方法默认就是比较内存地址(在没有被重写的情况下和==一样)。hashCode方法返回对
Java中卫语句的设计思想而为. java 服务器开发语言
卫语句（GuardClauses）是一种通过提前返回简化条件嵌套、提升代码可读性的编程技巧。其核心思想是优先处理异常或边界情况，让主逻辑保持扁平化。以下是deepseek做出的设计思想详解：核心设计原则FailFast（快速失败）在函数入口处立即检查非法参数或无效状态，若不符合条件则提前终止（如返回、抛异常），避免后续无效操作。减少嵌套层级用卫语句替换多层if-else嵌套，将代码从“箭头型”结构
Java进阶面试速记登陆成功200 JAVA进阶开发语言 java
注解注解@Override类似一个标签,作用在方法上,表示此方法是从父类中重写而来注解是java中的标注方式,可以最用在类,方法,变量,参数成员上在编译期间,会被编译到字节码文件中,运行时通过反射机制获得注解内容,进行解析.内置注解java中内定好的注解例如@Override@Deprecated-标记过时方法。如果使用该方法，会报编译警告。@SuppressWarnings-指示编译器去忽略注解
手写promise ,实现 then ,catch,finally,resolve,reject,all,allSettled 会飞的鱼先生前端 javascript 开发语言
完整代码原生Promise的用法1.Promise是JavaScript中用于处理异步操作的重要工具。它代表了一个异步操作的最终完成或失败，并且使异步方法可以像同步方法那样返回值。resolve：当异步操作成功时调用的函数，用于将Promise的状态改为fulfilled，并将结果值传递给后续的.then()方法。reject：当异步操作失败时调用的函数，用于将Promise的状态改为reject
异常的核心类Throwable 无量 java 源码异常处理 exception
java异常的核心是Throwable，其他的如Error和Exception都是继承的这个类里面有个核心参数是detailMessage，记录异常信息，getMessage核心方法，获取这个参数的值，我们可以自己定义自己的异常类，去继承这个Exception就可以了，方法基本上，用父类的构造方法就OK，所以这么看异常是不是很easy package com.natsu;
mongoDB 游标（cursor）实现分页迭代开窍的石头 mongodb
上篇中我们讲了mongoDB 中的查询函数，现在我们讲mongo中如何做分页查询如何声明一个游标 var mycursor = db.user.find({_id:{$lte:5}}); 迭代显示游标数
MySQL数据库INNODB 表损坏修复处理过程 0624chenhong tomcat mysql
最近mysql数据库经常死掉，用命令net stop mysql命令也无法停掉，关闭Tomcat的时候，出现Waiting for N instance(s) to be deallocated 信息。查了下，大概就是程序没有对数据库连接释放，导致Connection泄露了。因为用的是开元集成的平台，内部程序也不可能一下子给改掉的，就验证一下咯。启动Tomcat,用户登录系统，用netstat -
剖析如何与设计人员沟通不懂事的小屁孩工作
最近做图烦死了，不停的改图，改图……。烦，倒不是因为改，而是反反复复的改，人都会死。很多需求人员不知该如何与设计人员沟通，不明白如何使设计人员知道他所要的效果，结果只能是沟通变成了扯淡，改图变成了应付。那应该如何与设计人员沟通呢？我认为设计人员与需求人员先天就存在语言障碍。对一个合格的设计人员来说，整天玩的都是点、线、面、配色，哪种构图看起来协调；哪种配色看起来合理心里跟明镜似的，
qq空间刷评论工具换个号韩国红果果 JavaScript
var a=document.getElementsByClassName('textinput'); var b=[]; for(var m=0;m<a.length;m++){ if(a[m].getAttribute('placeholder')!=null) b.push(a[m]) } var l
S2SH整合之session 灵静志远 spring AOP struts session
错误信息： Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cartService': Scope 'session' is not active for the current thread; consider defining a scoped
xmp标签 a-john 标签
今天在处理数据的显示上遇到一个问题： var html = '<li><div class="pl-nr"><span class="user-name">' + user + '</span>' + text + '</div></li>'; ulComme
Ajax的常用技巧（2）---实现Web页面中的级联菜单 aijuans Ajax
在网络上显示数据，往往只显示数据中的一部分信息，如文章标题，产品名称等。如果浏览器要查看所有信息，只需点击相关链接即可。在web技术中，可以采用级联菜单完成上述操作。根据用户的选择，动态展开，并显示出对应选项子菜单的内容。在传统的web实现方式中，一般是在页面初始化时动态获取到服务端数据库中对应的所有子菜单中的信息，放置到页面中对应的位置，然后再结合CSS层叠样式表动态控制对应子菜单的显示或者隐
天-安-门，好高 atongyeye 情感
我是85后，北漂一族，之前房租1100，因为租房合同到期，再续，房租就要涨150。最近网上新闻，地铁也要涨价。算了一下，涨价之后，每次坐地铁由原来2块变成6块。仅坐地铁费用，一个月就要涨200。内心苦痛。晚上躺在床上一个人想了很久，很久。我生在农
android 动画百合不是茶 android 透明度平移缩放旋转
android的动画有两种 tween动画和Frame动画 tween动画;,透明度,缩放,旋转,平移效果 Animation 动画 AlphaAnimation 渐变透明度 RotateAnimation 画面旋转 ScaleAnimation 渐变尺寸缩放 TranslateAnimation 位置移动 Animation
查看本机网络信息的cmd脚本 bijian1013 cmd
@echo 您的用户名是：%USERDOMAIN%\%username%>"%userprofile%\网络参数.txt" @echo 您的机器名是：%COMPUTERNAME%>>"%userprofile%\网络参数.txt" @echo ___________________>>"%userprofile%\
plsql 清除登录过的用户征客丶 plsql
tools---preferences----logon history---history 把你想要删除的删除 -------------------------------------------------------------------- 若有其他凝问或文中有错误，请及时向我指出，我好及时改正，同时也让我们一起进步。 email ： binary_spac
【Pig一】Pig入门 bit1129 pig
Pig安装 1.下载pig wget http://mirror.bit.edu.cn/apache/pig/pig-0.14.0/pig-0.14.0.tar.gz 2. 解压配置环境变量如果Pig使用Map/Reduce模式，那么需要在环境变量中，配置HADOOP_HOME环境变量 expor
Java 线程同步几种方式 BlueSkator volatile synchronized ThredLocal ReenTranLock Concurrent
为何要使用同步？ java允许多线程并发控制，当多个线程同时操作一个可共享的资源变量时（如数据的增删改查），将会导致数据不准确，相互之间产生冲突，因此加入同步锁以避免在该线程没有完成操作之前，被其他线程的调用，从而保证了该变量的唯一性和准确性。 1.同步方法&
StringUtils判断字符串是否为空的方法（转帖） BreakingBad null StringUtils “”
转帖地址：http://www.cnblogs.com/shangxiaofei/p/4313111.html public static boolean isEmpty(String str) 　　判断某字符串是否为空，为空的标准是 str== null 或 str.length()== 0
编程之美-分层遍历二叉树 bylijinnan java 数据结构算法编程之美
import java.util.ArrayList; import java.util.LinkedList; import java.util.List; public class LevelTraverseBinaryTree { /** * 编程之美分层遍历二叉树 * 之前已经用队列实现过二叉树的层次遍历，但这次要求输出换行，因此要
jquery取值和ajax提交复习记录 chengxuyuancsdn jquery取值 ajax提交
// 取值 // alert($("input[name='username']").val()); // alert($("input[name='password']").val()); // alert($("input[name='sex']:checked").val()); // alert($("
推荐国产工作流引擎嵌入式公式语法解析器-IK Expression comsci java 应用服务器工作 Excel 嵌入式
这个开源软件包是国内的一位高手自行研制开发的，正如他所说的一样，我觉得它可以使一个工作流引擎上一个台阶。。。。。。欢迎大家使用，并提出意见和建议。。。 ----------转帖--------------------------------------------------- IK Expression是一个开源的（OpenSource），可扩展的（Extensible），基于java语言
关于系统中使用多个PropertyPlaceholderConfigurer的配置及PropertyOverrideConfigurer daizj spring
1、PropertyPlaceholderConfigurer Spring中PropertyPlaceholderConfigurer这个类，它是用来解析Java Properties属性文件值，并提供在spring配置期间替换使用属性值。接下来让我们逐渐的深入其配置。基本的使用方法是：(1) <bean id="propertyConfigurerForWZ&q
二叉树:二叉搜索树 dieslrae 二叉树
所谓二叉树,就是一个节点最多只能有两个子节点,而二叉搜索树就是一个经典并简单的二叉树.规则是一个节点的左子节点一定比自己小,右子节点一定大于等于自己(当然也可以反过来).在树基本平衡的时候插入,搜索和删除速度都很快,时间复杂度为O(logN).但是,如果插入的是有序的数据,那效率就会变成O(N),在这个时候,树其实变成了一个链表. tree代码:
C语言字符串函数大全 dcj3sjt126com c function
C语言字符串函数大全函数名: stpcpy 功能: 拷贝一个字符串到另一个用法: char *stpcpy(char *destin, char *source); 程序例: #include <stdio.h> #include <string.h> int main
友盟统计页面技巧 dcj3sjt126com 技巧
在基类调用就可以了, 基类ViewController示例代码 -(void)viewWillAppear:(BOOL)animated { [super viewWillAppear:animated]; [MobClick beginLogPageView:[NSString stringWithFormat:@"%@",self.class]];
window下在同一台机器上安装多个版本jdk，修改环境变量不生效问题处理办法 flyvszhb java jdk
window下在同一台机器上安装多个版本jdk，修改环境变量不生效问题处理办法本机已经安装了jdk1.7，而比较早期的项目需要依赖jdk1.6，于是同时在本机安装了jdk1.6和jdk1.7. 安装jdk1.6前，执行java -version得到 C:\Users\liuxiang2>java -version java version "1.7.0_21&quo
Java在创建子类对象的同时会不会创建父类对象 happyqing java 创建子类对象父类对象
1.在thingking in java 的第四版第六章中明确的说了，子类对象中封装了父类对象， 2."When you create an object of the derived class, it contains within it a subobject of the base class. This subobject is the sam
跟我学spring3 目录贴及电子书下载 jinnianshilongnian spring
一、《跟我学spring3》电子书下载地址：《跟我学spring3》（1-7 和 8-13） http://jinnianshilongnian.iteye.com/blog/pdf 跟我学spring3系列 word原版下载二、源代码下载最新依
第12章 Ajax（上） onestopweb Ajax
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
BI and EIM 4.0 at a glance blueoxygen BO
http://www.sap.com/corporate-en/press.epx?PressID=14787 有机会研究下EIM家族的两个新产品~~~~ New features of the 4.0 releases of BI and EIM solutions include: Real-time in-memory computing –
Java线程中yield与join方法的区别 tomcat_oracle java
长期以来，多线程问题颇为受到面试官的青睐。虽然我个人认为我们当中很少有人能真正获得机会开发复杂的多线程应用(在过去的七年中，我得到了一个机会)，但是理解多线程对增加你的信心很有用。之前，我讨论了一个wait()和sleep()方法区别的问题，这一次，我将会讨论join()和yield()方法的区别。坦白的说，实际上我并没有用过其中任何一个方法，所以，如果你感觉有不恰当的地方，请提出讨论。 &nb
android Manifest.xml选项阿尔萨斯 Manifest
结构继承关系 public final class Manifest extends Objectjava.lang.Objectandroid.Manifest 内部类 class Manifest.permission权限 class Manifest.permission_group权限组构造函数 public Manifest () 详细 androi
Oracle实现类split函数的方 zhaoshijie oracle
关键字：Oracle实现类split函数的方项目里需要保存结构数据，批量传到后他进行保存，为了减小数据量，子集拼装的格式，使用存储过程进行保存。保存的过程中需要对数据解析。但是oracle没有Java中split类似的函数。从网上找了一个，也补全了一下。 CREATE OR REPLACE TYPE t_split_100 IS TABLE OF VARCHAR2(100); cr