java xpath

Vipan Singla e-mail: [email protected]

XML and XPath Usage

Most common DOM interfaces:

Node: The base datatype of the DOM.

Element: The vast majority of the objects you抣l deal with are "Elements".

Attr: Represents an attribute of an "Element".

Text: The actual content of an "Element" or "Attr".

Document: Represents the entire XML document. A "Document" object is often referred to as a DOM tree.

Common DOM methods

Document.getDocumentElement()

Returns the root "element" of the document. It is the top level tag in the document. It is different from the "root" itself which is just a "/". So the root element resides below the "/". There are other elements below the "/" such as an <xml> declaration or a "comment".

Node.getFirstChild() and Node.getLastChild()

Returns the first or last child of a given Node.

Node.getNextSibling() and Node.getPreviousSibling()

Return the next or previous element, node or whatever at the same level as the node itself in the document tree.

Node.getAttribute(attrName)

For a given Node, returns the attribute with the requested name. For example, if you want the Attr object for the attribute named id, use getAttribute("id").

getElementsByTagName("tag_name")

Retrieve all of the <tag_name> elements in the document. This method saves the trouble of writing code to traverse the entire tree. Or, you can use XPath. See below.

All Seven Kinds of Nodes

The root

Elements

Text

Attributes

Namespaces

Processing instructions

Comments

XPath Abbreviated Syntax Examples
In all cases below, the "context node" is the node you want to start searching from in a pre-parsed "document" object. You must be holding a reference to the context node. Remember, a Document object is a type of Node. For example, in:
NodeIterator nl = XPathAPI.selectNodeIterator(node, "para");
, the argument node is the context node you want to start searching from. You may obtain the "Document" object using:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("C:\some_dir\some_file.xml");
The parse method can also take an "InputStream", "URL" or XML "InputSource" object. After you get the "Document" object, you should collapse all contiguous whitespace and "Text" nodes into one "text" node using:
doc.getDocumentElement().normalize();
Otherwise, your "Document" object is going to contain so many useless (empty) "Text" nodes that you are going to have a tough time reaching the useful textual content within an element.
para selects the "para" element children of the context node

* selects all element children of the context node

text() selects all text node children of the context node

@name selects the "name" attribute of the context node

@* selects all the attributes of the context node

para[1] selects the first "para" child of the context node

para[last()] selects the last "para" child of the context node

*/para selects all para grandchildren of the context node

/doc/chapter[5]/section[2] selects the second section of the fifth chapter of the doc

chapter//para selects the "para" element descendants of the "chapter" element children of the context node

//para selects all the para descendants of the "document root" and thus selects all "para" elements in the same document as the context node

//olist/item selects all the "item" elements in the same document as the context node that have an "olist" parent

. selects the context node itself

.//para selects the "para" element descendants of the context node

.. selects the parent of the context node

../@lang selects the "lang" attribute of the parent of the context node

para[@type="warning"] selects all "para" children of the context node that have a "type" attribute with value "warning"

para[@type="warning"][5] selects the fifth "para" child of the context node that has a "type" attribute with value "warning"

para[5][@type="warning"] selects the fifth "para" child of the context node if that child has a "type" attribute with value "warning"

chapter[title="Introduction"] selects the "chapter" children of the context node that have one or more "title" children with string-value equal to "Introduction" (Use this to match to a particular element which contains the text value you desire)

chapter[title] selects the "chapter" children of the context node that have one or more "title" children

employee[@secretary and @assistant] selects all the "employee" children of the context node that have both a "secretary" attribute and an "assistant" attribute

The default axes is "child". For example, a location path div/para is short for child::div/child::para.

So, abbreviation for attribute:: is @. For example, a location path para[@type="warning"] is short for child::para[attribute::type="warning"].

// is short for /descendant-or-self::node()/. For example, //para is short for /descendant-or-self::node()/child::para. Here, even a "para" element that is a document element will be selected since the document element node is a child of the root node.

The location path //para[1] does not mean the same as the location path /descendant::para[1]. The latter selects the first descendant para element; the former selects all descendant para elements that are the first para children of their parents.

A location step of . is short for self::node(). This is particularly useful in conjunction with //. For example, the location path .//para is short for self::node()/descendant-or-self::node()/child::para and so will select all para descendant elements of the context node.

Similarly, a location step of .. is short for parent::node(). For example, ../title is short for parent::node()/child::title and so will select the title children of the parent of the context node.
Demonstration Example of Using XML Xpath in a Java program
Save this code in XPathDemo.java file:
import java.io.*;
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.w3c.dom.*;
import org.w3c.dom.traversal.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.apache.xpath.*;


/**
 * This class demonstrates how to use Java to parse an XML file and get
 * any element's content or attribute's value WITHOUT "walking the tree".
 * It uses XPath to achieve this goal.  Also shown is a trivial usage of
 * an XML transform to print the parsed XML file to console.
 *
 * Some of the program snippets are by http://xml.apache.org.
 *
 */
public class XPathDemo {

  public static void main(String[] args) {

    if (args.length < 2) {
      System.out.println("Usage: ");
      System.out.println(
        "java -classpath xerces.jar;.;xalan.jar "
        + " XPathDemo your-file.xml your-xpath-string");
      return;
    }

    try {

      /****************************************************************
       * How to use turn an XML file into a document object in Java
       ****************************************************************/

      System.out.println("Parsing XML file " + args[0] + " ...");

      DocumentBuilderFactory docBuilderFactory =
                    DocumentBuilderFactory.newInstance();
      DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
      // Parse the XML file and build the Document object in RAM
      Document doc = docBuilder.parse(new File(args[0]));

      // Normalize text representation.
      // Collapses adjacent text nodes into one node.
      doc.getDocumentElement().normalize();
      /****************************************************************/


      /****************************************************************
       * How to use xpath to extract info from document object in Java
       ****************************************************************/
      String xpath = args[1];
      System.out.println("\nQuerying DOM using xpath string:" + xpath);

      // Catches the first node that meets the criteria of xpath string
      String str = XPathAPI.eval(doc, xpath).toString();
      System.out.println("=>" + str + "<=\n");
      /****************************************************************/


      /****************************************************************
       * How to get root node of the document object
       ****************************************************************/
      Node root = doc.getDocumentElement();
      System.out.println("\nRoot element of the doc is =>"
                + root.getNodeName() + "<=");
      /****************************************************************/


      /****************************************************************
       * How to print the parsed xml file right back to system out
       ****************************************************************/
      String xpathString = args[1];
      // Set up an identity transformer to use as serializer.
      // This one can write input to output stream
      Transformer serializer =
          TransformerFactory.newInstance().newTransformer();
      serializer.setOutputProperty(
            OutputKeys.OMIT_XML_DECLARATION, "yes");

      // Use the simple XPath API to select a nodeIterator.
      System.out.println("\nPrinting subtree under xpath =>"
                        + xpathString + "<=");
      NodeIterator nl = XPathAPI.selectNodeIterator(doc, xpathString);

      Node n;
      while ((n = nl.nextNode()) != null) {
        // Serialize the found nodes to System.out
        serializer.transform(
            new DOMSource(n),
            new StreamResult(System.out));
      }
      /****************************************************************/

    }
    catch (SAXParseException err) {
      String msg =
        "** SAXParseException"
          + ", line "
          + err.getLineNumber()
          + ", uri "
          + err.getSystemId()
          + "\n"
          + "   "
          + err.getMessage();
      System.out.println(msg);
      // print stack trace
      Exception x = err.getException();
      ((x == null) ? err : x).printStackTrace();
    }
    catch (SAXException e) {
      String msg = "SAXException";
      System.out.println(msg);
      Exception x = e.getException();
      ((x == null) ? e : x).printStackTrace();
    }
    catch (Exception e) {
      e.printStackTrace();
    }
    catch (Throwable t) {
      t.printStackTrace();
      String msg = "Some other exception while getting XML";
      System.out.println(msg);
    }
  }
}
Download Xalan from http://xml.apache.org, extract/unzip the downloaded file, find xerces.jar and xalan.jar files and copy these files in the same directory where you saved the above code in XPathDemo.java file (just to make the demonstration easier).
The download is about 7MB although the two files you need are about 2MB combined. The rest is documentation and the full Java source of Xalan!
Compile XPathDemo.java using:
javac -classpath xerces.jar;.;xalan.jar XPathDemo.java
Get or create any XML file. Here is a simple example. Save it as, say, example.xml file in the same directory as the above files (just to make the demonstration easier).
<demo-xpath>
  <database-access db-name="db1">
    Here is to xpath!
    <username>scott</username>
    <password>tiger</password>
    May be some text here.
    Some more text here.
  </database-access>
  Last text line!
</demo-xpath>
Now, you have apache's XML parser in xerces.jar, XPath API in xalan.jar, your Java program in XPathDemo.class and a sample XML file example.xml. You can try to run your Java program and pass it the XML file name and any XPath string. And see what the program gives you! Some generic XPath strings to try are . for current node (in this Java program, same as the root node) and / for root node.
Run XPathDemo using these commands one by one as examples:
java -classpath xerces.jar;.;xalan.jar XPathDemo example.xml /
java -classpath xerces.jar;.;xalan.jar XPathDemo example.xml .
java -classpath xerces.jar;.;xalan.jar XPathDemo example.xml /demo-xpath
java -classpath xerces.jar;.;xalan.jar XPathDemo example.xml //@db-name
java -classpath xerces.jar;.;xalan.jar XPathDemo example.xml //username
These runs will demonstrate different ways to use XPath to get the content of an element or the value of an attribute.
If you specify a non-existent element or attribute, the toString() method of XObject obtained from the XpathAPI.eval(...) method returns an empty string, not a nullPointerException, by design. Actually, a subclass of XObject, XNull, is returned whose toString() method has been programmed to return an empty string. See Xalan's javadoc.
Core Functions

Each function in the function library is specified using a function prototype, which gives the return type, the name of the function, and the type of the arguments. If an argument type is followed by a question mark, then the argument is optional; otherwise, the argument is required.

Node-Set Functions

number last(): The last node "number" in the node-set.

number position()

number count(node-set): Number of nodes in the node-set.

node-set id(object): id("foo") selects the element with unique ID "foo" and id("foo")/child::para[position()=5] selects the fifth "para" child of the element with unique ID "foo".

string local-name(node-set?): Local part of the expanded-name of the node in the argument node-set that is first in document order. If the argument node-set is empty or the first node has no expanded-name, an empty string is returned. If the argument is omitted, it defaults to a node-set with the context node as its only member.

string namespace-uri(node-set?): Some advanced function.

string name(node-set?): Some advanced function. Returns weird-looking name.

String Functions
string string(object?): Converts an object to a string as follows:

A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.

A number is converted to a string as follows:

NaN is converted to the string NaN

positive zero is converted to the string 0

negative zero is converted to the string 0

positive infinity is converted to the string Infinity

negative infinity is converted to the string -Infinity

if the number is an integer, the number is represented in decimal form as a Number with no decimal point and no leading zeros, preceded by a minus sign (-) if the number is negative

otherwise, the number is represented in decimal form as a Number including a decimal point with at least one digit before the decimal point and at least one digit after the decimal point, preceded by a minus sign (-) if the number is negative.

The boolean false value is converted to the string false. The boolean true value is converted to the string true.

An object of a type other than the four basic types is converted to a string in a way that is dependent on that type.

If the argument is omitted, it defaults to a node-set with the context node as its only member.
NOTE: The string function is not intended for converting numbers into strings for presentation to users. The format-number function and xsl:number element in [XSLT] provide this functionality.

string concat(string, string, string*): Concatenates its arguments.

boolean starts-with("string1", "string2"): Checks if "string1" starts with "string2".

boolean contains("string1", "string2"): Checks if "string1" contains "string2".

string substring-before("string1", "string2"): Returns a part of "string1" up to the first occurance of start of "string2". Or, empty string if no "string2" found.

string substring-after(string, string): Similar to above.
string substring(string, number1, number2?): Substring starting at number1 index position. number2 is end index position if present, otherwise go till the end.
More precisely, each character in the string (see [3.6 Strings]) is considered to have a numeric position: the position of the first character is 1, the position of the second character is 2 and so on. This differs from Java and ECMAScript, in which the String.substring method treats the position of the first character as 0.
The returned substring contains those characters for which the position of the character is greater than or equal to the rounded value of the second argument and, if the third argument is specified, less than the sum of the rounded value of the second argument and the rounded value of the third argument; the comparisons and addition used for the above follow the standard IEEE 754 rules; rounding is done as if by a call to the round function. The following examples illustrate various unusual cases:
substring("12345", 1.5, 2.6) returns "234"

substring("12345", 0, 3) returns "12"

substring("12345", 0 div 0, 3) returns ""

substring("12345", 1, 0 div 0) returns ""

substring("12345", -42, 1 div 0) returns "12345"

substring("12345", -1 div 0, 1 div 0) returns ""
number string-length(string?): Number of characters in the string. If no argument, returns length of string-value of context node.

string normalize-space(string?): Removes leading and trailing whitespace and replaces sequences of whitespace characters with a single space. If no argument, returns length of string-value of context node.

string translate(string, string1, string2): In "string", replaces occurrences of characters in "string1" with character at the corresponding position in "string2". For example, translate("bar","abc","ABC") returns the string BAr. If there is a character in the second argument string with no character at a corresponding position in the third argument string (because the second argument string is longer than the third argument string), then occurrences of that character in the first argument string are removed. For example, translate("--aaa--","abc-","ABC") returns "AAA". If a character occurs more than once in the second argument string, then the first occurrence determines the replacement character. If the third argument string is longer than the second argument string, then excess characters are ignored. Generally used for case-conversion.
Boolean Functions

boolean boolean(object): Converts object to a boolean as follows:

a number is true if and only if it is neither positive or negative zero nor NaN

a node-set is true if and only if it is non-empty

a string is true if and only if its length is non-zero

an object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type

boolean not(boolean): Reverses the argument.

boolean true(): Returns true.

boolean false(): Returns false.

boolean lang(string): Some advanced function

Number Functions

number number(object?): Converts object to a number as follows:

a string that consists of optional whitespace followed by an optional minus sign followed by a Number followed by whitespace is converted to a number that is nearest to the mathematical value represented by the string; any other string is converted to NaN

boolean true is converted to 1; boolean false is converted to 0

a node-set is first converted to a string as if by a call to the string function and then converted in the same way as a string argument

an object of a type other than the four basic types is converted to a number in a way that is dependent on that type

If the argument is omitted, it defaults to a node-set with the context node as its only member.

number sum(node-set): Sum total of all nodes in node-set after converting their string-values to numbers.

number floor(number): Lower integer than the number

number ceiling(number): Higher integer than the number

number round(number): The round function returns the number that is closest to the argument and that is an integer. If there are two such numbers, then the one that is closest to positive infinity is returned. If the argument is NaN, then NaN is returned. If the argument is positive infinity, then positive infinity is returned. If the argument is negative infinity, then negative infinity is returned. If the argument is positive zero, then positive zero is returned. If the argument is negative zero, then negative zero is returned. If the argument is less than zero, but greater than or equal to -0.5, then negative zero is returned.
NOTE: For these last two cases, the result of calling the round function is not the same as the result of adding 0.5 and then calling the floor function.

Data Model

XPath operates on an XML document as a tree. For all seven types of node, there is a way of determining a string-value for a node of that type. For some types of node, the string-value is part of the node; for other types of node, the string-value is computed from the string-value of descendant nodes.
NOTE: For element nodes and root nodes, the string-value of a node is not the same as the string returned by the DOM nodeValue method (see [DOM]).

There is an ordering, document order, defined on all the nodes in the document corresponding to the order in which the first character of the XML representation of each node occurs in the XML representation of the document after expansion of general entities. Thus, the root node will be the first node. Element nodes occur before their children. Thus, document order orders element nodes in order of the occurrence of their start-tag in the XML (after expansion of entities). The attribute nodes and namespace nodes of an element occur before the children of the element. The namespace nodes are defined to occur before the attribute nodes. The relative order of namespace nodes is implementation-dependent. The relative order of attribute nodes is implementation-dependent. Reverse document order is the reverse of document order.

Root nodes and element nodes have an ordered list of child nodes.

Nodes never share children

Every node other than the root node has exactly one parent, which is either an element node or the root node. A root node or an element node is the parent of each of its child nodes. The descendants of a node are the children of the node and the descendants of the children of the node.

Root Node is the root of the tree. A root node does not occur except as the root of the tree. The element node for the document element is a child of the root node. The root node also has as children processing instruction and comment nodes for processing instructions and comments that occur in the prolog and after the end of the document element.
The string-value of the root node is the concatenation of the string-values of all text node descendants of the root node in document order.

The children of an element node are the element nodes, comment nodes, processing instruction nodes and text nodes for its content. Entity references to both internal and external entities are expanded. Character references are resolved.
The string-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order.

Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element.
NOTE: This is different from the DOM, which does not treat the element bearing an attribute as the parent of the attribute.

Elements never share attribute nodes.

The = operator tests whether two nodes have the same value, not whether they are the same node. Thus attributes of two different elements may compare as equal using =, even though they are not the same node.

An attribute node has a normalized string-value. If it is an empty string, it results in an attribute node whose string-value is a zero-length string.

There is a comment node for every comment, except for any comment that occurs within the document type declaration.
The string-value of comment is the content of the comment not including the opening .

A text node never has an immediately following or preceding sibling that is another text node. The string-value of a text node is the character data. A text node always has at least one character of data.

A CDATA section is treated as if the <![CDATA[ and ]]> were removed and every occurrence of < and & were replaced by & l t ; (no spaces) and & a m p ; (no spaces) respectively.

� Vipan Singla 2000

你可能感兴趣的:(xpath)

Python教程：一文了解使用Python处理XPath 旦莫 Python进阶 python 开发语言
目录1.环境准备1.1安装lxml1.2验证安装2.XPath基础2.1什么是XPath？2.2XPath语法2.3示例XML文档3.使用lxml解析XML3.1解析XML文档3.2查看解析结果4.XPath查询4.1基本路径查询4.2使用属性查询4.3查询多个节点5.XPath的高级用法5.1使用逻辑运算符5.2使用函数6.实战案例6.1从网页抓取数据6.1.1安装Requests库6.1.2代
Python爬虫解析工具之xpath使用详解 eqa11 python 爬虫开发语言
文章目录Python爬虫解析工具之xpath使用详解一、引言二、环境准备1、插件安装2、依赖库安装三、xpath语法详解1、路径表达式2、通配符3、谓语4、常用函数四、xpath在Python代码中的使用1、文档树的创建2、使用xpath表达式3、获取元素内容和属性五、总结Python爬虫解析工具之xpath使用详解一、引言在Python爬虫开发中，数据提取是一个至关重要的环节。xpath作为一门
爬虫技术抓取网站数据 Bearjumpingcandy 爬虫
爬虫技术是一种自动化获取网站数据的技术，它可以模拟人类浏览器的行为，访问网页并提取所需的信息。以下是爬虫技术抓取网站数据的一般步骤：发起HTTP请求：爬虫首先会发送HTTP请求到目标网站，获取网页的内容。解析HTML：获取到网页内容后，爬虫会使用HTML解析器解析HTML代码，提取出需要的数据。数据提取：通过使用XPath、CSS选择器或正则表达式等工具，爬虫可以从HTML中提取出所需的数据，如文
BeautifulSoup 和 Xpath 的性能比较木语沉心
一些说明:其实这篇文章并不是为了比较出结论，因为结论是显而易见的.性能比较Xpath必然是要比BeautifulSoup在时间和空间上都要性能更好一些。其中理由有很多，其中一个很明显的是BeautifulSoup在构建一个对象的时候需要传入一个参数以指定解析器，而在它支持的众多的解析器中，lxml是性能最佳的，那么BeautifulSoup对象的各种方法可以理解为是对lxml的封装，换句话说，Be
JDom解析xml文件的java.lang.NoClassDefFoundError问题轻口味常见问题 xml exception encoding class list thread
java代码为：importjava.io.IOException;importjava.util.List;importorg.jdom.Document;importorg.jdom.Element;importorg.jdom.JDOMException;importorg.jdom.input.SAXBuilder;importorg.jdom.xpath.XPath;publicclas
第五章 SqlSession 的创建过程 flying jiang MyBatis 3源码深度解析 java tomcat mybatis
在MyBatis3中，SqlSession的创建过程涉及到对MyBatis配置文件的解析，这通常是通过XPath（XMLPathLanguage）来完成的。XPath是一种在XML文档中查找信息的语言，MyBatis使用它来解析配置文件（如mybatis-config.xml）中的元素和属性。以下是SqlSession创建过程中XPath使用的简要概述：读取配置文件：MyBatis首先需要读取其配
【语句】如何将列表拼接成字符串并截取20个字符后面的青龙摄影 javascript html 前端
base_info="".join(tree.xpath('/html/head/script[4]/text()'))[20:]以下是对这个语句的详细讲解：tree.xpath('/html/head/script[4]/text()')部分：tree：通常是一个已经构建好的HTML文档树对象，它是通过相关的HTML解析库（比如lxml）对HTML文档进行解析后得到的。/html/head/sc
基础爬虫 requests selenium aiohttp BeautifulSoup pyQuery Xpath&CssSelector 肯定是疯了
http://47.101.52.166/blog/back/python/%E7%88%AC%E8%99%AB.html请求requestsseleniumaiohttp*处理BeautifulSouppyQueryXpath&CssSelector*存储pymysqlPyMongoredisaiomysql*Scrapy
python web自动化 gaoguide2015 自动化脚本 web html
1.python爬虫之模拟登陆csdn(登录、cookie)http://blog.csdn.net/yanggd1987/article/details/52127436?locationNum=32、xml解析：Python网页解析：BeautifulSoup与lxml.html方式对比（xpath）lxml库速度快，功能强大，推荐。http://blog.sina.com.cn/s/blog
【Python报错】已解决FileNotFoundError: [Errno 2] No such file or directory: PosixPath(‘xxx‘) 云天徽上 python chrome numpy pandas 机器学习
解决Python报错：FileNotFoundError:[Errno2]Nosuchfileordirectory:PosixPath(‘xxx’)在Python编程中，处理文件和目录是一项常见的任务。然而，当你尝试打开一个不存在的文件时，可能会遇到FileNotFoundError:[Errno2]Nosuchfileordirectory:PosixPath('xxx')的错误。本文将介绍这
python爬虫面试真题及答案_Python面试题爬虫篇(附答案) 朴少 python爬虫面试真题及答案
0|1第一部分必答题注意：第31题1分，其他题均每题3分。1，了解哪些基于爬虫相关的模块？-网络请求：urllib，requests，aiohttp-数据解析：re，xpath，bs4，pyquery-selenium-js逆向：pyexcJs2，常见的数据解析方式？-re、lxml、bs43，列举在爬虫过程中遇到的哪些比较难的反爬机制？-动态加载的数据-动态变化的请求参数-js加密-代理-coo
python爬亚马逊数据_python爬虫----（6. scrapy框架，抓取亚马逊数据） weixin_39628342 python爬亚马逊数据
利用xpath()分析抓取数据还是比较简单的，只是网址的跳转和递归等比较麻烦。耽误了好久，还是豆瓣好呀，URL那么的规范。唉，亚马逊URL乱七八糟的....可能对url理解还不够.amazon├──amazon│├──__init__.py│├──__init__.pyc│├──items.py│├──items.pyc│├──msic││├──__init__.py││└──pad_urls.p
Swift Cell重用池机制以及UINib 司南_01b7
functableView(_tableView:UITableView,cellForRowAtindexPath:IndexPath)->UITableViewCell{letreuseID="taskCell5555555"//务必填写模版nib名（此处仅限于有cell模版，若无可忽略）letnib=UINib(nibName:"test5TableViewCell",bundle:nil)
技术分享 | app自动化测试（Android）--元素定位方式与隐式等待霍格沃兹测试开发学社测试人社区软件测试技能自动化运维
本文节选自霍格沃兹测试开发学社内部教材元素定位是UI自动化测试中最关键的一步，假如没有定位到元素，也就无法完成对页面的操作。那么在页面中如何定位到想要的元素，本小节讨论Appium元素定位方式。Appium的元素定位方式定位页面的元素有很多方式，比如可以通过ID、accessibility_id、XPath等方式进行元素定位，还可以使用Android、iOS工作引擎里面提供的定位方式。隐式等待设置
XPath和BeautifulSoup4 优秀的人A
什么是XPath？XPath(XMLPathLanguage)是一门在XML文档中查找信息的语言，可用来在XML文档中对元素和属性进行遍历什么是XML?XML指可扩展标记语言XML是一种标记语言，很类似HTMLXML的设计宗旨是传输数据，而非显示数据XML的标签需要我们自行定义XML被设计为具有自我描述性XML是W3C的推荐标准XML和HTML的区别XML是可扩展标记语言，被设计为传输和存储数据，
爬虫实战：一键爬取指定网站所有图片（二）老童聊AI python 明哥陪你学Python python
前言：上一篇已经提到了实现单网页下载图片，本篇将继续讲解如何通过爬虫来实现全网站的下载。任务分析：1、已实现指定某一网页的图片下载2、通过获取页面的url，进行href元素值的读取，并写入到下一个Job当中，并执行读出。直接进入题：这次的功能其实比较简单，只用通过xml的值，采用xpath的方式进入读取就行了。上一篇我们定义了一个DownloadImage类，这次我们新建一个download_im
Python 爬虫框架 BugLovers python
Python中有许多强大且主流的爬虫框架，这些框架提供了更高级的功能，使得开发和维护爬虫变得更加容易。以下是一些常用的爬虫框架：1.Scrapy-简介:Scrapy是Python最流行的爬虫框架之一，设计用于快速、高效地从网站中提取数据。它支持各种功能，如处理请求、解析HTML、处理分页、去重、以及保存数据等。-特点:-支持多线程，性能高效。-内置支持XPath、CSS选择器。-具有丰富的扩展插件
collectionViewCell防止复用的两种方法 suiyuechenglao collectionView iOS collectionView 复用
collectionView防止cell复用的方法一：//在创建collectionView的时候注册cell（一个分区）UICollectionViewCell*cell=[collectionViewdequeueReusableCellWithReuseIdentifier:@“cell"forIndexPath:indexPath];for(UIView*viewincell.conten
Unable to evaluate expression using this context java丶小虫 java Xpath XML
UnabletoevaluateexpressionusingthiscontextJAVA语言使用Xpath解析XML格式字符串publicStringxmlText(Stringxml){Documentdoc=null;try{doc=DocumentHelper.parseText(xml);//转为xmlXPathFactoryfactory=XPathFactory.newInstan
python爬取豆瓣电影信息_Python|简单爬取豆瓣网电影信息 weixin_39528525 python爬取豆瓣电影信息
前言：在掌握一些基础的爬虫知识后，就可以尝试做一些简单的爬虫来练一练手。今天要做的是利用xpath库来进行简单的数据的爬取。我们爬取的目标是电影的名字、导演和演员的信息、评分和url地址。准备环境：Pycharm、python3、爬虫库request、xpath模块、lxml模块第一步：分析url,理清思路先搜索豆瓣电影top250，打开网站可以发现要爬取的数据不止存在单独的一页，而是存在十页当中
Windows自动化2️⃣元素定位分析+图片视频上传等唐古乌梁海测试 python windows 自动化
windows自动化,难点元素定位XPath轴(XPathAxes)可定义某个相对于当前节点的节点集：preceding-sibling选取当前节点之前的所有同级节点following-sibling选取当前节点之后的所有同级节点preceding选取文档中当前节点的开始标签之前的所有节点following选取文档中当前节点的结束标签之后的所有节点preceding-sibling，选取当前节点之
java selenium 元素点击不了马达马达达 selenium 测试工具
最近做了一个页面爬取，很有意思被机缘巧合下解决了。这个元素很奇怪，用xpath可以定位元素，但是就是click()不了。试过了网上搜的一些办法：//尝试一WebElementa_tag=driver.findElement(By.xpath("xxx"));a_tag.click();//点击不了，卡住//尝试二WebDriverWaitwait=newWebDriverWait(driver,1
xpath的使用走到哪，爬到哪 python python chrome selenium xml
XPath是xml的路径语言，也是一门在xml文档中查找信息的语言。一、xpath常用规则表达式描述nodename选取此节点的所有节点/从当前节点选取子节点（从根节点开始定位）//从当前节点选取子孙节点.选取当前节点..选取当前节点的父节点@选取属性
XPATH表达式定位页面元素 qq_41075467 #RIDE--元素定位自动化软件测试 Xpath表达式 RIDE元素定位
XPATH表达式定位页面元素XPATH表达式语法1.选取节点2.谓语：用来查找某个特定的节点或者包含某个制定的值的节点，嵌在[]中3.选取未知节点4.选取若干路径轴：可定义相对于当前节点的节点集运算符常用功能函数1.关于节点的函数2.类型转换函数3.布尔函数4.字符串函数自动化测试学习过程中会用到一些页面元素的定位方法，常见的有id定位，name定位，css定位，以及Xpath定位，这里介绍的是X
【iPhone16】iPhone16抢购脚本苹果官网抢购 iPhone16 pro max 腹有诗书气自华777 chrome python
fromseleniumimportwebdriverimporttimedefclick_element(driver,xpath):element=driver.find_element_by_xpath(xpath)driver.execute_script("arguments[0].click();",element)defmain():#设置浏览器驱动路径driver_path="./
爬虫技术抓取网站数据 Bearjumpingcandy 爬虫
爬虫技术是一种自动化获取网站数据的技术，它可以模拟人类浏览器的行为，访问网页并提取所需的信息。以下是爬虫技术抓取网站数据的一般步骤：发起HTTP请求：爬虫首先会发送HTTP请求到目标网站，获取网页的内容。解析HTML：获取到网页内容后，爬虫会使用HTML解析器解析HTML文档，提取出需要的数据。数据提取：通过使用XPath、CSS选择器或正则表达式等工具，爬虫可以从HTML文档中提取出所需的数据，
python爬虫常用的库一剑丶飘香 python 爬虫
Python爬虫常用的库包括但不限于以下几种：请求库：`urllib`：Python3自带的库，用于发送HTTP请求，但现在可能被`requests`替代。1`requests`：第三方库，功能强大，使用简单，是当前最常用的请求库。2`Selenium`：自动化测试工具，用于模拟用户操作浏览器，适用于复杂页面。解析库：`lxml`：第三方库，支持HTML和XML的解析，支持XPath的解析方
appium定位xpath报错的解决办法（亲测有效）error“:“invalid argument“,“message“:“Exception while reading JSON“ 空城雀 appium json
通过weditor定位xpath的元素，确定存在，但是代码运行就是报错：error":“invalidargument”,“message”:“ExceptionwhilereadingJSON”解决办法如下：进到python的安装目录python311\Lib\site-packages\selenium\common有个文件：exceptions.py编辑该文件，加入类classInvalid
Xpath和BeautifulSoup4 骚X
什么是Xpath?Xpath(XMLPathLanguage)是一门在XML文档中查找信息的语音,可用来在XML文档对元素和属性进行遍历什么是XML?XML指可扩展标记语音XML是一种标记语音,很类似HTMLXML的设计宗旨是传输数据,而非显示数据XML的标签需要我们自行定义XML被设计为具有自我描述性XML是W3C推荐标准XML和HTML的区别XML是可扩展标记语音,被设计为传输和存储数据,其焦
Jmeter基本使用 weixin_43973848 工具的使用 jmeter python 开发语言
jmeter用法一、环境信息了解二、jmeter的使用基本元件重要的三个组件基础页面功能介绍配置元件介绍参数化方式csv注意断言接口关联1.正则表达式2.xpath提取器3.json提取器jmeter连接数据库逻辑控制器1.if控制器2.循环控制器3.foreach控制器4.吞吐量控制器定时器断言&监听器几种查看结果的方式三、jmeter脚本编写脚本录制四、跨线程的变量调用方法1：设置全局属性调用
Spring4.1新特性——Spring MVC增强 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
mysql 性能查询优化 annan211 java sql 优化 mysql 应用服务器
1 时间到底花在哪了？ mysql在执行查询的时候需要执行一系列的子任务，这些子任务包含了整个查询周期最重要的阶段，这其中包含了大量为了检索数据列到存储引擎的调用以及调用后的数据处理，包括排序、分组等。在完成这些任务的时候，查询需要在不同的地方花费时间，包括网络、cpu计算、生成统计信息和执行计划、锁等待等。尤其是向底层存储引擎检索数据的调用操作。这些调用需要在内存操
windows系统配置 cherishLC windows
删除Hiberfil.sys ：使用命令powercfg -h off 关闭休眠功能即可： http://jingyan.baidu.com/article/f3ad7d0fc0992e09c2345b51.html 类似的还有pagefile.sys msconfig 配置启动项 shutdown 定时关机 ipconfig 查看网络配置 ipconfig /flushdns
人体的排毒时间 Array_06 工作
======================== || 人体的排毒时间是什么时候？|| ======================== 转载于： http://zhidao.baidu.com/link?url=ibaGlicVslAQhVdWWVevU4TMjhiKaNBWCpZ1NS6igCQ78EkNJZFsEjCjl3T5EdXU9SaPg04bh8MbY1bR
ZooKeeper cugfy zookeeper
Zookeeper是一个高性能，分布式的，开源分布式应用协调服务。它提供了简单原始的功能，分布式应用可以基于它实现更高级的服务，比如同步，配置管理，集群管理，名空间。它被设计为易于编程，使用文件系统目录树作为数据模型。服务端跑在java上，提供java和C的客户端API。 Zookeeper是Google的Chubby一个开源的实现，是高有效和可靠的协同工作系统，Zookeeper能够用来lea
网络爬虫的乱码处理随意而生爬虫网络
下边简单总结下关于网络爬虫的乱码处理。注意，这里不仅是中文乱码，还包括一些如日文、韩文、俄文、藏文之类的乱码处理，因为他们的解决方式是一致的，故在此统一说明。网络爬虫，有两种选择，一是选择nutch、hetriex，二是自写爬虫，两者在处理乱码时，原理是一致的，但前者处理乱码时，要看懂源码后进行修改才可以，所以要废劲一些；而后者更自由方便，可以在编码处理
Xcode常用快捷键张亚雄 xcode
一、总结的常用命令：隐藏xcode command+h 退出xcode command+q 关闭窗口 command+w 关闭所有窗口 command+option+w 关闭当前
mongoDB索引操作 adminjun mongodb 索引
一、索引基础： MongoDB的索引几乎与传统的关系型数据库一模一样，这其中也包括一些基本的优化技巧。下面是创建索引的命令： > db.test.ensureIndex({"username":1}) 可以通过下面的名称查看索引是否已经成功建立： &nbs
成都软件园实习那些话 aijuans 成都软件园实习
无聊之中，翻了一下日志，发现上一篇经历是很久以前的事了，悔过~~ 　　断断续续离开了学校快一年了，习惯了那里一天天的幼稚、成长的环境，到这里有点与世隔绝的感觉。不过还好，那是刚到这里时的想法，现在感觉在这挺好，不管怎么样，最要感谢的还是老师能给这么好的一次催化成长的机会，在这里确实看到了好多好多能想到或想不到的东西。　　都说在外面和学校相比最明显的差距就是与人相处比较困难，因为在外面每个人都
Linux下FTP服务器安装及配置 ayaoxinchao linux FTP服务器 vsftp
检测是否安装了FTP [root@localhost ~]# rpm -q vsftpd 如果未安装：package vsftpd is not installed 安装了则显示：vsftpd-2.0.5-28.el5累死的版本信息安装FTP 运行yum install vsftpd命令，如[root@localhost ~]# yum install vsf
使用mongo-java-driver获取文档id和查找文档 BigBird2012 driver
注：本文所有代码都使用的mongo-java-driver实现。在MongoDB中，一个集合（collection）在概念上就类似我们SQL数据库中的表（Table），这个集合包含了一系列文档（document）。一个DBObject对象表示我们想添加到集合（collection）中的一个文档（document），MongoDB会自动为我们创建的每个文档添加一个id，这个id在
JSONObject以及json串 bijian1013 json JSONObject
一.JAR包简介要使程序可以运行必须引入JSON-lib包，JSON-lib包同时依赖于以下的JAR包： 1.commons-lang-2.0.jar 2.commons-beanutils-1.7.0.jar 3.commons-collections-3.1.jar &n
[Zookeeper学习笔记之三]Zookeeper实例创建和会话建立的异步特性 bit1129 zookeeper
为了说明问题，看个简单的代码， import org.apache.zookeeper.*; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ThreadLocal
【Scala十二】Scala核心六：Trait bit1129 scala
Traits are a fundamental unit of code reuse in Scala. A trait encapsulates method and field definitions, which can then be reused by mixing them into classes. Unlike class inheritance, in which each c
weblogic version 10.3破解 ronin47 weblogic
版本：WebLogic Server 10.3 说明：%DOMAIN_HOME%：指WebLogic Server 域(Domain）目录例如我的做测试的域的根目录 DOMAIN_HOME=D:/Weblogic/Middleware/user_projects/domains/base_domain 1.为了保证操作安全，备份%DOMAIN_HOME%/security/Defa
求第n个斐波那契数 BrokenDreams
今天看到群友发的一个问题：写一个小程序打印第n个斐波那契数。自己试了下，搞了好久。。。基础要加强了。 &nbs
读《研磨设计模式》-代码笔记-访问者模式-Visitor bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; interface IVisitor { //第二次分派，Visitor调用Element void visitConcret
MatConvNet的excise 3改为网络配置文件形式 cherishLC matlab
MatConvNet为vlFeat作者写的matlab下的卷积神经网络工具包，可以使用GPU。主页： http://www.vlfeat.org/matconvnet/ 教程： http://www.robots.ox.ac.uk/~vgg/practicals/cnn/index.html 注意：需要下载新版的MatConvNet替换掉教程中工具包中的matconvnet： http
ZK Timeout再讨论 chenchao051 zookeeper timeout hbase
http://crazyjvm.iteye.com/blog/1693757 文中提到相关超时问题，但是又出现了一个问题，我把min和max都设置成了180000，但是仍然出现了以下的异常信息： Client session timed out, have not heard from server in 154339ms for sessionid 0x13a3f7732340003
CASE WHEN 用法介绍 daizj sql group by case when
CASE WHEN 用法介绍 1. CASE WHEN 表达式有两种形式 --简单Case函数 CASE sex WHEN '1' THEN '男' WHEN '2' THEN '女' ELSE '其他' END --Case搜索函数 CASE WHEN sex = '1' THEN
PHP技巧汇总:提高PHP性能的53个技巧 dcj3sjt126com PHP
PHP技巧汇总:提高PHP性能的53个技巧　　用单引号代替双引号来包含字符串，这样做会更快一些。因为PHP会在双引号包围的字符串中搜寻变量，　　单引号则不会，注意：只有echo能这么做，它是一种可以把多个字符串当作参数的函数译注：　　PHP手册中说echo是语言结构，不是真正的函数，故把函数加上了双引号)。　　1、如果能将类的方法定义成static，就尽量定义成static，它的速度会提升将近4倍
Yii框架中CGridView的使用方法以及详细示例 dcj3sjt126com yii
CGridView显示一个数据项的列表中的一个表。表中的每一行代表一个数据项的数据,和一个列通常代表一个属性的物品(一些列可能对应于复杂的表达式的属性或静态文本)。　　CGridView既支持排序和分页的数据项。排序和分页可以在AJAX模式或正常的页面请求。使用CGridView的一个好处是,当用户浏览器禁用JavaScript,排序和分页自动退化普通页面请求和仍然正常运行。实例代码如下：
Maven项目打包成可执行Jar文件 dyy_gusi assembly
Maven项目打包成可执行Jar文件在使用Maven完成项目以后，如果是需要打包成可执行的Jar文件，我们通过eclipse的导出很麻烦，还得指定入口文件的位置，还得说明依赖的jar包，既然都使用Maven了，很重要的一个目的就是让这些繁琐的操作简单。我们可以通过插件完成这项工作，使用assembly插件。具体使用方式如下： 1、在项目中加入插件的依赖： <plugin>
php常见错误 geeksun PHP
1. kevent() reported that connect() failed (61: Connection refused) while connecting to upstream, client: 127.0.0.1, server: localhost, request: "GET / HTTP/1.1", upstream: "fastc
修改linux的用户名 hongtoushizi linux change password
Change Linux Username 更改Linux用户名，需要修改4个系统的文件： /etc/passwd /etc/shadow /etc/group /etc/gshadow 古老/传统的方法是使用vi去直接修改，但是这有安全隐患（具体可自己搜一下），所以后来改成使用这些命令去代替： vipw vipw -s vigr vigr -s 具体的操作顺
第五章常用Lua开发库1-redis、mysql、http客户端 jinnianshilongnian nginx lua
对于开发来说需要有好的生态开发库来辅助我们快速开发，而Lua中也有大多数我们需要的第三方开发库如Redis、Memcached、Mysql、Http客户端、JSON、模板引擎等。一些常见的Lua库可以在github上搜索，https://github.com/search?utf8=%E2%9C%93&q=lua+resty。 Redis客户端 lua-resty-r
zkClient 监控机制实现 liyonghui160com zkClient 监控机制实现
直接使用zk的api实现业务功能比较繁琐。因为要处理session loss，session expire等异常，在发生这些异常后进行重连。又因为ZK的watcher是一次性的，如果要基于wather实现发布/订阅模式，还要自己包装一下，将一次性订阅包装成持久订阅。另外如果要使用抽象级别更高的功能，比如分布式锁，leader选举
在Mysql 众多表中查找一个表名或者字段名的 SQL 语句 pda158 mysql
在Mysql 众多表中查找一个表名或者字段名的 SQL 语句：　　方法一：SELECT table_name, column_name from information_schema.columns WHERE column_name LIKE 'Name'; 　　方法二：SELECT column_name from information_schema.colum
程序员对英语的依赖 Smile.zeng 英语程序猿
1、程序员最基本的技能，至少要能写得出代码，当我们还在为建立类的时候思考用什么单词发牢骚的时候，英语与别人的差距就直接表现出来咯。 2、程序员最起码能认识开发工具里的英语单词，不然怎么知道使用这些开发工具。 3、进阶一点，就是能读懂别人的代码，有利于我们学习人家的思路和技术。 4、写的程序至少能有一定的可读性，至少要人别人能懂吧... 以上一些问题，充分说明了英语对程序猿的重要性。骚年
Oracle学习笔记(8) 使用PLSQL编写触发器 vipbooks oracle sql 编程活动 Access
时间过得真快啊，转眼就到了Oracle学习笔记的最后个章节了，通过前面七章的学习大家应该对Oracle编程有了一定了了解了吧，这东东如果一段时间不用很快就会忘记了，所以我会把自己学习过的东西做好详细的笔记，用到的时候可以随时查找，马上上手！希望这些笔记能对大家有些帮助！这是第八章的学习笔记，学习完第七章的子程序和包之后