XML路径语言1.0推荐标准

XML Path Language (XPath)

XML路径语言
Version 1.0

W3C Recommendation 16 November 1999

This version:

http://www.w3.org/TR/1999/REC-xpath-19991116
(available in XML or HTML)

Latest version:

http://www.w3.org/TR/xpath

Previous versions:

http://www.w3.org/TR/1999/PR-xpath-19991008
http://www.w3.org/1999/08/WD-xpath-19990813
http://www.w3.org/1999/07/WD-xpath-19990709
http://www.w3.org/TR/1999/WD-xslt-19990421

Editors:

James Clark <[email protected]>
Steve DeRose (Inso Corp. and
BrownUniversity) <[email protected]>

Copyright  ©  1999 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.


Abstract

XPath is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer.

Xpath是一门用于定址xml文档部分的语言,被设计由xsltxpointer使用。

Status of this document

[]

This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from other documents. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

The list of known errors in this specification is available at http://www.w3.org/1999/11/REC-xpath-19991116-errata.

Comments on this specification may be sent to [email protected]; archives of the comments are available.

The English version of this specification is the only normative version. However, for translations of this document, see http://www.w3.org/Style/XSL/translations.html.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

This specification is joint work of the XSL Working Group and the XML Linking Working Group and so is part of the W3C Style activity and of the W3C XML activity.

Table of contents

目录

1 Introduction
2 Location Paths
    2.1 Location Steps
    2.2 Axes
    2.3 Node Tests
    2.4 Predicates
    2.5 Abbreviated Syntax
3 Expressions
    3.1 Basics
    3.2 Function Calls
    3.3 Node-sets
    3.4 Booleans
    3.5 Numbers
    3.6 Strings
    3.7 Lexical Structure
4 Core Function Library
    4.1 Node Set Functions
    4.2 String Functions
    4.3 Boolean Functions
    4.4 Number Functions
5 Data Model
    5.1 Root Node
    5.2 Element Nodes
        5.2.1 Unique IDs
    5.3 Attribute Nodes
    5.4 Namespace Nodes
    5.5 Processing Instruction Nodes
    5.6 Comment Nodes
    5.7 Text Nodes
6 Conformance

Appendices

A References
    A.1 Normative References
    A.2 Other References
B XML Information Set Mapping (Non-Normative)


1 Introduction

简介

XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations [XSLT] and XPointer [XPointer]. The primary purpose of XPath is to address parts of an XML [XML] document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.

为了向xsltxpointer共同需要的功能提供统一的语法和语义而设计了xpathXpath的主要功能是指称xml文档的(多个)部分。为了支持这一功能,xpath还提供了用于处理字符串、数字、布尔的基本机制。Xpath使用了一种紧凑的非xml语法以便使其便于在urixml属性值中使用。Xpathxml文档的操作基于逻辑结构而不是字面语法。Xpath的名字来源自它作为url中的路径符号的用法,以在具有层次的xml文档中指称特定部分。

In addition to its use for addressing, XPath is also designed so that it has a natural subset that can be used for matching (testing whether or not a node matches a pattern); this use of XPath is described in XSLT.

除了用于定位,xpath还设计有一个真子集,可用用于匹配(测试一个节点是否符合某种模式)。Xpath的这一用法在xslt中描述。

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node. Some types of nodes also have names. XPath fully supports XML Namespaces [XML Names]. Thus, the name of a node is modeled as a pair consisting of a local part and a possibly null namespace URI; this is called an expanded-name. The data model is described in detail in [5 Data Model].

Xpathxml文档解析为一个节点树。由许多类型的节点,其中包括元素节点、属性节点和文本节点。Xpath定义了一种将各种节点映射为一个字符串值的方法。一些节点具有名称。Xpath完全支持xml命名空间。因此一个节点的名称被解析为局部名和可能为空的命名空间uri组成的二元组,这被称为展开名。数据模型在下文详述。

The primary syntactic construct in XPath is the expression. An expression matches the production Expr. An expression is evaluated to yield an object, which has one of the following four basic types:

Xpath中的原始语法构造是表达式,表达式由表达式产出式定义。一个表达式被求值以产生一个对象,可能是以下几种类型:

  • node-set (an unordered collection of nodes without duplicates) 节点集(节点的无重复无序集合)
  • boolean (true or false) 布尔
  • number (a floating-point number) 数字
  • string (a sequence of UCS characters) 字符串(一个统一字符集中字符的序列)

Expression evaluation occurs with respect to a context. XSLT and XPointer specify how the context is determined for XPath expressions used in XSLT and XPointer respectively. The context consists of:

表达式在一个上下文中求值。Xsltxpointer指定了各自的上下文决定机制。上下文由以下元素构成:

  • a node (the context node) 一个节点(上下文节点)
  • a pair of non-zero positive integers (the context position and the context size) 一对正整数(上下文位置和上下文大小)
  • a set of variable bindings一个变量集
  • a function library一个函数库
  • the set of namespace declarations in scope for the expression
  • 表达式处于其作用域中的命名空间声明集合

The context position is always less than or equal to the context size.

上下文位置总是小于或等于上下文大小。

The variable bindings consist of a mapping from variable names to variable values. The value of a variable is an object, which can be of any of the types that are possible for the value of an expression, and may also be of additional types not specified here.

变量绑定由一个从变量名到变量值得映射构成。变量的值是一个对象,可以是任何表达式结果类型,也可以这里没有说明的附加类型。

The function library consists of a mapping from function names to functions. Each function takes zero or more arguments and returns a single result. This document defines a core function library that all XPath implementations must support (see [4 Core Function Library]). For a function in the core function library, arguments and result are of the four basic types. Both XSLT and XPointer extend XPath by defining additional functions; some of these functions operate on the four basic types; others operate on additional data types defined by XSLT and XPointer.

函数库是一个从函数名到函数的映射。每个函数接受有限个参数并返回单一结果。这篇文档定义了一个所有xpath实现都必须支持的核心函数库。核心函数的参数和结果都是四基本类型之一。Xsltxpoint都扩展了函数库,扩展函数中的一部分是四种基本类型上的映射,另一些涉及到xsltxpointer中额外定义的数据类型。

The namespace declarations consist of a mapping from prefixes to namespace URIs.

命名空间声明集合是一个从前缀到uri的映射。

The variable bindings, function library and namespace declarations used to evaluate a subexpression are always the same as those used to evaluate the containing expression. The context node, context position, and context size used to evaluate a subexpression are sometimes different from those used to evaluate the containing expression. Several kinds of expressions change the context node; only predicates change the context position and context size (see [2.4 Predicates]). When the evaluation of a kind of expression is described, it will always be explicitly stated if the context node, context position, and context size change for the evaluation of subexpressions; if nothing is said about the context node, context position, and context size, they remain unchanged for the evaluation of subexpressions of that kind of expression.

用于求子表达式值的变量集、函数库和命名空间声明集合总是和求解包含它们的的父表达式时所使用到的相同。而上下文节点和上下文大小、位置则可能变化。一些表达式可能会改变上下文节点,但只有谓词会改变上下文位置和上下文大小。因此当描述一种表达式的求值时,我们总会显示指明上下文节点、上下文位置和上下文大小是否会因此改变,如果没有指明就是不会变化。

XPath expressions often occur in XML attributes. The grammar specified in this section applies to the attribute value after XML 1.0 normalization. So, for example, if the grammar uses the character <, this must not appear in the XML source as < but must be quoted according to XML 1.0 rules by, for example, entering it as &lt;. Within expressions, literal strings are delimited by single or double quotation marks, which are also used to delimit XML attributes. To avoid a quotation mark in an expression being interpreted by the XML processor as terminating the attribute value the quotation mark can be entered as a character reference (&quot; or &apos;). Alternatively, the expression can use single quotation marks if the XML attribute is delimited with double quotation marks or vice-versa.

Xpath表达式经常出现在xml属性中。本节中描述的语法适用于xml1.0标准之后的属性值。如果语法中出现了<,那么在xml中使用时必须根据xml1.0规则转义。例如通过&lt;进行实体引用;在表达式中单一号和双引号用于分隔字符串字面量,而它们也同样用于分隔xml属性,因此它们在xml属性中也必须以字符引用的形式出现。或者,如果属性使用双引号分隔,再表达式中可以使用单引号,如果属性使用的是单引号,那么在表达式中可以使用双引号。(一句话这里描述的xpath是独立于xml的标准,如果要在xml中表达则xpath是目标语言)

One important kind of expression is a location path. A location path selects a set of nodes relative to the context node. The result of evaluating an expression that is a location path is the node-set containing the nodes selected by the location path. Location paths can recursively contain expressions that are used to filter sets of nodes. A location path matches the production LocationPath.

一类重要的表达式是定位路径。定位路径选定相对于上下文节点的一个节点集合。定位路径表达式的求值结果是包含其选中节点的节点集。定位路径可以递归地包含用于过滤节点或集合的表达式。定位路径由定位路径产出式定义。

In the following grammar, the non-terminals QName and NCName are defined in [XML Names], and S is defined in [XML]. The grammar uses the same EBNF notation as [XML] (except that grammar symbols always have initial capital letters).

下面语法中的非终止符好qnamencnanexml names规范中定义。Sxml规范中定义。本语法使用与xml规范相同的EBNF符号。

Expressions are parsed by first dividing the character string to be parsed into tokens and then parsing the resulting sequence of tokens. Whitespace can be freely used between tokens. The tokenization process is described in [3.7 Lexical Structure].

表达式通过将要分析的字符序列划分为token来解析,然后再分析产生的token序列。空白字符可以在token之间自由出现。Token解析过程在下文描述。

2 Location Paths

Although location paths are not the most general grammatical construct in the language (a LocationPath is a special case of an Expr), they are the most important construct and will therefore be described first.

虽然定位路径并不是最一般的语法构造(定位路径是特殊的表达式),但由于其最重要性首先介绍。

Every location path can be expressed using a straightforward but rather verbose syntax. There are also a number of syntactic abbreviations that allow common cases to be expressed concisely. This section will explain the semantics of location paths using the unabbreviated syntax. The abbreviated syntax will then be explained by showing how it expands into the unabbreviated syntax (see [2.5 Abbreviated Syntax]).

任何定位路径可以用一个直白但冗长的语法表达。当然也有简写它的语法。本节将通过定位路径的非缩写语法解释其语义。缩写规则在下文描述。

Here are some examples of location paths using the unabbreviated syntax:

这是一些使用非所略形式表达的定位路径的例子:

  • child::para selects the para element children of the context node
  • 选择上下文节点的名为para的子元素
  • child::* selects all element children of the context node
  • 选择上下文节的所有子元素
  • child::text() selects all text node children of the context node
  • 选择上下文节点的所有文本子节点
  • child::node() selects all the children of the context node, whatever their node type
  • 选择上下文节点的所有子节点,无论它们的类型
  • attribute::name selects the name attribute of the context node
  • 选择上下文节点名为name的属性
  • attribute::* selects all the attributes of the context node
  • 选择上下文节点的所有属性
  • descendant::para selects the para element descendants of the context node
  • 选择上下文节点的所有名为para的子孙
  • ancestor::div selects all div ancestors of the context node
  • 选择上下文节点所有名为div的祖先
  • ancestor-or-self::div selects the div ancestors of the context node and, if the context node is a div element, the context node as well
  • 选择上下文节点所有名为div的祖先,如果上下文节点名称为div,则一并被选择
  • descendant-or-self::para selects the para element descendants of the context node and, if the context node is a para element, the context node as well
  • 选择上下文节点名为para的子孙,如果上下文节点名为para,则一并被选择
  • self::para selects the context node if it is a para element, and otherwise selects nothing
  • 如果上下文节点名为para(元素)则选择它否则什么都不选择
  • child::chapter/descendant::para selects the para element descendants of the chapter element children of the context node
  • 选择上下文节点的chapter子元素的para元素子孙(/表迭代计算)
  • child::*/child::para selects all para grandchildren of the context node
  • 选择上下文节点的所有孙元素
  • / selects the document root (which is always the parent of the document element)
  • 选择文档元素
  • /descendant::para selects all the para elements in the same document as the context node
  • 将所有文档中para选作上下文节点(由于计算会改变上下文节点,因此说是迭代)
  • /descendant::olist/child::item selects all the item elements that have an olist parent and that are in the same document as the context node
  • 选择上下文节点所在文档中所有具有olist父元素的item元素
  • child::para[position()=1] selects the first para child of the context node
  • 算则上下文节点第一个名为para的子
  • child::para[position()=last()] selects the last para child of the context node
  • 选择上下文节点名为para的最后一个子
  • child::para[position()=last()-1] selects the last but one para child of the context node
  • 选择上下文节点倒数第二个子
  • child::para[position()>1] selects all the para children of the context node other than the first para child of the context node
  • 选择上下文节点所有名为para的子,除了第一个
  • following-sibling::chapter[position()=1] selects the next chapter sibling of the context node
  • 选择上下文节点的下一个chapter同胞
  • preceding-sibling::chapter[position()=1] selects the previous chapter sibling of the context node
  • 选择上下文节点的前一个chapter同胞
  • /descendant::figure[position()=42] selects the forty-second figure element in the document
  • 选择文档中第42figure元素
  • /child::doc/child::chapter[position()=5]/child::section[position()=2] selects the second section of the fifth chapter of the doc document element
  • 选择doc文档元素的第五个chapter元素的第二个section元素
  • child::para[attribute::type="warning"] selects all para children of the context node that have a type attribute with value warning
  • 选择上下文节点的所有具有名为type的属性且值warningpara子元素
  • child::para[attribute::type='warning'][position()=5] selects the fifth para child of the context node that has a type attribute with value warning
  • 选择上下文节点具有type属性且其值为warning的第五个para
  • child::para[position()=5][attribute::type="warning"] selects the fifth para child of the context node if that child has a type attribute with value warning
  • 选择上下文节点的第五个para,如果它具有名为type的属性且值为warning
  • child::chapter[child::title='Introduction'] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction
  • 选择上下文节点的具有一个或多个(字符串值为Introductiontitle子元素)的chapter
  • child::chapter[child::title] selects the chapter children of the context node that have one or more title children
  • 选择上下文节点的chapter子元素,如果该子元素具有title子元素
  • child::*[self::chapter or self::appendix] selects the chapter and appendix children of the context node
  • 选择上下文节点的所有chapterappendix
  • child::*[self::chapter or self::appendix][position()=last()] selects the last chapter or appendix child of the context node
  • 选择上下文节点所有chapterchapter子元素中的最后一个

There are two kinds of location path: relative location paths and absolute location paths.

有两种定位路径:相对定位路径和绝对定位路径。

A relative location path consists of a sequence of one or more location steps separated by /. The steps in a relative location path are composed together from left to right. Each step in turn selects a set of nodes relative to a context node. An initial sequence of steps is composed together with a following step as follows. The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union. For example, child::div/child::para selects the para element children of the div element children of the context node, or, in other words, the para element grandchildren that have div parents.

相对定位路径是一个或多个定位步骤组成的序列,其间由/分隔。在一个相对定位路径中的步自左向右结合。每一步依次选择一个相对于上下文节点的节点集。初始定位步序列同一个作为其后续步骤的定位步结合在一起,然后初始定位步序列选择了相对于上下文节点的一集节点,最后这个集合中的每个节点被作为后续定位步的上下文节点,由后续步标识出的所有节点集被合并在一起。整个定位路径的结果就是这个合并的结果。例如,child::div/child::para选择了上下文节的div子的para子,或者说具有div父元素的孙元素。

An absolute location path consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node. If it is followed by a relative location path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node.

绝对路径由一个/构成,可以跟一个可选的相对定位路径。一个/自身选择包含着上下文节点的文档的根节点。如果后面跟着一个相对定位路径,那么此绝对定位路径选择的节点集就是相对定位路径相对于包含上下文节点的文档根节点选择的节点集。

Location Paths

[1]   

LocationPath

   ::=   

RelativeLocationPath

| AbsoluteLocationPath

[2]   

AbsoluteLocationPath

   ::=   

'/' RelativeLocationPath?

| AbbreviatedAbsoluteLocationPath

[3]   

RelativeLocationPath

   ::=   

Step

| RelativeLocationPath '/' Step

| AbbreviatedRelativeLocationPath

2.1 Location Steps

A location step has three parts:

一个定位步分为三部分:

  • an axis, which specifies the tree relationship between the nodes selected by the location step and the context node,
  • 一个指定上下文节点与被步选择节点间树关系的轴。
  • a node test, which specifies the node type and expanded-name of the nodes selected by the location step, and
  • 一个节点测试,指定节点类型和被选择节点的全名。
  • zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.
  • 有限多个谓词,使用任意表达式以进一步精炼定位步选择的节点集。

The syntax for a location step is the axis name and node test separated by a double colon, followed by zero or more expressions each in square brackets. For example, in child::para[position()=1], child is the name of the axis, para is the node test and [position()=1] is a predicate.

定位步的语法是轴名和节点测试名,其间用::分开;然后是有限多的表达式,每一个都用[]括起来。例如:

The node-set selected by the location step is the node-set that results from generating an initial node-set from the axis and node-test, and then filtering that node-set by each of the predicates in turn.

定位步选定的节点集是由轴和节点测试确定的初始节点集,后经各谓词依次过滤而得。

The initial node-set consists of the nodes having the relationship to the context node specified by the axis, and having the node type and expanded-name specified by the node test. For example, a location step descendant::para selects the para element descendants of the context node: descendant specifies that each node in the initial node-set must be a descendant of the context; para specifies that each node in the initial node-set must be an element named para. The available axes are described in [2.2 Axes]. The available node tests are described in [2.3 Node Tests]. The meaning of some node tests is dependent on the axis.

初始节点集是由同上下文节点具有轴所指定的关系,并具有节点测试指定的类型与全名的节点构成的集合。例如定位步:descentant::para选择上下文节点所有名为para的子孙。Decendant指定初始节点集中的所有节点都必须是上下文节点的子孙,节点测试para指定初始节点集中的每个节点都必须是名为para的元素。某些节点测试的意义由轴决定。

The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step. The axis affects how the expression in each predicate is evaluated and so the semantics of a predicate is defined with respect to an axis. See [2.4 Predicates].

初始节点集由第一个为此过滤以产生一个新的节点集,这个新节点集然后被用第二个谓词过滤,如此继续。最终的节点集就是整个定位步选定的节点集。轴会影响各个谓词中表达式的求值,因此谓词的表达式的语义是同轴相关的。

[$求是是针对上下文节点而不是针对上下文节点集。是针对上一步产生的节点的中的每一个节点进行,它们每一个经过定位步后产生一个节点集,整个定位步的结果是这些集合的并集。$]

Location Steps

[4]   

Step

   ::=   

AxisSpecifier NodeTest Predicate*

| AbbreviatedStep

[5]   

AxisSpecifier

   ::=   

AxisName '::'

| AbbreviatedAxisSpecifier

2.2 Axes

The following axes are available:

有以下可用的轴:

  • the child axis contains the children of the context node
  • child轴包含上下文节点的所有子
  • the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes
  • descendant轴包含上下文节点的后代,后代是子或者子的子,等等;因此后代永远不包含属性或命名空间节点(因为子不包含属性和命名空间?)
  • the parent axis contains the parent of the context node, if there is one
  • parent轴包含上下文节点的父
  • the ancestor axis contains the ancestors of the context node; the ancestors of the context node consist of the parent of context node and the parent's parent and so on; thus, the ancestor axis will always include the root node, unless the context node is the root node
  • ancestor轴包含上下文节点的祖先,除非上下文节点是根节点,否则ancestor轴总是包含根节点
  • the following-sibling axis contains all the following siblings of the context node; if the context node is an attribute node or namespace node, the following-sibling axis is empty
  • following-sibling轴包含上下文节点的后继同胞,如果上下文节点是属性节点或命名空间节点则保持为空
  • the preceding-sibling axis contains all the preceding siblings of the context node; if the context node is an attribute node or namespace node, the preceding-sibling axis is empty
  • preceding-sibling轴包含上下文节点的前驱同胞
  • the following axis contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes
  • following轴包含同一文档中的所有位于上下文节点后的节点,但不包括子孙、属性和命名空间。很明显就是排除在上下文节点作为一个整体之外的位于上下文节点后的元素节点
  • the preceding axis contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes
  • 领先轴包含统一文档中依照文档序出现在上下文节点之前的节点,但不包括任何祖先、属性或命名空间。
  • the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element
  • 属性轴,当上下文节点不是元素时为空
  • the namespace axis contains the namespace nodes of the context node; the axis will be empty unless the context node is an element
  • 命名空间轴,包含上下文节点的命名空间,当上下文节点不是元素时为空
  • the self axis contains just the context node itself
  • 选择上下文节点自身
  • the descendant-or-self axis contains the context node and the descendants of the context node
  • 子孙或自身
  • the ancestor-or-self axis contains the context node and the ancestors of the context node; thus, the ancestor axis will always include the root node
  • 祖先或自身

NOTE: The ancestor, descendant, following, preceding and self axes partition a document (ignoring attribute and namespace nodes): they do not overlap and together they contain all the nodes in the document.

祖先、子孙、后继、前驱和自身轴构成对文档的划分(不计属性和命名空间);它们互不重叠,并包含了文档中的所有元素。

[$祖先轴上的节点开始在上下文节点前并且结束在上下文节点后;后代轴上的元素开始在上下文节点中并且结束在上下文节点中;领先轴上的节点开始于上下文节点前并且结束在上下文节点前;后继轴上的节点开始于上下文节点后并且结束与上下文节点后。由于xml规则不允许标签交叉,且结束标记必须在开始标记后(依照文档顺序),因此这个划分是完全的。$]

Axes

[6]   

AxisName

   ::=   

'ancestor'

| 'ancestor-or-self'

| 'attribute'

| 'child'

| 'descendant'

| 'descendant-or-self'

| 'following'

| 'following-sibling'

| 'namespace'

| 'parent'

| 'preceding'

| 'preceding-sibling'

| 'self'

2.3 Node Tests

节点测试

Every axis has a principal node type. If an axis can contain elements, then the principal node type is element; otherwise, it is the type of the nodes that the axis can contain. Thus,

每个轴具有一个主节点类型。如果一个轴能够包含元素,那么主节点类型就是元素;否则就是它所能包含的节点类型。因此:

  • For the attribute axis, the principal node type is attribute.
  • 对于属性轴主节点类型是属性
  • For the namespace axis, the principal node type is namespace.
  • 对于命名空间轴主节点类型是命名空间
  • For other axes, the principal node type is element.
  • 对于其他轴主节点类型是元素

A node test that is a QName is true if and only if the type of the node (see [5 Data Model]) is the principal node type and has an expanded-name equal to the expanded-name specified by the QName. For example, child::para selects the para element children of the context node; if the context node has no para children, it will select an empty set of nodes. attribute::href selects the href attribute of the context node; if the context node has no href attribute, it will select an empty set of nodes.

qname的节点测试为真,当且仅当节点类型是主节点类型并且具有与qname指定的全名相同的全名。例如,child::para选择上下文节点的para元素子,如果上下文节点没有para元素则返回空节点集;attribute::href选择上下文节点的href属性,如果上下文节点没有href属性,也返回空节点集。

[$节点测试也可以在命名空间轴上面选择命名空间,节点测试的名字指称要选择的命名空间声明前缀。现在不知道如何将默认命名空间声明筛选出来。$]

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

节点测试中的qname使用表达式上下文中的命名空间声明被展开为全名。这同展开起始标记和结束标记中的元素类型名一样,除了用xmlns声明的默认命名空间没有被使用:如果qname没有前缀,那么namespace uri就是null(这同属性名的扩展方法一样)。如果qname的前缀在上下文中没有被声明过,就会发生错误。

A node test * is true for any node of the principal node type. For example, child::* will select all element children of the context node, and attribute::* will select all attributes of the context node.

节点测试*对任何与轴的主节点类型相同的节点返回真。例如child::*会返回上下文节点的所有子元素,attribute::*会返回上下文节点的所有属性节点。

[$领先轴、后继轴、祖先轴、后代轴、父轴、子轴似乎只能包含元素节点或文本节点。$]

A node test can have the form NCName:*. In this case, the prefix is expanded in the same way as with a QName, using the context namespace declarations. It is an error if there is no namespace declaration for the prefix in the expression context. The node test will be true for any node of the principal type whose expanded-name has the namespace URI to which the prefix expands, regardless of the local part of the name.

一个节点测试可以具有ncname:*这样的形式。这时前缀按qname的方式使用上下文命名空间展开。如果表达式上下文中没有此前缀的命名空间声明则会发生错误。对于任何主节点类型的、扩展名uri与节点测试中uri相同的的节点回真,而忽略名称的局部部分。

The node test text() is true for any text node. For example, child::text() will select the text node children of the context node. Similarly, the node test comment() is true for any comment node, and the node test processing-instruction() is true for any processing instruction. The processing-instruction() test may have an argument that is Literal; in this case, it is true for any processing instruction that has a name equal to the value of the Literal.

节点测试text()对所有文本节点返回真。例如,child::text()会选择上下文的所有文本子节点。类似的,comment()对于任何注释节点返回真;而processing-instruction()对任何处理指令节点返回真。节点测试comment对任何注释节点返回真。Processing-instruction()节点测试可以具有一个字面量参数,用于指定指选择名称是该字面量参数的处理指令。

A node test node() is true for any node of any type whatsoever.

节点测试node()对任何类型的任何节点都是真。

[7]   

NodeTest

   ::=   

NameTest

| NodeType '(' ')'

| 'processing-instruction' '(' Literal ')'

2.4 Predicates

谓词

An axis is either a forward axis or a reverse axis. An axis that only ever contains the context node or nodes that are after the context node in document order is a forward axis. An axis that only ever contains the context node or nodes that are before the context node in document order is a reverse axis. Thus, the ancestor, ancestor-or-self, preceding, and preceding-sibling axes are reverse axes; all other axes are forward axes. Since the self axis always contains at most one node, it makes no difference whether it is a forward or reverse axis. The proximity position of a member of a node-set with respect to an axis is defined to be the position of the node in the node-set ordered in document order if the axis is a forward axis and ordered in reverse document order if the axis is a reverse axis. The first position is 1.

轴要么是前向轴要么是后向轴。只包含上下文节点或者在文档中出现在上下文节点后的节点的轴是前向轴(开始于上下文节点开始之后)。只包含上下文节点或者根据文档顺序出现在上下文节点前节点的轴是反转轴(开始于上下文节点开始之前)。[$也就是按照开始标签计$]因此祖先轴、祖先或自身轴、领先轴、领先同胞轴是后向轴;其他的轴是前向轴。由于自身轴只包含上下文节点自己,所以它是前向轴或后向轴都没有区别。节点集成员基于轴的邻接顺序被定义为根据文档顺序节点在节点集中的位置,如果轴是前向轴;如果而反转轴的顺序却刚好相反。起始位置是1(也就是说索引位置是由轴决定的,原则是离上下文节点近的靠前以上下文节点为原点)

A predicate filters a node-set with respect to an axis to produce a new node-set. For each node in the node-set to be filtered, the PredicateExpr is evaluated with that node as the context node, with the number of nodes in the node-set as the context size, and with the proximity position of the node in the node-set with respect to the axis as the context position; if PredicateExpr evaluates to true for that node, the node is included in the new node-set; otherwise, it is not included.

谓词参照轴对节点集进行过滤,以产生一个新的节点集。对于被过滤节点集中的任何节点,谓词表达式将其作为上下文节点来求值,并以这个节点集中元素数作为上下文大小(表达式上下文的大小),以节点的相邻位置作为上下文位置(上下文节点在表达式上下文的位置);如果为此表达式返回真则节点被包含在新的节点集中。

A PredicateExpr is evaluated by evaluating the Expr and converting the result to a boolean. If the result is a number, the result will be converted to true if the number is equal to the context position and will be converted to false otherwise; if the result is not a number, then the result will be converted as if by a call to the boolean function. Thus a location path para[3] is equivalent to para[position()=3].

一个谓词表达试通过求取表达式的值并将其转换为布尔型来求值。如果结果是一个数字,那么如果这个数字等于上下文位置则转换为真,否则为假。如果结果不是数字,那么通过boolean函数确定转换结果。一个局部路径para[3]等同于para[position()=3]

Predicates

[8]   

Predicate

   ::=   

'[' PredicateExpr ']'

[9]   

PredicateExpr

   ::=   

Expr

2.5 Abbreviated Syntax

Here are some examples of location paths using abbreviated syntax:

这是一些局部路径使用缩略语的例子:

  • para selects the para element children of the context node
  • * selects all element children of the context node
  • text() selects all text node children of the context node
  • @name selects the name attribute of the context node
  • @* selects all the attributes of the context node
  • para[1] selects the first para child of the context node
  • para[last()] selects the last para child of the context node
  • */para selects all para grandchildren of the context node
  • /doc/chapter[5]/section[2] selects the second section of the fifth chapter of the doc
  • chapter//para selects the para element descendants of the chapter element children of the context node
  • //para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
  • //olist/item selects all the item elements in the same document as the context node that have an olist parent
  • . selects the context node
  • .//para selects the para element descendants of the context node
  • .. selects the parent of the context node
  • ../@lang selects the lang attribute of the parent of the context node
  • para[@type="warning"] selects all para children of the context node that have a type attribute with value warning
  • para[@type="warning"][5] selects the fifth para child of the context node that has a type attribute with value warning
  • para[5][@type="warning"] selects the fifth para child of the context node if that child has a type attribute with value warning
  • chapter[title="Introduction"] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction
  • chapter[title] selects the chapter children of the context node that have one or more title children
  • employee[@secretary and @assistant] selects all the employee children of the context node that have both a secretary attribute and an assistant attribute

The most important abbreviation is that child:: can be omitted from a location step. In effect, child is the default axis. For example, a location path div/para is short for child::div/child::para.

最重要的缩略规则是:child::可以在路径步中被忽略。也就是说子轴式默认轴。

There is also an abbreviation for attributes: attribute:: can be abbreviated to @. For example, a location path para[@type="warning"] is short for child::para[attribute::type="warning"] and so selects para children with a type attribute with value equal to warning.

属性轴也有一个缩写规则,attribute::可以缩写为@

// is short for /descendant-or-self::node()/. For example, //para is short for /descendant-or-self::node()/child::para and so will select any para element in the document (even a para element that is a document element will be selected by //para since the document element node is a child of the root node); div//para is short for div/descendant-or-self::node()/child::para and so will select all para descendants of div children.

///descendant-or-self::node()/的缩写。

NOTE: The location path //para[1] does not mean the same as the location path /descendant::para[1]. The latter selects the first descendant para element; the former selects all descendant para elements that are the first para children of their parents.

A location step of . is short for self::node(). This is particularly useful in conjunction with //. For example, the location path .//para is short for

self::node()/descendant-or-self::node()/child::para

and so will select all para descendant elements of the context node.

Similarly, a location step of .. is short for parent::node(). For example, ../title is short for parent::node()/child::title and so will select the title children of the parent of the context node.

Abbreviations

[10]   

AbbreviatedAbsoluteLocationPath

   ::=   

'//' RelativeLocationPath

[11]   

AbbreviatedRelativeLocationPath

   ::=   

RelativeLocationPath '//' Step

[12]   

AbbreviatedStep

   ::=   

'.'

| '..'

[13]   

AbbreviatedAxisSpecifier

   ::=   

'@'?

3 Expressions

3.1 Basics

A VariableReference evaluates to the value to which the variable name is bound in the set of variable bindings in the context. It is an error if the variable name is not bound to any value in the set of variable bindings in the expression context.

变量引用的值是上下文中的变量绑定集合中变量名被绑定上的那个值。如果变量名在表达式上下文的变量绑定集合中没有被指派任何值就会发生错误。

Parentheses may be used for grouping.        

圆括号可用于分组。

[14]   

Expr

   ::=   

OrExpr

[15]   

PrimaryExpr

   ::=   

VariableReference

| '(' Expr ')'

| Literal

| Number

| FunctionCall

3.2 Function Calls

A FunctionCall expression is evaluated by using the FunctionName to identify a function in the expression evaluation context function library, evaluating each of the Arguments, converting each argument to the type required by the function, and finally calling the function, passing it the converted arguments. It is an error if the number of arguments is wrong or if an argument cannot be converted to the required type. The result of the FunctionCall expression is the result returned by the function.

函数调用的在求值时,首先通过函数名标识表达式求值上下文的函数库中的一个函数,然后求取各个参数,并转换为函数要求的类型,最后调用函数,向其传递转换后的参数。如果参数数目无法匹配或者参数无法被转换为要求类型就会发生错误。函数调用的结果就是由函数返回的结果。

An argument is converted to type string as if by calling the string function. An argument is converted to type number as if by calling the number function. An argument is converted to type boolean as if by calling the boolean function. An argument that is not of type node-set cannot be converted to a node-set.

类型转换通过类型转换函数来定义。不是节点集的类型无法转换为节点集。

[16]   

FunctionCall

   ::=   

FunctionName '(' ( Argument ( ',' Argument )* )? ')'

[17]   

Argument

   ::=   

Expr

3.3 Node-sets

A location path can be used as an expression. The expression returns the set of nodes selected by the path.

一个定位路径可以被用作表达式。表达式返回其选定的节点集。

The | operator computes the union of its operands, which must be node-sets.

Predicates are used to filter expressions in the same way that they are used in location paths. It is an error if the expression to be filtered does not evaluate to a node-set. The Predicate filters the node-set with respect to the child axis.

|运算符计算其操作数的并,操作数必须是节点集。谓词被用于过滤表达式,就如同在定位路径中那样。如果被过滤的表达式不是一个节点集就会发生错误。谓词将参照子轴过滤节点集。

NOTE: The meaning of a Predicate depends crucially on which axis applies. For example, preceding::foo[1] returns the first foo element in reverse document order, because the axis that applies to the [1] predicate is the preceding axis; by contrast, (preceding::foo)[1] returns the first foo element in document order, because the axis that applies to the [1] predicate is the child axis.

注意:谓词的意义依赖于应用的轴。例如,preceding::foo[1]返回反文档序中的第一个foo元素。而(preceding::foo)返回文档序中的第一个foo元素。因为[1]应用的轴是子轴。也就是说表达式中的谓词总是应用子轴。

The / and // operators compose an expression and a relative location path. It is an error if the expression does not evaluate to a node-set. The / operator does composition in the same way as when / is used in a location path. As in location paths, // is short for /descendant-or-self::node()/.

///运算符用于组合表达式和相对局部路径。表达式不返回节点集是错误的。/运算符按其在定位路径中使用的方式完成连接。在定位路径中,///decendant-or-self::node()/的缩写。

There are no types of objects that can be converted to node-sets.

任何类型的对象都无法转换为节点集。

[18]   

UnionExpr

   ::=   

PathExpr

| UnionExpr '|' PathExpr

[19]   

PathExpr

   ::=   

LocationPath

| FilterExpr

| FilterExpr '/' RelativeLocationPath

| FilterExpr '//' RelativeLocationPath

[20]   

FilterExpr

   ::=   

PrimaryExpr

| FilterExpr Predicate

3.4 Booleans

An object of type boolean can have one of two values, true and false.

布尔类型的对象可以具有两个值:tf

An or expression is evaluated by evaluating each operand and converting its value to a boolean as if by a call to the boolean function. The result is true if either value is true and false otherwise. The right operand is not evaluated if the left operand evaluates to true.

Or表达式的求职过程是:分别求各个操作数并转换为布尔。如果任一操作数是真则返回true否则返回假。如果左操作数返回真则右操作数不进行求值。

An and expression is evaluated by evaluating each operand and converting its value to a boolean as if by a call to the boolean function. The result is true if both values are true and false otherwise. The right operand is not evaluated if the left operand evaluates to false.

如果左操作数返回假则右操作数不进行求值。

An EqualityExpr (that is not just a RelationalExpr) or a RelationalExpr (that is not just an AdditiveExpr) is evaluated by comparing the objects that result from evaluating the two operands. Comparison of the resulting objects is defined in the following three paragraphs. First, comparisons that involve node-sets are defined in terms of comparisons that do not involve node-sets; this is defined uniformly for =, !=, <=, <, >= and >. Second, comparisons that do not involve node-sets are defined for = and !=. Third, comparisons that do not involve node-sets are defined for <=, <, >= and >.

相等性表达式或关系表达式通过比较两边操作数返回的结果对象求值。比较方式在下面三段中定义。首先,涉及到节点集的比较被转换成不涉及节点集的比较;这对于=!=>=<=<>都一样。然后是不涉及节点集的=!=的比较;最后是不涉集结点集的<=>=><的比较。

If both objects to be compared are node-sets, then the comparison will be true if and only if there is a node in the first node-set and a node in the second node-set such that the result of performing the comparison on the string-values of the two nodes is true. If one object to be compared is a node-set and the other is a number, then the comparison will be true if and only if there is a node in the node-set such that the result of performing the comparison on the number to be compared and on the result of converting the string-value of that node to a number using the number function is true. If one object to be compared is a node-set and the other is a string, then the comparison will be true if and only if there is a node in the node-set such that the result of performing the comparison on the string-value of the node and the other string is true. If one object to be compared is a node-set and the other is a boolean, then the comparison will be true if and only if the result of performing the comparison on the boolean and on the result of converting the node-set to a boolean using the boolean function is true.

如果比较的双方都是节点集,那么比较成真当且仅当第一个节点集中存在一个节点,第二个节点集中也存在一个节点使得两节点的字符串比较成真。如果一边是节点集一边是数字,那么比较成真当且仅当节点集中存在一个节点使得其字符串值转换为数字后与另一个操作数比较结果是真。如果一边是节点集一边是字符串,那么比较结果成真当且仅当节点集中存在一个节点使得其字符串值同那个字符串比较的结果为真。如果一边是节点集一边是布尔,那么比较结果为真当且仅当节点集通过boolean函数转换后的布尔值同另一个操作数比较结果为真。

When neither object to be compared is a node-set and the operator is = or !=, then the objects are compared by converting them to a common type as follows and then comparing them. If at least one object to be compared is a boolean, then each object to be compared is converted to a boolean as if by applying the boolean function. Otherwise, if at least one object to be compared is a number, then each object to be compared is converted to a number as if by applying the number function. Otherwise, both objects to be compared are converted to strings as if by applying the string function. The = comparison will be true if and only if the objects are equal; the != comparison will be true if and only if the objects are not equal. Numbers are compared for equality according to IEEE 754 [IEEE 754]. Two booleans are equal if either both are true or both are false. Two strings are equal if and only if they consist of the same sequence of UCS characters.

如果比较的双方都不是节点集,而且操作符是=或者!=,那么首先将他们转换为一个公共类型然后再比较。如果其中之一是布尔,则都转换为布尔;否则如果一个是数字,则都转换为数字;否则都转换为字符串。=为真当且仅当对象相等。字符串被定义为ucs字符序列。

NOTE: If $x is bound to a node-set, then $x="foo" does not mean the same as not($x!="foo"): the former is true if and only if some node in $x has the string-value foo; the latter is true if and only if all nodes in $x have the string-value foo.

注意:如果$x被绑定到一个节点集,那么$x=”foo”not($x!="foo")并不等同:前一个为真当且仅当$x中的某节点具有字符值foo;而后一个为真当且仅当$x中的所有节点具有字符串值foo

When neither object to be compared is a node-set and the operator is <=, <, >= or >, then the objects are compared by converting both objects to numbers and comparing the numbers according to IEEE 754. The < comparison will be true if and only if the first number is less than the second number. The <= comparison will be true if and only if the first number is less than or equal to the second number. The > comparison will be true if and only if the first number is greater than the second number. The >= comparison will be true if and only if the first number is greater than or equal to the second number.

如果双方都不是节点集并且操作数是<=<>=>,那么双方被转换为数字进行比较。

NOTE: When an XPath expression occurs in an XML document, any < and <= operators must be quoted according to XML 1.0 rules by using, for example, &lt; and &lt;=. In the following example the value of the test attribute is an XPath expression:

<xsl:if test="@value &lt; 10">...</xsl:if>

注意:当一个xpath表达式出现在xml文档中时,<<=操作符必须被转义,根据xml1.0规则。在后面的例子中test属性的值是一个xpath表达式。

[21]   

OrExpr

   ::=   

AndExpr

| OrExpr 'or' AndExpr

[22]   

AndExpr

   ::=   

EqualityExpr

| AndExpr 'and' EqualityExpr

[23]   

EqualityExpr

   ::=   

RelationalExpr

| EqualityExpr '=' RelationalExpr

| EqualityExpr '!=' RelationalExpr

[24]   

RelationalExpr

   ::=   

AdditiveExpr

| RelationalExpr '<' AdditiveExpr

| RelationalExpr '>' AdditiveExpr

| RelationalExpr '<=' AdditiveExpr

| RelationalExpr '>=' AdditiveExpr

NOTE: The effect of the above grammar is that the order of precedence is (lowest precedence first):

·   or

·   and

·   =, !=

·   <=, <, >=, >

and the operators are all left associative. For example, 3 > 2 > 1 is equivalent to (3 > 2) > 1, which evaluates to false.

上面是由低到高的优先级,这些运算符都是左结合的。

3.5 Numbers

A number represents a floating-point number. A number can have any double-precision 64-bit format IEEE 754 value [IEEE 754]. These include a special "Not-a-Number" (NaN) value, positive and negative infinity, and positive and negative zero. See Section 4.2.3 of [JLS] for a summary of the key rules of the IEEE 754 standard.

数字表示一个浮点数。数字可以是任何64位精度浮点数值,包括nan,正负无穷、正负零。

The numeric operators convert their operands to numbers as if by calling the number function.

数值运算符自动将其操作数转换成数值。

The + operator performs addition.

The - operator performs subtraction.

NOTE: Since XML allows - in names, the - operator typically needs to be preceded by whitespace. For example, foo-bar evaluates to a node-set containing the child elements named foo-bar; foo - bar evaluates to the difference of the result of converting the string-value of the first foo child element to a number and the result of converting the string-value of the first bar child to a number.

注意:虽然xml允许在名称中出现-号,-操作符一般需要用空白字符作前导。例如,foo-bar表示名为foo-bar的子节点集,foo – bar求值出第一个foo子元素的字符值的转型得到的数字同第一个bar子元素的字符值转型得到的数字之差。

The div operator performs floating-point division according to IEEE 754.

Div运算符根据ieee754规范执行除法操作。

The mod operator returns the remainder from a truncating division. For example,

  • 5 mod 2 returns 1
  • 5 mod -2 returns 1
  • -5 mod 2 returns -1
  • -5 mod -2 returns -1

NOTE: This is the same as the % operator in Java and ECMAScript.这同javascript中的求模是一致的。

NOTE: This is not the same as the IEEE 754 remainder operation, which returns the remainder from a rounding division.

这同ieee754求模运算不同。

Numeric Expressions

[25]   

AdditiveExpr

   ::=   

MultiplicativeExpr

| AdditiveExpr '+' MultiplicativeExpr

| AdditiveExpr '-' MultiplicativeExpr

[26]   

MultiplicativeExpr

   ::=   

UnaryExpr

| MultiplicativeExpr MultiplyOperator UnaryExpr

| MultiplicativeExpr 'div' UnaryExpr

| MultiplicativeExpr 'mod' UnaryExpr

[27]   

UnaryExpr

   ::=   

UnionExpr

| '-' UnaryExpr

3.6 Strings

Strings consist of a sequence of zero or more characters, where a character is defined as in the XML Recommendation [XML]. A single character in XPath thus corresponds to a single Unicode abstract character with a single corresponding Unicode scalar value (see [Unicode]); this is not the same thing as a 16-bit Unicode code value: the Unicode coded character representation for an abstract character with Unicode scalar value greater that U+FFFF is a pair of 16-bit Unicode code values (a surrogate pair). In many programming languages, a string is represented by a sequence of 16-bit Unicode code values; implementations of XPath in such languages must take care to ensure that a surrogate pair is correctly treated as a single XPath character.

字符串是有限多个字符的序列,这里字符的定义如xml规范所述。(后面翻译不了)

NOTE: It is possible in Unicode for there to be two strings that should be treated as identical even though they consist of the distinct sequences of Unicode abstract characters. For example, some accented characters may be represented in either a precomposed or decomposed form. Therefore, XPath expressions may return unexpected results unless both the characters in the XPath expression and in the XML document have been normalized into a canonical form. See [Character Model].

注意:在unicode中可能有这种情况:两个不同的unicode抽象字符序列表示同意个字符串。例如,一些重音字符可能具有预合并的形式或分解的形式。因此xapth表达式可能返回意外的结果,除非xpath表达式和xml文档中的字符都被规范化。

3.7 Lexical Structure词法结构

When tokenizing, the longest possible token is always returned.

分析token时,总试图返回最长的token

For readability, whitespace may be used in expressions even though not explicitly allowed by the grammar: ExprWhitespace may be freely added within patterns before or after any ExprToken.

为了可读性可能会在表达式中使用空白字符即使没有显式在语法中允许这样做。表达式空白字符在模式中可以任意添加在token前后。

The following special tokenization rules must be applied in the order specified to disambiguate the ExprToken grammar:

下列特殊token解析规则必须按指定顺序应用以消除exprtoken语法歧义:

  • If there is a preceding token and the preceding token is not one of @, ::, (, [, , or an Operator, then a * must be recognized as a MultiplyOperator and an NCName must be recognized as an OperatorName.
  • 如果紧前token并且不是之一,那么*必须被识别为乘号而ncname必须别识别为operatorname
  • If the character following an NCName (possibly after intervening ExprWhitespace) is (, then the token must be recognized as a NodeType or a FunctionName.
  • 如果ncname后的字符是(,那么token必须被识别为一个节点类型或者函数名。
  • If the two characters following an NCName (possibly after intervening ExprWhitespace) are ::, then the token must be recognized as an AxisName.
  • 如果ncname后的两个字符是::,那么token必须被识别为一个轴名。
  • Otherwise, the token must not be recognized as a MultiplyOperator, an OperatorName, a NodeType, a FunctionName, or an AxisName.
  • 否则token必须不被认为是乘号、操作符名、节点类型、函数名、轴名。

Expression Lexical Structure

[28]   

ExprToken

   ::=   

'(' | ')' | '[' | ']' | '.' | '..' | '@' | ',' | '::'

| NameTest

| NodeType

| Operator

| FunctionName

| AxisName

| Literal

| Number

| VariableReference

[29]   

Literal

   ::=   

'"' [^"]* '"'

| "'" [^']* "'"

[30]   

Number

   ::=   

Digits ('.' Digits?)?

| '.' Digits

[31]   

Digits

   ::=   

[0-9]+

[32]   

Operator

   ::=   

OperatorName

| MultiplyOperator

| '/' | '//' | '|' | '+' | '-' | '=' | '!=' | '<' | '<=' | '>' | '>='

[33]   

OperatorName

   ::=   

'and' | 'or' | 'mod' | 'div'

[34]   

MultiplyOperator

   ::=   

'*'

[35]   

FunctionName

   ::=   

QName - NodeType

[36]   

VariableReference

   ::=   

'$' QName

[37]   

NameTest

   ::=   

'*'

| NCName ':' '*'

| QName

[38]   

NodeType

   ::=   

'comment'

| 'text'

| 'processing-instruction'

| 'node'

[39]   

ExprWhitespace

   ::=   

S

4 Core Function Library

This section describes functions that XPath implementations must always include in the function library that is used to evaluate expressions.

本节介绍xpath实现必须在其函数库中包含的函数。

Each function in the function library is specified using a function prototype, which gives the return type, the name of the function, and the type of the arguments. If an argument type is followed by a question mark, then the argument is optional; otherwise, the argument is required.

函数库中的所有函数通过函数原型指定,原型指定了返回类型]函数名、参数类型。如果一个参数类型后跟一个问号则表示该参数是可选的,否则则是必需的。

[$一些函数可以具有2种调用方式:一是相对于上下文节点的调用,二是相对于参数的调用。这样的函数如果有参数就是用参数,如果没有参数就使用上下文节点。或者说它们的可选参数以上下文节点作为默认值,如果没有指定就表示使用上下文节点作为这个参数的值。另一个问题是关于定位步。标准的定位步是由轴、节点测试和谓词构成,但实际上在定位步可以出现的地方可以使用任何表达式。定位步分隔符的意义是:自左向右对整个定位路径求值,前一个定位步筛选出来的节点集作为下一个定位步的上下文节点集——即下一个定位步分别以其中的每个节点作为上下文节点求值,然后把这些得到值并入一个统一的结果集合作为这一步的返回值。$]

4.1 Node Set Functions

节点集函数

Function: number last()

The last function returns a number equal to the context size from the expression evaluation context.返回上下文大小。(上下文节点集的大小)

Function: number position()

The position function returns a number equal to the context position from the expression evaluation context.返回上下文位置。(上下文节点在上下文节点集中的位置)

Function: number count(node-set)

The count function returns the number of nodes in the argument node-set.

返回节点集大小。

Function: node-set id(object)

The id function selects elements by their unique ID (see [5.2.1 Unique IDs]). When the argument to id is of type node-set, then the result is the union of the result of applying id to the string-value of each of the nodes in the argument node-set. When the argument to id is of any other type, the argument is converted to a string as if by a call to the string function; the string is split into a whitespace-separated list of tokens (whitespace is any sequence of characters matching the production S); the result is a node-set containing the elements in the same document as the context node that have a unique ID equal to any of the tokens in the list.

Id函数根据元素的unique id选择元素。当id函数的参数是节点集类型,那么结果是对参数节点集中的每个节点的字符串值应用id函数所得结果的并。当id函数的参数是任何其他类型则转换为string类型。字符串被划分为由空白字符分隔的token列表。结果是一个节点集,包含着其uniqueid出现在这个列表中的元素。(例如 id(“id1 id2 id3”)

  • id("foo") selects the element with unique ID foo
  • id("foo")/child::para[position()=5] selects the fifth para child of the element with unique ID foo

Function: string local-name(node-set?)

The local-name function returns the local part of the expanded-name of the node in the argument node-set that is first in document order. If the argument node-set is empty or the first node has no expanded-name, an empty string is returned. If the argument is omitted, it defaults to a node-set with the context node as its only member.

Local-name函数返回参数节点集中第一个节点的全名的局部部分。如果参数节点集为空或者地一个节点没有全名,则返回空串。如果参数被忽略,则被默认为一个包含上下文节点的单元集。

[$前一表达式返回的节点集并不是下一步求值上下文的一部分,其中的元素才是。然而context size是指这个集合的大小,context position是指元素在这个集合中的位置。$]

Function: string namespace-uri(node-set?)

The namespace-uri function returns the namespace URI of the expanded-name of the node in the argument node-set that is first in document order. If the argument node-set is empty, the first node has no expanded-name, or the namespace URI of the expanded-name is null, an empty string is returned. If the argument is omitted, it defaults to a node-set with the context node as its only member.

Namespace-uri函数返回参数节点集中第一个节点的全名的namespace uri。如果节点集为空或者第一个节点没有全名或者全名的namespace urinull,那么就返回一个空串。如果参数被忽略,就默认为一个只包含上下文节点作为其为一元素的单元集。

NOTE: The string returned by the namespace-uri function will be empty except for element nodes and attribute nodes.

注意:如果不是元素或属性节点namespace-uri函数返回的字符串将为空。

[$差点被忽悠了。Xpath2.0中发生了一些变化,首先数据类型增加了,因此本函数返回的是一个anyURIl类型值,其次这个函数似乎只能接受一个节点作为参数了。$]

Function: string name(node-set?)

The name function returns a string containing a QName representing the expanded-name of the node in the argument node-set that is first in document order. The QName must represent the expanded-name with respect to the namespace declarations in effect on the node whose expanded-name is being represented. Typically, this will be the QName that occurred in the XML source. This need not be the case if there are namespace declarations in effect on the node that associate multiple prefixes with the same namespace. However, an implementation may include information about the original prefix in its representation of nodes; in this case, an implementation can ensure that the returned string is always the same as the QName used in the XML source. If the argument node-set is empty or the first node has no expanded-name, an empty string is returned. If the argument it omitted, it defaults to a node-set with the context node as its only member.

Name函数返回一个字符串,其中包含一个qname,该qname表示参数节点集中第一个节点的全名。这个qname必须根据作用在节点上的命名空间声明来表示节点的全名。一般来说这就是出现在xml源文件中的qname。这样就必须要求作用在此节点上的所有命名空间声明不能有一个命名空间对应多个前缀的情况。当然实现可以自己去保证返回的全名总是和xml源文件中定义的相同。如果参数节点集为空或者第一个元素没有全名,则返回一个空串。如果参数被忽略,you know…

[$一般来说实现总是返回xml源文件中定义的那个qname$]

NOTE: The string returned by the name function will be the same as the string returned by the local-name function except for element nodes and attribute nodes.

[对于元素和属性外的其他节点,name函数和local-name函数返回总是相同(都是空串?)]

[$上面这几个函数说白了好像都是用来取节点的某方面特征的。$]

4.2 String Functions

字符串函数

Function: string string(object?)

The string function converts an object to a string as follows:

String函数按下列规则将对象转换为string:

  • A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.节点集转换为字符串时,去其第一个节点的字符串值,如果节点集为空则返回空串。
  • A number is converted to a string as follows
    • NaN is converted to the string NaN
    • positive zero is converted to the string 0
    • negative zero is converted to the string 0
    • positive infinity is converted to the string Infinity
    • negative infinity is converted to the string -Infinity
    • if the number is an integer, the number is represented in decimal form as a Number with no decimal point and no leading zeros, preceded by a minus sign (-) if the number is negative
    • otherwise, the number is represented in decimal form as a Number including a decimal point with at least one digit before the decimal point and at least one digit after the decimal point, preceded by a minus sign (-) if the number is negative; there must be no leading zeros before the decimal point apart possibly from the one required digit immediately before the decimal point; beyond the one required digit after the decimal point there must be as many, but only as many, more digits as are needed to uniquely distinguish the number from all other IEEE 754 numeric values.[]
  • The boolean false value is converted to the string false. The boolean true value is converted to the string true.布尔转成falsetrue
  • An object of a type other than the four basic types is converted to a string in a way that is dependent on that type.四种基本类型之外的对象向字符串的转换取决于其类型的规定。

If the argument is omitted, it defaults to a node-set with the context node as its only member.

如果参数被忽略,则默认为一个以上下文节点为唯一元素的单元集。

NOTE: The string function is not intended for converting numbers into strings for presentation to users. The format-number function and xsl:number element in [XSLT] provide this functionality.

注意:string并不是用来将数字转换为字符串以向用户呈现信息的。Format-number函数和xslt中的xls:number元素提供了这项功能。

Function: string concat(string, string, string*)

The concat function returns the concatenation of its arguments.

返回多个字符串的连接。

Function: boolean starts-with(string, string)

The starts-with function returns true if the first argument string starts with the second argument string, and otherwise returns false.

如果第一个参数以第二个参数开始则返回真,否则返回假。

Function: boolean contains(string, string)

The contains function returns true if the first argument string contains the second argument string, and otherwise returns false.

如果第一个参数包含第二个参数(子串关系)则返回真,否则返回假。

Function: string substring-before(string, string)

The substring-before function returns the substring of the first argument string that precedes the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-before("1999/04/01","/") returns 1999.

如果第一个参数不包含第二个参数怎返回空串,否则返回第二个参数在第一个参数中第一次出现之前的第一个参数的子串(部分)

Function: string substring-after(string, string)

The substring-after function returns the substring of the first argument string that follows the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-after("1999/04/01","/") returns 04/01, and substring-after("1999/04/01","19") returns 99/04/01.

如果第一个参数不包含第二个参数则返回空串,否则返回第二个参数在一个参数中第一次出现之后第一个参数的部分。

Function: string substring(string, number, number?)

The substring function returns the substring of the first argument starting at the position specified in the second argument with length specified in the third argument. For example, substring("12345",2,3) returns "234". If the third argument is not specified, it returns the substring starting at the position specified in the second argument and continuing to the end of the string. For example, substring("12345",2) returns "2345".

求子串函数阿大哥,索引从1开始。

More precisely, each character in the string (see [3.6 Strings]) is considered to have a numeric position: the position of the first character is 1, the position of the second character is 2 and so on.

准确地说,字符串中的所有字符都有一个数字索引:第一个是1…

NOTE: This differs from Java and ECMAScript, in which the String.substring method treats the position of the first character as 0.

注意:这不同于javajavascript,这两种语言中子串的索引从0开始。

The returned substring contains those characters for which the position of the character is greater than or equal to the rounded value of the second argument and, if the third argument is specified, less than the sum of( the rounded value of the second argument and the rounded value of the third argument); the comparisons and addition used for the above follow the standard IEEE 754 rules; rounding is done as if by a call to the round function. The following examples illustrate various unusual cases:

如果第二个参数和第三个参数不是整数,返回起始位置大于等于第二个参数的四舍五入值的部分,并且该部分的长度要小于等于第三个参数的四舍五入值。

  • substring("12345", 1.5, 2.6) returns "234"
  • substring("12345", 0, 3) returns "12"
  • substring("12345", 0 div 0, 3) returns ""
  • substring("12345", 1, 0 div 0) returns ""
  • substring("12345", -42, 1 div 0) returns "12345"
  • substring("12345", -1 div 0, 1 div 0) returns ""

Function: number string-length(string?)

The string-length returns the number of characters in the string (see [3.6 Strings]). If the argument is omitted, it defaults to the context node converted to a string, in other words the string-value of the context node.

返回字符串中的字符数。默认值是上下文节点的字符串值。

Function: string normalize-space(string?)

The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space. Whitespace characters are the same as those allowed by the S production in XML. If the argument is omitted, it defaults to the context node converted to a string, in other words the string-value of the context node.

基本上就是trim,但是如果字符中间包含空白字符则多个连续的空白字符会被转换为一个空格。

Function: string translate(string, string, string)

The translate function returns the first argument string with occurrences of characters in the second argument string replaced by the character at the corresponding position in the third argument string. For example, translate("bar","abc","ABC") returns the string BAr. If there is a character in the second argument string with no character at a corresponding position in the third argument string (because the second argument string is longer than the third argument string), then occurrences of that character in the first argument string are removed. For example, translate("--aaa--","abc-","ABC") returns "AAA". If a character occurs more than once in the second argument string, then the first occurrence determines the replacement character. If the third argument string is longer than the second argument string, then excess characters are ignored.

Translate函数返回第一个参数的变形。将其中在第二个参数中出现的字符替换为第三个参数中相同位置的字符。例如translate(“bar”,”abc”,”ABC”)的结果是Bar。如果第二个参数中的字符在第三个参数中没有对应字符,则该字符将从第一个参数中移除。例如translate(“---aaa--”,”abc-”,”ABC”)返回AAA。如果一个字符在第二个参数中出现多于一次,那么第一次出现决定替换字符;如果第三个字符长于第一个字符,则长出部分将被忽略。

NOTE: The translate function is not a sufficient solution for case conversion in all languages. A future version of XPath may provide additional functions for case conversion.

[不知道什么意思]

4.3 Boolean Functions

布尔函数

Function: boolean boolean(object)

The boolean function converts its argument to a boolean as follows:

本函数将其参数转换为布尔类型值:

  • a number is true if and only if it is neither positive or negative zero nor NaN数字当且仅当既不是正零也不是负零也不时nan时为真。
  • a node-set is true if and only if it is non-empty节点集为真当且仅当其非空
  • a string is true if and only if its length is non-zero字符串为真当且仅当其长度非0
  • an object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type四种基本类型之外的类型向boolean的转换由那种类型规定

Function: boolean not(boolean)

The not function returns true if its argument is false, and false otherwise.

求返

Function: boolean true()

The true function returns true.

恒真

Function: boolean false()

The false function returns false.

恒假

Function: boolean lang(string)

The lang function returns true or false depending on whether the language of the context node as specified by xml:lang attributes is the same as or is a sublanguage of the language specified by the argument string. The language of the context node is determined by the value of the xml:lang attribute on the context node, or, if the context node has no xml:lang attribute, by the value of the xml:lang attribute on the nearest ancestor of the context node that has an xml:lang attribute. If there is no such attribute, then lang returns false. If there is such an attribute, then lang returns true if the attribute value is equal to the argument ignoring case, or if there is some suffix starting with - such that the attribute value is equal to the argument ignoring that suffix of the attribute value and ignoring case. For example, lang("en") would return true if the context node is any of these five elements:

Lang函数判断上下文节点通过xml:lang属性制定的语言是否和参数字符串指定语言相同或是其子语言。上下文节点的语言由上下文节点上的xml:lang属性决定。或者如果上下文节点没有xml:lang属性,则由最近的具有xml:lang属性的祖先决定。如果没有这样的属性,则lang返回假。如果有这样的属性,且(其值等于参数或其值忽略以-开头的后缀后再忽略大小写与参数相同)则返回真。

例如,lang(“en”)在上下文节点是下面五种para元素时下返回真:

<para xml:lang="en"/>

<div xml:lang="en"><para/></div>

<para xml:lang="EN"/>

<para xml:lang="en-us"/>

4.4 Number Functions

数字函数

Function: number number(object?)

The number function converts its argument to a number as follows:

Number将其参数转换为数字:

  • a string that consists of optional whitespace followed by an optional minus sign followed by a Number followed by whitespace is converted to the IEEE 754 number that is nearest (according to the IEEE 754 round-to-nearest rule) to the mathematical value represented by the string; any other string is converted to NaN不表示数字的字符串被转换为nan
  • boolean true is converted to 1; boolean false is converted to 0布尔true转换为1false转换为0
  • a node-set is first converted to a string as if by a call to the string function and then converted in the same way as a string argument节点集先转换为字符串再转换为数字。
  • an object of a type other than the four basic types is converted to a number in a way that is dependent on that type啦啦啦

If the argument is omitted, it defaults to a node-set with the context node as its only member.啦啦啦啦

NOTE: The number function should not be used for conversion of numeric data occurring in an element in an XML document unless the element is of a type that represents numeric data in a language-neutral format (which would typically be transformed into a language-specific format for presentation to a user). In addition, the number function cannot be used unless the language-neutral format used by the element is consistent with the XPath syntax for a Number.

[不知道]

Function: number sum(node-set)

The sum function returns the sum, for each node in the argument node-set, of the result of converting the string-values of the node to a number.

返回参数节点集中各节点的字符串值转化为数字的和

Function: number floor(number)

The floor function returns the largest (closest to positive infinity) number that is not greater than the argument and that is an integer.

返回不大于参数的最大整数

Function: number ceiling(number)

The ceiling function returns the smallest (closest to negative infinity) number that is not less than the argument and that is an integer.

返回不小于参数的最小整数

Function: number round(number)

The round function returns the number that is closest to the argument and that is an integer. If there are two such numbers, then the one that is closest to positive infinity is returned. If the argument is NaN, then NaN is returned. If the argument is positive infinity, then positive infinity is returned. If the argument is negative infinity, then negative infinity is returned. If the argument is positive zero, then positive zero is returned. If the argument is negative zero, then negative zero is returned. If the argument is less than zero, but greater than or equal to -0.5, then negative zero is returned.

Round函数返回最接近number参数的整数。如果有两个同样接近的整数,则返回最接近于正无穷的那个。如果参数是nan,则返回nan。如果参数是正无穷,则返回正无穷。如果参数是负无穷,则返回负无穷。正0返回正0,负0返回负0。如果参数小于0但大于等于-0.5则返回-0

NOTE: For these last two cases, the result of calling the round function is not the same as the result of adding 0.5 and then calling the floor function.

注意:对于最后两种情况,调用round的结果与+0.5后调用floor的结果不同(+0)。太乱了。

5 Data Model

数据模型

XPath operates on an XML document as a tree. This section describes how XPath models an XML document as a tree. This model is conceptual only and does not mandate any particular implementation. The relationship of this model to the XML Information Set [XML Infoset] is described in [B XML Information Set Mapping].

Xpathxml文档作为树操作。本节描述xpath如何将xml文档解析为一棵树。这一模型只是概念上的并不要求特别的实现。

XML documents operated on by XPath must conform to the XML Namespaces Recommendation [XML Names].xpath操作的xml文档必须符合xml命名空间规范。

The tree contains nodes. There are seven types of node:

树包含节点,有七种类型的节点。

  • root nodes根节点
  • element nodes元素节点
  • text nodes文本节点
  • attribute nodes属性节点
  • namespace nodes命名空间节点
  • processing instruction nodes处理器指令节点
  • comment nodes注释节点

For every type of node, there is a way of determining a string-value for a node of that type. For some types of node, the string-value is part of the node; for other types of node, the string-value is computed from the string-value of descendant nodes.

对于每种类型的节点,都有一种方式为其决定一个字符串值。对于一些类型的节点,字符串值是节点的一部分;对于另一些,字符串值是被通过其子孙的字符串值计算而来的。

NOTE: For element nodes and root nodes, the string-value of a node is not the same as the string returned by the DOM nodeValue method (see [DOM]).

注意:对于元素节点和根节点,其字符串值并不等同于domnodevalue方法的返回值。

Some types of node also have an expanded-name, which is a pair consisting of a local part and a namespace URI. The local part is a string. The namespace URI is either null or a string. The namespace URI specified in the XML document can be a URI reference as defined in [RFC2396]; this means it can have a fragment identifier and can be relative. A relative URI should be resolved into an absolute URI during namespace processing: the namespace URIs of expanded-names of nodes in the data model should be absolute. Two expanded-names are equal if they have the same local part, and either both have a null namespace URI or both have non-null namespace URIs that are equal.

某些类型的节点还具有一个全名,即一个局部名称和命名空间uri组成的二元组。局部名称是一个字符串,命名空间uri或者是null或者是一个字符串。在xml文档中指定的命名空间uri可以是一个rfc2396种定义的rui引用;也就是说它可以具有一个片断标识符并且是相对的。一个相对uri应该在处理命名空间时被求解为绝对uri:数据模型中的节点的展开名的命名空间uri应当是绝对的。[不太清楚]

There is an ordering, document order, defined on all the nodes in the document corresponding to the order in which the first character of the XML representation of each node occurs in the XML representation of the document after expansion of general entities. Thus, the root node will be the first node. Element nodes occur before their children. Thus, document order orders element nodes in order of the occurrence of their start-tag in the XML (after expansion of entities). The attribute nodes and namespace nodes of an element occur before the children of the element. The namespace nodes are defined to occur before the attribute nodes. The relative order of namespace nodes is implementation-dependent. The relative order of attribute nodes is implementation-dependent. Reverse document order is the reverse of document order.

Root nodes and element nodes have an ordered list of child nodes. Nodes never share children: if one node is not the same node as another node, then none of the children of the one node will be the same node as any of the children of another node. Every node other than the root node has exactly one parent, which is either an element node or the root node. A root node or an element node is the parent of each of its child nodes. The descendants of a node are the children of the node and the descendants of the children of the node.

5.1 Root Node

The root node is the root of the tree. A root node does not occur except as the root of the tree. The element node for the document element is a child of the root node. The root node also has as children processing instruction and comment nodes for processing instructions and comments that occur in the prolog and after the end of the document element.

根节点是树的根。一个根节点只作为树的根出现。表示文档元素的元素节点是根节点的一个子。根节点还可以有处理器指令节点作为子节点,或者出现在文档元素前后注释节点作为子。

The string-value of the root node is the concatenation of the string-values of all text node descendants of the root node in document order.

根节点的字符串值是其所有子孙文本节点的字符串值按照文档顺序的串联。

The root node does not have an expanded-name.

根节点不具有全名。

5.2 Element Nodes

There is an element node for every element in the document. An element node has an expanded-name computed by expanding the QName of the element specified in the tag in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the element's expanded-name will be null if the QName has no prefix and there is no applicable default namespace.

文档中的每个元素对应一个元素节点。元素节点具有全名,通过展开其qname得到。元素全名的命名空间rui将是null如果qname没有前缀。不存在可适用的默认命名空间。也就是xpath不识别默认命名空间。[$好像在xpath2.0里可以识别默认命名空间了。$]

NOTE: In the notation of Appendix A.3 of [XML Names], the local part of the expanded-name corresponds to the type attribute of the ExpEType element; the namespace URI of the expanded-name corresponds to the ns attribute of the ExpEType element, and is null if the ns attribute of the ExpEType element is omitted.

(不知道expetype是个什么东西)

The children of an element node are the element nodes, comment nodes, processing instruction nodes and text nodes for its content. Entity references to both internal and external entities are expanded. Character references are resolved.

文档元素的子可以是文档元素、注释、处理指令和文本节点作为其内容。内外部实体引用都被扩展,字符引用将被求值。

The string-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order.

元素节点的字符串值是其所有子孙文本节点按照文档顺序连接而成。

5.2.1 Unique IDs

An element node may have a unique identifier (ID). This is the value of the attribute that is declared in the DTD as type ID. No two elements in a document may have the same unique ID. If an XML processor reports two elements in a document as having the same unique ID (which is possible only if the document is invalid) then the second element in document order must be treated as not having a unique ID.

元素节点可以有唯一标识。这是在dtd中声明为id类型的属性的值。文档中任何两个元素不能具有同样的标识。如果xml处理器报告文档中的两个元素具有同样的标识,那么按照文档序的第二个元素必须被当作没有唯一标识。

NOTE: If a document does not have a DTD, then no element in the document will have a unique ID.

如果一个文档没有dtd,那么任何元素都没有唯一标识。

5.3 Attribute Nodes

Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element.

每个元素可以有一个对应的属性集,元素是其中每个属性节点的父,但是属性节点不是其父元素的一个子元素。

NOTE: This is different from the DOM, which does not treat the element bearing an attribute as the parent of the attribute (see [DOM]).

着不同于domdom不将带有属性的元素作为属性的父。

Elements never share attribute nodes: if one element node is not the same node as another element node, then none of the attribute nodes of the one element node will be the same node as the attribute nodes of another element node.

元素不分有属性节点。属性节点只能属于一个且必须属于一个元素节点。

NOTE: The = operator tests whether two nodes have the same value, not whether they are the same node. Thus attributes of two different elements may compare as equal using =, even though they are not the same node.

注意:=元算符可以测试两个节点是否具有相同值,而不是它们是否是同一节点。因此不同元素的的属性用=号比较时可能相等,即时它们不是同一节点。

A defaulted attribute is treated the same as a specified attribute. If an attribute was declared for the element type in the DTD, but the default was declared as #IMPLIED, and the attribute was not specified on the element, then the element's attribute set does not contain a node for the attribute.

适用默认值属性与显式指定的属性同等看待。如果已经在DTD中为一个元素声明了属性,但default被声明为#IMPLIED,并且没有在元素上指定,那么元素的属性集就不包含这个属性。

Some attributes, such as xml:lang and xml:space, have the semantics that they apply to all elements that are descendants of the element bearing the attribute, unless overridden with an instance of the same attribute on another descendant element. However, this does not affect where attribute nodes appear in the tree: an element has attribute nodes only for attributes that were explicitly specified in the start-tag or empty-element tag of that element or that were explicitly declared in the DTD with a default value.

一些属性,例如xml:langxml:space具有这样的语义:它们适用于声明了这些属性的元素的所有后代元素,除非被覆盖。但这不影响属性节点出现在树中的位置:一个元素拥有属性节点,当且仅当在元素的起始标记或空元素标记上显式做出了指定,或者在元素的DTD中用默认值做出了显式定义(#IMPLIED是隐式)。

An attribute node has an expanded-name and a string-value. The expanded-name is computed by expanding the QName specified in the tag in the XML document in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the attribute's name will be null if the QName of the attribute does not have a prefix.

一个属性节点具有一个展开名和一个字符串值。如果一个属性的qname没有前缀则属性名的namespace urinull

NOTE: In the notation of Appendix A.3 of [XML Names], the local part of the expanded-name corresponds to the name attribute of the ExpAName element; the namespace URI of the expanded-name corresponds to the ns attribute of the ExpAName element, and is null if the ns attribute of the ExpAName element is omitted.

An attribute node has a string-value. The string-value is the normalized value as specified by the XML Recommendation [XML]. An attribute whose normalized value is a zero-length string is not treated specially: it results in an attribute node whose string-value is a zero-length string.

NOTE: It is possible for default attributes to be declared in an external DTD or an external parameter entity. The XML Recommendation does not require an XML processor to read an external DTD or an external parameter unless it is validating. A stylesheet or other facility that assumes that the XPath tree contains default attribute values declared in an external DTD or parameter entity may not work with some non-validating XML processors.

There are no attribute nodes corresponding to attributes that declare namespaces (see [XML Names]).

命名空间声明不对应有属性节点。

5.4 Namespace Nodes

Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by the XML Namespaces Recommendation [XML Names]) and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes;

每一个元素具有一个相关的命名空间节点集,每个不同的命名空间前缀对应一个。

however, a namespace node is not a child of its parent element. Elements never share namespace nodes: if one element node is not the same node as another element node, then none of the namespace nodes of the one element node will be the same node as the namespace nodes of another element node. This means that an element will have a namespace node:

因此在下列情况下元素会有命名空间节点:

  • for every attribute on the element whose name starts with xmlns:;
  • 元素上的每个以xmlns:开头的属性对应一个;
  • for every attribute on an ancestor element whose name starts xmlns: unless the element itself or a nearer ancestor redeclares the prefix;
  • 祖先元素上的每个以xmlns:开头的属性对应一个,除非元素自己或更近的祖先重新声明了同一前缀。
  • for an xmlns attribute, if the element or some ancestor has an xmlns attribute, and the value of the xmlns attribute for the nearest such element is non-empty
  • 对于xmlns属性,如果元素或者某些祖先具有xmlns属性,并且最近的xmlns属性的值非空。

NOTE: An attribute xmlns="" "undeclares" the default namespace (see [XML Names]).

注意:属性xmlns=””取消默认命名空间声明。

A namespace node has an expanded-name: the local part is the namespace prefix (this is empty if the namespace node is for the default namespace); the namespace URI is always null.

The string-value of a namespace node is the namespace URI that is being bound to the namespace prefix; if it is relative, it must be resolved just like a namespace URI in an expanded-name.

5.5 Processing Instruction Nodes

There is a processing instruction node for every processing instruction, except for any processing instruction that occurs within the document type declaration.

每个处理器指令产生一个处理器指令节点,除了文档类型声明中的处理器指令。

A processing instruction has an expanded-name: the local part is the processing instruction's target; the namespace URI is null. The string-value of a processing instruction node is the part of the processing instruction following the target and any whitespace. It does not include the terminating ?>.

处理器指令具有全名:局部名是处理器指令的目标,命名空间uri是空。处理器指令的字符串值是处理指令目标后的所有部分,包括人和空白字符但是不包括结尾的?>

NOTE: The XML declaration is not a processing instruction. Therefore, there is no processing instruction node corresponding to the XML declaration.

注意:xml声明并不是处理指令。因此没有对应于xml声明的处理指令节点。

5.6 Comment Nodes

There is a comment node for every comment, except for any comment that occurs within the document type declaration.

每个注释产生一个注释节点,除了文档类型声明中的注释。

The string-value of comment is the content of the comment not including the opening <!-- or the closing -->.

注释的字符串值就是注释中的内容,不包括<!—à

A comment node does not have an expanded-name.

注释没有全名。

5.7 Text Nodes

Character data is grouped into text nodes. As much character data as possible is grouped into each text node: a text node never has an immediately following or preceding sibling that is a text node. The string-value of a text node is the character data. A text node always has at least one character of data.

字符数据被分组到文本节点中。将近可能多的将文本数据组织进文本节点:一个文本节点永远不会立即跟随或前导有一个文本节点同胞。文本节点的字符串值就是那些字符数据。一个文本节点总至少有一个字符的数据。

Each character within a CDATA section is treated as character data. Thus, <![CDATA[<]]> in the source document will treated the same as &lt;. Both will result in a single < character in a text node in the tree. Thus, a CDATA section is treated as if the <![CDATA[ and ]]> were removed and every occurrence of < and & were replaced by &lt; and &amp; respectively.

cdata段中的所有字符都被作为作为字符数据。因此,源文档中的“<![CDATA[<]]>”这个cdata段等同于&lt;二者都返回单个小于号到树中的文本节点。因此,一个cdata段被处理时如同cdata的起始定界符和结束定界符被移除,然后所有的<&&lt;&amp;分别替代。

NOTE: When a text node that contains a < character is written out as

XML, the < character must be escaped by, for example, using &lt;, or including it in a CDATA section.

注意:当一个包含<的文本节点作为xml写出时,<必须被转义。

Characters inside comments, processing instructions and attribute values do not produce text nodes. Line-endings in external entities are normalized to #xA as specified in the XML Recommendation [XML].

在注释、处理指令和属性值内部的字符并不产生文本节点。外部实体中的行结束副被标准化为#xA,xml规范所述。

A text node does not have an expanded-name.

文本节点没有全名。

6 Conformance

XPath is intended primarily as a component that can be used by other specifications. Therefore, XPath relies on specifications that use XPath (such as [XPointer] and [XSLT]) to specify criteria for conformance of implementations of XPath and does not define any conformance criteria for independent implementations of XPath.

Xpath主要被用来作为其他规范的组件。因此xpath依赖于使用xpath的规范指定实现的一致性,并没有定义任何独立的一致性。


A References

A.1 Normative References

IEEE 754

Institute of Electrical and Electronics Engineers. IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Std 754-1985.

RFC2396

T. Berners-Lee, R. Fielding, and L. Masinter. Uniform Resource Identifiers (URI): Generic Syntax. IETF RFC 2396. See http://www.ietf.org/rfc/rfc2396.txt.

XML

World Wide Web Consortium. Extensible Markup Language (XML) 1.0. W3C Recommendation. See http://www.w3.org/TR/1998/REC-xml-19980210

XML Names

World Wide Web Consortium. Namespaces in XML. W3C Recommendation. See http://www.w3.org/TR/REC-xml-names

A.2 Other References

Character Model

World Wide Web Consortium. Character Model for the World Wide Web. W3C Working Draft. See http://www.w3.org/TR/WD-charmod

DOM

World Wide Web Consortium. Document Object Model (DOM) Level 1 Specification. W3C Recommendation. See http://www.w3.org/TR/REC-DOM-Level-1

JLS

J. Gosling, B. Joy, and G. Steele. The Java Language Specification. See http://java.sun.com/docs/books/jls/index.html.

ISO/IEC 10646

ISO (International Organization for Standardization). ISO/IEC 10646-1:1993, Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane. International Standard. See http://www.iso.ch/cate/d18741.html.

TEI

C.M. Sperberg-McQueen, L. Burnard Guidelines for Electronic Text Encoding and Interchange. See http://etext.virginia.edu/TEI.html.

Unicode

Unicode Consortium. The Unicode Standard. See http://www.unicode.org/unicode/standard/standard.html.

XML Infoset

World Wide Web Consortium. XML Information Set. W3C Working Draft. See http://www.w3.org/TR/xml-infoset

XPointer

World Wide Web Consortium. XML Pointer Language (XPointer). W3C Working Draft. See http://www.w3.org/TR/WD-xptr

XQL

J. Robie, J. Lapp, D. Schach. XML Query Language (XQL). See http://www.w3.org/TandS/QL/QL98/pp/xql.html

XSLT

World Wide Web Consortium. XSL Transformations (XSLT). W3C Recommendation. See http://www.w3.org/TR/xslt

B XML Information Set Mapping (Non-Normative)

The nodes in the XPath data model can be derived from the information items provided by the XML Information Set [XML Infoset] as follows:

NOTE: A new version of the XML Information Set Working Draft, which will replace the May 17 version, was close to completion at the time when the preparation of this version of XPath was completed and was expected to be released at the same time or shortly after the release of this version of XPath. The mapping is given for this new version of the XML Information Set Working Draft. If the new version of the XML Information Set Working has not yet been released, W3C members may consult the internal Working Group version http://www.w3.org/XML/Group/1999/09/WD-xml-infoset-19990915.html (members only).

  • The root node comes from the document information item. The children of the root node come from the children and children - comments properties.
  • An element node comes from an element information item. The children of an element node come from the children and children - comments properties. The attributes of an element node come from the attributes property. The namespaces of an element node come from the in-scope namespaces property. The local part of the expanded-name of the element node comes from the local name property. The namespace URI of the expanded-name of the element node comes from the namespace URI property. The unique ID of the element node comes from the children property of the attribute information item in the attributes property that has an attribute type property equal to ID.
  • An attribute node comes from an attribute information item. The local part of the expanded-name of the attribute node comes from the local name property. The namespace URI of the expanded-name of the attribute node comes from the namespace URI property. The string-value of the node comes from concatenating the character code property of each member of the children property.
  • A text node comes from a sequence of one or more consecutive character information items. The string-value of the node comes from concatenating the character code property of each of the character information items.
  • A processing instruction node comes from a processing instruction information item. The local part of the expanded-name of the node comes from the target property. (The namespace URI part of the expanded-name of the node is null.) The string-value of the node comes from the content property. There are no processing instruction nodes for processing instruction items that are children of document type declaration information item.
  • A comment node comes from a comment information item. The string-value of the node comes from the content property. There are no comment nodes for comment information items that are children of document type declaration information item.
  • A namespace node comes from a namespace declaration information item. The local part of the expanded-name of the node comes from the prefix property. (The namespace URI part of the expanded-name of the node is null.) The string-value of the node comes from the namespace URI property.

你可能感兴趣的:(xml)