XML学习笔记-第二章 XML文档

When choosing between using elements or attributes for data storage, keep in mind that attributes in their native format are only intended to contain a single value, while element structures can contain multiple values through nested elements. Also, elements can represent structures in documents through nesting, and can be extended, while attributes are limited to the element they are contained in. 

Unicode renders the text based on a certain predetermined byte format, and xml:lang tells parsers to handle the text defined in a specific xml:lang element as using a special set of instructions for a specific language. Parsers will continue to follow those specific language rules in nested elements and attributes until either the element tag is closed or another xml:lang attribute is encountered.

Language codes can be defined in a variety of ways, some completely standardized, as in the case of the International Organization of Standardization (ISO) 639 language codes (make sure you use the two character ISO 639 codes and not the three character ISO 639-2 codes) and the ISO 3166 country codes, of which any combination is a legal xml:lang language identifier, a registered IANA name tag (which can be linguistic or computer languages), or you can make one up, using an x- or an X as a prefix, as long as the name hasn’t already been registered as part of the ISO or IANA languages.

To maintain the text spacing through XML document manipulation and future reformatting, the xml:space=”preserve” attribute can be used to make sure that the spacing and the line formats stay intact

The xml:space=”default” attribute can also be defined, but just for fun because it doesn’t tell the parser to do anything it wouldn’t do anyway. some parsers may ignore the xml:space, but most are good XML citizens and respect the text formatting if the “preserve” attribute is set

The space that is defined around text but part of the text formatting is referred to as “whitespace” .

The xmlns: attribute declares the namespace for an XML document or a portion of an XML document.

Often the URL in the namespace also resolves to a Website that provides documentation about the namespace, or information about the encoding types identified in the namespace, and so on. However, in some case, the URLs do not resolve to an actual document, but are used as a placeholder when declaring namespace names, which can be used at a future date for documentation if it is needed.

Namespaces are useful in identifying sections of documents that are being parsed, transformed, or manipulated in some other way. The parser or transformation engine can identify groups of elements and attributes by their namespace prefix instead of by their element values alone, and this helps to keep logical portions of an XML document together during manipulation.

URI is one of the basic component in Namespace, HTTP URIs (Uniform Resource Identifiers) are a format
specification for Uniform Resource Locators (URLs), and Uniform Resource Names (URNs), The main difference is that URLs are used to specify a locationspecific resource on the Web, while URNs are used
to describe any value, URNs and URLs can be assigned to a URI.

The URN value of the namespace can contain anything that the W3C namespace Recommendation allows, but because the URI will be used in element names, it has to adhere to the W3C XML document element name rules for characters.

你可能感兴趣的:(XML学习笔记-第二章 XML文档)