一.XML约束
1.DTD约束
基本概念
DTD(Document Type Definition),文档类型定义,用来约束XML文档。或者可以把DTD理解为创建XML文档的结构。例如可以用DTD要求XML文档的根元素名为<students>,<students>中可以有1~N个<student>,<student>子元素为<name>、<age>和<sex>,<student>元素还有number属性。
(1)约束在XML文件中定义
l 位置:内部DTD在文档声明下面,在根元素上面;
l 语法格式:放到“<!DOCTYPE 根元素名称 [”和“]>”之间;
l 只对当前XML文档有效;
DTDTest01.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE bookstore [ <!ELEMENT bookstore (book+)> <!ELEMENT book (title,author,year,price)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST book category CDATA #REQUIRED> <!ATTLIST title lang CDATA #REQUIRED> ]> <bookstore> <book category = "history"> <title>Zi zhi tong jian</title> <author>Guang Si ma</author> <year>1048</year> <price>888</price> </book> <book category ="Web"> <title>Python</title> <author>Smith</author> <year>2005</year> <price>45</price> </book> <book category ="Cooking"> <title>Every Day</title> <author>J.P</author> <year>2012</year> <price>32.5</price> </book> </bookstore>
(2)约束在DTD文件中定义
l 位置:本地硬盘上;
l 语法格式:直接定义元素或属性即可;
l 本地所有XML文档都可以引用这个dtd文件;
1>demo.dtd
<!ELEMENT bookstore (book+)> <!ELEMENT book (title,author,year,price)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST book category CDATA #REQUIRED> <!ATTLIST title lang CDATA #REQUIRED>
2>DTDTest02.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE bookstore SYSTEM "demo.DTD"> <bookstore> <book category = "history"> <title>Zi zhi tong jian</title> <author>Guang Si ma</author> <year>1048</year> <price>888</price> </book> <book category ="Web"> <title>Python</title> <author>Smith</author> <year>2005</year> <price>45</price> </book> <book category ="Cooking"> <title>Every Day</title> <author>J.P</author> <year>2012</year> <price>32.5</price> </book> </bookstore>
2.Schema
基本概念
Schema是新的XML文档约束;
Schema要比DTD强大很多;
Schema本身也是XML文档,但Schema文档的扩展名为xsd,而不是xml。
(1)demo.xsd
<?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/demo" elementFormDefault="qualified"> <element name="bookstore"> <complexType> <sequence maxOccurs="3" minOccurs="1"> <element name="book"> <complexType> <sequence> <element name="title"> <complexType> <simpleContent> <extension base="string"> <attribute name="lang" type="string"></attribute> </extension> </simpleContent> </complexType> </element> <element name="author" type="string"></element> <element name="year" type="date"></element> <element name="price" type="double"></element> </sequence> <attribute name="category" type="string" use="required"></attribute> </complexType> </element> </sequence> </complexType> </element> </schema>
(2)bookstore.xml
<?xml version="1.0" encoding="UTF-8"?> <bookstore xmlns="http://www.example.org/demo" xsi:schemaLocation="http://www.example.org/demo demo.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <book category="COOKING"> <title>Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005-10-10</year> <price>30.00</price> </book> <book category="CHILDREN"> <title>Harry Potter</title> <author>J K. Rowling</author> <year>2004-10-10</year> <price>29.99</price> </book> <book category="WEB"> <title>学习 XML</title> <author>Erik T. Ray</author> <year>2003-10-10</year> <price>39.95</price> </book> </bookstore>
二.XML解析
1.DOM4J解析
基本概念
DOM4J是针对Java开发人员专门提供的XML文档解析规范,它不同与DOM,但与DOM相似。DOM4J针对Java开发人员而设计,所以对于Java开发人员来说,使用DOM4J要比使用DOM更加方便。
方法:
elements():得到相应元素下的所有的子元素
element("title"):得到指定的标签对象
elements(“title”):得到相应元素下的所有标签名为title的子元素
getName():得到相应标签或者元素的名字
attribute("category"): 得到指定的属性对象
.getValue :得到属性的值
attributeValue("category"):直接得到指定属性的值
getText():得到相应标签的文本值
elem.elementText("title"):得到指定子标签的文本的值
import java.util.Iterator; import java.util.List; import org.dom4j.Attribute; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.Element; import org.dom4j.io.SAXReader; public class DOMTest01 { public static void main(String[] args) throws Exception { //创建xml解析对象 SAXReader sax = new SAXReader(); //读取相应的xml文件 Document doc = sax.read("WebRoot/dom4j/bookstore.xml"); //得到根元素 Element rootElement = doc.getRootElement(); //遍历得到根元素下的子元素 for(Iterator i=rootElement.elementIterator();i.hasNext();){ Element element =(Element) i.next(); System.out.println(element); } System.out.println("-------------------------------------------------"); //得到根元素下的所有子元素 List<Element> els = rootElement.elements(); //得到根元素下的第二个元素 Element elem1 = els.get(1); System.out.println(elem1); System.out.println("-----------------------------------------"); //得到相应标签的名字 String name1 = elem1.getName(); System.out.println(name1); System.out.println("---------------------------------------------"); //得到指定属性的属性对象 Attribute attribute1 = elem1.attribute("category"); //得到相应的属性值 String value = attribute1.getValue(); System.out.println(value); System.out.println("-----------------------------------------------"); //直接得到指定属性的属性值 String value2 = elem1.attributeValue("category"); System.out.println(value2); System.out.println("--------------------------------"); //得到指定的标签对象 Element element = elem1.element("title"); //得到相应的标签文本值 String text = element.getText(); System.out.println(text); System.out.println("-------------------------------------------"); //直接得到相应的标签的文本值 String elementText = elem1.elementText("title"); System.out.println(elementText); } } /* org.dom4j.tree.DefaultElement@24e2dae9 [Element: <book attributes: [org.dom4j.tree.DefaultAttribute@299209ea [Attribute: name category value "COOKING"]]/>] org.dom4j.tree.DefaultElement@32c8f6f8 [Element: <book attributes: [org.dom4j.tree.DefaultAttribute@27ce2dd4 [Attribute: name category value "CHILDREN"]]/>] org.dom4j.tree.DefaultElement@5122cdb6 [Element: <book attributes: [org.dom4j.tree.DefaultAttribute@43ef9157 [Attribute: name category value "WEB"]]/>] ------------------------------------------------- org.dom4j.tree.DefaultElement@32c8f6f8 [Element: <book attributes: [org.dom4j.tree.DefaultAttribute@27ce2dd4 [Attribute: name category value "CHILDREN"]]/>] ----------------------------------------- book --------------------------------------------- CHILDREN ----------------------------------------------- CHILDREN -------------------------------- Harry Potter ------------------------------------------- Harry Potter */
2.使用XPath解析
基本概念
xpath是在xml中快速查找的语言,dom4j支持Xpath使用,必须导入jar包。
dom4j提供了两个方法给xpath
selectNodes
selectSingleNode
import java.util.List; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.Element; import org.dom4j.io.SAXReader; public class XpathTest01 { public static void main(String[] args) throws Exception { //创建xml解析对象 SAXReader sax = new SAXReader(); //读取XML文件 Document document = sax.read("WebRoot/dom4j/bookstore.xml"); //BBB /*选择所有BBB元素 * <AAA> <BBB/> <CCC/> <BBB/> <DDD> <BBB/> </DDD> <CCC> <DDD> <BBB/> <BBB/> </DDD> </CCC> </AAA> */ //查找多条xml数据元素 List<Element> elements = document.selectNodes("//year"); for(Element ele:elements){ System.out.println(ele.getText()); } System.out.println("--------------------------------------"); //查找单个标签元素 //BBB[@name='bbb'] /*选择含有属性name且其值为'bbb'的BBB元素 * <AAA> <BBB id = "b1"/> <BBB name = " bbb "/> <BBB name = "bbb"/> </AAA> */ Element ele = (Element) document.selectSingleNode("//book[@category='WEB']/price"); System.out.println(ele.getText()); } } /* 2005 2004 2003 39.95 */