2.1)dom4j 是一个对象模型,在内存中表示一颗XML 树。dom4j 提供了易于使用的API以提供强大的处理特性,操纵或控制 XML 和 结合 XPath, XSLT 以及 SAX, JAXP 和 DOM 来进行处理;2.2)dom4j 是基于接口来设计的,来提供高可配置的实现策略。你只需提供一个DocumentFactory的实现就可以创建你自己的XML树实现。这使得我们易于重用dom4j 的代码,当扩展dom4j来提供所需特性的实现的时候;
public class DeployFileLoaderSample { /** dom4j object model representation of a xml document. Note: We use the interface(!) not its implementation */ private Document doc; /** * Loads a document from a file. * @param aFile the data source * @throw a org.dom4j.DocumentExcepiton occurs on parsing failure. */ public void parseWithSAX(File aFile) throws DocumentException { SAXReader xmlReader = new SAXReader(); this.doc = xmlReader.read(aFile); } /** * Loads a document from a file. * @param aURL the data source * @throw a org.dom4j.DocumentExcepiton occurs on parsing failure. */ public void parseWithSAX(URL aURL) throws DocumentException { SAXReader xmlReader = new SAXReader(); this.doc = xmlReader.read(aURL); } public Document getDoc() { return doc; } }
java.lang.String - a SystemId is a String that contains a URI e.g. a URL to a XML file java.net.URL - represents a Uniform Resource Loader or a Uniform Resource Identifier. Encapsulates a URL. java.io.InputStream - an open input stream that transports xml data java.io.Reader - more compatible. Has abilitiy to specify encoding scheme org.sax.InputSource - a single input source for a XML entity.
2.1)添加新方法为 为 DeployFileCreator 增加更多的扩展性,代码还是上面那个代码;
3)测试用例如下
@Test public void readXML() { String base = System.getProperty("user.dir") + File.separator + "src" + File.separator; DeployFileLoaderSample sample = new DeployFileLoaderSample(); try { // via parameter of URL type. sample.parseWithSAX(new URL("file:" + base + "pom.xml")); Document doc = sample.getDoc(); System.out.println(doc.asXML()); } catch (Exception e) { e.printStackTrace(); } try { // via parameter of File type. sample.parseWithSAX(new File(base + "pom.xml")); Document doc = sample.getDoc(); System.out.println(doc.asXML()); } catch (Exception e) { e.printStackTrace(); } }
【2】dom4j 和 其他XML API 整合
1)intro:dom4j 也提供了类用于和两个原始 XML 处理API(SAX 和 DOM) 进行整合。
2)DomReader类: 允许你将一个存在的 DOM 树 转换为 dom4j 树。你也可以 转换一个DOM 文档,DOM 节点分支 和 单个元素;代码如下:
public class DOMIntegratorSample { public DOMIntegratorSample() {} public org.w3c.dom.Document parse(URL url) { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); try { DocumentBuilder builder = factory.newDocumentBuilder(); return builder.parse(url.toString()); } catch (Exception e) { e.printStackTrace(); return null; } } /** converts a W3C DOM document into a dom4j document */ public Document buildDocment(org.w3c.dom.Document domDocument) { DOMReader xmlReader = new DOMReader(); return xmlReader.read(domDocument); } } public String base = System.getProperty("user.dir") + File.separator + "src" + File.separator; @Test // 测试用例,. public void testIntegrate() { DOMIntegratorSample sample = new DOMIntegratorSample(); try { org.w3c.dom.Document doc = sample.parse(new URL("file:"+ base + "pom.xml")); Document doc4j = sample.buildDocment(doc); System.out.println(doc4j.asXML()); } catch (Exception e) { e.printStackTrace(); } }
【3】DocumentFactory 的秘密
1)intro: 从头到尾创建一个 Document,代码如下:
public class GranuatedDeployFileCreator { private DocumentFactory factory; private Document doc; public GranuatedDeployFileCreator() { this.factory = DocumentFactory.getInstance(); // 单例方法. } public void generateDoc(String aRootElement) { doc = factory.createDocument(); Element root = doc.addElement(aRootElement); } }
1.1)测试用例如下:
@Test public void testGenerateDoc() { GranuatedDeployFileCreator creator = new GranuatedDeployFileCreator(); creator.generateDoc("project"); Document doc = creator.getDoc(); System.out.println(doc.asXML()); }
2)Document 和 Element 接口有许多 助手方法以简单的方式来创动态建 XML 文档;
public class Foo { public Foo() {} public Document createDocument() { Document document = DocumentHelper.createDocument(); Element root = document.addElement("root"); Element author2 = root.addElement("author").addAttribute("name", "Toby").addAttribute("location", "Germany") .addText("Tobias Rademacher"); Element author1 = root.addElement("author").addAttribute("name", "James").addAttribute("location", "UK") .addText("James Strachan"); return document; } }
2.1)测试用例如下:
@Test public void testCreateDocByHelper() { Foo foo = new Foo(); Document doc = foo.createDocument(); System.out.println(doc.asXML()); }
2.2)dom4j 是基于API 的接口。这意味着dom4j中的 DocumentFactory 和 阅读器类 总是使用 org.dom4j 接口而不是其实现类。 集合 API 和 W3C 的DOM 也采用了这种 方式;
2.3)一旦你解析后创建了一个文档,你就想要将其序列化到硬盘或普通流中。dom4j 提供了一组类以以下四种方式 来序列化 你的 dom4j 树; XML + HTML + DOM + SAX Events;
【4】序列化到XML
1)intro: 使用 XMLWriter 构造器根据给定的字符编码 来传递 输出流。相比于输出流,Writer 更容易使用,因为Writer 是基于字符串的,因此有很少的编码问题。Writer.write()方法 被重写了,你可以按需逐个写出dom4j对象;
2)代码如下:
// 序列化xml public class DeployFileCreator3 { private Document doc; public DeployFileCreator3(Document doc) { this.doc = doc; } public void serializetoXML(OutputStream out, String aEncodingScheme) throws Exception { OutputFormat outformat = OutputFormat.createPrettyPrint(); outformat.setEncoding(aEncodingScheme); XMLWriter writer = new XMLWriter(out, outformat); writer.write(this.doc); writer.flush(); writer.close(); } }
3)测试用例
@Test public void testSerializetoXML() { Foo foo = new Foo(); Document doc = foo.createDocument(); DeployFileCreator3 creator = new DeployFileCreator3(doc); try { creator.serializetoXML(new FileOutputStream(base + "serializable.xml"), "UTF-8"); System.out.println("serializable successfully."); } catch (Exception e) { e.printStackTrace(); } }
<?xml version="1.0" encoding="UTF-8"?> <root> <author name="Toby" location="Germany">Tobias Rademacher</author> <author name="James" location="UK">James Strachan</author> </root>
【4.1】自定义输出格式
1)intro:即是说,你可以定义xml的输出格式(aEncodingScheme)
// customize output format. public class DeployFileCreator4 { private Document doc; private OutputFormat outFormat; public DeployFileCreator4(Document doc) { this.outFormat = OutputFormat.createPrettyPrint(); this.doc = doc; } public DeployFileCreator4(Document doc, OutputFormat outFormat) { this.doc = doc; this.outFormat = outFormat; } public void writeAsXML(OutputStream out) throws Exception { XMLWriter writer = new XMLWriter(out, this.outFormat); writer.write(this.doc); } public void writeAsXML(OutputStream out, String encoding) throws Exception { this.outFormat.setEncoding(encoding); this.writeAsXML(out); } }
2)OutputFormat中一个有趣的特性是 能够设置字符编码。使用这种机制设置XMLWriter的编码方式是一个好习惯,使用这种编码方式创建OutputStream 和 输出XML的声明。
3)测试用例:
@Test public void testCustomizeOutputFormat() { Foo foo = new Foo(); Document doc = foo.createDocument(); OutputFormat format = OutputFormat.createCompactFormat(); format.setEncoding("UTF-8"); DeployFileCreator4 creator = new DeployFileCreator4( doc, format); try { creator.writeAsXML(new FileOutputStream(base + "customizeFormat.xml")); System.out.println("successful customize format"); } catch (Exception e) { e.printStackTrace(); } }
<?xml version="1.0" encoding="UTF-8"?> <root><author name="Toby" location="Germany">Tobias Rademacher</author><author name="James" location="UK">James Strachan</author></root>
【5】打印HTML
1)intro:HTMLWriter 带有一个dom4j 树 且会将该树 格式化为 HMTL流。这个格式化器 类似于 XMLWriter 但输出的是 CDATA 和 实体区域而不是 序列化格式的XML,且它支持许多没有结束标签的HTML 元素。如<br>;
2)代码如下:
public class PrintHTML { private Document doc; private OutputFormat outFormat; public PrintHTML(Document doc) { this.outFormat = OutputFormat.createPrettyPrint(); this.doc = doc; } public PrintHTML(Document doc, OutputFormat outFormat) { this.doc = doc; this.outFormat = outFormat; } public void writeAsHTML(OutputStream out) throws Exception { HTMLWriter writer = new HTMLWriter(out, this.outFormat); writer.write(this.doc); writer.flush(); } }
3)测试用例:
@Test public void testPrintHTML() { Foo foo = new Foo(); Document doc = foo.createDocument(); PrintHTML creator = new PrintHTML(doc); try { creator.writeAsHTML(new FileOutputStream(base + "printHtml.html")); System.out.println("PrintHTML successfully"); } catch (Exception e) { e.printStackTrace(); } }