dom4j-cookbook

【0】README
1)本文译自http://dom4j.sourceforge.net/dom4j-1.6.1/cookbook.html 
2)intro: 
2.1)dom4j 是一个对象模型,在内存中表示一颗XML 树。dom4j 提供了易于使用的API以提供强大的处理特性,操纵或控制 XML 和 结合 XPath, XSLT 以及 SAX, JAXP 和 DOM 来进行处理;
2.2)dom4j 是基于接口来设计的,来提供高可配置的实现策略。你只需提供一个DocumentFactory的实现就可以创建你自己的XML树实现。这使得我们易于重用dom4j 的代码,当扩展dom4j来提供所需特性的实现的时候;

【1】读取XML 数据
1)intro:dom4j 附带了一组builder 类用于解析xml 数据和创建 类似于树的对象结构。读取XML 数据的代码如下:
public class DeployFileLoaderSample {
 /** dom4j object model representation of a xml document. Note: We use the interface(!) not its implementation */
 private Document doc;
   /**
    * Loads a document from a file.
    * @param aFile the data source
    * @throw a org.dom4j.DocumentExcepiton occurs on parsing failure.
    */
 public void parseWithSAX(File aFile) throws DocumentException {
  SAXReader xmlReader = new SAXReader();
  this.doc = xmlReader.read(aFile);
 }
 /**
  * Loads a document from a file.
  * @param aURL the data source
  * @throw a org.dom4j.DocumentExcepiton occurs on parsing failure.
  */
 public void parseWithSAX(URL aURL) throws DocumentException {
  SAXReader xmlReader = new SAXReader();
  this.doc = xmlReader.read(aURL);
 }
 public Document getDoc() {
  return doc;
 }
}
2)以上代码 阐明了使用 SAXReader根据给定文件 来创建一个完整dom4j 树。org.dom4j.io 包 包含了一组类用于创建和序列化XML对象。其中read() 方法被重载了使得你能够传递表示不同资源的对象;

java.lang.String - a SystemId is a String that contains a URI e.g. a URL to a XML file
java.net.URL - represents a Uniform Resource Loader or a Uniform Resource Identifier. Encapsulates a URL.
java.io.InputStream - an open input stream that transports xml data
java.io.Reader - more compatible. Has abilitiy to specify encoding scheme
org.sax.InputSource - a single input source for a XML entity.

2.1)添加新方法为 为 DeployFileCreator  增加更多的扩展性,代码还是上面那个代码;

3)测试用例如下

@Test
 public void readXML() {
  String base = System.getProperty("user.dir") + File.separator
    + "src" + File.separator;
 
  DeployFileLoaderSample sample = new DeployFileLoaderSample();
  try { // via parameter of URL type.
   sample.parseWithSAX(new URL("file:" + base + "pom.xml"));
   Document doc = sample.getDoc();
   System.out.println(doc.asXML());
  } catch (Exception e) {
   e.printStackTrace();
  }
 
  try { // via parameter of File type.
   sample.parseWithSAX(new File(base + "pom.xml"));
   Document doc = sample.getDoc();
   System.out.println(doc.asXML());
  } catch (Exception e) {
   e.printStackTrace();
  }
 }

【2】dom4j 和 其他XML API 整合

1)intro:dom4j 也提供了类用于和两个原始 XML 处理API(SAX 和 DOM) 进行整合。

2)DomReader类: 允许你将一个存在的 DOM 树 转换为 dom4j 树。你也可以 转换一个DOM 文档,DOM 节点分支 和 单个元素;代码如下:

public class DOMIntegratorSample {
 
 public DOMIntegratorSample() {}
 
 public org.w3c.dom.Document parse(URL url) {
  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  try {
   DocumentBuilder builder = factory.newDocumentBuilder();
   return builder.parse(url.toString());
  } catch (Exception e) {
   e.printStackTrace();
   return null;
  }
 }
 
 /** converts a W3C DOM document into a dom4j document */
 public Document buildDocment(org.w3c.dom.Document domDocument) {
  DOMReader xmlReader = new DOMReader();
  return xmlReader.read(domDocument);
 }
}

public String base = System.getProperty("user.dir") + File.separator
   + "src" + File.separator;

@Test // 测试用例,.
 public void testIntegrate() {
  DOMIntegratorSample sample = new DOMIntegratorSample();
  try {
   org.w3c.dom.Document doc = sample.parse(new URL("file:"+ base + "pom.xml"));
   Document doc4j  = sample.buildDocment(doc);
   System.out.println(doc4j.asXML());
  } catch (Exception e) {
   e.printStackTrace();
  }
 }

【3】DocumentFactory 的秘密

1)intro: 从头到尾创建一个 Document,代码如下:

public class GranuatedDeployFileCreator {
 private DocumentFactory factory;
 private Document doc;

 public GranuatedDeployFileCreator() {
  this.factory = DocumentFactory.getInstance(); // 单例方法.
 }
 public void generateDoc(String aRootElement) {
  doc = factory.createDocument();
  Element root = doc.addElement(aRootElement);
 }
}

1.1)测试用例如下:

@Test
 public void testGenerateDoc() {
  GranuatedDeployFileCreator creator = new GranuatedDeployFileCreator();
 
  creator.generateDoc("project");
  Document doc = creator.getDoc();
  System.out.println(doc.asXML());
 }

2)Document 和 Element 接口有许多 助手方法以简单的方式来创动态建 XML 文档;

public class Foo {
 
 public Foo() {}
 
 public Document createDocument() {
  Document document = DocumentHelper.createDocument();
  Element root = document.addElement("root");
  Element author2 = root.addElement("author").addAttribute("name", "Toby").addAttribute("location", "Germany")
    .addText("Tobias Rademacher");
  Element author1 = root.addElement("author").addAttribute("name", "James").addAttribute("location", "UK")
    .addText("James Strachan");
  return document;
 }
}

2.1)测试用例如下:

@Test
 public void testCreateDocByHelper() {
  Foo foo = new Foo();
 
  Document doc = foo.createDocument();
  System.out.println(doc.asXML());
 }

2.2)dom4j 是基于API 的接口。这意味着dom4j中的  DocumentFactory 和 阅读器类 总是使用 org.dom4j 接口而不是其实现类。 集合 API 和 W3C 的DOM 也采用了这种 方式;

2.3)一旦你解析后创建了一个文档,你就想要将其序列化到硬盘或普通流中。dom4j 提供了一组类以以下四种方式 来序列化 你的 dom4j 树; XML + HTML + DOM + SAX Events;


【4】序列化到XML

1)intro: 使用 XMLWriter 构造器根据给定的字符编码 来传递 输出流。相比于输出流,Writer 更容易使用,因为Writer 是基于字符串的,因此有很少的编码问题。Writer.write()方法 被重写了,你可以按需逐个写出dom4j对象;

2)代码如下:

// 序列化xml
public class DeployFileCreator3 {
 private Document doc;
 public DeployFileCreator3(Document doc) {
  this.doc = doc;
 }
 
 public void serializetoXML(OutputStream out, String aEncodingScheme) throws Exception {
  OutputFormat outformat = OutputFormat.createPrettyPrint();
  outformat.setEncoding(aEncodingScheme);
  XMLWriter writer = new XMLWriter(out, outformat);
  writer.write(this.doc);
  writer.flush();
  writer.close();
 }
 
}

3)测试用例

@Test
 public void testSerializetoXML() {
  Foo foo = new Foo();
 
  Document doc = foo.createDocument();
  DeployFileCreator3 creator = new DeployFileCreator3(doc);
  try {
   creator.serializetoXML(new FileOutputStream(base + "serializable.xml"),
     "UTF-8");
   System.out.println("serializable successfully.");
  } catch (Exception e) {
   e.printStackTrace();
  }
 }

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <author name="Toby" location="Germany">Tobias Rademacher</author>
  <author name="James" location="UK">James Strachan</author>
</root>

【4.1】自定义输出格式 

1)intro:即是说,你可以定义xml的输出格式(aEncodingScheme)

 // customize output format.
public class DeployFileCreator4 {
 private Document doc;
 private OutputFormat outFormat;
 public DeployFileCreator4(Document doc) {
  this.outFormat = OutputFormat.createPrettyPrint();
  this.doc = doc;
 }
 public DeployFileCreator4(Document doc, OutputFormat outFormat) {
  this.doc = doc;
  this.outFormat = outFormat;
 }
 public void writeAsXML(OutputStream out) throws Exception {
  XMLWriter writer = new XMLWriter(out, this.outFormat);
  writer.write(this.doc);
 }
 public void writeAsXML(OutputStream out, String encoding) throws Exception {
  this.outFormat.setEncoding(encoding);
  this.writeAsXML(out);
 }
}

2)OutputFormat中一个有趣的特性是 能够设置字符编码。使用这种机制设置XMLWriter的编码方式是一个好习惯,使用这种编码方式创建OutputStream 和 输出XML的声明。

3)测试用例:

@Test
 public void testCustomizeOutputFormat() {
  Foo foo = new Foo();
 
  Document doc = foo.createDocument();
  OutputFormat format = OutputFormat.createCompactFormat();
  format.setEncoding("UTF-8");
  DeployFileCreator4 creator = new DeployFileCreator4(
    doc, format);
  try {
   creator.writeAsXML(new FileOutputStream(base + "customizeFormat.xml"));
   System.out.println("successful customize format");
  } catch (Exception e) {
   e.printStackTrace();
  }
 }

<?xml version="1.0" encoding="UTF-8"?>
<root><author name="Toby" location="Germany">Tobias Rademacher</author><author name="James" location="UK">James Strachan</author></root>

【5】打印HTML

1)intro:HTMLWriter 带有一个dom4j 树 且会将该树 格式化为 HMTL流。这个格式化器 类似于 XMLWriter 但输出的是 CDATA 和 实体区域而不是 序列化格式的XML,且它支持许多没有结束标签的HTML 元素。如<br>;

2)代码如下:

public class PrintHTML {
	private Document doc;
	private OutputFormat outFormat;

	public PrintHTML(Document doc) {
		this.outFormat = OutputFormat.createPrettyPrint();
		this.doc = doc;
	}

	public PrintHTML(Document doc, OutputFormat outFormat) {
		this.doc = doc;
		this.outFormat = outFormat;
	}

	public void writeAsHTML(OutputStream out) throws Exception {
		HTMLWriter writer = new HTMLWriter(out, this.outFormat);
		writer.write(this.doc);
		writer.flush();
	}
}

3)测试用例:

@Test
	public void testPrintHTML() {
		Foo foo = new Foo();
		
		Document doc = foo.createDocument();
		PrintHTML creator = new PrintHTML(doc);
		try {
			creator.writeAsHTML(new FileOutputStream(base + "printHtml.html"));
			System.out.println("PrintHTML successfully");
		} catch (Exception e) {
			e.printStackTrace();
		}
	}




你可能感兴趣的:(dom4j-cookbook)