dom4j学习总结


一、Dom4j介绍


dom4j是由JDOM开发团队分裂后开发出来的包;在hibernate、JAXM中都使用了dom4j;
性能来说:Dom4j>JDom>JAXP;

二、dom4j API


DocumentHelper类中有:


(1)Document document = DocumentHelper.createDocument();//创建一个document对象,通常用于新建一个xml文档

(2)Element element = DocumentHelper.createElement();//创建一个element对象,即创建一个标签

(3)Document document = DocumentHelper.parseText(String xml);//将xml字符串转换成以document为根节点的DOM树


SAXReader类中有:


(1)SAXReader reader = new SAXReader();

(2)Document document  = reader.read(new File("1.xml"));//读取并解析1.xml文档,并返回document


Document类中有:


(1)String text = document.asXML(Document); //将一颗DOM树转为XML字符串

(2)Element root = document.getRootElement(); //获得根节点


Element中有:


(1)Element newelem = elem.addElement("child"); //加入名为child的子标签,并返回此element

(2)newelem.addAttribute("name","value"); //标签添加一个属性

(3)newelem.addText("xxxx"); //为标签添加一个标签值

(4)newelem.getText(); //获得标签的标签值

(5)String value = newelem.attributeValue("name");//获得标签的属性值

(6)Iterator iter = newelem.attributeIterator() ;//标签的属性迭代器

(7)List childs = newelem.elements(); //获得标签的全部子元素

(8)Element child = newelem.element("name"); //获得标签的子标签中的多个<name>标签中的第一个元素

(9)List childs = newelem.elements("name"); //获得标签的子标签中的全部<name>标签

(10)newelem.remove(elem); //删除elem标签


XMLWriter类中有:


(1)XMLWriter writer = new XMLWriter(OutputStream out,OutputFormat format);

(2)writer.write(document); //输出document

(3)writer.close(); //关闭XMLWriter流


OutputFormat类中有:


(1)OutputFormat format = OutputFormat.createPrettyFormat();//输出时排版整齐

(2)OutputFormat format = OutputFormat.createCompactFormat();//输出时排版紧实

(3)format.setEncoding("UTF-8"); //设置<?xml    ?>中的encoding属性,默认为UTF-8


Attribute类中有:


(1)attr.setValue("value"); //设置属性

(2)String value = attr.getValue();


三、dom4j中的CRUD


1.Create


创建一个文档:

	private static void create() throws Exception {
		Document document = DocumentHelper.createDocument();
		Element person = DocumentHelper.createElement("person");
		document.add(person);
		Element name = person.addElement("name").addAttribute("a", "x").addText("xiazdong");
		Element age = person.addElement("age").addText("20");
		OutputFormat format = OutputFormat.createPrettyPrint();
		format.setEncoding("utf-8");
		XMLWriter writer = new XMLWriter(new FileOutputStream("output.xml"),format);
		writer.write(document);
		writer.close();
	}

插入:

private static void insert(Document document) throws Exception {
	Element root = document.getRootElement();
	List list =  root.elements("person");
	Element person = (Element)list.get(1);
	Element tmpElement = person.addElement("tmpChild");
	tmpElement.setText("tmp");				//添加标签值
	tmpElement.addAttribute("tmpname", "tmpvalue");	//添加属性
	Element tmp2 = DocumentHelper.createElement("tmpChild2");//创建一个element
	tmp2.setText("tmp2");
	list.add(1,tmp2);	//在指定位置添加元素
	XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"));
	writer.write(document);
	writer.close();	
}

2.Read


private static void read(Document document) throws Exception{
	Element root = document.getRootElement();
	Element person = (Element) root.elements("person").get(1);
	String value = person.element("name").getText();
	String attri = person.element("name").attributeValue("a");
	System.out.println(value);
	System.out.println(attri);
}


3.Update


private static void update(Document document) throws Exception{
	Element root = document.getRootElement();
	Element name = root.element("person").element("name");
	name.setText("xiazdong");				//更新标签值
	name.attribute("a").setValue("bb");		//更新属性值
	OutputFormat format = OutputFormat.createPrettyPrint();
	format.setEncoding("utf-8");
	XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"),format);
	writer.write(document);
	writer.close();
}

4.Delete

private static void delete(Document document) throws Exception {
	Element root = document.getRootElement();
	Element name = root.element("person").element("name");
	name.remove(name.attribute("a"));		//删除attribute
	name.getParent().remove(name);			//删除element
	OutputFormat format = OutputFormat.createPrettyPrint();
	format.setEncoding("utf-8");
	XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"),format);
	writer.write(document);
	writer.close();
}

四、乱码问题



在导入中文时,可能会出现乱码问题,乱码图示:


dom4j学习总结

解决方法:

format.setEncoding("UTF-8");

并且用字节流输出


dom4j学习总结


补充:dom4j处理大文件问题(比如100G)


因为dom方法是将整个XML文件读入内存,因此如果文件太大,会出现问题;

我们采用ElementHandler进行解决:每读一个分支节点,就处理一个分支节点。

SAXReader reader = new SAXReader();
reader.addHandler("/subwaycard/card",    //当处理<subwaycard>元素下的<card>子元素时
	new ElementHandler() {
		public void onEnd(ElementPath arg0) {   // 处理</card>时
			Element card = arg0.getCurrent();  //获得<card>节点
			card.getParent().remove(card);
			card.detach();  //将dom树上的card节点剪枝
		}
		public void onStart(ElementPath arg0) {//处理<card>时
						
		}

});

以上函数会在Document document = reader.read(new File("1.xml")); 调用。


以下代码可以实现大文件的删除操作:

package org.xiazdong.xml;

import java.io.File;
import java.io.FileOutputStream;
import java.util.List;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.ElementHandler;
import org.dom4j.ElementPath;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.SAXReader;
import org.dom4j.io.XMLWriter;

public class ElementHandlerTest {
	public static void main(String[] args) throws Exception {
		SAXReader reader = new SAXReader();
		reader.addHandler("/subwaycard/card",
				new ElementHandler() {
					public void onEnd(ElementPath arg0) {
						Element card = arg0.getCurrent();
						card.getParent().remove(card);
						card.detach();
					}

					public void onStart(ElementPath arg0) {
						
						
					}

				});
		Document document = reader.read(new File("1.xml"));
		OutputFormat format = OutputFormat.createPrettyPrint();
		format.setEncoding("GBK");
		XMLWriter writer = new XMLWriter(new FileOutputStream("1.xml"),
				format);
		writer.write(document);
		writer.close();
	}
}



你可能感兴趣的:(dom4j)