在JAVA中最常用的解析xml的方式有四种,分别为:sax、dom、dom4j、jdom。sax方式的解析是基于事件的,比较适合大数据的解析,而dom的方式会将xml加载到内存中,构建出dom树,解析大文件时容易造成内存溢出,在实际开发中,我们应该选择最合适的解析方式。
假设我们有这样一个xml文件:
<xml>
<userId><![CDATA[jianggujin]]></userId>
<userName><![CDATA[蒋固金]]></userName>
<birthday>1994-12-01</birthday>
</xml>
通过观察这段xml,我们可以将其抽象成一个实体类,取名为UserInfo
。
import java.text.MessageFormat;
public class UserInfo {
private String userId;
private String userName;
private String birthday;
public String getUserId()
{
return userId;
}
public void setUserId(String userId)
{
this.userId = userId;
}
public String getUserName()
{
return userName;
}
public void setUserName(String userName)
{
this.userName = userName;
}
public String getBirthday()
{
return birthday;
}
public void setBirthday(String birthday)
{
this.birthday = birthday;
}
@Override
public String toString()
{
return MessageFormat.format("[userId:{0},userName:{1},birthday:{2}]",
userId, userName, birthday);
}
}
为了使用方便,将这段xml抽出来放在一个接口中,这样就不需要每个解析类里面都写一遍了,只需要实现这个接口就行了。
public interface XMLContent {
public String XML = "<xml><userId><![CDATA[jianggujin]]></userId><userName><![CDATA[蒋固金]]></userName><birthday>1994-12-01</birthday></xml>";
}
接下来,将演示四种方式解析xml的简单示例。
sax方式需要一个解析的处理器,继承自org.xml.sax.helpers.DefaultHandler
,
在DefaultHandler中,我们可能需要重写的方法有如下几个:
返回值 | 方法名 | 说明 |
---|---|---|
void | startDocument() | 接收文档开始的通知 |
void | startElement(String uri, String localName, String qName, Attributes attributes) | 接收元素开始的通知 |
void | characters(char[] ch, int start, int length) | 接收元素中字符数据的通知 |
void | endElement(String uri, String localName, String qName) | 接收元素结束的通知 |
void | endDocument() | 接收文档结束的通知 |
使用示例:
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
/** * Sax解析XML * * @author jianggujin * */
public class SaxParser extends DefaultHandler implements XMLContent {
private UserInfo userInfo = null;
private String currentTag = null;
@Override
public void startDocument() throws SAXException
{
userInfo = new UserInfo();
System.out.println("startDocument...");
}
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException
{
System.out.println("startElement...");
currentTag = qName;
}
@Override
public void characters(char[] ch, int start, int length) throws SAXException
{
System.out.println("characters...");
String value = new String(ch, start, length);
System.out.println(value);
if ("userId".equals(currentTag))
{
userInfo.setUserId(value);
}
else if ("userName".equals(currentTag))
{
userInfo.setUserName(value);
}
else if ("birthday".equals(currentTag))
{
userInfo.setBirthday(value);
}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException
{
System.out.println("endElement...");
currentTag = null;
}
@Override
public void endDocument() throws SAXException
{
System.out.println("endDocument...");
}
private UserInfo getUserInfo()
{
return userInfo;
}
public static UserInfo parse(InputStream inputStream) throws SAXException,
ParserConfigurationException, IOException
{
XMLReader reader = SAXParserFactory.newInstance().newSAXParser()
.getXMLReader();
SaxParser handler = new SaxParser();
reader.setContentHandler(handler);
reader.parse(new InputSource(inputStream));
return handler.getUserInfo();
}
public static UserInfo parse(String xml) throws SAXException,
ParserConfigurationException, IOException
{
return parse(new ByteArrayInputStream(xml.getBytes("UTF-8")));
}
public static void main(String[] args) throws Exception
{
System.out.println(parse(XML));
}
}
import java.io.ByteArrayInputStream;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
/** * Dom方式解析XML * * @author jianggujin * */
public class DomParser implements XMLContent {
public static void main(String[] args) throws Exception
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new ByteArrayInputStream(XML
.getBytes("UTF-8")));
UserInfo userInfo = new UserInfo();
userInfo.setUserId(document.getElementsByTagName("userId").item(0)
.getTextContent());
userInfo.setUserName(document.getElementsByTagName("userName").item(0)
.getTextContent());
userInfo.setBirthday(document.getElementsByTagName("birthday").item(0)
.getTextContent());
System.out.println(userInfo);
}
}
import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
/** * Dom4d方式解析XML * * @author jianggujin * */
public class Dom4jParser implements XMLContent {
public static void main(String[] args) throws Exception
{
Document document = DocumentHelper.parseText(XML);
Element rootElement = document.getRootElement();
UserInfo userInfo = new UserInfo();
userInfo.setUserId(rootElement.elementText("userId"));
userInfo.setUserName(rootElement.elementText("userName"));
userInfo.setBirthday(rootElement.elementText("birthday"));
System.out.println(userInfo);
}
}
import java.io.ByteArrayInputStream;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.input.SAXBuilder;
/** * JDom方式解析XML * * @author jianggujin * */
public class JDomParser implements XMLContent {
public static void main(String[] args) throws Exception
{
SAXBuilder builder = new SAXBuilder();
Document document = builder.build(new ByteArrayInputStream(XML
.getBytes("UTF-8")));
Element rootElement = document.getRootElement();
UserInfo userInfo = new UserInfo();
userInfo.setUserId(rootElement.getChildText("userId"));
userInfo.setUserName(rootElement.getChildText("userName"));
userInfo.setBirthday(rootElement.getChildText("birthday"));
System.out.println(userInfo);
}
}
JAVA解析XML