在工作中我们也许会用到xml,比如java中的配置文件,或者是一些基于硬件方面的接口通讯,一般都不是json,而是xml格式的,那为了好操作,我们需要把xml文件格式转换为我们需要的实体对象,那么:如何高效的将xml对象解析为我们的实体类对象?
目前在java中比较流行的,xml解析器有四种:
1.DOM解析器
2.SAX 解析器
3.StAX解析器
4.JAXB解析器 (这里暂不试验,用起来相对复杂一些)
当然除了上面这四种,github或其他开源平台上也有许多开源的xml解析插件。这里主要来结合代码来说明这四种解析器的使用。
DOM 解析器是最容易学习的java xml解析器。DOM解析器将XML文件加载到内存中,我们可以逐节点遍历它来解析XML。DOM Parser适用于小文件,但是当文件大小增加时,它执行速度慢并消耗更多内存。
测试代码如下:
创建一个employee.xml的测试文件:
Pankaj
544
Java Developer
Male
Lisa
35
CSS Developer
Female
DOMParse类如下:
public class DOMParse {
//DOM Parser适用于小型XML文档,但由于它将完整的XML文件加载到内存中,因此对大型XML文件不利。对于大型XML文件,您应该使用SAX Parser。
public static void main(String[] args) throws Exception {
String filePath = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
File xmlFile = new File(filePath);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nodeList = doc.getElementsByTagName("Employee");
//now XML is loaded as Document in memory, lets convert it to Object List
List empList = new ArrayList();
for (int i = 0; i < nodeList.getLength(); i++) {
empList.add(getEmployee(nodeList.item(i)));
}
//lets print Employee list information
for (Employee emp : empList) {
System.out.println(emp.toString());
}
} catch (SAXException | ParserConfigurationException | IOException e1) {
e1.printStackTrace();
}
}
private static Employee getEmployee(Node node) {
//XMLReaderDOM domReader = new XMLReaderDOM();
Employee emp = new Employee();
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
emp.setName(getTagValue("name", element));
emp.setAge(Integer.parseInt(getTagValue("age", element)));
emp.setGender(getTagValue("gender", element));
emp.setRole(getTagValue("role", element));
}
return emp;
}
private static String getTagValue(String tag, Element element) {
NodeList nodeList = element.getElementsByTagName(tag).item(0).getChildNodes();
Node node = (Node) nodeList.item(0);
return node.getNodeValue();
}
}
输出结果:
Root element :Employees
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
Java SAX 解析器提供了解析XML文档的API。SAX解析器与DOM解析器不同,因为它不会将完整的XML加载到内存中并按顺序读取xml文档。它是一个基于事件的解析器,我们需要实现我们的Handler类来解析XML文件。对于大型XML文件而言,它在时间和内存使用方面比DOM Parser更优秀。
javax.xml.parsers.SAXParser
提供了使用事件处理程序解析XML文档的方法。此类实现XMLReader
接口并提供重载版本的parse()
方法,以从File,InputStream,SAX InputSource和String URI读取XML文档。
实际的解析由Handler类完成。我们需要创建自己的处理程序类来解析XML文档。我们需要实现org.xml.sax.ContentHandler
接口来创建自己的处理程序类。此接口包含回调方法,这些方法在发生任何事件时接收通知。例如StartDocument,EndDocument,StartElement,EndElement,CharacterData等。
org.xml.sax.helpers.DefaultHandler
提供了ContentHandler接口的默认实现,我们可以扩展这个类来创建自己的处理程序。建议扩展此类,因为我们可能只需要很少的方法来实现。扩展此类将使我们的代码更清晰,更易于维护。
我们依然沿用相同的employee.xml文件
创建我们自己的Handler对象EmployeeXMLHandler:
public class EmployeeXMLHandler extends DefaultHandler {
//List to hold Employees object
private List empList = null;
private Employee emp = null;
//getter method for employee list
public List getEmpList() {
return empList;
}
boolean bAge = false;
boolean bName = false;
boolean bGender = false;
boolean bRole = false;
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("Employee")) {
//create a new Employee and put it in Map
//initialize Employee object and set id attribute
emp = new Employee();
//initialize list
if (empList == null)
empList = new ArrayList<>();
} else if (qName.equalsIgnoreCase("name")) {
//set boolean values for fields, will be used in setting Employee variables
bName = true;
} else if (qName.equalsIgnoreCase("age")) {
bAge = true;
} else if (qName.equalsIgnoreCase("gender")) {
bGender = true;
} else if (qName.equalsIgnoreCase("role")) {
bRole = true;
}
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if (qName.equalsIgnoreCase("Employee")) {
//add Employee object to list
empList.add(emp);
}
}
@Override
public void characters(char ch[], int start, int length) throws SAXException {
if (bAge) {
//age element, set Employee age
emp.setAge(Integer.parseInt(new String(ch, start, length)));
bAge = false;
} else if (bName) {
emp.setName(new String(ch, start, length));
bName = false;
} else if (bRole) {
emp.setRole(new String(ch, start, length));
bRole = false;
} else if (bGender) {
emp.setGender(new String(ch, start, length));
bGender = false;
}
}
}
测试类XMLParserSAX:
public class XMLParserSAX {
public static void main(String[] args) {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
try {
SAXParser saxParser = saxParserFactory.newSAXParser();
EmployeeXMLHandler handler = new EmployeeXMLHandler();
saxParser.parse(new File("D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml"), handler);
//Get Employees list
List empList = handler.getEmpList();
//print employee information
for(Employee emp : empList)
System.out.println(emp);
} catch (ParserConfigurationException | IOException | org.xml.sax.SAXException e) {
e.printStackTrace();
}
}
}
输出结果:
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
要覆盖的SAX解析器方法
重写的重要方法是startElement()
,endElement()
和characters()
。
SAXParser
开始解析文档,当找到任何start元素时,startElement()
调用方法。我们重写此方法以设置将用于标识元素的布尔变量。
每次找到Employee start元素时,我们也使用此方法创建新的Employee对象。检查如何读取id属性以设置Employee Object id
字段。
characters()
SAXParser在元素中找到字符数据时调用方法。我们使用布尔字段将值设置为在Employee对象中更正字段。
该endElement()
是我们Employee对象添加到每当我们发现员工结束元素标签列表中的位置。
SAXParserFactory
提供工厂方法来获取SAXParser
实例。我们将File对象与MyHandler实例一起传递给parse方法来处理回调事件。
SAXParser在开始时有点混乱,但如果您正在处理大型XML文档,它提供了比DOM Parser更有效的XML读取方法。这就是Java中的SAX Parser。
用于XML的Java Streaming API(Java StAX)提供了在java中处理XML的实现。StAX包含两组API - 基于游标的API和基于迭代器的API。
我们依然沿用上面的employee.xml文件来做测试。
创建StaxXMLReader类:
public class StaxXMLReader {
public static void main(String[] args) {
String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
List empList = parseXML(fileName);
for(Employee emp : empList){
System.out.println(emp.toString());
}
}
private static List parseXML(String fileName) {
List empList = new ArrayList<>();
Employee emp = null;
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
while(xmlEventReader.hasNext()){
XMLEvent xmlEvent = xmlEventReader.nextEvent();
if (xmlEvent.isStartElement()){
StartElement startElement = xmlEvent.asStartElement();
if(startElement.getName().getLocalPart().equals("Employee")){
emp = new Employee();
//Get the 'id' attribute from Employee element
Attribute idAttr = startElement.getAttributeByName(new QName("id"));
/*if(idAttr != null){
emp.setId(Integer.parseInt(idAttr.getValue()));
}*/
}
//set the other varibles from xml elements
else if(startElement.getName().getLocalPart().equals("age")){
xmlEvent = xmlEventReader.nextEvent();
// 这里得注意一下,如果age可能为空则需要这样来判断一下
if(xmlEvent.isEndElement()) {
emp.setAge(Integer.parseInt("1000"));
}
else
{
emp.setAge(Integer.parseInt(xmlEvent.asCharacters().getData()));
}
}else if(startElement.getName().getLocalPart().equals("name")){
xmlEvent = xmlEventReader.nextEvent();
emp.setName(xmlEvent.asCharacters().getData());
}else if(startElement.getName().getLocalPart().equals("gender")){
xmlEvent = xmlEventReader.nextEvent();
emp.setGender(xmlEvent.asCharacters().getData());
}else if(startElement.getName().getLocalPart().equals("role")){
xmlEvent = xmlEventReader.nextEvent();
emp.setRole(xmlEvent.asCharacters().getData());
}
}
//if Employee end element is reached, add employee object to list
if(xmlEvent.isEndElement()){
EndElement endElement = xmlEvent.asEndElement();
System.out.println("取到的结束标签"+endElement.getName().getLocalPart());
if(endElement.getName().getLocalPart().equals("Employee")){
empList.add(emp);
}
}
}
} catch (FileNotFoundException | XMLStreamException e) {
e.printStackTrace();
}
return empList;
}
}
当我们使用StAX XML Parser时,我们需要创建XMLInputFactory
读取XML文件。然后我们可以通过创建XMLStreamReader
对象来读取文件来选择基于游标的API 。XMLStreamReader next()方法用于获取下一个解析事件,并根据事件类型返回int值。常见事件类型包括Start Document,Start Element,Characters,End Element和End Document。XMLStreamConstants
包含可用于根据事件类型处理事件的int常量。
测试类StaxXMLReader2
public class StaxXMLReader2
{
private static boolean bName;
private static boolean bAge;
private static boolean bGender;
private static boolean bRole;
public static void main(String[] args) {
String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
List empList = parseXML(fileName);
for(Employee emp : empList){
System.out.println(emp.toString());
}
}
private static List parseXML(String fileName) {
List empList = new ArrayList<>();
Employee emp = null;
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
int event = xmlStreamReader.getEventType();
while(true){
switch(event) {
case XMLStreamConstants.START_ELEMENT:
if(xmlStreamReader.getLocalName().equals("Employee")){
emp = new Employee();
// emp.setId(Integer.parseInt(xmlStreamReader.getAttributeValue(0)));
}else if(xmlStreamReader.getLocalName().equals("name")){
bName=true;
}else if(xmlStreamReader.getLocalName().equals("age")){
bAge=true;
}else if(xmlStreamReader.getLocalName().equals("role")){
bRole=true;
}else if(xmlStreamReader.getLocalName().equals("gender")){
bGender=true;
}
break;
case XMLStreamConstants.CHARACTERS:
if(bName){
emp.setName(xmlStreamReader.getText());
bName=false;
}else if(bAge){
emp.setAge(Integer.parseInt(xmlStreamReader.getText()));
bAge=false;
}else if(bGender){
emp.setGender(xmlStreamReader.getText());
bGender=false;
}else if(bRole){
emp.setRole(xmlStreamReader.getText());
bRole=false;
}
break;
case XMLStreamConstants.END_ELEMENT:
if(xmlStreamReader.getLocalName().equals("Employee")){
empList.add(emp);
}
break;
}
if (!xmlStreamReader.hasNext())
break;
event = xmlStreamReader.next();
}
} catch (FileNotFoundException | XMLStreamException e) {
e.printStackTrace();
}
return empList;
}
}
运行结果:
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
JDOM提供了一个出色的Java XML解析器API,可以轻松地读取,编辑和编写XML文档。JDOM提供了包装类,用于从SAX Parser,DOM Parser,STAX Event Parser和STAX Stream Parser中选择底层实现。
添加maven依赖:
org.jdom
jdom2
2.0.6
测试类JDOMXMLReader:
public class JDOMXMLReader {
//使用JDOM的好处是可以轻松地从SAX切换到DOM到STAX Parser,您可以提供工厂方法让客户端应用程序选择实现。
public static void main(String[] args) {
final String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
org.jdom2.Document jdomDoc;
try {
//we can create JDOM Document from DOM, SAX and STAX Parser Builder classes
jdomDoc = useDOMParser(fileName);
// jdomDoc = useSAXParser(fileName);
// jdomDoc = useSTAXParser(fileName,"stream");
Element root = jdomDoc.getRootElement();
List empListElements = root.getChildren("Employee");
List empList = new ArrayList<>();
for (Element empElement : empListElements) {
Employee emp = new Employee();
// emp.setId(Integer.parseInt(empElement.getAttributeValue("id")));
emp.setAge(Integer.parseInt(empElement.getChildText("age")));
emp.setName(empElement.getChildText("name"));
emp.setRole(empElement.getChildText("role"));
emp.setGender(empElement.getChildText("gender"));
empList.add(emp);
}
//lets print Employees list information
for (Employee emp : empList)
System.out.println(emp);
} catch (Exception e) {
e.printStackTrace();
}
}
//Get JDOM document from DOM Parser
private static org.jdom2.Document useDOMParser(String fileName)
throws ParserConfigurationException, SAXException, IOException {
//creating DOM Document
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File(fileName));
DOMBuilder domBuilder = new DOMBuilder();
return domBuilder.build(doc);
}
//Get JDOM document from SAX Parser
private static org.jdom2.Document useSAXParser(String fileName) throws JDOMException,
IOException {
SAXBuilder saxBuilder = new SAXBuilder();
return saxBuilder.build(new File(fileName));
}
//Get JDOM Document from STAX Stream Parser or STAX Event Parser
private static org.jdom2.Document useSTAXParser(String fileName, String type) throws FileNotFoundException, XMLStreamException, JDOMException{
if(type.equalsIgnoreCase("stream")){
StAXStreamBuilder staxBuilder = new StAXStreamBuilder();
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
return staxBuilder.build(xmlStreamReader);
}
StAXEventBuilder staxBuilder = new StAXEventBuilder();
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
return staxBuilder.build(xmlEventReader);
}
}
使用JDOM的好处是可以轻松地从SAX切换到DOM到STAX Parser,我们可以提供相关实现接口让客户端应用程序选择实现。
完整的测试代码地址:https://github.com/bo-zhang-1/Xml-Parser