java用dom4j解析带有cdata的xml报文

java用dom4j解析带有cdata的xml报文

    • 背景
    • 解决方法
      • 按照节点格式生成对应的实体类
        • 解析类
      • 注意
        • 使用xstream需要引入3个jar包
      • 总结

背景

前两天在工作中,调用外部的webservice接口,发现对方的返回报文格式与常见的不同,在解析中也一直有问题,遂记录下来。
报文格式

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
   <soap:Body>
      <ns1:result xmlns:ns1="http://baidu.com/">
         <return><![CDATA[<?xml version="1.0" encoding="UTF-8"?><Response>
  <responsebody>
    <respInfo>
      <resultCode>1</resultCode>
      <resultMsg>success</resultMsg>
      <users>
        <user>
          <name>张三</name>
          <sex></sex>
          <age>18</age>
          <stuends>
            <stuend>
              <score>80</score>
              <height>20</height>
            </stuend>
          </stuends>
        </user>
      </users>
    </respInfo>
  </responsebody>
</Response>]]></return>
      </ns1:result>
   </soap:Body>
</soap:Envelope>

思路
可以发现在整个xmj报文的中间,有一部分数据用包裹起来,导致我们不能用以前的方法来解析,需要先去掉xml报文头,定位到return节点,再处理cdata中的数据。

解决方法

按照节点格式生成对应的实体类

先生成实体类,需要lombok
Response.java


import lombok.*;

/**
 * @Author: xs
 * @Description:java用dom4j解析带有cdata的xml报文
 * @Date:Create:in 2020/6/28 15:48
 * @Modified By:
 */
@Builder
@Data
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor
public class Response {
    private ResponseBody responsebody;
}

ResponseBody.java


import lombok.*;

/**
 * @Author: xs
 * @Description:
 * @Date:Create:in 2020/6/28 15:49
 * @Modified By:
 */
@Builder
@Data
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor
public class ResponseBody {
    private RespInfo respInfo;
}

RespInfo.java

import lombok.*;

import java.util.List;

/**
 * @Author: xs
 * @Description:
 * @Date:Create:in 2020/6/28 15:52
 * @Modified By:
 */
@Builder
@Data
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor
public class RespInfo {
    private String resultCode;
    private String resultMsg;
    private List<Users> users;
}

Users.java

import lombok.*;

import java.util.List;

/**
 * @Author: xs
 * @Description:
 * @Date:Create:in 2020/6/28 15:53
 * @Modified By:
 */
@Builder
@Data
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor
public class Users {
    private String name;
    private String sex;
    private String age;
    private List<Stuends> stuends;
}

Stuends.java

import lombok.*;

/**
 * @Author: xs
 * @Description:
 * @Date:Create:in 2020/6/28 15:55
 * @Modified By:
 */
@Builder
@Data
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor
public class Stuends {
    private String score;
    private String height;
}

解析类

testXml.java


import com.thoughtworks.xstream.XStream;
import com.thoughtworks.xstream.io.xml.DomDriver;
import org.dom4j.Document;
import org.dom4j.DocumentHelper;


/**
 * @Author: xs
 * @Description:
 * @Date:Create:in 2020/6/28 15:56
 * @Modified By:
 */
public class testXml {

    public static void main(String[] args) {
        String str = "\n" +
                "\n" +
                "   \n" +
                "      \n" +
                "         \n" +
                "  \n" +
                "    \n" +
                "      1\n" +
                "      success\n" +
                "      \n" +
                "        \n" +
                "          张三\n" +
                "          \n" +
                "          18\n" +
                "          \n" +
                "            \n" +
                "              80\n" +
                "              20\n" +
                "            \n" +
                "          \n" +
                "        \n" +
                "      \n" +
                "    \n" +
                "  \n" +
                "]]>\n" +
                "      \n" +
                "   \n" +
                "";
        Response response = test(str);
        String name = response.getResponsebody().getRespInfo().getUsers().get(0).getName();
        System.out.println("name: "+name);
    }

    public static Response test(String xml){
        Response response = new Response();
        try{
            Document document = DocumentHelper.parseText(xml);
            String returnStr = document.getRootElement().element("Body").element("result").element("return").getText();
			// 此处初始化XStream加了new DomDriver(),因为缺少xpp3_min的jar包,如果有这个jar不加new DomDriver()也可以
            XStream xStream = new XStream(new DomDriver());
            xStream.alias("Response",Response.class);
            xStream.alias("responsebody",ResponseBody.class);
            xStream.alias("respInfo",RespInfo.class);
            xStream.alias("user",Users.class);
            xStream.alias("stuend",Stuends.class);

            Document document1 = DocumentHelper.parseText(returnStr);
            String returnXml = document1.getRootElement().asXML();
            response = (Response) xStream.fromXML(returnXml);
        }catch (Exception e){
            e.printStackTrace();
        }
        return response;
    }
}

输出结果

name: 张三

Process finished with exit code 0

注意

使用xstream需要引入3个jar包

<dependency>
        <groupId>com.thoughtworks.xstream</groupId>
        <artifactId>xstream</artifactId>
        <version>1.4</version>
</dependency>

<dependency>
       <groupId>xmlpull</groupId>
       <artifactId>xmlpull</artifactId>
       <version>1.1.3.1</version>
</dependency>

<dependency>
       <groupId>xpp3</groupId>
       <artifactId>xpp3_min</artifactId>
       <version>1.1.4c</version>
</dependency>

如果缺少xpp3_min会报以下问题,可以在初始化时加上new DomDriver()
java用dom4j解析带有cdata的xml报文_第1张图片
###在解析到return这一层时,如果用的是asXML方法。

String returnStr = document.getRootElement().element("Body").element("result").element("return").asXML();

会报以下问题
java用dom4j解析带有cdata的xml报文_第2张图片
这是因为getText()获取当前节点的文本内容。是当前节点.如果当前节点是一个element元素,那返回值就是null.例如上述报文,getText获取到的就是如下内容


  
    
      1
      success
      
        
          张三
          
          18
          
            
              80
              20
            
          
        
      
    
  


但是asXml()获取到的是如下内容,指的是这个节点(元素)的开始到结束包含的内容组成String


  
    
      1
      success
      
        
          张三
          
          18
          
            
              80
              20
            
          
        
      
    
  
]]>

所以不能用asXml。
####在解析到Response这一层的时候,如果用的是getText方法

String returnXml = document1.getRootElement().getText();

会报以下问题
java用dom4j解析带有cdata的xml报文_第3张图片
是因为getText在这里获取的是空,
java用dom4j解析带有cdata的xml报文_第4张图片
但是asXml获取到的是

<Response>
  <responsebody>
    <respInfo>
      <resultCode>1</resultCode>
      <resultMsg>success</resultMsg>
      <users>
        <user>
          <name>张三</name>
          <sex></sex>
          <age>18</age>
          <stuends>
            <stuend>
              <score>80</score>
              <height>20</height>
            </stuend>
          </stuends>
        </user>
      </users>
    </respInfo>
  </responsebody>
</Response>

总结

从入参xml到returnStr,我们做的是去掉了整个xml文件头,解析到了cdata这一层;从returnStr到returnXml,我们做的是去掉了cdata中的文件头。

你可能感兴趣的:(java)