XML文件的解析以及XML外部实体注入防护

XML外部实体注入
例:

InputStream is = Test01.class.getClassLoader().getResourceAsStream("evil.xml");//source
XMLInputFactory xmlFactory  = XMLInputFactory.newInstance();  
XMLEventReader reader = xmlFactory.createXMLEventReader(is);  //sink

如果evil.xml文件中包含如下内容,就可能会造成xml外部实体注入

 
 
 ]><foo>&xxe;foo>

XML文件的解析与XXE防护

DOM
DOM的全称是Document Object Model,也即文档对象模型。在应用程序中,基于DOM的XML分析器将一个XML文档转换成一个对象模型的集合(通常称DOM树),应用程序正是通过对这个对象模型的操作,来实现对XML文档数据的操作。

import javax.xml.parsers.DocumentBuilder;  
import javax.xml.parsers.DocumentBuilderFactory;  

...
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  
        System.out.println("class name: " + dbf.getClass().getName());  
        // step 2:获得具体的dom解析器  
        DocumentBuilder db = dbf.newDocumentBuilder(); 
        // step3: 解析一个xml文档,获得Document对象(根结点)  
        Document document = db.parse(new File("candidate.xml"));  
        NodeList list = document.getElementsByTagName("PERSON");  


防护建议1

 DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    try {
      // 这是优先选择. 如果不允许DTDs (doctypes) ,几乎可以阻止所有的XML实体攻击
      String FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
      dbf.setFeature(FEATURE, true);

        catch (ParserConfigurationException e) {
            // This should catch a failed setFeature feature
            logger.info("ParserConfigurationException was thrown. The feature '" +
                        FEATURE +
                        "' is probably not supported by your XML processor.");
            ...
        }
        catch (SAXException e) {
            // On Apache, this should be thrown when disallowing DOCTYPE
            logger.warning("A DOCTYPE was passed into the XML document");
            ...
        }
        catch (IOException e) {
            // XXE that points to a file that doesn't exist
            logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
            ...
        }



防护建议2

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    try {
      // 如果不能完全禁用DTDs,最少采取以下措施
      FEATURE = "http://xml.org/sax/features/external-general-entities";
      dbf.setFeature(FEATURE, false);

      FEATURE = "http://xml.org/sax/features/external-parameter-entities";
      dbf.setFeature(FEATURE, false);

      // and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks" (see reference below)
      dbf.setXIncludeAware(false);
      dbf.setExpandEntityReferences(false);

      // And, per Timothy Morgan: "If for some reason support for inline DOCTYPEs are a requirement, then ensure the entity settings are disabled (as shown above) and beware that SSRF attacks(http://cwe.mitre.org/data/definitions/918.html) and denial of service attacks (such as billion laughs or decompression bombs via "jar:") are a risk."
      ...

        catch (ParserConfigurationException e) {
            // This should catch a failed setFeature feature
            logger.info("ParserConfigurationException was thrown. The feature '" +
                        FEATURE +
                        "' is probably not supported by your XML processor.");
            ...
        }
        catch (SAXException e) {
            // On Apache, this should be thrown when disallowing DOCTYPE
            logger.warning("A DOCTYPE was passed into the XML document");
            ...
        }
        catch (IOException e) {
            // XXE that points to a file that doesn't exist
            logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
            ...
        }



SAX
SAX的全称是Simple APIs for XML,也即XML简单应用程序接口。与DOM不同,SAX提供的访问模式是一种顺序模式,这是一种快速读写XML数据的方式。当使用SAX分析器对XML文档进行分析时,会触发一系列事件,并激活相应的事件处理函数,应用程序通过这些事件处理函数实现对XML文档的访问,因而SAX接口也被称作事件驱动接口。

import javax.xml.parsers.SAXParser;  
import javax.xml.parsers.SAXParserFactory;  

        SAXParserFactory factory = SAXParserFactory.newInstance();  
        //step2: 获得SAX解析器实例  
        SAXParser parser = factory.newSAXParser();  
        //step3: 开始进行解析  
        parser.parse(new File("student.xml"), new MyHandler());  

防护建议

参考DocumentBuilderFactory



JDOM
JDOM(Java-based Document Object Model)是一个开源项目,它基于树型结构,利用纯JAVA的技术对XML文档实现解析、生成、序列化以及多种操作。

import org.jdom.Attribute;  
import org.jdom.Document;  
import org.jdom.Element;  
import org.jdom.input.SAXBuilder;  
import org.jdom.output.Format;  
import org.jdom.output.XMLOutputter;  

...
        SAXBuilder builder = new SAXBuilder();  
        Document doc = builder.build(new File("jdom.xml"));  
        Element element = doc.getRootElement();  



DOM4J
DOM4J(Document Object Model for Java),采用Java集合框架,并完全支持DOM、SAX和JAXP

StAX
StAX(Streaming API for XML) 就是一种拉分析式的XML解析技术(基于流模型中拉模型的分析方式就称为拉分析)。StAX包括两套处理XML的API,分别提供了不同程度的抽象。它们是:基于指针的API和基于迭代器的API。
可以让我们使用基于指针的API的接口是javax.xml.stream.XMLStreamReader(很遗憾,你不能直接实例化它),要得到它的实例,我们需要借助于javax.xml.stream.XMLInputFactory类。

//获得一个XMLInputFactory实例
XMLInputFactory factory = XMLInputFactory.newInstance();  
//开始解析
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));  



防护建议

XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false);  //会完全禁止DTD
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));  

参考 java解析xml文件的几种方式
参考 XML External Entity (XXE) Processing

你可能感兴趣的:(web安全,代码审计)