XML外部实体注入
例:
InputStream is = Test01.class.getClassLoader().getResourceAsStream("evil.xml");//source
XMLInputFactory xmlFactory = XMLInputFactory.newInstance();
XMLEventReader reader = xmlFactory.createXMLEventReader(is); //sink
如果evil.xml文件中包含如下内容,就可能会造成xml外部实体注入
]><foo>&xxe;foo>
DOM
DOM的全称是Document Object Model,也即文档对象模型。在应用程序中,基于DOM的XML分析器将一个XML文档转换成一个对象模型的集合(通常称DOM树),应用程序正是通过对这个对象模型的操作,来实现对XML文档数据的操作。
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
System.out.println("class name: " + dbf.getClass().getName());
// step 2:获得具体的dom解析器
DocumentBuilder db = dbf.newDocumentBuilder();
// step3: 解析一个xml文档,获得Document对象(根结点)
Document document = db.parse(new File("candidate.xml"));
NodeList list = document.getElementsByTagName("PERSON");
防护建议1
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// 这是优先选择. 如果不允许DTDs (doctypes) ,几乎可以阻止所有的XML实体攻击
String FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
dbf.setFeature(FEATURE, true);
catch (ParserConfigurationException e) {
// This should catch a failed setFeature feature
logger.info("ParserConfigurationException was thrown. The feature '" +
FEATURE +
"' is probably not supported by your XML processor.");
...
}
catch (SAXException e) {
// On Apache, this should be thrown when disallowing DOCTYPE
logger.warning("A DOCTYPE was passed into the XML document");
...
}
catch (IOException e) {
// XXE that points to a file that doesn't exist
logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
...
}
防护建议2
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// 如果不能完全禁用DTDs,最少采取以下措施
FEATURE = "http://xml.org/sax/features/external-general-entities";
dbf.setFeature(FEATURE, false);
FEATURE = "http://xml.org/sax/features/external-parameter-entities";
dbf.setFeature(FEATURE, false);
// and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks" (see reference below)
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
// And, per Timothy Morgan: "If for some reason support for inline DOCTYPEs are a requirement, then ensure the entity settings are disabled (as shown above) and beware that SSRF attacks(http://cwe.mitre.org/data/definitions/918.html) and denial of service attacks (such as billion laughs or decompression bombs via "jar:") are a risk."
...
catch (ParserConfigurationException e) {
// This should catch a failed setFeature feature
logger.info("ParserConfigurationException was thrown. The feature '" +
FEATURE +
"' is probably not supported by your XML processor.");
...
}
catch (SAXException e) {
// On Apache, this should be thrown when disallowing DOCTYPE
logger.warning("A DOCTYPE was passed into the XML document");
...
}
catch (IOException e) {
// XXE that points to a file that doesn't exist
logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
...
}
SAX
SAX的全称是Simple APIs for XML,也即XML简单应用程序接口。与DOM不同,SAX提供的访问模式是一种顺序模式,这是一种快速读写XML数据的方式。当使用SAX分析器对XML文档进行分析时,会触发一系列事件,并激活相应的事件处理函数,应用程序通过这些事件处理函数实现对XML文档的访问,因而SAX接口也被称作事件驱动接口。
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
SAXParserFactory factory = SAXParserFactory.newInstance();
//step2: 获得SAX解析器实例
SAXParser parser = factory.newSAXParser();
//step3: 开始进行解析
parser.parse(new File("student.xml"), new MyHandler());
防护建议
参考DocumentBuilderFactory
JDOM
JDOM(Java-based Document Object Model)是一个开源项目,它基于树型结构,利用纯JAVA的技术对XML文档实现解析、生成、序列化以及多种操作。
import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
...
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(new File("jdom.xml"));
Element element = doc.getRootElement();
DOM4J
DOM4J(Document Object Model for Java),采用Java集合框架,并完全支持DOM、SAX和JAXP
StAX
StAX(Streaming API for XML) 就是一种拉分析式的XML解析技术(基于流模型中拉模型的分析方式就称为拉分析)。StAX包括两套处理XML的API,分别提供了不同程度的抽象。它们是:基于指针的API和基于迭代器的API。
可以让我们使用基于指针的API的接口是javax.xml.stream.XMLStreamReader(很遗憾,你不能直接实例化它),要得到它的实例,我们需要借助于javax.xml.stream.XMLInputFactory类。
//获得一个XMLInputFactory实例
XMLInputFactory factory = XMLInputFactory.newInstance();
//开始解析
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));
防护建议
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false); //会完全禁止DTD
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("users.xml"));
参考 java解析xml文件的几种方式
参考 XML External Entity (XXE) Processing