本来是想用dom4j直接读取url的,但是读取url时,会出现
org.dom4j.DocumentException: Error on line 1 of document : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog. 的错误。
google后,说是多了个空格,哎,url多了个空格咋搞?
请教高手指点
有的xml中含有中文,这里拿google的天气api来做说明
如何让dom4j读取xml的中文呢(当然需要2个必备包,附件下载即可)
private Document readXML(String url) throws MalformedURLException, DocumentException, UnsupportedEncodingException { SAXReader reader = new SAXReader(); byte[] bytes = url.getBytes(); InputStream in = new ByteArrayInputStream(bytes); InputStreamReader strInStream = new InputStreamReader(in, "GBK"); Document document = reader.read(strInStream); return document; }
这样就可以读取中文了
Document weatherDoc = null; try { weatherDoc = readXML("http://www.google.com/ig/api?hl=zh-cn&weather=changzhou"); //weatherDoc = readXML(new File("c://api.xml")); } catch (DocumentException e) { e.printStackTrace(); } //List list = weatherDoc.selectNodes("//xml_api_reply/weather/forecast_information"); Node node = weatherDoc.selectSingleNode("//xml_api_reply/weather/current_conditions/condition"); System.out.println(node.getName()); String name = node.valueOf("@data"); System.out.println(name);
@data出来了,今天“晴 ”