sax是基于事件回调模型的,比dom(document)更快捷。同时对解析过程有更多的控制。在java中可用的有原生态的包:javax.xml.parsers.SAXParser或者apache的Xerces中的。
解析无非两件事:一个解析器(SAXParser,XMLReader),一个事件句析或者叫作回调函数。
注意的是:
1:
当读到开始标签时会调用:public void startElement(String uri, String localName, String qName,Attributes atts) throws SAXException {}和public void characters(char[] ch, int start, int length)throws SAXException {}
2:
当读到结束标签时会调用:
public void endElement(String uri, String localName, String qName)throws SAXException{}和public void characters(char[] ch, int start, int length)throws SAXException {}
此时取到的char是空
3:处理的对象不需要在回调对象中传出。因为这时传到业务端(回调对象中的参数是按引用传递的)。代码示例以说明:
以下是回调类或者处理句柄或者叫业务对象
class RssItemHandler extends DefaultHandler{ private List<RssItem> items=new ArrayList<>(); private RssItem it=null; private String currentTag = null; private boolean isStartTag=false; private String allowElement="link|pubDate|description|title"; public RssItemHandler(List<RssItem> items) { super(); this.items = items; } @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { // TODO Auto-generated method stub if("item".equals(qName)){ it = new RssItem(); } if(it!=null && allowElement.indexOf(qName)!=-1){ isStartTag=true; currentTag = qName; } } @Override public void endElement(String uri, String localName, String qName) throws SAXException { // TODO Auto-generated method stub if("item".equals(qName)){ System.out.println(it.toString()); items.add(it); it = null; currentTag = null; } } @Override public void characters(char[] ch, int start, int length) throws SAXException { // TODO Auto-generated method stub if(currentTag!=null && isStartTag){ String content = new String(ch,start,length); if("title".equals(currentTag)){ it.setTitle(content); }else if("link".equals(currentTag)){ try { it.setUrl(new URL(content)); } catch (MalformedURLException e) { // TODO Auto-generated catch block e.printStackTrace(); } }else if("description".equals(currentTag)){ it.setDescription(content); } isStartTag=false; } } }
public static void main(String[] args) throws IOException { // TODO Auto-generated method stub try{ SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser parser = factory.newSAXParser(); String xmluri="http://news.csdn.net/rss_news.html"; InputSource is=new InputSource(new URL(xmluri).openStream()); // List<RssItem> its=new ArrayList<>(); DefaultHandler dh=new RssItemHandler(its); parser.parse(is, dh); System.out.println(its.size()); for(RssItem ri:its){ System.out.println("[RV]"+ri.toString()); } } catch (SAXException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (ParserConfigurationException e) { // TODO Auto-generated catch block e.printStackTrace(); } }
private List<RssItem> items=new ArrayList<>();
4.parse方法的第一个参数建议用InputSource
最后:运行环境:java7