可扩展标记语言(XML)是一个以机器可读的形式编码文档的规则的集合。XML是一种流行的在网络上分享数据的格式。频繁地更新他们的内容的网站,比如新闻站点或博客,经常提供一个XML feed,以使外部的程序能够了解内容变化的最新情况。对于联网apps来说,上传和解析XML数据是一种常见的任务。这一节将解释如何解析XML文档并使用它们的数据。
我们建议使用XmlPullParser,它是Android上解析XML的一种高效的和可维护的方式。在历史上,Android有这个接口的两种实现:
两种中的任何一种选择都挺好。这一节的例子使用了ExpatPullParser,通过Xml.newPullParser()。
解析一个feed的第一步是确定你对哪些fields感兴趣。Parser提取那些fields的数据,并忽略其余的。
这里是一个在示例app中解析的feed的摘录。在feed中,到StackOverflow.com的每一个post以一个entry标签的形式出现,其包含了一些嵌套的标签:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" ..."> <title type="text">newest questions tagged android - Stack Overflow</title> ... <entry> ... </entry> <entry> <id>http://stackoverflow.com/q/9439999</id> <re:rank scheme="http://stackoverflow.com">0</re:rank> <title type="text">Where is my data file?</title> <category scheme="http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest/tags" term="android"/> <category scheme="http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest/tags" term="file"/> <author> <name>cliff2310</name> <uri>http://stackoverflow.com/users/1128925</uri> </author> <link rel="alternate" href="http://stackoverflow.com/questions/9439999/where-is-my-data-file" /> <published>2012-02-25T00:30:54Z</published> <updated>2012-02-25T00:30:54Z</updated> <summary type="html"> <p>I have an Application that requires a data file...</p> </summary> </entry> <entry> ... </entry> ... </feed>
示例app为entry标签提取数据,及它的嵌套标签的标题,链接,和总结。
下一步是实例化一个parser,并启动解析过程。在这个代码片段中,一个parser被初始化为不处理命名空间,并使用提供的InputStream作为它的输入。它通过一个对nextTag()的调用来启动解析过程,并调用readFeed()方法,后者会提取并处理app感兴趣的数据:
public class StackOverflowXmlParser { // We don't use namespaces private static final String ns = null; public List parse(InputStream in) throws XmlPullParserException, IOException { try { XmlPullParser parser = Xml.newPullParser(); parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false); parser.setInput(in, null); parser.nextTag(); return readFeed(parser); } finally { in.close(); } } ... }
readFeed()方法执行实际的处理feed的工作。它查找标记为“entry”的元素作为一个递归地feed处理的起点。如果一个标签不是一个entry标签,它就跳过它。一旦整个feed都被递归地处理完了,则readFeed()返回一个List,其中包含了它从feed中提取entries(包含嵌套的数据成员)。然后这个List被parser返回。
private List readFeed(XmlPullParser parser) throws XmlPullParserException, IOException { List entries = new ArrayList(); parser.require(XmlPullParser.START_TAG, ns, "feed"); while (parser.next() != XmlPullParser.END_TAG) { if (parser.getEventType() != XmlPullParser.START_TAG) { continue; } String name = parser.getName(); // Starts by looking for the entry tag if (name.equals("entry")) { entries.add(readEntry(parser)); } else { skip(parser); } } return entries; }
解析一个XML feed的步骤如下:
这个代码片段显示了parser如何解析entries,titles,links和summaries。
public static class Entry { public final String title; public final String link; public final String summary; private Entry(String title, String summary, String link) { this.title = title; this.summary = summary; this.link = link; } } // Parses the contents of an entry. If it encounters a title, summary, or link tag, hands them off // to their respective "read" methods for processing. Otherwise, skips the tag. private Entry readEntry(XmlPullParser parser) throws XmlPullParserException, IOException { parser.require(XmlPullParser.START_TAG, ns, "entry"); String title = null; String summary = null; String link = null; while (parser.next() != XmlPullParser.END_TAG) { if (parser.getEventType() != XmlPullParser.START_TAG) { continue; } String name = parser.getName(); if (name.equals("title")) { title = readTitle(parser); } else if (name.equals("summary")) { summary = readSummary(parser); } else if (name.equals("link")) { link = readLink(parser); } else { skip(parser); } } return new Entry(title, summary, link); } // Processes title tags in the feed. private String readTitle(XmlPullParser parser) throws IOException, XmlPullParserException { parser.require(XmlPullParser.START_TAG, ns, "title"); String title = readText(parser); parser.require(XmlPullParser.END_TAG, ns, "title"); return title; } // Processes link tags in the feed. private String readLink(XmlPullParser parser) throws IOException, XmlPullParserException { String link = ""; parser.require(XmlPullParser.START_TAG, ns, "link"); String tag = parser.getName(); String relType = parser.getAttributeValue(null, "rel"); if (tag.equals("link")) { if (relType.equals("alternate")){ link = parser.getAttributeValue(null, "href"); parser.nextTag(); } } parser.require(XmlPullParser.END_TAG, ns, "link"); return link; } // Processes summary tags in the feed. private String readSummary(XmlPullParser parser) throws IOException, XmlPullParserException { parser.require(XmlPullParser.START_TAG, ns, "summary"); String summary = readText(parser); parser.require(XmlPullParser.END_TAG, ns, "summary"); return summary; } // For the tags title and summary, extracts their text values. private String readText(XmlPullParser parser) throws IOException, XmlPullParserException { String result = ""; if (parser.next() == XmlPullParser.TEXT) { result = parser.getText(); parser.nextTag(); } return result; } ... }
上面描述的XML解析的其中一步是paser跳过它不感兴趣的标签。这里是parser的skip()方法:
private void skip(XmlPullParser parser) throws XmlPullParserException, IOException { if (parser.getEventType() != XmlPullParser.START_TAG) { throw new IllegalStateException(); } int depth = 1; while (depth != 0) { switch (parser.next()) { case XmlPullParser.END_TAG: depth--; break; case XmlPullParser.START_TAG: depth++; break; } } }
这是它如何工作的:
这样的话,如果当前的元素具有嵌套的元素,则depth的值将一直为非0,直到parser消费掉了所有在最初的START_TAG和与其匹配的END_TAG之间的所有的events。比如,考虑parser如何跳过<author>元素,它具有2个嵌套的元素,<name>和<uri>:
例子应用程序在一个AsyncTask中获取并解析XML feed。这将把处理过程从主UI线程中分离。当处理过程完成时,app更新main activity (NetworkActivity)中的UI。
在下面所示的摘录中,loadPage()方法做了如下的事情:
public class NetworkActivity extends Activity { public static final String WIFI = "Wi-Fi"; public static final String ANY = "Any"; private static final String URL = "http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest"; // Whether there is a Wi-Fi connection. private static boolean wifiConnected = false; // Whether there is a mobile connection. private static boolean mobileConnected = false; // Whether the display should be refreshed. public static boolean refreshDisplay = true; public static String sPref = null; ... // Uses AsyncTask to download the XML feed from stackoverflow.com. public void loadPage() { if((sPref.equals(ANY)) && (wifiConnected || mobileConnected)) { new DownloadXmlTask().execute(URL); } else if ((sPref.equals(WIFI)) && (wifiConnected)) { new DownloadXmlTask().execute(URL); } else { // show error } }
下面所示的AsyncTask子类,DownloadXmlTask,实现了如下的AsyncTask方法:
// Implementation of AsyncTask used to download XML feed from stackoverflow.com. private class DownloadXmlTask extends AsyncTask<String, Void, String> { @Override protected String doInBackground(String... urls) { try { return loadXmlFromNetwork(urls[0]); } catch (IOException e) { return getResources().getString(R.string.connection_error); } catch (XmlPullParserException e) { return getResources().getString(R.string.xml_error); } } @Override protected void onPostExecute(String result) { setContentView(R.layout.main); // Displays the HTML string in the UI via a WebView WebView myWebView = (WebView) findViewById(R.id.webview); myWebView.loadData(result, "text/html", null); } }
下面是方法loadXmlFromNetwork(),它会在DownloadXmlTask中调用。它做了如下的事情:
// Uploads XML from stackoverflow.com, parses it, and combines it with // HTML markup. Returns HTML string. private String loadXmlFromNetwork(String urlString) throws XmlPullParserException, IOException { InputStream stream = null; // Instantiate the parser StackOverflowXmlParser stackOverflowXmlParser = new StackOverflowXmlParser(); List<Entry> entries = null; String title = null; String url = null; String summary = null; Calendar rightNow = Calendar.getInstance(); DateFormat formatter = new SimpleDateFormat("MMM dd h:mmaa"); // Checks whether the user set the preference to include summary text SharedPreferences sharedPrefs = PreferenceManager.getDefaultSharedPreferences(this); boolean pref = sharedPrefs.getBoolean("summaryPref", false); StringBuilder htmlString = new StringBuilder(); htmlString.append("<h3>" + getResources().getString(R.string.page_title) + "</h3>"); htmlString.append("<em>" + getResources().getString(R.string.updated) + " " + formatter.format(rightNow.getTime()) + "</em>"); try { stream = downloadUrl(urlString); entries = stackOverflowXmlParser.parse(stream); // Makes sure that the InputStream is closed after the app is // finished using it. } finally { if (stream != null) { stream.close(); } } // StackOverflowXmlParser returns a List (called "entries") of Entry objects. // Each Entry object represents a single post in the XML feed. // This section processes the entries list to combine each entry with HTML markup. // Each entry is displayed in the UI as a link that optionally includes // a text summary. for (Entry entry : entries) { htmlString.append("<p><a href='"); htmlString.append(entry.link); htmlString.append("'>" + entry.title + "</a></p>"); // If the user set the preference to include summary text, // adds it to the display. if (pref) { htmlString.append(entry.summary); } } return htmlString.toString(); } // Given a string representation of a URL, sets up a connection and gets // an input stream. private InputStream downloadUrl(String urlString) throws IOException { URL url = new URL(urlString); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setReadTimeout(10000 /* milliseconds */); conn.setConnectTimeout(15000 /* milliseconds */); conn.setRequestMethod("GET"); conn.setDoInput(true); // Starts the query conn.connect(); return conn.getInputStream(); }译自: http://developer.android.com/training/basics/network-ops/xml.html
Done.