关联:iOS中XML解析 (一) TBXML (实例:打印xml内容及存储到数组)
关于libxml库的基本使用,在http://xmlsoft.org/网上有文档。
准备工作:
一、使用libxml读XML文件
要读xml,需要使用reader,这里介绍两种方式,一种从文件读取,一种从内存读取。其它就是通过libxml库中提供的两个API来建立reader.请看代码:
1、从文件建立reader
xmlTextReaderPtr reader = xmlNewTextReaderFilename(xmlfile);
2、从内存建立reader
// char* memory, int size xmlTextReaderPtr reader = xmlReaderForMemory(memory, size, NULL, "UTF-8", 0);
从上述代码来看,建立一个reader是非常容易的。
3、从reader中读数据
建立了reader之后,我们就可以通过reader的辅助函数来实现xml数据的读取。在这里,我讲述的是如何读一个文本方式的XML,并没有使用XML的专有模型。这种方式最原始,也是最容易理解的。
要读一个reader中的数据,使用xmlTextReaderRead来读一个元素,XML中的每一个元素都会经过reader依次读取,我们可以根据需要来检查当前reader位置的元素类型,并取出数据为已所用,当然还要释放由reader分配的数据空间。下面来看一下读的例子:
ret = xmlTextReaderRead(reader); if (ret == 0) return 0; if (ret != 1) return -2; element = xmlTextReaderName(reader); if (element != NULL) { ntype = xmlTextReaderNodeType(reader); if (strcmp((const char*) element, "param-name") == 0) { xmlFree(element); if (XML_READER_TYPE_ELEMENT == ntype) { /*......*/ } } }
xmlTextReaderRead需要一个参数,就是我们前面进行的一个文本读取器指针,该函数返回1表示成功读取,0表示到达文件尾。当成功读取时,可能使用xmlTextReaderName读取当前位置的元素数据,并可以通过xmlTextTextReaderNodeType来读取XML元素的类型。
/** * xmlReaderTypes: * * Predefined constants for the different types of nodes. */ typedef enum { XML_READER_TYPE_NONE = 0, XML_READER_TYPE_ELEMENT = 1, XML_READER_TYPE_ATTRIBUTE = 2, XML_READER_TYPE_TEXT = 3, XML_READER_TYPE_CDATA = 4, XML_READER_TYPE_ENTITY_REFERENCE = 5, XML_READER_TYPE_ENTITY = 6, XML_READER_TYPE_PROCESSING_INSTRUCTION = 7, XML_READER_TYPE_COMMENT = 8, XML_READER_TYPE_DOCUMENT = 9, XML_READER_TYPE_DOCUMENT_TYPE = 10, XML_READER_TYPE_DOCUMENT_FRAGMENT = 11, XML_READER_TYPE_NOTATION = 12, XML_READER_TYPE_WHITESPACE = 13, XML_READER_TYPE_SIGNIFICANT_WHITESPACE = 14, XML_READER_TYPE_END_ELEMENT = 15, XML_READER_TYPE_END_ENTITY = 16, XML_READER_TYPE_XML_DECLARATION = 17 } xmlReaderTypes;
reader支持如上类型,我们可以根据当前类型来读取数据,因为不现的类型,读取数据的方式不同,比如xmlTextReaderReadString只能读元素(XML_READER_TYPE_ELEMENT)的名称或者文件类型(XML_READER_TYPE_TEXT)的数据。注意一点就是reader是按顺序读取每一个元素,在写代码时,应该不要假定后面一定是什么元素或者特定类型,应该去检测,保证软件的稳定性。
使用xmlTextReaderReadString返回一个元素(xmlChar*类型)时,该区域是由库分配的内存区域,需要使用xmlFree来释放,不然就有内存泄漏。
4、读xml的reader的释放与清理
xmlTextReaderClose(reader); xmlFreeTextReader(reader); xmlDictCleanup(); xmlCleanupParser(); xmlMemoryDump(); xmlCleanupCharEncodingHandlers();
有一个xmlTextReaderClose函数,当使用该函数时,要注意顺序,一定要在xmlFreeTextReader之前,不然就会出现错误。
二、实例
1. 假设xml地址为 http://cdn.domain.com/ipad/settings/config.xml 格式为:
<?xml version="1.0"?> <settings> <popupAd> <show>1</show> <count>3</count> </popupAd> </settings>
读取:
.h
#import <Foundation/Foundation.h> #include <libxml/xmlreader.h> @interface NewsFeedParser : NSObject { } @end
.m
-(void)readXml { NSURLResponse *response; NSError *error; NSURLRequest *request = [NSURLRequest requestWithURL:[NSURL URLWithString:@"http://cdn.domain.com/ipad/settings/config.xml"]]; NSData *settingData = [NSURLConnection sendSynchronousRequest:request returningResponse:&response error:&error]; xmlTextReaderPtr reader = xmlReaderForMemory([settingData bytes], [settingData length], nil, nil, (XML_PARSE_NOENT|XML_PARSE_NOBLANKS | XML_PARSE_NOCDATA | XML_PARSE_NOERROR | XML_PARSE_NOWARNING)); if(!reader) NSLog(@"Failed to load setting config xml !"); else { char *temp; NSString *currentTagName = nil; NSString *currentTagValue = nil; NSMutableDictionary *config = [NSMutableDictionary dictionary]; while (TRUE) { if(!xmlTextReaderRead(reader)) break; //NSLog(@"========> %s",xmlTextReaderName(reader)); if(xmlTextReaderNodeType(reader) == XML_READER_TYPE_ELEMENT) { temp = (char *)xmlTextReaderConstName(reader); currentTagName = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; if([currentTagName isEqualToString:@"show"] || [currentTagName isEqualToString:@"count"]) { temp = (char *)xmlTextReaderReadString(reader); currentTagValue = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; //NSLog(@"===> TagName: %@",currentTagName); //NSLog(@"===> TagValue: %@",currentTagValue); [config setObject:currentTagValue forKey:currentTagName]; currentTagValue = nil; } } } NSLog(@"======> %@",[config objectForKey:@"show"]); NSLog(@"======> %@",[config objectForKey:@"count"]); } }
2. 假设xml地址为 http://www.domain.com/feed/ipad/marketchart/main.rss 格式为:
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/"> <channel> <atom:link href="http://www.domain.com/feed/ipad/marketchart/main.rss" rel="self" type="application/rss+xml" /> <title><![CDATA[domain.com : Market data]]></title> <description><![CDATA[Market data RSS Feed ]]></description> <link>http://www.domain.com/feed/ipad/marketchart/main.rss</link> <copyright>All articles are copyrighted by IBTimes.com</copyright> <image> <url>http://img.domain.com/www/site/2010/main/images/heading_editores_pick_logo.png</url> <title><![CDATA[domain.com : Market data]]></title> <link>http://www.domain.com/feed/ipad/marketchart/main.rss</link> </image> <item> <title><![CDATA[DOW]]></title> <last>13085.53</last> <symbol>^DJI</symbol> <change>-114.02</change> <description><![CDATA[ description ]]></description> <pubDate>Wed, 04 Apr 2012 14:51:00 EDT</pubDate> </item> <item> <title><![CDATA[NYSE]]></title> <last>8151.97</last> <symbol>$NYA</symbol> <change>0</change> <description><![CDATA[ description ]]></description> <pubDate>Fri, 24 Feb 2012 17:05:00 EST</pubDate> </item> <item> <title><![CDATA[NASDAQ]]></title> <last>3063.63</last> <symbol>$COMP</symbol> <change>-49.94</change> <description><![CDATA[ description ]]></description> <pubDate>Wed, 04 Apr 2012 14:45:00 EDT</pubDate> </item> <item> <title><![CDATA[S&P 500]]></title> <last>1399.64</last> <symbol>$SPX</symbol> <change>-13.74</change> <description><![CDATA[ description ]]></description> <pubDate>Wed, 04 Apr 2012 14:45:00 EDT</pubDate> </item> </channel> </rss>
以item为单位的循环
NewsFeedParser.h
#import <Foundation/Foundation.h> #include <libxml/xmlreader.h> @interface NewsFeedParser : NSObject { } +(NSMutableArray*) parseFeedFromUrl:(NSString *) url; @end
NewsFeedParser.m
#import "NewsFeedParser.h" @implementation NewsFeedParser +(NSMutableArray*) parseFeedFromUrl:(NSString *) url { NSMutableArray *itemsArray = [NSMutableArray array]; //NSLog(@"NewsFeedParser:%@ begin\n", url); NSURLRequest *request = [NSURLRequest requestWithURL:[NSURL URLWithString:url]]; NSURLResponse *response; NSError *error; NSData *xmlData = [NSURLConnection sendSynchronousRequest:request returningResponse:&response error:&error]; xmlTextReaderPtr reader = xmlReaderForMemory([xmlData bytes], [xmlData length], nil, nil, (XML_PARSE_NOENT|XML_PARSE_NOBLANKS | XML_PARSE_NOCDATA | XML_PARSE_NOERROR | XML_PARSE_NOWARNING)); if (!reader) { NSLog(@"Failed to load xmlreader"); return itemsArray; } NSString *currentTagName = nil; NSDictionary *currentItem = nil; NSString *currentTagValue = nil; bool itemStarted = false; bool authorStarted = false; bool categoryStarted = false; char* temp; while (true) { if (!xmlTextReaderRead(reader)) break; int type = xmlTextReaderNodeType(reader); switch (type) { case XML_READER_TYPE_END_ELEMENT: temp = (char*)xmlTextReaderConstName(reader); currentTagName = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; if ([currentTagName isEqualToString:@"item"]) { itemStarted = false; } continue; case XML_READER_TYPE_ELEMENT: //We are starting an element temp = (char*)xmlTextReaderConstName(reader); currentTagName = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; if ([currentTagName isEqualToString:@"item"]) { //NSLog(@"Item begin\n"); currentItem = [NSMutableDictionary dictionary]; [itemsArray addObject:currentItem]; itemStarted = true; authorStarted = false; categoryStarted = false; } if([currentTagName isEqualToString:@"author"]){ authorStarted = true; categoryStarted = false; } if([currentTagName isEqualToString:@"category"]){ categoryStarted = true; authorStarted = false; } if(itemStarted == true && [currentTagName isEqualToString:@"media:content"]) { temp = (char*)xmlTextReaderGetAttribute(reader,"url"); currentTagValue = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; [currentItem setValue:currentTagValue forKey:currentTagName]; //NSLog(@"%@ - %@\n", currentTagName, currentTagValue); currentTagValue = nil; } continue; case XML_READER_TYPE_TEXT: //The current tag has a text value, stick it into the current person if(itemStarted == false) continue; temp = (char*)xmlTextReaderConstValue(reader); currentTagValue = [NSString stringWithCString:temp encoding:NSUTF8StringEncoding]; //NSLog(@"%@ - %@\n", currentTagName, currentTagValue); if([currentTagName isEqualToString:@"name"]){ if(authorStarted) [currentItem setValue:currentTagValue forKey:@"author"]; if(categoryStarted) [currentItem setValue:currentTagValue forKey:@"category"]; } else{ [currentItem setValue:currentTagValue forKey:currentTagName]; } currentTagValue = nil; currentTagName = nil; continue; case XML_READER_TYPE_ATTRIBUTE: //temp = (char*)xmlTextReaderConstValue(reader); //NSLog(@"%s\n", temp); default: continue; } } //NSLog(@"NewsFeedParser:%@ done\n", url); return itemsArray; } @end
ViewController.m
#import "NewsFeedParser.h" -(void) loadMarketData{ dispatch_async(dispatch_get_global_queue(0, 0), ^{ NSMutableArray *items = [NewsFeedParser parseFeedFromUrl:@"http://www.domain.com/feed/ipad/marketchart/main.rss"]; if([items count] == 0) return; NSLog(@"Title => %@", [[items objectAtIndex:0] objectForKey:@"title"]); NSLog(@"Last=> %@", [[items objectAtIndex:0] objectForKey:@"last"]); NSLog(@"Change=> %@", [[items objectAtIndex:0] objectForKey:@"change"]); }); }
其他一些方法:
libxml库提供了一些.net风格的函数,以流的形式来读取并分析xml文件.
<libxml/xmlreader.h> xmlTextReader xmlTextReaderPtr //XmlReader的结构体及其指针 xmlTextReaderPtr xmlReaderForFile (const char * filename, const char * encoding, int options) //打开一个xml文件并返回xmlreader对象,准备开始分析. int xmlTextReaderRead (xmlTextReaderPtr reader) //读取下一个节点(注意,是下一个,不是下一个同层节点) int xmlTextReaderNext (xmlTextReaderPtr reader) //读取下一个同层节点 int xmlTextReaderNodeType (xmlTextReaderPtr reader) //判断当前节点的类型 xmlChar *xmlTextReaderGetAttribute (xmlTextReaderPtr reader, const xmlChar * name) //获取当前节点的指定属性 xmlChar *xmlTextReaderReadString (xmlTextReaderPtr reader) //读取当前节点下的text xmlNodePtr xmlTextReaderExpand (xmlTextReaderPtr reader) //将当前节点展开成一个节点对象(慎用) int xmlTextReaderHasValue (xmlTextReaderPtr reader) //判断当前节点是否有text值 int xmlTextReaderHasAttributes (xmlTextReaderPtr reader) //判断当前节点是否包含属性 int xmlTextReaderMoveToAttribute (xmlTextReaderPtr reader, const xmlChar * name) //移动指针到当前节点的指定属性名的属性 int xmlTextReaderMoveToAttributeNo (xmlTextReaderPtr reader, int no) //移动指针到当前节点指定属性编号的属性 int xmlTextReaderMoveToElement (xmlTextReaderPtr reader) //将指针移会当前节点 int xmlTextReaderMoveToFirstAttribute (xmlTextReaderPtr reader) //将指针移动到当前节点的第一个属性 int xmlTextReaderMoveToNextAttribute (xmlTextReaderPtr reader) //将指针移动到当前节点的下一个属性 xmlChar *xmlTextReaderName (xmlTextReaderPtr reader) //返回当前节点的名字
libxml自定义了一个字符类型xmlChar,其本质是 unsigned char.
另外,libxml提供了一个宏来将char*转换成xmlChar*, 名字很有趣,叫 BAD_CAST 它的本质其实是 unsigned char*.
为了方便对xmlChar类型字符串的操作,libxml提供了自己的函数,它们的定义于标准c函数库中的字符串函数很像.
xmlChar* xmlStrcat (xmlChar *cur, const xmlChar * add) const xmlChar *xmlStrchr(const xmlChar * str, xmlChar val) int xmlStrcmp (const xmlChar * str1, const xmlChar * str2) int xmlStrlen (const xmlChar * str) xmlChar *xmlStrncat (xmlChar * cur, const xmlChar * add, int len) int xmlStrncmp (const xmlChar * str1, const xmlChar * str2, int len) const xmlChar *xmlStrstr (const xmlChar * str, const xmlChar * val)
更多函数大家可以参考
http://xmlsoft.org/html/libxml-xmlstring.html
关联:iOS中XML解析 (一) TBXML (实例:打印xml内容及存储到数组)