在 .NET 开发中经常需要读取和操作XML文件,例如:操作配置文件(web.config和app.config)、读取业务设置的xml文件等。以前都喜欢用DataSet直接读取或写入xml,当文件小的时候,读取效率还能接受,但是当文件很大的时候,读取就变得很慢了。闲暇之于就对 XMLReader 、 XMLDocument和DataSet 读取XML文件进行简单总结,对效率进行简单比较。
XMLReader 提供对 XML 数据进行快速、非缓存、只进访问的读取器。XMLReader 只能读取xml文件,需要我们自己控制怎样获取相应的xml节点的信息,适合于读取很大的xml文件。
XMLReader 有一个类型为XmlNodeType的NodeType只读属性,通过它可以知道当前节点类型,以及根据节点类型和具体需求获取相应节点的信息。更详细的信息可以到微软技术资源库进行查询和了解。XMLReader读取XML文件方式如下:
1: static List<Dictionary<string, string>> XMLReaderTest(string xmlPath)
2: {
3: List<Dictionary<string, string>> entityInfo = new List<Dictionary<string, string>>();
4: using (XmlReader reader = new XmlTextReader(xmlPath))
5: {
6: Dictionary<string, string> xmlValue = null;
7: string key = string.Empty;
8: while (reader.Read())
9: {
10: switch (reader.NodeType)
11: {
12: case XmlNodeType.Element:
13: if (string.Compare(reader.LocalName, "BE_WorkStation_ACInstance", StringComparison.OrdinalIgnoreCase) == 0)
14: {
15: xmlValue = new Dictionary<string, string>();
16: }
17: else
18: {
19: if (string.Compare(reader.LocalName, "EntitySchema", StringComparison.OrdinalIgnoreCase) != 0)
20: {
21: key = reader.LocalName;
22: }
23: }
24: break;
25: case XmlNodeType.EndElement:
26: if (string.Compare(reader.LocalName, "BE_WorkStation_ACInstance", StringComparison.OrdinalIgnoreCase) == 0)
27: {
28: if (xmlValue != null)
29: {
30: entityInfo.Add(xmlValue);
31: xmlValue = null;
32: }
33: }
34: break;
35: case XmlNodeType.Text:
36: if (xmlValue != null)
37: {
38: xmlValue.Add(key, reader.Value);
39: }
40: break;
41: default:
42: break;
43: }
44: }
45: }
46: return entityInfo;
47: }
XMLDocument 表示XML文档在内存中的树形结构,它提供像js操作html文档一样的方式操作XML文档。在读取单个小XML文件时效率比较高。XMLDocument 读取XML文件方式如下:
1: static List<Dictionary<string, string>> XMLDocumentTest(string xmlPath)
2: {
3: List<Dictionary<string, string>> entityInfo = new List<Dictionary<string, string>>();
4: using (XmlReader reader = new XmlTextReader(xmlPath))
5: {
6: XmlDocument doc = new XmlDocument();
7: doc.Load(reader);
8: XmlNodeList nodeList = doc.ChildNodes;
9: foreach (XmlNode node in nodeList)
10: {
11: var xmlValue = new Dictionary<string, string>();
12: foreach (XmlNode child in node.ChildNodes)
13: {
14: xmlValue[child.LocalName] = child.InnerText;
15: }
16: entityInfo.Add(xmlValue);
17: }
18: }
19: return entityInfo;
20: }
DataSet 类型提供了一个ReadXml 方法,它将XML架构和数据读入DataSet中。DataSet 在读取 XML 文件时效率很低。DataSet 读取XML文件方式如下:
1: static List<Dictionary<string, string>> DataSetTest(string xmlPath)
2: {
3: List<Dictionary<string, string>> entityInfo = new List<Dictionary<string, string>>();
4: DataSet ds = new DataSet();
5: //读取XML文件架构
6: using (XmlReader reader = new XmlTextReader(xmlPath))
7: {
8: ds.ReadXmlSchema(reader);
9: }
10: foreach (DataTable dt in ds.Tables)
11: {
12: dt.BeginLoadData();
13: }
14: using (XmlReader reader = new XmlTextReader(xmlPath))
15: {
16: ds.ReadXml(reader);
17: }
18: foreach (DataTable dt in ds.Tables)
19: {
20: dt.EndLoadData();
21: }
22: if (ds.Tables.Count > 0)
23: {
24: DataTable dt = ds.Tables[0];
25: foreach (DataRow row in dt.Rows)
26: {
27: var xmlValue = new Dictionary<string, string>();
28: foreach (DataColumn col in dt.Columns)
29: {
30: xmlValue[col.ColumnName] = row.Field<string>(col);
31: }
32: entityInfo.Add(xmlValue);
33: }
34: }
35: return entityInfo;
36: }
读取的xml文件片段:
1: <?xml version="1.0" encoding="utf-8" ?>
2: <EntitySchema>
3:
4: <BE_WorkStation_ACInstance>
5: <FieldName>MS_ACInstanceOID</FieldName>
6: <FieldChinese>主键</FieldChinese>
7: <FieldType>45</FieldType>
8: <FieldLength>500</FieldLength>
9: <DecLength>0</DecLength>
10: <CodeTable>选择</CodeTable>
11: <fUseCodeTable>false</fUseCodeTable>
12: <fDisplay>true</fDisplay>
13: <DefaultValue></DefaultValue>
14: <FieldKind>0</FieldKind>
15: <fCanModify>true</fCanModify>
16: <ForeignKeyField></ForeignKeyField>
17: <LookupKeyField></LookupKeyField>
18: <LookupDataSet></LookupDataSet>
19: <LookupResultField>选择列</LookupResultField>
20: <fForeignKey>false</fForeignKey>
21: <SqlColumn>MS_ACInstance.MS_ACInstanceOID as MS_ACInstanceOID</SqlColumn>
22: <ifSubmit>true</ifSubmit>
23: <SubmitColumnName></SubmitColumnName>
24: <ifReadData>true</ifReadData>
25: <ifChangeDataSet>true</ifChangeDataSet>
26: <ColumnNumRule></ColumnNumRule>
27: <EditorBusinessElement></EditorBusinessElement>
28: <InputRule></InputRule>
29: <InputRuleHelper></InputRuleHelper>
30: <ifCheckInputNull>false</ifCheckInputNull>
31: <EncryptRule></EncryptRule>
32: </BE_WorkStation_ACInstance>
33:
34: <BE_WorkStation_ACInstance>
35: <FieldName>sys_rank</FieldName>
36: <FieldChinese>排序标识</FieldChinese>
37: <FieldType>2</FieldType>
38: <FieldLength>500</FieldLength>
39: <DecLength>0</DecLength>
40: <CodeTable>选择</CodeTable>
41: <fUseCodeTable>false</fUseCodeTable>
42: <fDisplay>true</fDisplay>
43: <DefaultValue></DefaultValue>
44: <FieldKind>0</FieldKind>
45: <fCanModify>true</fCanModify>
46: <ForeignKeyField></ForeignKeyField>
47: <LookupKeyField></LookupKeyField>
48: <LookupDataSet></LookupDataSet>
49: <LookupResultField>选择列</LookupResultField>
50: <fForeignKey>false</fForeignKey>
51: <SqlColumn>MS_ACInstance.BASE_Rank as sys_rank</SqlColumn>
52: <ifSubmit>true</ifSubmit>
53: <SubmitColumnName>BASE_Rank</SubmitColumnName>
54: <ifReadData>true</ifReadData>
55: <ifChangeDataSet>true</ifChangeDataSet>
56: <ColumnNumRule></ColumnNumRule>
57: <EditorBusinessElement></EditorBusinessElement>
58: <InputRule></InputRule>
59: <InputRuleHelper></InputRuleHelper>
60: <ifCheckInputNull>false</ifCheckInputNull>
61: <EncryptRule></EncryptRule>
62: </BE_WorkStation_ACInstance>
63:
64: ......
65: </EntitySchema>
使用上述3个方法分别读取一个大小为11M的xml文件,经过100次读取得到的总时间取平均值,分别为:0.4696s、0.5604、2.3166s。在读取打文件时,XMLReader最快、XMLDocument其次、DataSet 最慢。总时间如下图:
3中方式读取XML文件简单的介绍完成,也是经验和知识的累积。
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。