基于C# 语言的两个html解析器

基于C# 语言的两个html解析器

 

1)Html Agility Pack

http://nsoup.codeplex.com/

代码段示例:

 

HtmlDocument doc = new HtmlDocument();

 doc.Load("file.htm");

 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])

 {

    HtmlAttribute att = link["href"];

    att.Value = FixLink(att);

 }

 doc.Save("file.htm");

 

 

2) JSoup的Net移植版本 NSoup

http://htmlagilitypack.codeplex.com/

更推荐NSoup

NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString);

 

NSoup.Nodes.Document doc = NSoup.NSoupClient.Connect("http://www.oschina.net/").Get();

 

ebClient webClient = new WebClient();

String HtmlString=Encoding.GetEncoding("utf-8").GetString(webClient.DownloadData("http://www.oschina.net/"));

NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString);

 

WebRequest webRequest=WebRequest.Create("http://www.oschina.net/");

NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(webRequest.GetResponse().GetResponseStream(),"utf-8");

 

你可能感兴趣的:(html)