使用php simple html dom parser解析html标签

用了一下

PHP Simple HTML DOM Parser

解析HTML页面,感觉还不错,它能创建一个DOM tree方便你解析html里面的内容。用来抓东西挺好的。
 
附带一个例子,你也到sourceforge下载压缩包看里面的例子:

Scraping data with PHP Simple HTML DOM Parser

 
PHP Simple HTML DOM Parser , written in PHP5+, allows you to manipulate HTML in a very easy way. Supporting invalid HTML, this parser is better then other PHP scripts using complicated regexes to extract information from web pages.
Before getting the necessary info, a DOM should be created from either URL or file. The following script extracts links & p_w_picpaths from a website:
view plain copy to clipboard print ?
 
Php代码
  1. // Create DOM from URL or file   
  2. $html = file_get_html('http://www.microsoft.com/');   
  3.   
  4. // Extract links   
  5. foreach($html->find('a'as $element)   
  6.        echo $element->href . '
    '
    ;    
  7.   
  8. // Extract p_w_picpaths   
  9. foreach($html->find('img'as $element)   
  10.        echo $element->src . '
    '
    ;  
The parser can also be used to modify HTML elements:
view plain copy to clipboard print ?
 
Php代码
  1. // Create DOM from string   
  2. $html = str_get_html('Simple
Parser
');   
  •   
  • $html->find('div', 1)->class = 'bar';   
  •   
  • $html->find('div[id=simple]', 0)->innertext = 'Foo';   
  •   
  • // Output: Foo
  • Parser