BeautifulSoup学习笔记

from BeautifulSoup import BeautifulSoup
import re
 
doc = ['Page title',
       '

This is paragraph one.', '

This is paragraph two.', ''] soup = BeautifulSoup(''.join(doc)) print soup.prettify() 运行结果为: print soup.contents[0].name # print soup.contents[0].contents[0].name for i in range(len(soup.contents[0])): print soup.contents[0].contents[i].name


 

titleTag = soup.html.head.title
titleTag
# Page title
 
titleTag.string
# u'Page title'
 
len(soup('p'))
# 2
 
soup.findAll('p', align="center")
# [

This is paragraph one.

] soup.find('p', align="center") #

This is paragraph one.

soup('p', align="center")[0]['id'] # u'firstpara' soup.find('p', align=re.compile('^b.*'))['id'] # u'secondpara' soup.find('p').b.string # u'one' soup('p')[1].b.string # u'two'

你可能感兴趣的:(python编程)