BeautifulSoup4和 lxml 一样,Beautiful Soup 也是一个HTML/XML的解析器,主要的功能也是如何解析和提取 HTML/XML 数据。
with open('test.html', 'r', encoding='utf-8') as f:
Soup = BeautifulSoup(f.read(), 'html.parser')
titles = Soup.select('ul > li > div.article-info > h3 > a')
for title in titles:
print(title.text)
BeautifulSoup详细教程
BeautifulSoup使用案例