通过Xpath解析百度贴吧url源代码和爬取标题

通过Xpath解析url源代码和爬取标题

在这里插入代码片
#拿到页面源代码
import requests
from lxml import etree
response = requests.get("https://tieba.baidu.com/f?kw=%E5%A4%A7%E6%95%B0%E6%8D%AE&ie=utf-8&pn=0")
htmlStr = response.content.decode("utf-8")
#print(htmlStr)
#拿到指定xpath
content = etree.HTML(htmlStr)
resposne = content.xpath("//li/div/div/div/div/a/text()")
print(resposne)

你可能感兴趣的:(爬虫,xpath,python)