如何注释python中html,Python在HTML中提取带注释的代码,python,html,被

假设被注释代码段如下:

html="""

"""

如果直接对此代码段使用pyquery转换并提取

from pyquery import PyQuery as pq

response = pq(html)("div.forum_content")

print(response)

会报错:lxml.etree.ParserError: Document is empty

方法:利用bs4提取被注释代码段,再使用pyquery转换并提取

from pyquery import PyQuery as pq

from bs4 import BeautifulSoup,Comment

soup = BeautifulSoup(html,'html.parser')

res = ''.join(soup.findAll(text=lambda text:isinstance(text,Comment))) # 提取被注释部分

response = pq(res)("div.forum_content")

print(response)

结果:可被正常提取

你可能感兴趣的:(如何注释python中html)