下面通过代码实例展示使用pyquery解析库的伪类选择器对HTML节点操作:
源代码:
from pyquery import PyQuery as py
html = '''
- this is li1.
- this is li2.
- this is li3.
- this is li4.
- this is li5.
'''
doc = py(html)
li = doc('li:first-child')
print('第一个')
print(li)
li = doc('li:last-child')
print('最后一个')
print(li)
li = doc('li:nth-child(2)')
print('第二个')
print(li)
li = doc('li:gt(3)')
print('第4个(不包括)之后')
print(li)
li = doc('li:nth-child(2n)')
print('偶数位')
print(li)
li = doc('li:contains(li4)')
print('文本内容包含li4')
print(li)
运行结果:
第一个
class="li1">this is li1.
最后一个
class="li5">this is li5.
第二个
class="li2">this is li2.
第4个(不包括)之后
class="li5">this is li5.
偶数位
class="li2">this is li2.
class="li4">this is li4.
文本内容包含li4
class="li4">this is li4.