【playwright】访问不同链接方法

every blog every motto: You can do more than you think.
https://blog.csdn.net/weixin_39190382?type=blog

0. 前言

访问不同页面方法方法比较

browser.new_page()
page = context.new_page()

1. 访问不同url

1.1 方法一

browser.new_page()
打开多个浏览器,然后再浏览器中访问链接

from playwright.sync_api import sync_playwright

url = 'https://www.baidu.com/'

def run(playwright):
    chromium = playwright.chromium  # or "firefox" or "webkit".
    browser = chromium.launch(headless=False, slow_mo=100)
    page = browser.new_page()
    page.goto(url)

    page.wait_for_load_state('networkidle')
    elements = page.query_selector_all('#s-top-left > a')
    print(elements)
    for ele in elements:
        href = ele.get_attribute('href')
        text = ele.inner_text()
        print(href, text)
        np = browser.new_page()
        np.goto(href)
   
    browser.close()

with sync_playwright() as playwright:
    run(playwright)

【playwright】访问不同链接方法_第1张图片

1.2 方法二

page = context.new_page()
再一个浏览器打开多个标签页(选项卡),

from playwright.sync_api import sync_playwright

url = 'https://www.baidu.com/'

def run(playwright):

    chromium = playwright.chromium  # or "firefox" or "webkit".
    browser = chromium.launch(headless=False, slow_mo=100)
    context = browser.new_context()
    page = context.new_page()

    page.goto(url)
    # 等待页面加载完成
    page.wait_for_load_state('networkidle')
    elements = page.query_selector_all('#s-top-left > a')
    print(elements)
    for ele in elements:
        href = ele.get_attribute('href')
        text = ele.inner_text()
        print(href, text)
        page = context.new_page()

        page.goto(href)

with sync_playwright() as playwright:
    run(playwright)

【playwright】访问不同链接方法_第2张图片

你可能感兴趣的:(爬虫,python)