动态网页数据抓取(二)

9.Selenium显式等待和隐式等待

现在的网页越来越多的使用Ajax技术,不确定什么时候可以获取到数据。

(1)隐式等待:driver.implicitly_wait( )。实例代码如下:

# --coding:utf-8-- #

from selenium import webdriver

# driver_path是chromedriver的存放地址
driver_path = r"D:\Python3\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=driver_path)

# 隐式等待:driver.implicitly_wait()
driver.implicitly_wait(5)

driver.get("https://www.douban.com/")
# driver.quit()

(2)显式等待:WebDriverWait(driver, 10).until( )

# --coding:utf-8-- #

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait # 显式等待
from selenium.webdriver.support import expected_conditions as EC # 显式等待
from selenium.webdriver.common.by import By

# driver_path是chromedriver的存放地址
driver_path = r"D:\Python3\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=driver_path)
driver.get("https://www.douban.com/")

# 显式等待:
WebDriverWait(driver, 10).until(
    # EC.presence_of_all_elements_located((By.ID, '14fgrrr')) # 不存在的ID,会等待10秒后抛出异常
    EC.presence_of_all_elements_located((By.ID, 'form_email')) # 存在的ID,不会等待,也不会抛异常
)

driver.quit()

10.Selenium打开多窗口和切换页面

 

你可能感兴趣的:(Python)