Selenium学习笔记——selenium操作详记

Selenium操作学习笔记

学习目的: 使用selenium自动化测试工具,模拟人为操作浏览器,达到可以注册/登录网页,滑动下拉框,选择,鼠标点击等操作,为以后编写突破反爬虫机制的个人小爬虫建立基础。
学习背景: windows系统,python语言,PyCharm
学习资源: https://www.selenium.dev/documentation/en/

主要内容

1. 定位元素

是最重要的学习内容之一,学习如何定位到网页中的元素

1.1 定位单一元素

driver.find_element(By.ID, "cheese")

tips:driver是WebDriver的实例对象

一旦“定位到了”以上的网页元素,可以在上一步结果的基础上进一步缩小范围

cheese = driver.find_element(By.ID, "cheese")
cheddar = cheese.find_elements_by_id("cheddar")

同时,也可以使用另一种方法实现

cheddar = driver.find_element_by_css_selector("#cheese #cheddar")

1.2 定位多个元素

如果网页结构如下

想要定位cheese下的所有元素,如下

mucho_cheese = driver.find_elements_by_css_selector("#cheese li")

1.3 元素选择策略

WebDriver有八个内置的元素选择方法

定位法 使用描述
class name 寻找包含查找值的class name元素(不包括复合class name)
css selector 用css法定位元素
id 依据id属性值定位元素
name 依据name属性值定位元素
link text 定位其可视文本与搜索值匹配的锚元素
partial link text 定位其可视文本包含搜索值的第一个锚元素
tag name 定位tag name与搜索值匹配的元素
xpath 查找与xpaht表达式匹配的元素

1.4 其他定位法

在定位到想要的元素后,可以使用以下方法定位到相邻的元素

above() 定位到现元素上面的元素
below() 定位到现元素下面的元素
toLeftOf() 定位到现元素左面的元素
tpRightOf() 定位到现元素右面的元素
near() 定位到最多距现元素50个像素远的元素

2. 动作操作

2.1 定位并输入文本

name = "Charles"
driver.find_element(By.NAME, "name").send_keys(name) 

2.2 拖放功能

source = driver.find_element(By.ID, "source")
target = driver.find_element(By.ID, "target")
ActionChains(driver).drag_and_drop(source, target).perform()

2.3 点击元素

driver.find_element(By.CSS_SELECTOR, "input[type='submit']").click()

3. 实例化WebDriver

from selenium.webdriver import Chrome

driver = Chrome()

from selenium.webdriver import Chrome

with Chrome() as driver:
    #your code inside this indent

4. 浏览器操作

4.1 打开网页

driver.get("https://selenium.dev") 

4.2 获取当前网址

从浏览器的地址栏读取当前的URL

driver.current_url

4.3 按下浏览器的后退按钮

driver.back()

4.4 按下浏览器的前进按钮

driver.forward()

4.5 刷新当前网页

driver.refresh()

4.6 从浏览器中读取当前页面标题

driver.title

4.7 当打开一个新窗口时,获取该窗口的处理权

driver.current_window_handle

4.8 当点击一个链接跳转到新窗口时,需要switch到新窗口

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Start the driver
with webdriver.Chrome() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element(By.LINK_TEXT, "new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content
    wait.until(EC.title_is("SeleniumHQ Browser Automation"))

4.9 创建新窗口并switch

# Opens a new tab and switches to new tab
driver.switch_to.new_window('tab')

# Opens a new window and switches to new window
driver.switch_to.new_window('window')

4.10 关闭一个窗口或tab

#Close the tab or window
driver.close()

#Switch back to the old tab or window
driver.switch_to.window(original_window)

4.11 退出浏览器

driver.quit()

4.12 Frames和Iframes,点击iframes中的按钮


代码如下
方法一

# Store iframe web element
iframe = driver.find_element(By.CSS_SELECTOR, "#modal > iframe")

# switch to selected iframe
driver.switch_to.frame(iframe)

# Now click on button
driver.find_element(By.TAG_NAME, 'button').click()

方法二

# Switch frame by id
driver.switch_to.frame('buttonframe')

# Now, Click on the button
driver.find_element(By.TAG_NAME, 'button').click()  

方法三

# Switch to the second frame
driver.switch_to.frame(1)

4.13 退出一个frame或frameset

driver.switch_to.default_content()

5. 窗口管理

5.1 获取窗口大小

# Access each dimension individually
width = driver.get_window_size().get("width")
height = driver.get_window_size().get("height")

# Or store the dimensions and query them later
size = driver.get_window_size()
width1 = size.get("width")
height1 = size.get("height")  

5.2 设置窗口大小

driver.set_window_size(1024, 768)

5.3 获得窗口位置

# Access each dimension individually
x = driver.get_window_position().get('x')
y = driver.get_window_position().get('y')

# Or store the dimensions and query them later
position = driver.get_window_position()
x1 = position.get('x')
y1 = position.get('y')  

5.4 设置窗口位置

# Move the window to the top left of the primary monitor
driver.set_window_position(0, 0)

5.5 最大化窗口

driver.maximize_window()

5.6 最小化窗口

driver.minimize_window()

5.7 窗口全屏

driver.fullscreen_window()

6. 等待响应

例子:将下列代码保存为一个文件,路径为:file://race_condition.html



Race Condition Example


6.1 显式等待网页加载完成

方法一

from selenium.webdriver.support.ui import WebDriverWait
def document_initialised(driver):
    return driver.execute_script("return initialised")

driver.navigate("file:///race_condition.html")
WebDriverWait(driver).until(document_initialised)
el = driver.find_element(By.TAG_NAME, "p")
assert el.text == "Hello from JavaScript!"

方法二

from selenium.webdriver.support.ui import WebDriverWait

driver.navigate("file:///race_condition.html")
el = WebDriverWait(driver).until(lambda d: d.find_element_by_tag_name("p"))
assert el.text == "Hello from JavaScript!"  

方法三

WebDriverWait(driver, timeout=3).until(some_condition) 

6.2 隐式等待

driver = Chrome()
driver.implicitly_wait(10)
driver.get("http://somedomain/url_that_delays_loading")
my_dynamic_element = driver.find_element(By.ID, "myDynamicElement")

FluentWait

driver = Chrome()
driver.get("http://somedomain/url_that_delays_loading")
wait = WebDriverWait(driver, 10, poll_frequency=1, ignored_exceptions=[ElementNotVisibleException, ElementNotSelectableException])
element = wait.until(EC.element_to_be_clickable((By.XPATH, "//div")))  

7. JavaScript警报、提示和确认

7.1 处理警报

# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See an example alert").click()

# Wait for the alert to be displayed and store it in a variable
alert = wait.until(expected_conditions.alert_is_present())

# Store the alert text in a variable
text = alert.text

# Press the OK button
alert.accept()  

7.2 处理确认

# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See a sample confirm").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = driver.switch_to.alert

# Store the alert text in a variable
text = alert.text

# Press the Cancel button
alert.dismiss()  

7.3 处理提示

# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See a sample prompt").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = Alert(driver)

# Type your message
alert.send_keys("Selenium")

# Press the OK button
alert.accept()  

8. Http代理

from selenium import webdriver

PROXY = ""
webdriver.DesiredCapabilities.CHROME['proxy'] = {
    "httpProxy": PROXY,
    "ftpProxy": PROXY,
    "sslProxy": PROXY,
    "proxyType": "MANUAL",
}

with webdriver.Chrome() as driver:
    # Open URL
    driver.get("https://selenium.dev")  

9. 网页加载策略

9.1 normal(网页完全加载)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'normal'
driver = webdriver.Chrome(options=options)
# Navigate to url
driver.get("http://www.google.com")
driver.quit()  

9.2 eager(只加载主页面,忽略表格、图像和子框架)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
# Navigate to url
driver.get("http://www.google.com")
driver.quit()  

9.3 none(仅加载主页面)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'none'
driver = webdriver.Chrome(options=options)
# Navigate to url
driver.get("http://www.google.com")
driver.quit()  

10. Web元素

10.1 查找元素

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()

driver.get("http://www.google.com")

# Get search box element from webElement 'q' using Find Element
search_box = driver.find_element(By.NAME, "q")

search_box.send_keys("webdriver")  

10.2 查找多元素

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()

# Navigate to Url
driver.get("https://www.example.com")

# Get all the elements available with tag name 'p'
elements = driver.find_elements(By.TAG_NAME, 'p')

for e in elements:
    print e.text  

10.3 从元素下查找单个元素

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("http://www.google.com")
search_form = driver.find_element(By.TAG_NAME, "form")
search_box = search_form.find_element(By.NAME, "q")
search_box.send_keys("webdriver")  

10.4 从元素下查找多个元素

  from selenium import webdriver
  from selenium.webdriver.common.by import By

  driver = webdriver.Chrome()
  driver.get("https://www.example.com")

  # Get element with tag name 'div'
  element = driver.find_element(By.TAG_NAME, 'div')

  # Get all the elements available with tag name 'p'
  elements = element.find_elements(By.TAG_NAME, 'p')
  for e in elements:
      print e.text  

10.5 获取active element

  from selenium import webdriver
  from selenium.webdriver.common.by import By

  driver = webdriver.Chrome()
  driver.get("https://www.google.com")
  driver.find_element(By.CSS_SELECTOR, '[name="q"]').send_keys("webElement")

  # Get attribute of current active element
  attr = driver.switch_to.active_element.get_attribute("title")
  print attr  

11. 键盘操作

11.1 sendKeys

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()

# Navigate to url

driver.get("http://www.google.com")

# Enter "webdriver" text and perform "ENTER" keyboard action

driver.find_element(By.NAME, "q").send_keys("webdriver" + Keys.ENTER)  

11.2 keyDown

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()

# Navigate to url

driver.get("http://www.google.com")

# Enter "webdriver" text and perform "ENTER" keyboard action

driver.find_element(By.NAME, "q").send_keys("webdriver" + Keys.ENTER)

# Perform action ctrl + A (modifier CONTROL + Alphabet A) to select the page

webdriver.ActionChains(driver).key_down(Keys.CONTROL).send_keys("a").perform()  

11.3 keyUp

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()

# Navigate to url

driver.get("http://www.google.com")

# Store google search box WebElement

search = driver.find_element(By.NAME, "q")

action = webdriver.ActionChains(driver)

# Enters text "qwerty" with keyDown SHIFT key and after keyUp SHIFT key (QWERTYqwerty)

action.key_down(Keys.SHIFT).send_keys_to_element(search, "qwerty").key_up(Keys.SHIFT).send_keys("qwerty").perform()  

11.4 clear

from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()

# Navigate to url

driver.get("http://www.google.com")
# Store 'SearchInput' element

SearchInput = driver.find_element(By.NAME, "q")
SearchInput.send_keys("selenium")
# Clears the entered text

SearchInput.clear()  

你可能感兴趣的:(python爬虫,selenium,python)