Selenium笔记(1)安装和简单使用
简介
Selenium是一个用于Web应用程序测试的工具。
Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。支持的浏览器包括IE(7, 8, 9, 10, 11),Firefox,Safari,Chrome,Opera等。
这个工具的主要功能包括:测试与浏览器的兼容性——测试你的应用程序看是否能够很好得工作在不同浏览器和操作系统之上。测试系统功能——创建回归测试检验软件功能和用户需求。
而用在爬虫上则是模拟正常用户访问网页并获取数据。
安装
ChromeDriver(浏览器驱动)安装
使用selenium驱动chrome浏览器需要下载chromedriver,而且chromedriver版本需要与chrome的版本对应,版本错误的话则会运行报错。
Chromedriver下载地址:https://chromedriver.storage.googleapis.com/index.html
Chromedriver与Chrome版本映射表:
chromedriver版本 | 支持的Chrome版本 |
---|---|
v2.37 | v64-66 |
v2.36 | v63-65 |
v2.35 | v62-64 |
v2.34 | v61-63 |
v2.33 | v60-62 |
v2.32 | v59-61 |
v2.31 | v58-60 |
v2.30 | v58-60 |
v2.29 | v56-58 |
v2.28 | v55-57 |
v2.27 | v54-56 |
v2.26 | v53-55 |
v2.25 | v53-55 |
v2.24 | v52-54 |
v2.23 | v51-53 |
Mac/Linux
下载完成解压后,将文件移动至/usr/local/bin
目录中,则可以正常使用。
Windows
也可将驱动文件许放在脚本文件下
下载完成解压后,将文件移动到一个配置了环境变量的文件夹中,例如你的Python安装文件夹。
Selenium安装
Selenium
的安装非常简单,直接pip就可以搞定。
pip install selenium
简单使用
Chrome无界面运行
这是chrome浏览器2017年发布的新特性,需要unix版本的chrome版本高于57,windows版本的chrome版本高于58。
使用selenium无界面运行chrome的代码如下:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# 实例化一个启动参数对象 chrome_options = Options() # 设置浏览器以无界面方式运行 chrome_options.add_argument('--headless') # 官方文档表示这一句在之后的版本会消失,但目前版本需要加上此参数 chrome_options.add_argument('--disable-gpu') # 设置浏览器参数时最好固定好窗口大小,窗口大小不同会在解析网页时出现不同的结果 chrome_options.add_argument('--window-size=1366,768') # 启动浏览器 browser = webdriver.Chrome(chrome_options=chrome_options)
运行上述代码,则会打开一个无界面chrome浏览器的空白页,去掉headless那一句可以看到效果。
Selenium简单例子
这是一个打开百度首页,在输入框中输入Python,并点击搜索的例子。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.wait import WebDriverWait # 打开一个Chrome浏览器 browser = webdriver.Chrome() # 请求百度首页 browser.get('https://www.baidu.com') # 找到输入框位置 input = WebDriverWait(browser, 10).until( EC.presence_of_element_located((By.XPATH, '//*[@id="kw"]')) ) # 在输入框中输入Python input.send_keys('Python') # 找到输入按钮 button = WebDriverWait(browser, 10).until( EC.element_to_be_clickable( (By.XPATH, '//*[@id="su"]')) ) # 点击一次输入按钮 button.click() browser.quit()
# -*- coding: utf-8 -*-
# 斌彬电脑
# @Time : 2018/9/6 0006 5:08
# 开启谷歌浏览器
from selenium import webdriver
drt = webdriver.Chrome()
drt.get('http://www.baidu.com')
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
# 找到搜索框,
input = WebDriverWait(drt, 10).until(EC.presence_of_element_located((By.XPATH,'//input[@id="kw"]')))
input.send_keys('123')
# 找到百度一下按钮
btn = WebDriverWait(drt, 10).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="su"]')))
btn.click()
#关闭浏览器
# drt.quit()
Selenium笔记(2)Chrome Webdriver启动选项
在Selenium
中使用不同的Webdriver
可能会有不一样的方法,有些相同的操作会得到不一样的结果,本文主要介绍的是Chrome()
的使用方法。
其他Webdriver
可以查阅官方文档。
Chrome WebDriver Options
简介
这是一个Chrome的参数对象,在此对象中使用add_argument()
方法可以添加启动参数,添加完毕后可以在初始化Webdriver对象时将此Options对象传入,则可以实现以特定参数启动Chrome。
例子
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# 实例化一个启动参数对象 chrome_options = Options() # 添加启动参数 chrome_options.add_argument('--window-size=1366,768') # 将参数对象传入Chrome,则启动了一个设置了窗口大小的Chrome browser = webdriver.Chrome(chrome_options=chrome_options)
常用的启动参数
启动参数 | 作用 |
---|---|
--user-agent="" | 设置请求头的User-Agent |
--window-size=1366,768 | 设置浏览器分辨率 |
--headless | 无界面运行 |
--start-maximized | 最大化运行 |
--incognito | 隐身模式 |
--disable-javascript | 禁用javascript |
--disable-infobars | 禁用浏览器正在被自动化程序控制的提示 |
完整启动参数可以到此页面查看:
https://peter.sh/experiments/chromium-command-line-switches/
禁用图片加载
Chrome的禁用图片加载参数设置比较复杂,如下所示:
prefs = {
'profile.default_content_setting_values' : {
'images' : 2
}
}
options.add_experimental_option('prefs',prefs)
禁用浏览器弹窗
使用浏览器时常常会有弹窗弹出,以下选项可以禁止弹窗:
prefs = {
'profile.default_content_setting_values' : {
'notifications' : 2
}
}
options.add_experimental_option('prefs',prefs)
完整文档
class selenium.webdriver.chrome.options.Options
Bases: object
Method
-
__init__
()
-
add_argument
(argument)Adds an argument to the listArgs:Sets the arguments
-
add_encoded_extension
(extension)Adds Base64 encoded string with extension data to a list that will be used to extract it to the ChromeDriverArgs:extension: Base64 encoded string with extension data
-
add_experimental_option
(name, value)Adds an experimental option which is passed to chrome.Args:name: The experimental option name. value: The option value.
-
add_extension
(extension)Adds the path to the extension to a list that will be used to extract it to the ChromeDriverArgs:extension: path to the *.crx file
-
set_headless
(headless=True)Sets the headless argumentArgs:headless: boolean value indicating to set the headless option
-
to_capabilities
()Creates a capabilities with all the options that have been set andreturns a dictionary with everything
Values
-
KEY
= 'goog:chromeOptions'
-
arguments
Returns a list of arguments needed for the browser
-
binary_location
Returns the location of the binary otherwise an empty string
-
debugger_address
Returns the address of the remote devtools instance
-
experimental_options
Returns a dictionary of experimental options for chrome.
-
extensions
Returns a list of encoded extensions that will be loaded into chrome
-
headless
Returns whether or not the headless argument is set
Chrome WebDriver对象
简介
这个对象继承自selenium.webdriver.remote.webdriver.WebDriver
,这个类会在下一章讲到,Chrome的WebDriver作为子类增添了几个方法。
指定chromedriver.exe的位置
chromedriver.exe一般可以放在环境文件中,但是有时候为了方便部署项目,或者为了容易打包,我们可以将chromedriver.exe放到我们的项目目录中,然后在初始化Chrome Webdriver对象时,传入chromedriver.exe的路径。
如下所示:
from selenium import webdriver
browser = webdriver.Chrome(executable_path='chromedriver.exe')
完整文档
class selenium.webdriver.chrome.webdriver.WebDriver(executable_path='chromedriver', port=0, options=None,service_args=None, desired_capabilities=None, service_log_path=None, chrome_options=None)
Bases: selenium.webdriver.remote.webdriver.WebDriver
Controls the ChromeDriver and allows you to drive the browser.
You will need to download the ChromeDriver executable fromhttp://chromedriver.storage.googleapis.com/index.html
-
__init__
(executable_path='chromedriver', port=0, options=None, service_args=None, desired_capabilities=None,service_log_path=None, chrome_options=None)Creates a new instance of the chrome driver.
Starts the service and then creates new instance of chrome driver.
Args:
-
executable_path - path to the executable. If the default is used it assumes the executable is in the $PATHport
-
port you would like the service to run, if left as 0, a free port will be found.
-
desired_capabilities: Dictionary object with non-browser specific capabilities only, such as “proxy” or “loggingPref”.
-
options: this takes an instance of ChromeOptions
-
-
create_options
()
-
get_network_conditions
()Gets Chrome network emulation settings.
Returns:A dict. For example:
{‘latency’: 4, ‘download_throughput’: 2, ‘upload_throughput’: 2, ‘offline’: False}
-
launch_app
(id)Launches Chrome app specified by id.
-
quit
()Closes the browser and shuts down the ChromeDriver executable that is started when starting the ChromeDriver
-
set_network_conditions
(**network_conditions)Sets Chrome network emulation settings.
Args:
-
network_conditions: A dict with conditions specification.
Usage:
driver.set_network_conditions(offline=False, latency=5, # additional latency (ms) download_throughput=500 * 1024, # maximal throughput upload_throughput=500 * 1024) # maximal throughput
-
Note: ‘throughput’ can be used to set both (for download and upload).
Selenium笔记(3)Remote Webdriver
简介
selenium.webdriver.remote.webdriver.WebDriver
这个类其实是所有其他Webdriver的父类,例如Chrome Webdriver
,Firefox Webdriver
都是继承自这个类。这个类中实现了每个Webdriver间相通的方法。
常用操作
-
get(url)
在当前浏览器会话中访问传入的url地址。
用法:
driver.get('https://www.baidu.com')
-
close()
关闭浏览器当前窗口。
-
quit()
退出webdriver并关闭所有窗口。
-
refresh()
刷新当前页面。
-
title
获取当前页的标题。
-
page_source
获取当前页渲染后的源代码。
-
current_url
获取当前页面的url。
-
window_handles
获取当前会话中所有窗口的句柄。
查找元素
Webdriver
对象中内置了查找节点元素的方法,使用非常方便。
单个查找
以下是查找单个元素的方法:
方法 | 作用 |
---|---|
find_element_by_xpath () |
通过Xpath 查找 |
find_element_by_class_name () |
通过class属性 查找 |
find_element_by_css_selector () |
通过css选择器 查找 |
find_element_by_id () |
通过id 查找 |
find_element_by_link_text () |
通过链接文本 查找 |
find_element_by_name () |
通过name属性 进行查找 |
find_element_by_partial_link_text () |
通过链接文本的部分匹配 查找 |
find_element_by_tag_name () |
通过标签名 查找 |
查找后返回的是一个Webelement
对象。
多个查找
上面的方法都是将第一个找到的元素进行返回,而将所有匹配的元素进行返回使用的是find_elements_by_*
方法。
注:将其中的element加上一个s,则是对应的多个查找方法。
此方法返回的是一个Webelement
对象组成的列表。
通过私有方法进行查找
除了以上的多种查找方式,还有两种私有方法find_element()
和find_elements()
可以使用:
例子:
from selenium.webdriver.common.by import By
driver.find_element(By.XPATH, '//button[text()="Some text"]')
driver.find_elements(By.XPATH, '//button')
By
这个类是专门用来查找元素时传入的参数,这个类中有以下属性:
ID = "id"
XPATH = "xpath"
LINK_TEXT = "link text"
PARTIAL_LINK_TEXT = "partial link text"
NAME = "name" TAG_NAME = "tag name" CLASS_NAME = "class name" CSS_SELECTOR = "css selector"
操作Cookie
-
add_cookie(cookie_dict)
给当前会话添加一个cookie。
-
cookie_dict: 一个字典对象,必须要有"name"和"value"两个键,可选的键有:“path”, “domain”, “secure”, “expiry” 。
-
用法:
driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’}) driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’, ‘path’ : ‘/’}) driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’, ‘path’ : ‘/’, ‘secure’:True})
-
-
get_cookie(name)
按name获取单个Cookie,没有则返回None。
-
get_cookies()
获取所有Cookie,返回的是一组字典。
-
delete_all_cookies
()¶删除所有Cookies。
-
delete_cookie
(name)按name删除指定cookie。
获取截屏
-
get_screenshot_as_base64()
获取当前窗口的截图保存为一个base64编码的字符串。
-
get_screenshot_as_file(filename)
获取当前窗口的截图保存为一个png格式的图片,filename参数为图片的保存地址,最后应该以.png结尾。如果出现IO错误,则返回False。
用法:
driver.get_screenshot_as_file(‘/Screenshots/foo.png’)
-
get_screenshot_as_png()
获取当前窗口的截图保存为一个png格式的二进制字符串。
获取窗口信息
-
get_window_position
(windowHandle='current')获取当前窗口的x,y坐标。
-
get_window_rect()
获取当前窗口的x,y坐标和当前窗口的高度和宽度。
-
get_window_size
(windowHandle='current')获取当前窗口的高度和宽度。
切换
-
switch_to_frame
(frame_reference)将焦点切换到指定的子框架中
-
switch_to_window
(window_name)切换窗口
执行JS代码
-
execute_async_script(script, *args)
在当前的window/frame中
异步
执行JS代码。script:是你要执行的JS代码。
*args:是你的JS代码执行要传入的参数。
用法:
script = “var callback = arguments[arguments.length - 1]; ” script2 = “window.setTimeout(function(){ callback(‘timeout’) }, 3000);” driver.execute_async_script(script + script2)
-
execute_script(script, *args)
在当前的window/frame中
同步
执行JS代码。script:是你要执行的JS代码。
*args:是你的JS代码执行要传入的参数。
完整文档
class selenium.webdriver.remote.webdriver.``WebDriver
(command_executor='http://127.0.0.1:4444/wd/hub',desired_capabilities=None, browser_profile=None, proxy=None, keep_alive=False, file_detector=None, options=None)
Bases: object
Controls a browser by sending commands to a remote server. This server is expected to be running the WebDriver wire protocol as defined at
https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol 。
-
Attributes:
-
session_id - String ID of the browser session started and controlled by this WebDriver.
-
capabilities - Dictionaty of effective capabilities of this browser session as returned
by the remote server. See https://github.com/SeleniumHQ/selenium/wiki/DesiredCapabilities
-
command_executor - remote_connection.RemoteConnection object used to execute commands.
-
error_handler - errorhandler.ErrorHandler object used to handle errors.
-
-
__init__
(command_executor='http://127.0.0.1:4444/wd/hub', desired_capabilities=None, browser_profile=None, proxy=None,keep_alive=False, file_detector=None, options=None)Create a new driver that will issue commands using the wire protocol.
Args:
-
command_executor - Either a string representing URL of the remote server or a customremote_connection.RemoteConnection object. Defaults to ‘http://127.0.0.1:4444/wd/hub’.
-
desired_capabilities - A dictionary of capabilities to request whenstarting the browser session. Required parameter.
-
browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object.Only used if Firefox is requested. Optional.
-
proxy - A selenium.webdriver.common.proxy.Proxy object. The browser session willbe started with given proxy settings, if possible. Optional.
-
keep_alive - Whether to configure remote_connection.RemoteConnection to useHTTP keep-alive. Defaults to False.
-
file_detector - Pass custom file detector object during instantiation. If None,then default LocalFileDetector() will be used.
-
options - instance of a driver options.Options class
-
-
add_cookie
(cookie_dict)Adds a cookie to your current session.
Args:
-
cookie_dict: A dictionary object, with required keys - “name” and “value”;optional keys - “path”, “domain”, “secure”, “expiry”
Usage:
driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’}) driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’, ‘path’ : ‘/’}) driver.add_cookie({‘name’ : ‘foo’, ‘value’ : ‘bar’, ‘path’ : ‘/’, ‘secure’:True})
-
-
back
()Goes one step backward in the browser history.
Usage:
driver.back()
-
close
()Closes the current window.Usage:driver.close()
-
create_web_element
(element_id)Creates a web element with the specified element_id.
-
delete_all_cookies
()Delete all cookies in the scope of the session.
Usage:
driver.delete_all_cookies()
-
delete_cookie
(name)Deletes a single cookie with the given name.
Usage:
driver.delete_cookie(‘my_cookie’)
-
execute
(driver_command, params=None)Sends a command to be executed by a command.CommandExecutor.
Args:
-
driver_command: The name of the command to execute as a string.
-
params: A dictionary of named parameters to send with the command.
Returns:
The command’s JSON response loaded into a dictionary object.
-
-
execute_async_script
(script, *args)Asynchronously Executes JavaScript in the current window/frame.
Args:
-
script: The JavaScript to execute.
-
*args: Any applicable arguments for your JavaScript.
Usage:
script = “var callback = arguments[arguments.length - 1]; ” “window.setTimeout(function(){ callback(‘timeout’) }, 3000);” driver.execute_async_script(script)
-
-
execute_script
(script, *args)Synchronously Executes JavaScript in the current window/frame.
Args:
-
script: The JavaScript to execute.
-
*args: Any applicable arguments for your JavaScript.
Usage:
driver.execute_script(‘return document.title;’)
-
-
file_detector_context
(*args, **kwds)Overrides the current file detector (if necessary) in limited context. Ensures the original file detector is set afterwards.
Example:
with webdriver.file_detector_context(UselessFileDetector): someinput.send_keys(‘/etc/hosts’)
Args:
-
file_detector_class - Class of the desired file detector. If the class is differentfrom the current file_detector, then the class is instantiated with args and kwargs and used as a file detector during the duration of the context manager.
-
args - Optional arguments that get passed to the file detector class duringinstantiation.
-
kwargs - Keyword arguments, passed the same way as args.
-
-
find_element
(by='id', value=None)‘Private’ method used by the
find_element_by_*
methods.Usage:
Use the corresponding
find_element_by_*
instead of this.Return type:
WebElement
-
forward
()Goes one step forward in the browser history.
Usage:
driver.forward()
-
fullscreen_window
()Invokes the window manager-specific ‘full screen’ operation
-
get
(url)Loads a web page in the current browser session.
-
get_cookie
(name)Get a single cookie by name. Returns the cookie if found, None if not.
Usage:
driver.get_cookie(‘my_cookie’)
-
get_cookies
()Returns a set of dictionaries, corresponding to cookies visible in the current session.
Usage:
driver.get_cookies()
-
get_log
(log_type)Gets the log for a given log type
Args:
-
log_type: type of log that which will be returned
Usage:
driver.get_log(‘browser’) driver.get_log(‘driver’) driver.get_log(‘client’) driver.get_log(‘server’)
-
-
get_screenshot_as_base64
()Gets the screenshot of the current window as a base64 encoded stringwhich is useful in embedded images in HTML.
Usage:
driver.get_screenshot_as_base64()
-
get_screenshot_as_file
(filename)Saves a screenshot of the current window to a PNG image file. ReturnsFalse if there is any IOError, else returns True. Use full paths in your filename.
Args:
-
filename: The full path you wish to save your screenshot to. This should end with a .png extension.
Usage:
driver.get_screenshot_as_file(‘/Screenshots/foo.png’)
-
-
get_screenshot_as_png
()Gets the screenshot of the current window as a binary data.
Usage:
driver.get_screenshot_as_png()
-
get_window_position
(windowHandle='current')Gets the x,y position of the current window.
Usage:
driver.get_window_position()
-
get_window_rect
()Gets the x, y coordinates of the window as well as height and width of the current window.
Usage:
driver.get_window_rect()
-
get_window_size
(windowHandle='current')Gets the width and height of the current window.
Usage:
driver.get_window_size()
-
implicitly_wait
(time_to_wait)Sets a sticky timeout to implicitly wait for an element to be found,or a command to complete. This method only needs to be called one time per session. To set the timeout for calls to execute_async_script, see set_script_timeout.
Args:
-
time_to_wait: Amount of time to wait (in seconds)
Usage:
driver.implicitly_wait(30)
-
-
maximize_window
()Maximizes the current window that webdriver is using
-
minimize_window
()Invokes the window manager-specific ‘minimize’ operation
-
quit
()Quits the driver and closes every associated window.
Usage:
driver.quit()
-
refresh
()Refreshes the current page.
Usage:
driver.refresh()
-
save_screenshot
(filename)Saves a screenshot of the current window to a PNG image file. ReturnsFalse if there is any IOError, else returns True. Use full paths in your filename.
Args:
-
filename: The full path you wish to save your screenshot to. This should end with a .png extension.
Usage:
driver.save_screenshot(‘/Screenshots/foo.png’)
-
-
set_page_load_timeout
(time_to_wait)Set the amount of time to wait for a page load to completebefore throwing an error.
Args:
-
time_to_wait: The amount of time to wait
Usage:
driver.set_page_load_timeout(30)
-
-
set_script_timeout
(time_to_wait)Set the amount of time that the script should wait during anexecute_async_script call before throwing an error.
Args:
-
time_to_wait: The amount of time to wait (in seconds)
Usage:
driver.set_script_timeout(30)
-
-
set_window_position
(x, y, windowHandle='current')Sets the x,y position of the current window. (window.moveTo)
Args:
-
x: the x-coordinate in pixels to set the window position
-
y: the y-coordinate in pixels to set the window position
Usage:
driver.set_window_position(0,0)
-
-
set_window_rect
(x=None, y=None, width=None, height=None)Sets the x, y coordinates of the window as well as height and width of the current window.
Usage:
driver.set_window_rect(x=10, y=10) driver.set_window_rect(width=100, height=200) driver.set_window_rect(x=10, y=10, width=100, height=200)
-
set_window_size
(width, height, windowHandle='current')Sets the width and height of the current window. (window.resizeTo)
Args:
-
width: the width in pixels to set the window to
-
height: the height in pixels to set the window to
Usage:
driver.set_window_size(800,600)
-
-
start_client
()Called before starting a new session. This method may be overridden to define custom startup behavior.
-
start_session
(capabilities, browser_profile=None)Creates a new session with the desired capabilities.
Args:
-
browser_name - The name of the browser to request.
-
version - Which browser version to request.platform - Which platform to request the browser on.
-
javascript_enabled - Whether the new session should support JavaScript.
-
browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object. Only used if Firefox is requested.
-
-
stop_client
()Called after executing a quit command. This method may be overridden to define custom shutdown behavior.
-
switch_to_active_element
()Deprecated use driver.switch_to.active_element
-
switch_to_alert
()Deprecated use driver.switch_to.alert
-
switch_to_default_content
()Deprecated use driver.switch_to.default_content
-
switch_to_frame
(frame_reference)Deprecated use driver.switch_to.frame
-
switch_to_window
(window_name)Deprecated use driver.switch_to.window
-
application_cache
Returns a ApplicationCache Object to interact with the browser app cache
-
current_url
Gets the URL of the current page.
Usage:
driver.current_url
-
current_window_handle
Returns the handle of the current window.
Usage:
driver.current_window_handle
-
desired_capabilities
returns the drivers current desired capabilities being used
-
file_detector
-
log_types
Gets a list of the available log types
Usage:
driver.log_types
-
mobile
-
name
Returns the name of the underlying browser for this instance.
Usage:
name = driver.name
-
orientation
Gets the current orientation of the device
Usage:
orientation = driver.orientation
-
page_source
Gets the source of the current page.
Usage:
driver.page_source
-
switch_to
Returns:
-
SwitchTo: an object containing all options to switch focus into
Usage:
element = driver.switch_to.active_element alert = driver.switch_to.alert driver.switch_to.default_content() driver.switch_to.frame(‘frame_name’) driver.switch_to.frame(1) driver.switch_to.frame(driver.find_elements_by_tag_name(“iframe”)[0]) driver.switch_to.parent_frame() driver.switch_to.window(‘main’)
-
-
title
Returns the title of the current page.
Usage:
title = driver.title
-
window_handles
Returns the handles of all windows within the current session.
Usage:
driver.window_handles
Selenium笔记(4)Webelement
这是通过find方法找到的页面元素,此对象提供了多种方法,让我们可以与页面元素进行交互,例如点击、清空。
方法
-
clear()
清空如果当前元素中有文本,则清空文本
-
click()
单击点击当前元素
-
get_attribute(name)
获取属性获取元素的attribute/property
优先返回完全匹配属性名的值,如果不存在,则返回属性名中包含name的值。
-
screenshot(filename)
获取截图获取当前元素的截图,保存为png,最好用绝对路径,(谷歌上用不了,火狐可以)。
-
send_keys(value)
模拟键入元素给当前元素模拟输入
webelement的此方法在Chrome中应该是有bug,无法使用。
-
submit()
提交表单提交表单
在页面元素中,同样提供find_elements_by_*
等查找方法,可以将查找范围限制到当前元素。
属性
-
text
获取当前元素的文本内容
-
tag_name
获取当前元素的标签名
-
size
获取当前元素的大小
-
screenshot_as_png
将当前元素截屏并保存为png格式的二进制数据
-
screenshot_as_base64
将当前元素截屏并保存为base64编码的字符串
-
rect
获取一个包含当前元素大小和位置的字典
-
parent
获取当前元素的父节点
-
location
当前元素的位置
-
id
当前元素的id值,主要用来selenium内部使用,可以用来判断两个元素是否是同一个元素
Keys
我们经常需要模拟键盘的输入,当输入普通的值时,在send_keys()
方法中传入要输入的字符串就好了。
但是我们有时候会用到一些特殊的按键,这时候就需要用到我们的Keys类。
简例
from selenium.webdriver.common.keys import Keys
elem.send_keys(Keys.CONTROL, 'c')
属性
这个Keys
类有很多属性,每个属性对应一个按键。所有的属性如下所示:
ADD = u'\ue025'
ALT = u'\ue00a'
ARROW_DOWN = u'\ue015'
ARROW_LEFT = u'\ue012'
ARROW_RIGHT = u'\ue014' ARROW_UP = u'\ue013' BACKSPACE = u'\ue003' BACK_SPACE = u'\ue003' CANCEL = u'\ue001' CLEAR = u'\ue005' COMMAND = u'\ue03d' CONTROL = u'\ue009' DECIMAL = u'\ue028' DELETE = u'\ue017' DIVIDE = u'\ue029' DOWN = u'\ue015' END = u'\ue010' ENTER = u'\ue007' EQUALS = u'\ue019' ESCAPE = u'\ue00c' F1 = u'\ue031' F10 = u'\ue03a' F11 = u'\ue03b' F12 = u'\ue03c' F2 = u'\ue032' F3 = u'\ue033' F4 = u'\ue034' F5 = u'\ue035' F6 = u'\ue036' F7 = u'\ue037' F8 = u'\ue038' F9 = u'\ue039' HELP = u'\ue002' HOME = u'\ue011' INSERT = u'\ue016' LEFT = u'\ue012' LEFT_ALT = u'\ue00a' LEFT_CONTROL = u'\ue009' LEFT_SHIFT = u'\ue008' META = u'\ue03d' MULTIPLY = u'\ue024' NULL = u'\ue000' NUMPAD0 = u'\ue01a' NUMPAD1 = u'\ue01b' NUMPAD2 = u'\ue01c' NUMPAD3 = u'\ue01d' NUMPAD4 = u'\ue01e' NUMPAD5 = u'\ue01f' NUMPAD6 = u'\ue020' NUMPAD7 = u'\ue021' NUMPAD8 = u'\ue022' NUMPAD9 = u'\ue023' PAGE_DOWN = u'\ue00f' PAGE_UP = u'\ue00e' PAUSE = u'\ue00b' RETURN = u'\ue006' RIGHT = u'\ue014' SEMICOLON = u'\ue018' SEPARATOR = u'\ue026' SHIFT = u'\ue008' SPACE = u'\ue00d' SUBTRACT = u'\ue027' TAB = u'\ue004' UP = u'\ue013'
Selenium笔记(5)动作链
简介
一般来说我们与页面的交互可以使用Webelement
的方法来进行点击等操作。但是,有时候我们需要一些更复杂的动作,类似于拖动,双击,长按等等。
这时候就需要用到我们的Action Chains
(动作链)了。
简例
from selenium.webdriver import ActionChains
element = driver.find_element_by_name("source")
target = driver.find_element_by_name("target")
actions = ActionChains(driver)
actions.drag_and_drop(element, target)
actions.perform()
在导入动作链模块以后,需要声明一个动作链对象,在声明时将webdriver当作参数传入,并将对象赋值给一个actions变量。
然后我们通过这个actions变量,调用其内部附带的各种动作方法进行操作。
注:在调用各种动作方法后,这些方法并不会马上执行,而是会按你代码的顺序存储在ActionChains对象的队列中。当你调用perform()时,这些动作才会依次开始执行。
常用动作方法
-
click
(on_element=None)左键单击传入的元素,如果不传入的话,点击鼠标当前位置。
-
context_click
(on_element=None)右键单击。
-
double_click
(on_element=None)双击。
-
click_and_hold
(on_element=None)点击并抓起
-
drag_and_drop
(source, target)在source元素上点击抓起,移动到target元素上松开放下。
-
drag_and_drop_by_offset
(source, xoffset, yoffset)在source元素上点击抓起,移动到相对于source元素偏移xoffset和yoffset的坐标位置放下。
-
send_keys
(*keys_to_send)将键发送到当前聚焦的元素。
-
send_keys_to_element
(element, *keys_to_send)将键发送到指定的元素。
-
reset_actions
()清除已经存储的动作。
完整文档
class selenium.webdriver.common.action_chains.``ActionChains
(driver)
Bases: object
ActionChains are a way to automate low level interactions such as mouse movements, mouse button actions, key press, and context menu interactions. This is useful for doing more complex actions like hover over and drag and drop.
Generate user actions.
When you call methods for actions on the ActionChains object, the actions are stored in a queue in the ActionChains object. When you call perform(), the events are fired in the order they are queued up.
ActionChains can be used in a chain pattern:
menu = driver.find_element_by_css_selector(".nav")
hidden_submenu = driver.find_element_by_css_selector(".nav #submenu1")
ActionChains(driver).move_to_element(menu).click(hidden_submenu).perform()
Or actions can be queued up one by one, then performed.:
menu = driver.find_element_by_css_selector(".nav")
hidden_submenu = driver.find_element_by_css_selector(".nav #submenu1")
actions = ActionChains(driver)
actions.move_to_element(menu)
actions.click(hidden_submenu)
actions.perform()
Either way, the actions are performed in the order they are called, one after another.
-
__init__
(driver)Creates a new ActionChains.
Args:
-
driver: The WebDriver instance which performs user actions.
-
-
click
(on_element=None)Clicks an element.
Args:
-
on_element: The element to click. If None, clicks on current mouse position.
-
-
click_and_hold
(on_element=None)Holds down the left mouse button on an element.
Args:
-
on_element: The element to mouse down. If None, clicks on current mouse position.
-
-
context_click
(on_element=None)Performs a context-click (right click) on an element.
Args:
-
on_element: The element to context-click. If None, clicks on current mouse position.
-
-
double_click
(on_element=None)Double-clicks an element.
Args:
-
on_element: The element to double-click. If None, clicks on current mouse position.
-
-
drag_and_drop
(source, target)Holds down the left mouse button on the source element,then moves to the target element and releases the mouse button.
Args:
-
source: The element to mouse down.
-
target: The element to mouse up.
-
-
drag_and_drop_by_offset
(source, xoffset, yoffset)Holds down the left mouse button on the source element,then moves to the target offset and releases the mouse button.
Args:
-
source: The element to mouse down.
-
xoffset: X offset to move to.
-
yoffset: Y offset to move to.
-
-
key_down
(value, element=None)Sends a key press only, without releasing it.Should only be used with modifier keys (Control, Alt and Shift).
Args:
-
value: The modifier key to send. Values are defined in Keys class.
-
element: The element to send keys. If None, sends a key to current focused element.
Example, pressing ctrl+c:
ActionChains(driver).key_down(Keys.CONTROL).send_keys('c').key_up(Keys.CONTROL).perform()
-
-
key_up
(value, element=None)Releases a modifier key.
Args:
-
value: The modifier key to send. Values are defined in Keys class.
-
element: The element to send keys. If None, sends a key to current focused element.
Example, pressing ctrl+c:
ActionChains(driver).key_down(Keys.CONTROL).send_keys('c').key_up(Keys.CONTROL).perform()
-
-
move_by_offset
(xoffset, yoffset)Moving the mouse to an offset from current mouse position.
Args:
-
xoffset: X offset to move to, as a positive or negative integer.
-
yoffset: Y offset to move to, as a positive or negative integer.
-
-
move_to_element
(to_element)Moving the mouse to the middle of an element.
Args:
-
to_element: The WebElement to move to.
-
-
move_to_element_with_offset
(to_element, xoffset, yoffset)Move the mouse by an offset of the specified element.Offsets are relative to the top-left corner of the element.
Args:
-
to_element: The WebElement to move to.
-
xoffset: X offset to move to.
-
yoffset: Y offset to move to.
-
-
pause
(seconds)Pause all inputs for the specified duration in seconds
-
perform
()Performs all stored actions.
-
release
(on_element=None)Releasing a held mouse button on an element.
Args:
-
on_element: The element to mouse up. If None, releases on current mouse position.
-
-
reset_actions
()Clears actions that are already stored on the remote end.
-
send_keys
(*keys_to_send)Sends keys to current focused element.
Args:
-
keys_to_send: The keys to send. Modifier keys constants can be found in the ‘Keys’ class.
-
-
send_keys_to_element
(element, *keys_to_send)Sends keys to an element.
Args:
-
element: The element to send keys.
-
keys_to_send: The keys to send. Modifier keys constants can be found in the ‘Keys’ class.
-
Selenium笔记(6)等待
简介
在selenium操作浏览器的过程中,每一次请求url,selenium都会等待页面加载完毕以后,才会将操作权限再次交给我们的程序。
但是,由于ajax和各种JS代码的异步加载问题,所以我们在使用selenium的时候常常会遇到操作的元素还没有加载出来,就会引发报错。为了解决这个问题,Selenium
提供了几种等待的方法,让我们可以等待元素加载完毕后,再进行操作。
显式等待
例子
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.Chrome() driver.get("http://somedomain/url_that_delays_loading") try: element = WebDriverWait(driver, 10).until( EC.presence_of_element_located((By.ID, "myDynamicElement")) ) finally: driver.quit()
在这个例子中,我们在查找一个元素的时候,不再使用find_element_by_*
这样的方式来查找元素,而是使用了WebDriverWait
。
try代码块中的代码的意思是:在抛出元素不存在异常之前,最多等待10秒。在这10秒中,WebDriverWait
会默认每500ms运行一次until之中的内容,而until中的EC.presence_of_element_located
则是检查元素是否已经被加载,检查的元素则通过By.ID
这样的方式来进行查找。
就是说,在10秒内,默认每0.5秒检查一次元素是否存在,存在则将元素赋值给element这个变量。如果超过10秒这个元素仍不存在,则抛出超时异常。
Expected Conditions
Expected Conditions
这个类提供了很多种常见的检查条件可以供我们使用。
-
title_is
-
title_contains
-
presence_of_element_located
-
visibility_of_element_located
-
visibility_of
-
presence_of_all_elements_located
-
text_to_be_present_in_element
-
text_to_be_present_in_element_value
-
frame_to_be_available_and_switch_to_it
-
invisibility_of_element_located
-
element_to_be_clickable
-
staleness_of
-
element_to_be_selected
-
element_located_to_be_selected
-
element_selection_state_to_be
-
element_located_selection_state_to_be
-
alert_is_present
例子:
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
# 等待直到元素可以被点击 element = wait.until(EC.element_to_be_clickable((By.ID, 'someid')))
隐式等待
隐式等待指的是,在webdriver
中进行find_element
这一类查找操作时,如果找不到元素,则会默认的轮询等待一段时间。
这个值默认是0,可以通过以下方式进行设置:
from selenium import webdriver
driver = webdriver.Chrome()
driver.implicitly_wait(10) # 单位是秒
driver.get("http://somedomain/url_that_delays_loading") myDynamicElement = driver.find_element_by_id("myDynamicElement")
Selenium笔记(7)异常
完整文档
Exceptions that may happen in all the webdriver code.
-
-
exception
selenium.common.exceptions.``ElementClickInterceptedException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
The Element Click command could not be completed because the element receiving the events is obscuring the element that was requested clicked. -
exception
selenium.common.exceptions.``ElementNotInteractableException
(msg=None,screen=None, stacktrace=None)¶Bases:
selenium.common.exceptions.InvalidElementStateException
Thrown when an element is present in the DOM but interactions with that element will hit another element do to paint order -
exception
selenium.common.exceptions.``ElementNotSelectableException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.InvalidElementStateException
Thrown when trying to select an unselectable element.For example, selecting a ‘script’ element. -
exception
selenium.common.exceptions.``ElementNotVisibleException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.InvalidElementStateException
Thrown when an element is present on the DOM, but it is not visible, and so is not able to be interacted with.Most commonly encountered when trying to click or read text of an element that is hidden from view. -
exception
selenium.common.exceptions.``ErrorInResponseException
(response,msg)Bases:
selenium.common.exceptions.WebDriverException
Thrown when an error has occurred on the server side.This may happen when communicating with the firefox extension or the remote driver server.__init__
(response, msg) -
exception
selenium.common.exceptions.``ImeActivationFailedException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when activating an IME engine has failed. -
exception
selenium.common.exceptions.``ImeNotAvailableException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when IME support is not available. This exception is thrown for every IME-related method call if IME support is not available on the machine. -
exception
selenium.common.exceptions.``InsecureCertificateException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Navigation caused the user agent to hit a certificate warning, which is usually the result of an expired or invalid TLS certificate. -
exception
selenium.common.exceptions.``InvalidArgumentException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
The arguments passed to a command are either invalid or malformed. -
exception
selenium.common.exceptions.``InvalidCookieDomainException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when attempting to add a cookie under a different domain than the current URL. -
exception
selenium.common.exceptions.``InvalidCoordinatesException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
The coordinates provided to an interactions operation are invalid. -
exception
selenium.common.exceptions.``InvalidElementStateException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
-
exception
selenium.common.exceptions.``InvalidSelectorException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.NoSuchElementException
Thrown when the selector which is used to find an element does not return a WebElement. Currently this only happens when the selector is an xpath expression and it is either syntactically invalid (i.e. it is not a xpath expression) or the expression does not select WebElements (e.g. “count(//input)”). -
exception
selenium.common.exceptions.``InvalidSessionIdException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Occurs if the given session id is not in the list of active sessions, meaning the session either does not exist or that it’s not active. -
exception
selenium.common.exceptions.``InvalidSwitchToTargetException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when frame or window target to be switched doesn’t exist. -
exception
selenium.common.exceptions.``JavascriptException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
An error occurred while executing JavaScript supplied by the user. -
exception
selenium.common.exceptions.``MoveTargetOutOfBoundsException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when the target provided to the ActionsChains move() method is invalid, i.e. out of document. -
exception
selenium.common.exceptions.``NoAlertPresentException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when switching to no presented alert.This can be caused by calling an operation on the Alert() class when an alert is not yet on the screen. -
exception
selenium.common.exceptions.``NoSuchAttributeException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when the attribute of element could not be found.You may want to check if the attribute exists in the particular browser you are testing against. Some browsers may have different property names for the same property. (IE8’s .innerText vs. Firefox .textContent) -
exception
selenium.common.exceptions.``NoSuchCookieException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
No cookie matching the given path name was found amongst the associated cookies of the current browsing context’s active document. -
exception
selenium.common.exceptions.``NoSuchElementException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when element could not be found.If you encounter this exception, you may want to check the following:Check your selector used in your find_by…Element may not yet be on the screen at the time of the find operation, (webpage is still loading) see selenium.webdriver.support.wait.WebDriverWait() for how to write a wait wrapper to wait for an element to appear. -
exception
selenium.common.exceptions.``NoSuchFrameException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.InvalidSwitchToTargetException
Thrown when frame target to be switched doesn’t exist. -
exception
selenium.common.exceptions.``NoSuchWindowException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.InvalidSwitchToTargetException
Thrown when window target to be switched doesn’t exist.To find the current set of active window handles, you can get a list of the active window handles in the following way:print driver.window_handles
-
exception
selenium.common.exceptions.``RemoteDriverServerException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
-
exception
selenium.common.exceptions.``ScreenshotException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
A screen capture was made impossible. -
exception
selenium.common.exceptions.``SessionNotCreatedException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
A new session could not be created. -
exception
selenium.common.exceptions.``StaleElementReferenceException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when a reference to an element is now “stale”.Stale means the element no longer appears on the DOM of the page.Possible causes of StaleElementReferenceException include, but not limited to:You are no longer on the same page, or the page may have refreshed since the element was located.The element may have been removed and re-added to the screen, since it was located. Such as an element being relocated. This can happen typically with a javascript framework when values are updated and the node is rebuilt.Element may have been inside an iframe or another context which was refreshed. -
exception
selenium.common.exceptions.``TimeoutException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when a command does not complete in enough time. -
exception
selenium.common.exceptions.``UnableToSetCookieException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when a driver fails to set a cookie. -
exception
selenium.common.exceptions.``UnexpectedAlertPresentException
(msg=None,screen=None, stacktrace=None, alert_text=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when an unexpected alert is appeared.Usually raised when when an expected modal is blocking webdriver form executing any more commands.__init__
(msg=None, screen=None,stacktrace=None, alert_text=None) -
exception
selenium.common.exceptions.``UnexpectedTagNameException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
Thrown when a support class did not get an expected web element. -
exception
selenium.common.exceptions.``UnknownMethodException
(msg=None,screen=None, stacktrace=None)Bases:
selenium.common.exceptions.WebDriverException
The requested command matched a known URL but did not match an method for that URL. -
exception
selenium.common.exceptions.``WebDriverException
(msg=None,screen=None, stacktrace=None)Bases:
exceptions.Exception
Base webdriver exception.__init__
(msg=None, screen=None,stacktrace=None)
-
Selenium笔记(8)常见的坑
用Xpath查找数据时无法直接获取节点属性
通常在我们使用xpath时,可以使用@class
的方式直接获取节点的属性,如下所示:
page.xpath('//div/a/@class')
但在Selenium
中不支持这种用法,只能在找到节点后,使用get_attribute(name)
方法来获取属性:
page.xpath('//div/a').get_attribute('class')
同样的,Selenium
同样不支持Xpath中的string()
,text()
这类的方法,只能获取元素节点。
使用了WebDriverWait以后仍然无法找到元素
有很多时候,一个简单的元素,明明也加了显式等待,但就是找不到,代码在仔细查看过后也没有问题后,多半是以下这几种情况:
-
由于分辨率设置的原因,查找的元素当前是不可见的。
-
某些页面的元素是需要向下滚动页面才会加载的。
-
由于某些其他元素的短暂遮挡,所以无法定位到。
1.分辨率原因
这时候应该设置好分辨率,使当前元素能够显示到页面中。
2.需要滚动页面
有些页面为了性能的考虑,页面下方不在当前屏幕中的元素是不会加载的,只有当页面向下滚动时才会继续加载。
而selenium本身不提供向下滚动的方法,所以我们需要去用JS去滚动页面:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
网上查到的一些滚动方式在Chrome上无效。但这一句是有效的。
3.由于其他元素的遮挡
有时候因为一些弹出元素的原因,如果还使用EC.presence_of_element_located()
的话,我们需要定位的元素就无法被找到,这个时候我们就应该改变我们判断元素的方法:
element = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.XPATH, ''))
)
使用EC.visibility_of_element_located()
方法可以在等待到当前元素可见后,才获取元素。
在我们找不到元素,或者跟元素无法交互时,应该多去根据当前的情况,灵活选择显式等待的判断方式。