最近需要使用一下selenium,刚运行就报错了。。。
前提准备:
1.安装selenium
2.下载chrome对应版本的chromedriver
代码就是一个简单的demo:
from selenium import webdriver
import time
browser = webdriver.Chrome()
browser.get('http://www.baidu.com/')
time.sleep(10)
运行报错:
网上说要把chromedriver放到环境变量,放进去还是报错!!
然后就直接看源码吧:
这个过程很繁琐,很枯燥,嫌废话连篇请直接翻到文末解决方法~~
这个过程很繁琐,很枯燥,嫌废话连篇请直接翻到文末解决方法~~
这个过程很繁琐,很枯燥,嫌废话连篇请直接翻到文末解决方法~~
1.代码报错入口:
browser = webdriver.Chrome()
2.首先看报错被挂起的地方:
File “E:\APP_Install\python_install\lib\site-packages\selenium\webdriver\common\service.py”, line 83, in start
os.path.basename(self.path), self.start_error_message)
对应代码:
def start(self):
"""
Starts the Service.
:Exceptions:
- WebDriverException : Raised either when it can't start the service
or when it can't connect to the service
"""
try:
cmd = [self.path]
cmd.extend(self.command_line_args())
self.process = subprocess.Popen(cmd, env=self.env,
close_fds=system() != 'Windows',
stdout=self.log_file,
stderr=self.log_file,
stdin=PIPE,
creationflags=self.creationflags)
except TypeError:
raise
except OSError as err:
if err.errno == errno.ENOENT:
raise WebDriverException(
"'%s' executable needs to be in PATH. %s" % (
os.path.basename(self.path), self.start_error_message)
)
elif err.errno == errno.EACCES:
raise WebDriverException(
"'%s' executable may have wrong permissions. %s" % (
os.path.basename(self.path), self.start_error_message)
)
else:
raise
3.是start()方法的try失败了,这里可以看到是self.path的问题,再往上回溯看调用start方法的地方,self.path = executable找到传值 executable的地方:
File “E:\APP_Install\python_install\lib\site-packages\selenium\webdriver\chromium\webdriver.py”, line 90, in init
self.service.start()
相关代码如下:
class ChromiumDriver(RemoteWebDriver):
"""
Controls the WebDriver instance of ChromiumDriver and allows you to drive the browser.
"""
def __init__(self, browser_name, vendor_prefix,
port=DEFAULT_PORT, options: BaseOptions = None, service_args=None,
desired_capabilities=None, service_log_path=DEFAULT_SERVICE_LOG_PATH,
service: Service = None, keep_alive=DEFAULT_KEEP_ALIVE):
if desired_capabilities:
warnings.warn('desired_capabilities has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
if port != DEFAULT_PORT:
warnings.warn('port has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
self.port = port
if service_log_path != DEFAULT_SERVICE_LOG_PATH:
warnings.warn('service_log_path has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
if keep_alive != DEFAULT_KEEP_ALIVE and type(self) == __class__:
warnings.warn('keep_alive has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
else:
keep_alive = True
self.vendor_prefix = vendor_prefix
_ignore_proxy = None
if not options:
options = self.create_options()
if desired_capabilities:
for key, value in desired_capabilities.items():
options.set_capability(key, value)
if options._ignore_local_proxy:
_ignore_proxy = options._ignore_local_proxy
if not service:
raise AttributeError('service cannot be None')
self.service = service
self.service.start()
4.粗看似乎没什么问题,self.service.start()前面并没有进行什么特殊定义和处理,并没有进行传参"self.path",这时候就要考虑python 的继承问题了,我进入它的父类RemoteWebDriver也没有发现导致该报错的问题,这时就要考虑子类了,我们此时还忽略了一个问题:报错入口: browser = webdriver.Chrome()
webdriver.Chrome()代码如下:
E:\APP_Install\python_install\Lib\site-packages\selenium\webdriver\chrome\webdriver.py
class WebDriver(ChromiumDriver):
def __init__(self, executable_path=DEFAULT_EXECUTABLE_PATH, port=DEFAULT_PORT,
options: Options = None, service_args=None,
desired_capabilities=None, service_log_path=DEFAULT_SERVICE_LOG_PATH,
chrome_options=None, service: Service = None, keep_alive=DEFAULT_KEEP_ALIVE):
if executable_path != 'chromedriver':
warnings.warn('executable_path has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
if chrome_options:
warnings.warn('use options instead of chrome_options',
DeprecationWarning, stacklevel=2)
options = chrome_options
if keep_alive != DEFAULT_KEEP_ALIVE:
warnings.warn('keep_alive has been deprecated, please pass in a Service object',
DeprecationWarning, stacklevel=2)
else:
keep_alive = True
if not service:
service = Service(executable_path, port, service_args, service_log_path)
super(WebDriver, self).__init__(DesiredCapabilities.CHROME['browserName'], "goog",
port, options,
service_args, desired_capabilities,
service_log_path, service, keep_alive)
5.这不就连起来了吗,子类调用父类的__init__方法:
super(WebDriver, self).__init__ #和第3步相连
6.这个时候我们要关注Service类的传参,即super(WebDriver, self).__init__的倒数第二个参数service:
service = Service(executable_path, port, service_args, service_log_path)
对应代码:
E:\APP_Install\python_install\Lib\site-packages\selenium\webdriver\chrome\service.py
class Service(service.ChromiumService):
def __init__(self, executable_path: str = DEFAULT_EXECUTABLE_PATH,
port: int = 0, service_args: List[str] = None,
log_path: str = None, env: dict = None):
super(Service, self).__init__(
executable_path,
port,
service_args,
log_path,
env,
"Please see https://chromedriver.chromium.org/home")
这里有同学可能有疑问了?
此处的Service和第2步的Service不是一个类啊!
是的,这里还有两层嵌套关系,直接ctrl点击进去就可以看:
1.super(Service, self).__init__
2.service.Service.__init__(self, executable_path, port=port, env=env, start_error_message=start_error_message)
7.根据第6步的关系我们知道:在第2步Service出现问题的self.path就是这里面的executable_path,我们来看下这个参数的定义:
executable_path=DEFAULT_EXECUTABLE_PATH
8.刨根究底来看这个常量的定义:
DEFAULT_EXECUTABLE_PATH = "chromedriver"
9.这不就出来了,定义的executable_path 解释器没有找到,就直接使用chromedriver了,所以导致报错,我们确认一下这个参数的定义:
- executable_path - Deprecated: path to the executable. If the default is used it assumes the executable is in the $PATH
10.至此,有两个解决办法:
1.直接修改DEFAULT_EXECUTABLE_PATH 为你机器的chromedriver的路径
哈哈,但是这个方法只适用于你的机器了,因此我们考虑脚本传参executable_path:
那不就是第一个参数吗?直接传进去,大功告成:
from selenium import webdriver
import time
browser = webdriver.Chrome("F:\google_download\chromedriver_99.0/chromedriver.exe")
browser.get('http://www.baidu.com/')
time.sleep(10)