playwright安装:
python为3.7以上版本
如果没安装python或版本过低,可以去官网下载 python官网链接
安装playwright及Chromium、Firefox和WebKit的浏览器二进制文件
pip安装
pip install --upgrade pip
pip install playwright
playwright install
conda安装
conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install
安装成功后,导入模块,简单(同步)运行一个案例:
from playwright.sync_api import sync_playwright
with sync_playwright() as sp:
browser = sp.chromium.launch()
page = browser.new_page()
page.goto("http://playwright.dev")
print(page.title())
browser.close()
也可以用异步的形式:
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as ap:
browser = await ap.chromium.launch()
page = await browser.new_page()
await page.goto("http://playwright.dev")
print(await page.title())
await browser.close()
执行前可以在初始化浏览器对象时根据需要加参数
# playwright默认是无头模式,可以设置headless参数打开浏览器UI界面,slow_mo设置请求速度
sp.chromium.launch(headless=False, slow_mo=50)
# (异步)等待页面加载操作,可以用来替代time.sleep()操作:
page.wait_for_timeout(5000)
使用playwright录制脚本,根据浏览器操作,自动生成代码
终端执行:
playwright codegen https://www.bilibili.com
右边页面为Inspector,时Playwright 的一个GUI 工具,可帮助创作和调试 Playwright 脚本
python交互模式与IDE中类似
需要注意的是,由于不是用playwright上下文创建的,需要在最后手动关闭
同步模式:
>>> from playwright.sync_api import sync_playwright
>>> playwright = sync_playwright().start()
# Use playwright.chromium, playwright.firefox or playwright.webkit
# Pass headless=False to launch() to see the browser UI
>>> browser = playwright.chromium.launch()
>>> page = browser.new_page()
>>> page.goto("http://whatsmyuseragent.org/")
>>> page.screenshot(path="example.png")
>>> browser.close()
>
>>> playwright.stop()
异步模式
python -m asyncio
启动程序
>>> from playwright.async_api import async_playwright
>>> playwright = await async_playwright().start()
>>> browser = await playwright.chromium.launch()
>>> page = await browser.new_page()
>>> await page.goto("http://whatsmyuseragent.org/")
>>> await page.screenshot(path="example.png")
>>> await browser.close()
>
>>> await playwright.stop()
另外还可以使用playwright和pyinstaller结合生成执行文件
PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
pyinstaller -F main.py
没有pyinstaller
模块的话可以手动安装
pip install pyinstaller
官网原话:
incompatible with SelectorEventLoop of asyncio on Windows
Playwright runs the driver in a subprocess, so it requires ProactorEventLoop of asyncio on Windows because SelectorEventLoop does not supports async subprocesses.
On Windows Python 3.7, Playwright sets the default event loop to ProactorEventLoop as it is default on Python 3.8+.
Threading
Playwright’s API is not thread-safe. If you are using Playwright in a multi-threaded environment, you should create a playwright instance per thread. See threading issue for more details.
在windows上使用asyncio异步执行程序时,需要用ProactorEventLoop替代SelectorEventLoop,因为SelectorEventLoop 不支持异步子进程。需要注意的是,在多线程中使用playwright时,需要给每个线程初始化一个playwright对象