Puppeteer
随着phantomjs的bug越来越多,也无人维护后,谷歌推出的Puppeteer已达60Kstar,替代是毫无疑问。如果不考虑浏览器兼容问题,Puppeteer的简洁的接口,更简单的Js语法,做自动化测试也是不错的方案。
Puppeteer是什么?
Puppeteer是一个Node库,它提供了高级API来通过DevTools协议控制Chrome或Chromium,简单理解成我们日常使用的Chrome的无界面版本,可以使用js接口进行进行操控。意味凡是Chrome浏览器能干的事情,Puppeteer都能出色的完成,比如:
安装Puuueteer
依赖npm,建议使用cnpmnpm i puppeteer
或者yarn add puppeteer
安装好后,会自动安装Chromium,大约100多M,目前puppeteer的版本已到3.0.2。
Puppeteer API结构
Puppeteer API是分层的,和浏览器保持一致,如图所示(褪色实体当前未在Puppeteer中表示):
Puppeteer API结构
Puppeteer 入门教程和实践
Puppeteer 官方推荐的是使用高版本 Node 的 async/await 语法
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://www.baidu.com'); //打开一个网页 // 执行其他脚步... await browser.close(); })();
网站截图:
const puppeteer = require('puppeteer'); const iPhone = puppeteer.devices['iPhone 8']; //指定为手机端 (async () => { //const browser = await puppeteer.launch(); const browser = await puppeteer.launch({ headless: false //可以看到打开浏览器效果,默认值true }); const page = await browser.newPage(); await page.emulate(iPhone); await page.goto('https://www.baidu.com'); //quality 指定照片质量0-100,不支持png格式图片 await page.screenshot({ path: 'baidu.jpg', quality: 100, fullPage: true }); await browser.close(); })();
生成PDF:
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); //waitUntil:networkidle2 会一直等待,直到页面加载后同时没有存在 2 个以上的资源请求,这个种状态持续至少 500 ms await page.goto('https://www.baidu.com', {waitUntil: 'networkidle2'}); await page.pdf({path: 'baidu.pdf', format: 'A4'}); await browser.close(); })();
测试Chrome扩展程序:
Chrome / Chromium中的扩展程序当前仅在non-headless模式下工作
const puppeteer = require('puppeteer'); (async () => { const pathToExtension = require('path').join(__dirname, 'my-extension'); const browser = await puppeteer.launch({ headless: false, args: [ `--disable-extensions-except=${pathToExtension}`, `--load-extension=${pathToExtension}` ] }); const targets = await browser.targets(); const backgroundPageTarget = targets.find(target => target.type() === 'background_page'); const backgroundPage = await backgroundPageTarget.page(); // Test the background page as you would any other page. await browser.close(); })();
高级版抓图淘宝网站
const puppeteer = require('puppeteer');
(async() => {
// 启动Chromium
const browser = await puppeteer.launch({
ignoreHTTPSErrors: true,
headless: false,
args: ['--no-sandbox']
});
// 打开新页面
const page = await browser.newPage();
// 设置页面分辨率
await page.setViewport({
width: 1920,
height: 1080
});
let request_url = 'https://www.taobao.com';
// 访问
await page.goto(request_url, {
waitUntil: 'domcontentloaded'
}).catch(err => console.log(err));
await page.waitFor(1000);
let title = await page.title();
console.log(title);
// 网页加载最大高度
const max_height_px = 20000;
// 滚动高度
let scrollStep = 1080;
let height_limit = false;
let mValues = {
'scrollEnable': true,
'height_limit': height_limit
};
while (mValues.scrollEnable) {
mValues = await page.evaluate((scrollStep, max_height_px, height_limit) =>{
if (document.scrollingElement) {
let scrollTop = document.scrollingElement.scrollTop;
document.scrollingElement.scrollTop = scrollTop + scrollStep;
if (null != document.body && document.body.clientHeight > max_height_px) { height_limit = true; } else if (document.scrollingElement.scrollTop + scrollStep > max_height_px) { height_limit = true; } let scrollEnableFlag = false; if (null != document.body) { scrollEnableFlag = document.body.clientHeight > scrollTop + 1081 && !height_limit; } else { scrollEnableFlag = document.scrollingElement.scrollTop + scrollStep > scrollTop + 1081 && !height_limit; } return { 'scrollEnable': scrollEnableFlag, 'height_limit': height_limit, 'document_scrolling_Element_scrollTop': document.scrollingElement.scrollTop };
}
},
scrollStep, max_height_px, height_limit);
await sleep(800);
}
try {
await page.screenshot({
path: "taobao1.jpg",
quality: 10,
fullPage: true
}).catch(err => {
console.log('截图失败');
console.log(err);
});
await page.waitFor(5000);
} catch (e) {
console.log('执行异常');
} finally {
await browser.close();
}
})();
//延时函数
function sleep(delay) {
return new Promise((resolve, reject) => {
setTimeout(() => {
try {
resolve(1)
} catch (e) {
reject(0)
}
}, delay)
})
}
原文链接:https://www.ffeeii.com/1824.html