webDriver以及Selenium使用总结

maven依赖

    
        2.53.1
    

     
        
            org.seleniumhq.selenium
            selenium-java
            ${selenium.version}
        
        
            org.seleniumhq.selenium
            selenium-remote-driver
            ${selenium.version}
        
        
            org.seleniumhq.selenium
            selenium-server
            ${selenium.version}
        

        
        
            log4j
            log4j
            1.2.17
        
        
            org.apache.logging.log4j
            log4j-core
            2.17.0
        

本地化部署

webDriver以及Selenium使用总结_第1张图片

驱动下载

需要下载与chrom版本对应的chromedriver

下载位置

http://chromedriver.storage.googleapis.com/index.html

webDriver以及Selenium使用总结_第2张图片

如果被墙住了,国内:

https://npm.taobao.org/mirrors/chromedriver/

查看chrome版本

chrome://version/

点击帮助-关于Google Chrome 也可以查看,但是会引起浏览器更新,不到万一不建议更新浏览器

使用样例

    public  static void startChrome(){
        System.out.println("start chrome browser...");
        System.setProperty("webdriver.chrome.driver","C:\\chromedriver_win32_3_102.0.5\\chromedriver.exe");//指定驱动路径
        DesiredCapabilities capabilities = DesiredCapabilities.chrome();
        WebDriver driver = new ChromeDriver(capabilities);
        driver.get("http://www.baidu.com/");
        System.out.println("start firefox browser succeed...");
        driver.close();
    }

yum本地部署

Centos Chrome Chromdriver python Xvfb 无界面模式,需要centos7以上环境

安装 Xvfb

Xvfb 是一个实现了 X 服务器协议的 虚拟显示服务器,运行在内存当中,如果要运行浏览器,必须要用 X 显示服务,所以安装 Xvfb , 安装如下。

yum install Xvfb -y
yum install libXfont -y
yum install xorg-x11-fonts* -y

为防止依赖缺失,发生莫名其妙的问题,可以再执行:

yum install zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl-devel xz xz-devel -y


安装 google-chrome

wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
yum install google-chrome-stable_current_x86_64.rpm

selenium之 chromedriver与chrome版本映射表

https://sites.google.com/a/chromium.org/chromedriver/downloads

如果被墙住了,国内:

https://npm.taobao.org/mirrors/chromedriver/


启动 Xvfb 

Xvfb -ac :99 -screen 0 1280x1024x16 & export DISPLAY=:99

手动测试下,启动google-chrome
[root@linjie test]# google-chrome

[9311:9311:0521/164942.275865:ERROR:zygote_host_impl_linux.cc(88)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.


查看chrome版本

(base) [root@localhost ~]# google-chrome --version
Google Chrome 109.0.5414.74 

找不到DevToolsActivePort的报错处理:

selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
  (unknown error: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)

1.下载驱动

wget https://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_linux64.zip
unzip chromedriver_linux64.zip

2.复制驱动到指定位置

chmod +x chromedriver
mv -f chromedriver /usr/local/share/chromedriver
ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
ln -s /usr/local/share/chromedriver /usr/bin/chromedriver

3.将chromedriver加入环境变量

export PATH=/usr/local/share/chromedriver:$PATH

4.添加option.add_argument('--no-sandbox')

添加option.add_argument('--no-sandbox') 即可解决找不到DevToolsActivePort的报错

在python中测试一下

#coding=utf-8
from selenium import webdriver

option = webdriver.ChromeOptions()
option.add_argument('headless')
option.add_argument('--no-sandbox')
driver = webdriver.Chrome(chrome_options=option)
driver.get('https://www.google.com')
print(driver.title)
#最后关闭一下
driver.quit()

来源:

安装Chrome(Headless)并在python中使用 - 腾讯云开发者社区-腾讯云

Grid部署

webDriver以及Selenium使用总结_第3张图片

docker部署grid

1.启用ipv4转发

先在服务器安装好docker服务,并建议设置好国内加速器(我用的是阿里)。
由于docker需要开启端口映射,所以服务器需要启用ipv4转发:

vim /etc/sysctl.conf
#添加以下配置
net.ipv4.ip_forward=1
#重启网络服务
systemctl restart network
#查看结果为1表示已开启
sysctl net.ipv4.ip_forward

2.下载镜像

#下载hub镜像

docker pull selenium/hub

#下载chrome镜像 默认就是headless模式

docker pull selenium/node-chrome

#查看到下载到本地的镜像

docker images

3.创建网络

docker network create grid


4.创建hub容器

若不指定版本此过程会pull最新版本镜像并进行装载

docker run -d -p 4442-4444:4442-4444 --net grid --name selenium-hub selenium/hub

5.创建node容器

docker run -d --net grid -e SE_EVENT_BUS_HOST=selenium-hub  --shm-size=2g  -e SE_EVENT_BUS_PUBLISH_PORT=4442  -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443  --name selenium-node-chrme1 selenium/node-chrome

#若需多个node,可重复执行以上命令,修改--name即可,例如:

docker run -d --net grid -e SE_EVENT_BUS_HOST=selenium-hub  --shm-size=2g  -e SE_EVENT_BUS_PUBLISH_PORT=4442  -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443  --name selenium-node-chrme2 selenium/node-chrome 

6.访问控制台

http://192.168.10.130:4444/ui#

webDriver以及Selenium使用总结_第4张图片

   使用样例

    public  static void startRemoteChrome() throws MalformedURLException {
        System.out.println("start chrome browser...");
        DesiredCapabilities capabilities = DesiredCapabilities.chrome();
        WebDriver driver = new RemoteWebDriver(new URL("http://192.168.10.130:4444//wd/hub/"),capabilities);
        driver.get("http://www.baidu.com/");
        System.out.println("start firefox browser succeed...");
        driver.close();
    }

DesiredCapabilities配置

  public static WebDriver initDriver(String ip, String port, String domain, String cookieStr) {
        DesiredCapabilities capabilities = DesiredCapabilities.chrome();
        capabilities.setCapability("applicationCacheEnabled", true);

        capabilities.setCapability(CapabilityType.ForSeleniumServer.AVOIDING_PROXY, true);
        capabilities.setCapability(CapabilityType.ForSeleniumServer.ONLY_PROXYING_SELENIUM_TRAFFIC, true);
        // System.setProperty("http.nonProxyHosts", "localhost");
        Proxy proxy = new Proxy();
        proxy.setHttpProxy(ip + ":" + port).setFtpProxy(ip + ":" + port).setSslProxy(ip + ":" + port);
        capabilities.setCapability(CapabilityType.PROXY, proxy);
        ChromeOptions options = new ChromeOptions();
        options.addArguments("--proxy-server=http://" + ip + ":" + port); // 切换代理
        options.addArguments("--lang=" + "zh-CN"); // 切换浏览器语言
//		options.addArguments("--disable-javascript"); //禁用js
//		options.addArguments("--disable-plugins"); //禁用插件
//		options.addArguments("--disable-images"); //禁用图像
//		options.addArguments("--headless"); //-headless模式
//		options.addArguments("--disable-gpu"); //禁用GPU加速
        options.addArguments("--test-type", /* "--start-maximized", */ "no-default-browser-check"); // 意思好像是测试模式,最大化浏览器并且默认不检查浏览器
        //options.addArguments("--test-type", "--ignore-certificate-errors"); // 设置忽略 Chrome 浏览器证书错误报警提示
        // options.addArguments("user-data-dir=C:/Users/VULCAN/AppData/Local/Google/Chrome/User Data");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
        //options.addArguments("user-data-dir=C:\\dev\\scoped_dir1808_11424");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
        // options.addArguments("--test-type", "--incognito"); //启动进入隐身模式
        // options.addArguments("--test-type", "--disable-plugins"); //禁用插件
        //添加扩展的方法,将crx文件所在的路径添加进去
        // options.addExtensions(new File("C:\\Users\\swang\\AppData\\Local\\Google\\Chrome\\UserData\\Default\\Extensions\\ijaobnmmgonppmablhldddpfmgpklbfh\\1.6.0_0.crx"));
        //非windows环境默认用headless模式
        if(!windows){
            options.addArguments("--widows-size=1920,5000"); //为了完整截图
            options.addArguments("--start-fullscreen");
            options.addArguments("--start-maximized");
            //headless配置
            options.addArguments("--headless"); //-headless模式
            options.addArguments("--disable-gpu"); //禁用GPU加速
            options.addArguments("--no-sandbox"); //关闭沙盒模式
        }
        options.addArguments("--user-agent=" + userAgent);
        capabilities.setCapability(ChromeOptions.CAPABILITY, options);
        //WebDriver driver = new ChromeDriver(options); //本地化部署
        //http://x.x.x.x:4444//wd/hub/  将url改成远程地址可以调用远程服务
        WebDriver driver = new RemoteWebDriver(service.getUrl(), capabilities);
        //设置超时
        driver.manage().timeouts().implicitlyWait(200, TimeUnit.SECONDS);
        driver.manage().timeouts().pageLoadTimeout(200, TimeUnit.SECONDS);
        driver.manage().timeouts().setScriptTimeout(200, TimeUnit.SECONDS);
        //加载cookie
        if (cookieStr != null && domain != null) {
            driver.get(domain);
            loadCookiesString(driver, cookieStr);
            driver.navigate().refresh();
        }
        return driver;
    }

java selenium使用封装

selenium2java基本方法二次封装 - 知乎

selenium2java调用JavaScript方法封装

SeleniumUtils

package com.isi.utils;

import org.apache.commons.io.FileUtils;
import org.apache.log4j.Logger;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.interactions.Actions;
import org.openqa.selenium.remote.CapabilityType;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.remote.RemoteWebDriver;
import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import java.util.Date;
import java.util.Random;
import java.util.StringTokenizer;
import java.util.concurrent.TimeUnit;

/**
 * Created by VULCAN on 2017/7/4.
 */
public class SeleniumUtils {
    private static String userAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.101 Safari/537.36";
    private static final Logger logger = Logger.getLogger(SeleniumUtils.class);
    private static  boolean windows;

    static {
        String os = System.getProperty("os.name");
        if (os.toLowerCase().contains("windows")) {
//            System.setProperty("webdriver.chrome.driver", "C:/dev/chromedriver.exe");
            System.setProperty("webdriver.chrome.driver", "C:\\chromedriver_win32_3_102.0.5\\chromedriver.exe");
            windows=true;
        } else {
            System.setProperty("webdriver.chrome.driver", "/home/spider/test/chromedriver");
        }
        initservice();

    }

    /**
     * 初始化服务
     */
    public static  ChromeDriverService service=null;
    private static ChromeDriverService initservice() {
         service = new ChromeDriverService.Builder().usingDriverExecutable(new File(System.getProperty("webdriver.chrome.driver"))).usingAnyFreePort().build();
        try {
            service.start();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return service;
    }


    public static synchronized WebDriver initDriver() {
//		ChromeDriverService service=initservice();
        DesiredCapabilities capabilities = DesiredCapabilities.chrome();
        capabilities.setCapability("applicationCacheEnabled", true);
        capabilities.setCapability("chrome.switches", Arrays.asList("--start-maximized", "--start-fullscreen"));
        ChromeOptions options = new ChromeOptions();
        options.addArguments("--lang=" + "zh-CN"); // 切换浏览器语言
        options.addArguments("--test-type", "--start-maximized" /*"no-default-browser-check"*/); // 意思好像是测试模式,最大化浏览器并且默认不检查浏览器
        options.addArguments("--test-type", "--ignore-certificate-errors"); // 设置忽略 Chrome 浏览器证书错误报警提示
        // options.addArguments("user-data-dir=C:/Users/VULCAN/AppData/Local/Google/Chrome/User Data");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
//		 options.addArguments("user-data-dir=C:\\dev\\scoped_dir25620_6285");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
        // options.addArguments("--test-type", "--incognito"); //启动进入隐身模式
        // options.addArguments("--test-type", "--disable-plugins"); //禁用插件
        // options.addExtensions(new File("C:\\Users\\swang\\AppData\\Local\\Google\\Chrome\\UserData\\Default\\Extensions\\ijaobnmmgonppmablhldddpfmgpklbfh\\1.6.0_0.crx"));//添加扩展的方法,将crx文件所在的路径添加进去
        if(!windows){
            options.addArguments("--widows-size=1920,5000"); //为了完整截图
            options.addArguments("--start-fullscreen");
            options.addArguments("--start-maximized");
            //headless配置
            options.addArguments("--headless"); //-headless模式
            options.addArguments("--disable-gpu"); //禁用GPU加速
            options.addArguments("--no-sandbox"); //关闭沙盒模式
        }
        options.addArguments("--user-agent=" + userAgent);
        capabilities.setCapability(ChromeOptions.CAPABILITY, options);
        WebDriver driver = new ChromeDriver(options);
//		WebDriver driver = new RemoteWebDriver(service.getUrl(), capabilities);
        driver.manage().timeouts().implicitlyWait(100, TimeUnit.SECONDS);
        driver.manage().timeouts().pageLoadTimeout(100, TimeUnit.SECONDS);
        driver.manage().timeouts().setScriptTimeout(100, TimeUnit.SECONDS);
        return driver;
    }

    /**
     * 设置webDriver使用的代理ip和端口port
     *
     * @param ip
     * @param port
     * @return
     * @throws Exception
     */
    public static WebDriver initDriver(String ip, String port, String domain, String cookieStr) {
        DesiredCapabilities capabilities = DesiredCapabilities.chrome();
        capabilities.setCapability("applicationCacheEnabled", true);

        capabilities.setCapability(CapabilityType.ForSeleniumServer.AVOIDING_PROXY, true);
        capabilities.setCapability(CapabilityType.ForSeleniumServer.ONLY_PROXYING_SELENIUM_TRAFFIC, true);
        // System.setProperty("http.nonProxyHosts", "localhost");
        Proxy proxy = new Proxy();
        proxy.setHttpProxy(ip + ":" + port).setFtpProxy(ip + ":" + port).setSslProxy(ip + ":" + port);
        capabilities.setCapability(CapabilityType.PROXY, proxy);
        ChromeOptions options = new ChromeOptions();
        options.addArguments("--proxy-server=http://" + ip + ":" + port); // 切换代理
        options.addArguments("--lang=" + "zh-CN"); // 切换浏览器语言
//		options.addArguments("--disable-javascript"); //禁用js
//		options.addArguments("--disable-plugins"); //禁用插件
//		options.addArguments("--disable-images"); //禁用图像
//		options.addArguments("--headless"); //-headless模式
//		options.addArguments("--disable-gpu"); //禁用GPU加速
        options.addArguments("--test-type", /* "--start-maximized", */ "no-default-browser-check"); // 意思好像是测试模式,最大化浏览器并且默认不检查浏览器
        //options.addArguments("--test-type", "--ignore-certificate-errors"); // 设置忽略 Chrome 浏览器证书错误报警提示
        // options.addArguments("user-data-dir=C:/Users/VULCAN/AppData/Local/Google/Chrome/User Data");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
        //options.addArguments("user-data-dir=C:\\dev\\scoped_dir1808_11424");
        // //设置启动chrome为默认用户的配置信息(包括书签、扩展程序、代理设置等), 运行程序前需关闭win7系统中采用默认配置打开的浏览器chrome
        // options.addArguments("--test-type", "--incognito"); //启动进入隐身模式
        // options.addArguments("--test-type", "--disable-plugins"); //禁用插件
        //添加扩展的方法,将crx文件所在的路径添加进去
        // options.addExtensions(new File("C:\\Users\\swang\\AppData\\Local\\Google\\Chrome\\UserData\\Default\\Extensions\\ijaobnmmgonppmablhldddpfmgpklbfh\\1.6.0_0.crx"));
        //非windows环境默认用headless模式
        if(!windows){
            options.addArguments("--widows-size=1920,5000"); //为了完整截图
            options.addArguments("--start-fullscreen");
            options.addArguments("--start-maximized");
            //headless配置
            options.addArguments("--headless"); //-headless模式
            options.addArguments("--disable-gpu"); //禁用GPU加速
            options.addArguments("--no-sandbox"); //关闭沙盒模式
        }
        options.addArguments("--user-agent=" + userAgent);
        capabilities.setCapability(ChromeOptions.CAPABILITY, options);
        //WebDriver driver = new ChromeDriver(options); //本地化部署
        //http://x.x.x.x:4444//wd/hub/  将url改成远程地址可以调用远程服务
        WebDriver driver = new RemoteWebDriver(service.getUrl(), capabilities);
        //设置超时
        driver.manage().timeouts().implicitlyWait(200, TimeUnit.SECONDS);
        driver.manage().timeouts().pageLoadTimeout(200, TimeUnit.SECONDS);
        driver.manage().timeouts().setScriptTimeout(200, TimeUnit.SECONDS);
        //加载cookie
        if (cookieStr != null && domain != null) {
            driver.get(domain);
            loadCookiesString(driver, cookieStr);
            driver.navigate().refresh();
        }
        return driver;
    }

    public static void loadCookiesString(WebDriver driver, String cookieConent) {
        String lineArr[] = cookieConent.split("\n");
        for (String line : lineArr) {
            StringTokenizer str = new StringTokenizer(line, ";");
            while (str.hasMoreTokens()) {
                String name = str.nextToken();
                String value = str.nextToken();
                String domain = str.nextToken();
                String path = str.nextToken();
                Date expiry = null;
                String dt;
                if (!(dt = str.nextToken()).equals(null)) {
                    // expiry=new Date(dt);
                    // System.out.println();
                }
                boolean isSecure = new Boolean(str.nextToken()).booleanValue();
                Cookie ck = new Cookie(name, value, domain, path, expiry, isSecure);
                driver.manage().addCookie(ck);
            }
        }
        driver.navigate().refresh();
    }


    public static String getCookieStr(WebDriver driver) {
//        driver.navigate().refresh();
        String cookieStr = null;
        for (Cookie ck : driver.manage().getCookies()) {
            cookieStr += (ck.getName() + "=" + ck.getValue() + ";" + ck.getDomain() + ";" + ck.getPath() + ";" + ck.getExpiry() + ";" + ck.isSecure() + ";");
//            cookieStr += ck.getName() + "=" + ck.getValue()+";";
        }
        if(cookieStr==null) {
            return null;
        }else {
            return cookieStr.substring(0, cookieStr.length() - 1).replaceFirst("null","");
        }

    }


    public static String getTotalCookieStr(WebDriver driver) {
        driver.navigate().refresh();
        String cookieStr = driver.manage().getCookies().toString();
        cookieStr = cookieStr.substring(1, cookieStr.length() - 1);

        return cookieStr;
    }


    /**
     * 控制页面滚动
     *
     * @param driver
     * @param pageNum
     * @param waitTime
     */
    private static Random random = new Random();
    public static void rollPage(WebDriver driver, int pageNum, int waitTime) {
        // 控制滚动条一直向下翻页
        String url = driver.getCurrentUrl();
        JavascriptExecutor js = (JavascriptExecutor) driver;
        for (int i = 0; i < pageNum; i++) {
            //js控制浏览器划到底部
            js.executeScript("window.scrollTo(0,document.body.scrollHeight)");
            logger.info(url + " page num::::::::" + (i + 1));
            sleep(waitTime + random.nextInt(5000));
        }
    }

    /**
     * 关闭 解决进程残留问题
     *
     * @param driver
     */
    public static synchronized void close(WebDriver driver) {
        if (driver != null) {
            try {
                driver.quit();
                driver.close();
//				service.stop();
            } catch (Exception e) {
                logger.warn("close webdriver error " + e.getMessage());
            }
        }
    }


    public static File getTotalImg(WebDriver driver) throws IOException {
        driver.manage().window().fullscreen();
        JavascriptExecutor js = (JavascriptExecutor) driver;
//        Object width=js.executeScript("return document.body.scrollWidth");
//        Object height=js.executeScript("return document.body.scrollHeight");
        Object width = js.executeScript("return (document.documentElement.scrollWidth>document.documentElement.clientWidth) ? document.documentElement.scrollWidth : document.documentElement.scrollWidth");
        Object height = js.executeScript("return (document.documentElement.scrollHeight >document.documentElement.clientHeight) ? document.documentElement.scrollHeight : document.documentElement.clientHeight");

        int w = Integer.parseInt(width.toString());
        int h = Integer.parseInt(height.toString());

        System.out.println("width::::" + width);
        System.out.println("height::::" + height);

        driver.manage().window().setSize(new Dimension(w, h));

        TakesScreenshot takesScreenshot = (TakesScreenshot) driver;
        return takesScreenshot.getScreenshotAs(OutputType.FILE);
    }

    /**
     * 整页截图
     *
     * @param driver
     * @param destPath
     * @throws IOException
     */
    public synchronized static void screenshot(WebDriver driver, String destPath) throws IOException {
        //记录原有的windows size
        Dimension winSize = driver.manage().window().getSize();
        File outFile = getTotalImg(driver);
        FileUtils.moveFile(outFile, new File(destPath));
        //还原windows size
        driver.manage().window().setSize(winSize);
    }

    /**
     * 选中目标右键
     * @param driver
     * @param tagName
     */
    public static void contextClickByTagName(WebDriver driver,String tagName) {
        Actions action = new Actions(driver);
        action.contextClick(driver.findElement(By.tagName(tagName))).build().perform();

    }

    public static void sleep(int time){
        try {
            Thread.sleep(time);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws IOException {
//		 for(int i=0;i<10;i++){
//			 WebDriver webDriver = initDriver();
//			 driver.get("http://www.baidu.com");
//			 ThreadUtil.sleep(1000);
//			 driver.quit();
//		 }

        //加载 cookie

//		 QuestCookieDao questCookieDao = AppUtils.daoFactory(QuestCookieDao.class);
//		 QuestCookie questCookie = questCookieDao.getCookiesByRand2();
//		 us.codecraft.webmagic.proxy.Proxy proxy = ProxyUtils.getProxyByDomain();
//		 String host = proxy.getHost();
//		 String port = String.valueOf(proxy.getPort());
//		 String cookies = questCookie.getCookies();
//		 SeleniumUtils.initDriver(host,port,"http://www.zujuan.com/",cookies);


        WebDriver driver = initDriver();
        String url = "http://www.bubuko.com/infodetail-186319.html";
        driver.get(url);

        screenshot(driver, "D:/test.png");


//		 sleep(10000);

//		 String url2 = "http://www.qichacha.com/search?key=腾讯";
//		 driver.navigate().to(url2);

//		 String cookieStr = getCookieStr(driver);
//		 System.out.println(cookieStr);

//		 driver.navigate().refresh();
//		 System.out.println("--------------------------------");
//		 cookieStr = getCookieStr(driver);
//		 System.out.println(cookieStr);

    }


}

Selenium反屏蔽

python关于屏蔽selenium指纹的文章有很多都是使用stealth.min.js来屏蔽特征的,stealth.min.js又是node.js用来屏蔽selenium指纹的一段js代码,所以可以说java,和python屏蔽selenium特征的最初来源是node.js

依赖

     
                4.2.2
     

    
        
        
            org.seleniumhq.selenium
            selenium-java
            ${selenium.version}
        
        
            org.seleniumhq.selenium
            selenium-remote-driver
            ${selenium.version}
        
        
            org.seleniumhq.selenium
            selenium-server
            2.53.1
        
        
        
            cn.hutool
            hutool-all
            5.6.0
        

代码

import cn.hutool.core.io.FileUtil;
import cn.hutool.core.map.MapBuilder;
import org.openqa.selenium.Dimension;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import java.util.HashMap;
import java.util.LinkedHashMap;
import java.util.Map;

public class BotTest {
    public static void main(String[] args) throws IOException {
        WebDriver driver = getWebDriver();
        driver.get("https://bot.sannysoft.com/");
    }
    /**
     * 获取web驱动
     *
     * @return 浏览器驱动
     */
    public static WebDriver getWebDriver() throws IOException {
        String webDriverDir = "C:\\chromedriver_win32_3_102.0.5\\chromedriver.exe";
        // 加载驱动
        System.setProperty("webdriver.chrome.driver", webDriverDir);
        // 设置浏览器参数
        ChromeOptions options = new ChromeOptions();
        Map prefs = new HashMap();
        prefs.put("credentials_enable_service", false);
        prefs.put("profile.password_manager_enabled", false);

//        excludeSwitches", Arrays.asList("enable-automation")在高版本的谷歌浏览器是无法屏蔽
//        window.navigator.webdriver 为false 的特征,这里写出来是为了配合其他参数来关闭浏览器上显示"正在收到自动测试软件控制"的提示

        options.setExperimentalOption("excludeSwitches", Arrays.asList("enable-automation"));
        options.addArguments("--disable-blink-features");
        options.addArguments("--disable-blink-features=AutomationControlled");
        options.setExperimentalOption("useAutomationExtension", false);

        options.setExperimentalOption("prefs", prefs);
        // 创建驱动对象
        WebDriver driver = new ChromeDriver(options);
        driver.manage().window().setSize(new Dimension(1280, 1024));

        // 去除seleium全部指纹特征
        String js = FileUtil.readString(new File("C:\\tmp\\team.min.js"), "utf-8");

        // MapBuilder是依赖hutool工具包的api
        Map commandMap = MapBuilder.create(new LinkedHashMap()).put("source", js)
                .build();
        // executeCdpCommand这个api在selenium3中是没有的,请使用selenium4才能使用此api
        ((ChromeDriver) driver).executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", commandMap);
        return driver ;
    }
}

测试网站

Antibothttps://bot.sannysoft.com/

 stealth.min.js

https://github.com/requireCool/stealth.min.js

PhantomJs使用总结

PhantomJS是一个基于webkit的JavaScript API。它使用QtWebKit作为它核心浏览器的功能,使用webkit来编译解释执行JavaScript代码。

phantomJs使用总结_csdncjh的博客-CSDN博客下载 | PhantomJS 使用封装PhantomJsUtils来源phantomJs_Selenium_java 最全配置访问_菜鸡java程序员的博客-CSDN博客Selenium+Phantomjs做Java爬虫_西红柿丶番茄的博客-CSDN博客_java phantomjs seleniumhttps://blog.csdn.net/csdncjh/article/details/125382427

来源

Selenium的四种部署方式 - 知乎

Selenium Grid Docker部署+PythonDemo+配置解析

java selenium屏蔽所有selenium指纹和特征_fx9590的博客-CSDN博客_selenium去掉特征

你可能感兴趣的:(爬虫,java)