验证码识别是常见的反爬手段之一,这次做了个滑动验证码的识别,在这个网站上做检测
https://promotion.aliyun.com/ntms/act/captchaIntroAndDemo.htmlhttps://promotion.aliyun.com/ntms/act/captchaIntroAndDemo.html
既然要滑动滑块那么就需要使用selenium模拟浏览了,selenium被很多网站进行了屏蔽,用selenium驱动浏览器时,在控制台输入window.navigator.webdriver时值为True.而非selenium环境下,它的值为undefined。
本文采用在驱动浏览器的时候,选用开发者模式驱动浏览器,这样就不会被网站识别为selenium驱动了。在控制台输入window.navigator.webdriver,结果如上,所以在开发者模式驱动浏览器可以避免被识别。
在模拟轨迹滑动的时候,选择先加速后减速的方式滑动。附上代码。`from selenium import webdriver
import time
from selenium.webdriver import ChromeOptions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver import ActionChains
##选用开发者模式,创建一个浏览器对象
option = ChromeOptions()
option.add_experimental_option(‘excludeSwitches’,[‘enable-automation’])
browser = webdriver.Chrome(options=option)
wait = WebDriverWait(browser, 40)
items =[]
def get_track(distance ):
track = []
current = 0
mid = distance * (3/5)
t = 3
v = 0
v1 = 0
a1 = 4
a = 8
while current < distance:
if current < mid:
move = v * t + 1/2 * a t * t
current += move
track.append(round(move))
v = v + at
else:
move1 = v1 * t + 1/2 * a1 t t
current += move1
track.append(round(move1))
v1 = v1 + a1t
return track
def huadong():
slider = browser.find_element_by_xpath("//[@id=‘nc_2_n1z’]") # 找到滑动按钮
ActionChains(browser).click_and_hold(slider).perform()
track = get_track(700) # 模拟运动轨迹,速度先快后慢
for x in track:
ActionChains(browser).drag_and_drop_by_offset(slider, xoffset=x, yoffset=random.randint(1, 3)).perform()
ActionChains(browser).release().perform()
def login():
browser.get(‘https://promotion.aliyun.com/ntms/act/captchaIntroAndDemo.html’)
time.sleep(10)
huadong1 = browser.find_element_by_xpath(’//*[@id=“tab2”]/a’)
huadong1.click()
huadong()
def main():
login()
time.sleep(10)
browser.close()
if name == ‘main’:
main()
`当然网上也有其他的方式能够完成识别,以后如果做成功会接着分享