前情回顾
- 卷一 验证码
- http://www.jianshu.com/p/811505323f51
- 卷二 MmEwMD参数
- http://www.jianshu.com/p/d8ce2d729683
-
https://passport.zhaopin.com/chk/verify?callback=jsonpCallback
不传入MmEwMD
参数,一样可以验证通过,卷二研究的好辛苦 >_<
自动登陆
改版后cookie上做了点手脚,卷一中的方法会登陆失败
使用requests访问login页面
import requests
session = requests.session()
session.get("https://passport.zhaopin.com/org/login")
print session.cookies
,
]>
使用webdriver访问login页面
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://passport.zhaopin.com/org/login")
print driver.get_cookies()
[
{
u'domain':u'passport.zhaopin.com',
u'secure':False,
u'value':u'14RuoIMWv_Ekd*****zr..5EYPpMWwe9J*****P.IebLEO7rIH7*****D4ypbCEbU9dM*****Z_PH.v1T6FOKlPA95.iyVY7pOqn0zSTtr16BvmyTPL..fd7z0q8KOfQGEjNtT*****MUibORXa9HX0yVYmsH36dqSn*****X1NGgEosf6UWy5Jc3biPt6Ay3oSW3V0Qeug.77JOVMRskbxLkWYN*****NlsbTw7e4YU8xQA',
u'expiry':1806250515,
u'path':u'/',
u'httpOnly':False,
u'name':u'FSSBBIl1UgzbN7N443T'
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'95841*****0.1490890515',
u'expiry':1490892315,
u'path':u'/',
u'httpOnly':False,
u'name':u'dyweb'
},
{
u'domain':u'.zhaopin.com',
u'name':u'__utmc',
u'value':u'269****210',
u'path':u'/',
u'httpOnly':False,
u'secure':False
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'1',
u'expiry':1490891115,
u'path':u'/',
u'httpOnly':False,
u'name':u'__utmt'
},
{
u'domain':u'.zhaopin.com',
u'name':u'dywec',
u'value':u'95*****23',
u'path':u'/',
u'httpOnly':False,
u'secure':False
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'958****23.34768****001208000.14****515.14908****5.1490890515.1',
u'expiry':1553962515,
u'path':u'/',
u'httpOnly':False,
u'name':u'dywea'
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'26992121*****15.1.1.ut*****ccn=(direct)|utmcmd=(none)',
u'expiry':1506658515,
u'path':u'/',
u'httpOnly':False,
u'name':u'__utmz'
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'269921****545150.14908****.14908905****0890515.1',
u'expiry':1553962515,
u'path':u'/',
u'httpOnly':False,
u'name':u'__utma'
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'95841923.149****.1.1.dywecsr=(d****eccn=(direct)|dywecm****dywectr=undefined',
u'expiry':1506658515,
u'path':u'/',
u'httpOnly':False,
u'name':u'dywez'
},
{
u'domain':u'.zhaopin.com',
u'secure':False,
u'value':u'269***.10.149****515',
u'expiry':1490892315,
u'path':u'/',
u'httpOnly':False,
u'name':u'__utmb'
},
{
u'domain':u'passport.zhaopin.com',
u'secure':False,
u'value':u'Te1E****xf2cCnriBjVesgm****87w_2eh3h7Xv****7NhNUQKba',
u'expiry':1806250444.416624,
u'path':u'/',
u'httpOnly':True,
u'name':u'FSSBBIl1UgzbN7N443S'
}
]
- requests只是一个http library,请求某个url,只会获取内容,而不会执行网页上的JavaScript代码
- webdriver是一个headless web browser,可以认为是一个没有UI的浏览器,访问某个url后,会模拟浏览器的执行,可以看到cookie内容比较多,多出来的cookie就是login成功的关键
WebDriver和Requests整合
session = requests.session()
driver = webdriver.Chrome()
driver.get("https://passport.zhaopin.com/org/login")
cookies = driver.get_cookies()
for cookie in cookies:
session.cookies.set(cookie['name'], cookie['value'])
Java WebDriver
org.seleniumhq.selenium
selenium-java
3.3.1
import org.openqa.selenium.Cookie;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import java.util.Iterator;
import java.util.Set;
public class TestWebDriver {
public static void main(String args[]) {
WebDriver driver = new ChromeDriver();
driver.get("https://passport.zhaopin.com/org/login");
Set cookies = driver.manage().getCookies();
Iterator itr = cookies.iterator();
while (itr.hasNext()) {
Cookie c = itr.next();
System.out.println("-----Cookies Detail-----");
System.out.println("Cookie Name: " + c.getName()
+ "\n\tCookie Domain: " + c.getDomain()
+ "\n\tCookie Value: " + c.getValue()
+ "\n\tPath: " + c.getPath()
+ "\n\tExpiry Date: " + c.getExpiry()
+ "\n\tSecure: " + c.isSecure());
}
}
}
Python install WebDriver
- http://selenium-python.readthedocs.io/installation.html
- https://github.com/mozilla/geckodriver/releases