文章解决问题:
1、利用selenium模拟登录
2、获取模拟登录后的cookie
3、将cookie保存在python 的 requests中,做进一步的爬取工作。
具体步骤代码:
1、利用selenium模拟登录:
driver =webdriver.PhantomJS(executable_path="phantomjs.exe")
driver.get(self.login_url)
ck1 = self.driver.get_cookies()
elem_user = self.driver.find_element_by_xpath('//input[@id="loginname"]')
elem_user.send_keys('[email protected]')
time.sleep(1)
elem_pwd = self.driver.find_element_by_xpath('//input[@id="nloginpwd"]')
elem_pwd.send_keys('32mcymcymcy')
time.sleep(1)
elem_sub = self.driver.find_element_by_xpath('//div[@class="login-btn"]/a[@id="loginsubmit"]').click() #
time.sleep(3)
url = self.driver.current_url
if url!=self.login_url:
print "登录成功。"
2、获取模拟登录后的cookie,3、将cookie保存在python 的 requests中,做进一步的爬取工作:
cookie =[item["name"] + ":" + item["value"] for item in driver.get_cookies()]
cookiestr = ';'.join(item for item in cookie)
cook_map = {}
for item in cookie :
str = item.split(':')
cook_map[str[0]] = str[1]
print cook_map
cookies = requests.utils.cookiejar_from_dict(cook_map, cookiejar=None, overwrite=True)
self.session.cookies = cookies
参考文章:http://blog.csdn.net/falseen/article/details/46962011
http://blog.csdn.net/warrior_zhang/article/details/50198699
文章解释:
请先阅读参考文章内容,有一定基础后,参考本文。