Python扫码登录保存和验证cookies值——微博篇(五)

python实现扫码登录微博网页版

  • 一、找到生成二维码链接地址
  • 二、找到确认二维码链接地址
  • 三、继续寻找相关链接地址获取登录信息
  • 四、最后保存cookies值并进行验证是否有效或登录状态
    • 完整代码
  • 五 、 更多文章

一、找到生成二维码链接地址

  • 下面是微博APP扫码登录,其他授权APP扫码后期更新!
  • 通过F12或者抓包软件Fiddler找到第一个链接地址
loginurl = 'https://login.sina.com.cn/sso/qrcode/image?entry=weibo&size=180&callback=STK_{}'
texturl = session.get(loginurl.format((time.time() * 1000), headers=headers)).text
print(texturl)
#输出texturl = window.STK_16149…… && STK_1614992087527.347({"retcode":20000000,"msg":"succ","data":{"qrid":"2MjNgQt……","image":"\……"}});

  • 小编发现image就是二维码图片,qrid就是需要下一步的参数
  • 通过筛选和转码获取image、qrid的两个参数进一步访问运行
session = requests.session()
loginurl = 'https://login.sina.com.cn/sso/qrcode/image?entry=weibo&size=180&callback=STK_{}'
texturl = session.get(loginurl.format((time.time() * 1000), headers=headers)).text
#通过筛选获得主要参数
xx = re.search("window.STK_\d+.\d+ && STK_\d+.\d+\(?", texturl)
x = json.loads(texturl.strip().lstrip(xx.group()).rstrip(");"))
qrid = x['data']['qrid']
image = x['data']['image']
  • 直接访问打开image二维码图片扫码操作
imageurl = session.get('https:' + image, headers=headers).content
t = showpng(imageurl)
t.start()

二、找到确认二维码链接地址

  • 二维码图片链接我们已经找到,接下来进行扫码后的确认链接
qridurl = 'https://login.sina.com.cn/sso/qrcode/check?entry=weibo&qrid={}&callback=STK_{}'
  • 发现需要两个变量参数

qrid和callback

  • qrid上面我们已经获取,callback就是时间戳
while 1:
    dateurl = session.get(qridurl.format(qrid,(time.time() * 1000), headers=headers)).text
    #输出dateurl = window.STK_1614995146615.8608 && STK_1614995146615.8608({"retcode":50114001,"msg":"\u672a\u4f7f\u7528","data":null});
    xx = re.search("window.STK_\d+.\d+ && STK_\d+.\d+\(?", dateurl)
    x = json.loads(dateurl.strip().lstrip(xx.group()).rstrip(");"))
    retcode = x['retcode']
    if '50114001' in str(retcode):
        print('二维码未失效,请扫码!')
    elif '50114002' in str(retcode):
        print('已扫码,请确认!')
    elif '50114004' in str(retcode):
        print('二维码已失效,请重新运行!')
    elif '20000000' in str(retcode):
        print('已确认,登录成功!')
        break
    else:
        print('其他情况',retcode)
    time.sleep(5)
  • 通过分析retcode的值

50114001:二维码未扫描状态
50114002:二维码已扫描未确认状态
20000000:二维码已确认状态
50114004:二维码已失效

  • 登录成功了,可以并没有获取需要的真正cookie值

三、继续寻找相关链接地址获取登录信息

  • 二维码确认以后获取了新的参数alt

window.STK_161498904169123 && STK_161498904169123({“retcode”:20000000,“msg”:“succ”,“data”:{“alt”:“ALT-NjE0Njc……”}});

  • 通过抓包软件找到下一个链接就是
alturl = 'https://login.sina.com.cn/sso/login.php?entry=weibo&returntype=TEXT&crossdomain=1&cdult=3&domain=weibo.com&alt=ALT-NjE0Njc4MTM……&savestate=30&callback=STK_161498904169125'
name value
entry weibo
returntype TEXT
crossdomain 1
cdult 3
domain weibo.com
alt ALT-NjE0Njc4M……
savestate 30
callback STK_161498904169125
  • 将alt带入该链接并进行访问获取新的数据
alt = x['data']['alt']
alturl = 'https://login.sina.com.cn/sso/login.php?entry=weibo&returntype=TEXT&crossdomain=1&cdult=3&domain=weibo.com&alt={}&savestate=30&callback=STK_{}'.format(alt,int(time.time() * 100000))
crossDomainUrlList = session.get(alturl,headers=headers).text
print(crossDomainUrlList)
#输出crossDomainUrlList = STK_161500519933534({"retcode":"0","uid":"61……","nick":"\u7528……","crossDomainUrlList":["https:\/\/……"]});

  • 通过访问获取了有用的四个网址

1.https://passport.9797……
2.https://passport.krco……
3.https://passport.weib……
4.https://passport.weibo.com/wbsso……

  • 再将上面四个网址逐一访问既可得到想要的最终cookie值
alturl = 'https://login.sina.com.cn/sso/login.php?entry=weibo&returntype=TEXT&crossdomain=1&cdult=3&domain=weibo.com&alt={}&savestate=30&callback=STK_{}'.format(alt,int(time.time() * 100000))
crossDomainUrl = session.get(alturl,headers=headers).text
pp = re.search("STK_\d+\(?", crossDomainUrl)
p = json.loads(crossDomainUrl.strip().lstrip(pp.group()).rstrip(");"))
crossDomainUrlList = p['crossDomainUrlList']
session.get(crossDomainUrlList[0], headers=headers)
session.get(crossDomainUrlList[1]+'&action=login', headers=headers)
session.get(crossDomainUrlList[2], headers=headers)
session.get(crossDomainUrlList[3], headers=headers)
print(session.cookies)
  • 读取第二网址的时候老是报错,最后通过对比小编发现网址后面既然还需要多加“&action=login”最后成功访问读取,获得了最全的cookie

四、最后保存cookies值并进行验证是否有效或登录状态

  • 保存cookie和验证cookie就不详细介绍,直接出完整代码可以看懂

完整代码

# -*- coding: utf-8 -*-
import json
import re
import agent
from threading import Thread
import time
import requests
from io import BytesIO
import http.cookiejar as cookielib
from PIL import Image
import os

requests.packages.urllib3.disable_warnings()

headers = {
     'User-Agent': agent.get_user_agents(), 'Referer': "https://weibo.com/"}

class showpng(Thread):
    def __init__(self, data):
        Thread.__init__(self)
        self.data = data

    def run(self):
        img = Image.open(BytesIO(self.data))
        img.show()


def islogin(session):
    try:
        session.cookies.load(ignore_discard=True)
    except Exception:
        pass
    loginurl = session.get("https://account.weibo.com/set/aj/iframe/schoollist?province=11&city=&type=1&_t=0&__rnd={}".format(int(time.time() * 1000)), headers=headers).json()['code']
    if loginurl == '100000':
        print('Cookies值有效,无需扫码登录!')
        return session, True
    else:
        print('Cookies值已经失效,请重新扫码登录!')
        return session, False


def wblogin():
    if not os.path.exists('wbcookies.txt'):
        with open("wbcookies.txt", 'w') as f:
            f.write("")
    session = requests.session()
    session.cookies = cookielib.LWPCookieJar(filename='wbcookies.txt')
    session, status = islogin(session)
    if not status:
        loginurl = 'https://login.sina.com.cn/sso/qrcode/image?entry=weibo&size=180&callback=STK_{}'
        texturl = session.get(loginurl.format(int(time.time() * 1000), headers=headers)).text
        xx = re.search("window.STK_\d+.\d+ && STK_\d+.\d+\(?", texturl)
        x = json.loads(texturl.strip().lstrip(xx.group()).rstrip(");"))
        qrid = x['data']['qrid']
        image = x['data']['image']
        imageurl = session.get('https:' + image, headers=headers).content
        t = showpng(imageurl)
        t.start()
        qridurl = 'https://login.sina.com.cn/sso/qrcode/check?entry=weibo&qrid={}&callback=STK_{}'
        while 1:
            dateurl = session.get(qridurl.format(qrid, int(time.time() * 1000), headers=headers)).text
            xx = re.search("window.STK_\d+.\d+ && STK_\d+.\d+\(?", dateurl)
            x = json.loads(dateurl.strip().lstrip(xx.group()).rstrip(");"))
            retcode = x['retcode']
            if '50114001' in str(retcode):
                print('二维码未失效,请扫码!')
            elif '50114002' in str(retcode):
                print('已扫码,请确认!')
            elif '50114004' in str(retcode):
                print('二维码已失效,请重新运行!')
            elif '20000000' in str(retcode):
                alt = x['data']['alt']
                alturl = 'https://login.sina.com.cn/sso/login.php?entry=weibo&returntype=TEXT&crossdomain=1&cdult=3&domain=weibo.com&alt={}&savestate=30&callback=STK_{}'.format(
                    alt, int(time.time() * 100000))
                crossDomainUrl = session.get(alturl, headers=headers).text
                pp = re.search("STK_\d+\(?", crossDomainUrl)
                p = json.loads(crossDomainUrl.strip().lstrip(pp.group()).rstrip(");"))
                crossDomainUrlList = p['crossDomainUrlList']
                session.get(crossDomainUrlList[0], headers=headers)
                session.get(crossDomainUrlList[1] + '&action=login', headers=headers)
                session.get(crossDomainUrlList[2], headers=headers)
                session.get(crossDomainUrlList[3], headers=headers)
                print('已确认,登录成功!')
                break
            else:
                print('其他情况', retcode)
            time.sleep(5)
        session.cookies.save()
    return session

if __name__ == '__main__':
    wblogin()
    

五 、 更多文章

  1. 抖音篇(一)
  2. 快手篇(二)
  3. 微视篇(三)
  4. 微信公众号篇(四)
  5. B站篇(六)
  6. 视频号篇(七)
  • 后期小编将开设登录后批量采集各平台数据(点赞、播放量、评论、图片、视频、音乐等)专栏文章!记得关注哟!
  • 如果文章能帮到您,愿意给小编点个 吗,么么哒~ (●’◡’●)

你可能感兴趣的:(Python实现扫码登录,python)