以Discuz的官方站为例。直接点击网页右上角的登录按钮,会弹出一个带验证码的登录窗口。输入验证码之后,会检查验证码是否正确。然后登录。首先,通过抓包分析,这些过程浏览器和服务器交换了哪些数据。
抓包分析
整个过程产生了5条数据:
一
第一个是GET请求,返回了一段html代码
<div id="main_messaqge_LZH8S"> <div id="layer_login_LZH8S"> <h3 class="flb"> <em id="returnmessage_LZH8S"> 用户登录em> <span><a href="javascript:;" class="flbc" onclick="hideWindow('login', 0, 1);" title="关闭">关闭a>span> h3> <form method="post" autocomplete="off" name="login" id="loginform_LZH8S" class="cl" onsubmit="pwdclear = 1;ajaxpost('loginform_LZH8S', 'returnmessage_LZH8S', 'returnmessage_LZH8S', 'onerror');return false;" action="member.php?mod=logging&action=login&loginsubmit=yes&handlekey=login&loginhash=LZH8S"> <div class="c cl"> <input type="hidden" name="formhash" value="41969484" /> <input type="hidden" name="referer" value="http://www.discuz.net/forum.php" /> <div class="rfm"> <table> <tr> <th> <span class="login_slct"> <select name="loginfield" style="float: left;" width="45" id="loginfield_LZH8S"> <option value="username">用户名option> <option value="email">Emailoption> select> span> th> <td><input type="text" name="username" id="username_LZH8S" autocomplete="off" size="30" class="px p_fre" tabindex="1" value="" />td> <td class="tipcol"><a href="member.php?mod=register">立即注册a>td> tr> table> div> <div class="rfm"> <table> <tr> <th><label for="password3_LZH8S">密码:label>th> <td><input type="password" id="password3_LZH8S" name="password" onfocus="clearpwd()" size="30" class="px p_fre" tabindex="1" />td> <td class="tipcol"><a href="javascript:;" onclick="display('layer_login_LZH8S');display('layer_lostpw_LZH8S');" title="找回密码">找回密码a>td> tr> table> div> <div class="rfm"> <table> <tr> <th>安全提问:th> <td><select id="loginquestionid_LZH8S" width="213" name="questionid" onchange="if($('loginquestionid_LZH8S').value > 0) {$('loginanswer_row_LZH8S').style.display='';} else {$('loginanswer_row_LZH8S').style.display='none';}"> <option value="0">安全提问(未设置请忽略)option> <option value="1">母亲的名字option> <option value="2">爷爷的名字option> <option value="3">父亲出生的城市option> <option value="4">您其中一位老师的名字option> <option value="5">您个人计算机的型号option> <option value="6">您最喜欢的餐馆名称option> <option value="7">驾驶执照最后四位数字option> select>td> tr> table> div> <div class="rfm" id="loginanswer_row_LZH8S" style="display:none"> <table> <tr> <th>答案:th> <td><input type="text" name="answer" id="loginanswer_LZH8S" autocomplete="off" size="30" class="px p_fre" tabindex="1" />td> tr> table> div> <span id="seccode_cSA">span> <script type="text/javascript" reload="1">updateseccode('cSA', '', 'member::logging');script> <div class="rfm bw0"> <table> <tr> <th>th> <td><label for="cookietime_LZH8S"><input type="checkbox" class="pc" name="cookietime" id="cookietime_LZH8S" tabindex="1" value="2592000" />自动登录label>td> tr> table> div> <div class="rfm mbw bw0"> <table width="100%"> <tr> <th> th> <td> <button class="pn pnc" type="submit" name="loginsubmit" value="true" tabindex="1"><strong>登录strong>button> td> <td> td> tr> table> div> <div class="rfm bw0 "> <hr class="l" /> <table> <tr> <th>快捷登录:th> <td> <a href="http://www.discuz.net/connect.php?mod=login&op=init&referer=http%3A%2F%2Fwww.discuz.net%2Fforum.php&statfrom=login" target="_top" rel="nofollow"><img src="static/image/common/qq_login.gif" class="vm" />a> <a href="plugin.php?id=wechat:login"><img src="source/plugin/wechat/image/wechat_login.png" class="vm" />a> td> tr> table> div> div> form> div> <div id="layer_lostpw_LZH8S" style="display: none;"> <h3 class="flb"> <em id="returnmessage3_LZH8S">找回密码em> <span><a href="javascript:;" class="flbc" onclick="hideWindow('login')" title="关闭">关闭a>span> h3> <form method="post" autocomplete="off" id="lostpwform_LZH8S" class="cl" onsubmit="ajaxpost('lostpwform_LZH8S', 'returnmessage3_LZH8S', 'returnmessage3_LZH8S', 'onerror');return false;" action="member.php?mod=lostpasswd&lostpwsubmit=yes&infloat=yes"> <div class="c cl"> <input type="hidden" name="formhash" value="41969484" /> <input type="hidden" name="handlekey" value="lostpwform" /> <div class="rfm"> <table> <tr> <th><span class="rq">*span><label for="lostpw_email">Email:label>th> <td><input type="text" name="email" id="lostpw_email" size="30" value="" tabindex="1" class="px p_fre" />td> tr> table> div> <div class="rfm"> <table> <tr> <th><label for="lostpw_username">用户名:label>th> <td><input type="text" name="username" id="lostpw_username" size="30" value="" tabindex="1" class="px p_fre" />td> tr> table> div> <div class="rfm mbw bw0"> <table> <tr> <th>th> <td><button class="pn pnc" type="submit" name="lostpwsubmit" value="true" tabindex="100"><span>提交span>button>td> tr> table> div> div> form> div> div> <div id="layer_message_LZH8S" style="display: none;"> <h3 class="flb" id="layer_header_LZH8S"> <em>用户登录em> <span><a href="javascript:;" class="flbc" onclick="hideWindow('login')" title="关闭">关闭a>span> h3> <div class="c"><div class="alert_right"> <div id="messageleft_LZH8S">div> <p class="alert_btnleft" id="messageright_LZH8S">p> div> div> <script type="text/javascript" reload="1"> var pwdclear = 0; function initinput_login() { document.body.focus(); if($('loginform_LZH8S')) { $('loginform_LZH8S').username.focus(); } simulateSelect('loginfield_LZH8S'); } initinput_login(); function clearpwd() { if(pwdclear) { $('password3_LZH8S').value = ''; } pwdclear = 0; } script>
:
看起来就是登录窗口。
特别注意到,这段代码中有
loginhash=LZH8S <input type="hidden" name="formhash" value="41969484" />
'LZH8S'在代码中频繁出现,formhash出现两次。
刷新之后,loginhash和formhash发生了变化。
二
第二个GET请求返回了一段JavaScript代码
1 if($('seccode_cSA')) { 2 if(!$('vseccode_cSA')) { 3 var sectpl = seccheck_tpl['cSA'] != '' ? seccheck_tpl['cSA'].replace(//g, 'codecSA') : ''; 4 var sectplcode = sectpl != '' ? sectpl.split(' ') : Array('
',': ','
',''); 5 var string = '' + sectplcode[0] + '验证码' + sectplcode[1] + '' + 6 ' 换一个' + 7 '' + 8 sectplcode[2] + '请输入下面动画图片中的字符
' + sectplcode[3]; 9 evalscript(string); 10 $('seccode_cSA').innerHTML = string; 11 } else { 12 var string = '请输入下面动画图片中的字符
'; 13 evalscript(string); 14 $('vseccode_cSA').innerHTML = string; 15 } 16 17 }
其中的update值在刷新和点击换一个验证码之后产生变化
src="misc.php?mod=seccode&update=18644&idhash=cSA"
三
第三个GET请求的头文件包含
GET http://www.discuz.net/misc.php?mod=seccode&update=18644&idhash=cSA HTTP/1.1
Accept: image/webp,image/*,*/*;q=0.8
根据Accept属性可知返回的是一张图片,请求链接中包含了上一条请求的update值。
四
第4个GET请求的链接为
GET http://www.discuz.net/misc.php?mod=seccode&action=check&inajax=1&modid=member::logging&idhash=cSA&secverify=e38p
最后的secverify=e38p为验证码,返回的文本为
xml version="1.0" encoding="gbk"?> <root>succeed]]>root>
根据链接中的action=check和返回的文本推断,这是在检查输入的验证码是否正确。
五
第5个为POST请求,headers为
1 POST http://www.discuz.net/member.php?mod=logging&action=login&loginsubmit=yes&handlekey=login&loginhash=LZH8S&inajax=1 HTTP/1.1 2 Host: www.discuz.net 3 Connection: keep-alive 4 Content-Length: 200 5 Cache-Control: max-age=0 6 Origin: http://www.discuz.net 7 Upgrade-Insecure-Requests: 1 8 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 9 Content-Type: application/x-www-form-urlencoded 10 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 11 Referer: http://www.discuz.net/forum.php 12 Accept-Encoding: gzip, deflate 13 Accept-Language: zh-CN,zh;q=0.8 14 Cookie: t7asq_4ad6_saltkey=X3kj05VV; t7asq_4ad6_lastvisit=1496539605; t7asq_4ad6_nofavfid=1; t7asq_4ad6_ulastactivity=1496722517%7C0; t7asq_4ad6_lastcheckfeed=3051978%7C1496722517; t7asq_4ad6_security_cookiereport=798bNiNKvpmS%2BJb6aMC1CK0u2rBbgm%2Bl4RHOagb%2FADNc1uz1WEc6; t7asq_4ad6_connect_is_bind=0; t7asq_4ad6_forum_lastvisit=D_10_1496722811; pgv_pvi=7116158144; pgv_info=ssi=s5527714008; t7asq_4ad6_seccode=14415.a12e22e835ed3df84a; t7asq_4ad6_lastact=1496730222%09misc.php%09seccode 15 16 formhash=41969484&referer=http%3A%2F%2Fwww.discuz.net%2Fforum.php&loginfield=username&username=123&password=123&questionid=0&answer=&seccodehash=cSA&seccodemodid=member%3A%3Alogging&seccodeverify=e38p
请求的链接中含有loginhash=LZH8S,提交的数据中含有formhash=41969484和验证码seccodeverify=e38p和没有被加密的用户名和密码。
因为用户名和密码是随便输入的,所以返回的文本为
xml version="1.0" encoding="gbk"?> <root>登录失败,您还可以尝试 4 次]]>root>
下面进行模拟登录,注意要获取loginhash、formhash和update的值。
模拟登录
登录过程中的5个请求的headers都不同,在模拟请求时,完全复制抓包到的headers。
定义函数:
1 import requests 2 session=requests.session() 3 4 #获取登录窗口中的loginhash和formhash 5 def get_login_window(): 6 url='http://www.discuz.net/member.php?mod=logging&action=login&infloat=yes&handlekey=login&inajax=1&ajaxtarget=fwin_content_login' 7 headers={'Host':'www.discuz.net','Connection':'keep-alive','User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36','X-Requested-With':'XMLHttpRequest','Accept':'*/*','Referer':'http://www.discuz.net/forum.php','Accept-Encoding':'gzip, deflate, sdch','Accept-Language':'zh-CN,zh;q=0.8'} 8 #清空原来的headers 9 session.headers.clear() 10 #更新headers 11 session.headers.update(headers) 12 r=session.get(url) 13 #获取loginhash 14 p=r.text.find('loginhash')+len('loginhash')+1 15 loginhash=r.text[p:p+5] 16 #获取formhash 17 p=r.text.find('formhash')+len('formhash" value="') 18 formhash=r.text[p:p+8] 19 return (loginhash,formhash) 20 21 #获取update 22 def get_code_info(): 23 url='http://www.discuz.net/misc.php?mod=seccode&action=update&idhash=cSA&0.3916181418197131&modid=member::logging' 24 r=session.get(url) 25 p=r.text.find('update=') 26 update=r.text[p+7:p+12] 27 return update 28 29 #获取验证码 30 def get_code(update): 31 url='http://www.discuz.net/misc.php?mod=seccode&update='+update+'&idhash=cSA' 32 headers={'Host':'www.discuz.net','Connection':'keep-alive','User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36','Accept':'image/webp,image/*,*/*;q=0.8','Referer':'http://www.discuz.net/forum.php','Accept-Encoding':'gzip, deflate, sdch','Accept-Language':'zh-CN,zh;q=0.8'} 33 session.headers.clear() 34 session.headers.update(headers) 35 r=session.get(url) 36 if(r.content[:3]==b'GIF'): 37 #保存验证码图片 38 file=open('code.gif','wb') 39 file.write(r.content) 40 file.close() 41 else: 42 #打印错误信息 43 print(r.text) 44 45 #检查验证码是否正确 46 #通过人工识别验证码code,:) 47 def check_code(code): 48 url='http://www.discuz.net/misc.php?mod=seccode&action=check&inajax=1&modid=member::logging&idhash=cSA&secverify='+code 49 headers={'Host':'www.discuz.net','Connection':'keep-alive','User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36','Accept':'image/webp,image/*,*/*;q=0.8','Referer':'http://www.discuz.net/forum.php','Accept-Encoding':'gzip, deflate, sdch','Accept-Language':'zh-CN,zh;q=0.8'} 50 session.headers.clear() 51 session.headers.update(headers) 52 r=session.get(url) 53 return r.text 54 55 #模拟登录 56 def login(loginhash,formhash,code,username,password): 57 url='http://www.discuz.net/member.php?mod=logging&action=login&loginsubmit=yes&handlekey=login&loginhash='+loginhash+'&inajax=1' 58 data={'formhash':formhash, 59 'referer':'http://www.discuz.net/forum.php', 60 'loginfield':'username', 61 'username':username, 62 'password':password, 63 'questionid':'0', 64 'answer':'', 65 'seccodehash':'cSA', 66 'seccodemodid':'member::logging', 67 'seccodeverify':code} 68 headers={'Host':'www.discuz.net','Connection':'keep-alive','Content-Length':'203','Cache-Control':'max-age=0','Origin':'http://www.discuz.net','Upgrade-Insecure-Requests':'1','User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36','Content-Type':'application/x-www-form-urlencoded','Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','Referer':'http://www.discuz.net/forum.php','Accept-Encoding':'gzip, deflate','Accept-Language':'zh-CN,zh;q=0.8'} 69 session.headers.clear() 70 session.headers.update(headers) 71 r=session.post(url,data) 72 print(r.text)
模拟登录:
(loginhash,formhash)=get_login_window() get_code(get_code_info()) code=input()#人工识别 :) check_code(code) #[CDATA[succeed]] login(loginhash,formhash,code,username,password) #欢迎您回来,现在将转入登录前页面
测试:
此页面只有在登录后才能显示
url='http://www.discuz.net/home.php?mod=space&do=pm'
标题为 ”用户名 - Discuz! 官方站 - Powered by Discuz!“
如果未登录,则标题为 ”提示信息 - Discuz! 官方站 - Powered by Discuz!“
session.headers.clear() r=session.get(url) p=r.text.find('')+len(' ') print(r.text[p:r.text.find('<',p)])
如果打印出了 ”用户名 - Discuz! 官方站 - Powered by Discuz!“,则证明登录成功。