浏览器安全检查5秒解决方案

浏览器安全检查5秒解决方案

最近在做一个爬虫的时候遇到了浏览器安全检查的情况:
浏览器安全检查5秒解决方案_第1张图片

使用fiddler进行抓包分析
这里写图片描述

我们可以看到第一次访问网页返回的响应码是503,因为没有携带cookie
然后5秒之后自动跳转到第三个请求,第三个请求的url为
http://www.machineryinfo.net/cdn-cgi/l/chk_jschl?jschl_vc=59dc97dc709d5d42b4907dea5483df37&pass=1514166702.677-IddoE%2FcQMm&jschl_answer=544

我们分析一下第一个请求的响应体


<html lang="en-US">
<head>
  <meta charset="UTF-8" />
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
  <meta name="robots" content="noindex, nofollow" />
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
  <title>安全检查中...title>
  <style type="text/css">
    html, body {width: 100%; height: 100%; margin: 0; padding: 0;}
    body {background-color: #ffffff; font-family: Helvetica, Arial, sans-serif; font-size: 100%;}
    h1 {font-size: 1.5em; color: #404040; text-align: center;}
    p {font-size: 1em; color: #404040; text-align: center; margin: 10px 0 0 0;}
    #spinner {margin: 0 auto 30px auto; display: block;}
    .attribution {margin-top: 20px;}
  style>

    <script type="text/javascript">
  //
  (function(){
    var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
    b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
    b(function(){
      var a = document.getElementById('yjs-content');a.style.display = 'block';
      setTimeout(function(){
        var s,t,o,p,b,r,e,a,k,i,n,g,f, uyKdOwL={"qeULyPtNFAQh":+((+!![]+[])+(!+[]+!![]+!![]+!![]))};
        t = document.createElement('div');
        t.innerHTML="x";
        t = t.firstChild.href;r = t.match(/https?:\/\//)[0];
        t = t.substr(r.length); t = t.substr(0,t.length-1);
        a = document.getElementById('jschl-answer');
        f = document.getElementById('challenge-form');
        ;uyKdOwL.qeULyPtNFAQh*=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));uyKdOwL.qeULyPtNFAQh-=+!![];uyKdOwL.qeULyPtNFAQh+=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));a.value = parseInt(uyKdOwL.qeULyPtNFAQh, 10) + t.length; '; 121'
        f.submit();
      }, 4000);
    }, false);
  })();
  //]]>
script>


head>
<body>
  <table width="100%" height="100%" cellpadding="20">
    <tr>
      <td align="center" valign="middle">
          <div class="yjs-browser-verification yjs-im-under-attack">
  <noscript><h1 data-translate="turn_on_js" style="color:#bd2426;">请打开浏览器的javascript,然后刷新浏览器h1>noscript>
  <div id="yjs-content" style="display:none">
    <div>
      <div class="bubbles">div>
      <div class="bubbles">div>
      <div class="bubbles">div>
    div>
    <h1>machineryinfo.net <span data-translate="checking_browser">浏览器安全检查中...span>h1>
    <p data-translate="process_is_automatic">p>
    <p data-translate="allow_5_secs">还剩 5 秒…p>
  div>
  <form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
    <input type="hidden" name="jschl_vc" value="59dc97dc709d5d42b4907dea5483df37"/>
    <input type="hidden" name="pass" value="1514166702.677-IddoE/cQMm"/>
    <input type="hidden" id="jschl-answer" name="jschl_answer"/>
  form>
div>


          <div class="attribution"><a href="http://su.baidu.com/" target="_blank" style="font-size: 12px;">a>div>
      td>
    tr>
  table>
body>
html>

发现第三个请求的三个参数在源码中都有体现,



jschl-answer这个值需要我们自己进行计算

uyKdOwL={"qeULyPtNFAQh":+((+!![]+[])+(!+[]+!![]+!![]+!![]))};
uyKdOwL.qeULyPtNFAQh*=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));
uyKdOwL.qeULyPtNFAQh-=+!![];
uyKdOwL.qeULyPtNFAQh+=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));
a.value = parseInt(uyKdOwL.qeULyPtNFAQh, 10) + t.length;

经过分析源码中的js,发现jschl-answer 即为a.value,t.length是21,uyKdOwL.qeULyPtNFAQh是经过重重计算得到的,+((+!![]+[])+(!+[]+!![]+!![]+!![])) 可以看做是经过加密的js代码,在chrome控制台中直接执行可以看到实际上是14,*=35,-=1,+=34,然后再加上t.length(21),结果就是544,就是第二个请求中最后一个参数的值
然后获取到了url就可以模拟请求,在进行该请求前等待4秒,模拟js中的setTime方法,然后该请求的响应头中会携带新的cookie字段,然后在请求头中加上该cookie字段就可以正常访问网页了

你可能感兴趣的:(爬虫)