HttpClient4.4.1模拟登录知乎

HttpClient4.4.1模拟登录知乎

今天登陆时发现无法登陆,是知乎登陆方式改变了,下面代码仅供参考
2015年07月19日21:17:00

一,登录要Post的表单数据是什么

这部分可以使用Wireshark工具来抓包就可以了,发现需要以下数据:

“_xsrf” = xxxx(这是一个变动的数据,需要先活取获取知乎首页源码来获得)
“email” = 邮箱
“password” = 密码
“rememberme” = “y”(或者n也可以)

  • 获取_xsrf数据:
String xsrfValue = responseHtml.split("<input type=\"hidden\" name= \"_xsrf\" value=\"")[1].split("\"/>")[0];

responseHtml是首页的源码,根据网页的组织形式,把_xsrf数据分割出来。

二,我的登录代码

RequestConfig requestConfig = RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD_STRICT).build();
        CloseableHttpClient httpClient = HttpClients.custom().setDefaultRequestConfig(requestConfig).build();

        HttpGet get = new HttpGet("http://www.zhihu.com/");
        try {
            CloseableHttpResponse response = httpClient.execute(get);
            String responseHtml = EntityUtils.toString(response.getEntity());
            String xsrfValue = responseHtml.split("<input type=\"hidden\" name=\"_xsrf\" value=\"")[1].split("\"/>")[0];
            System.out.println("xsrfValue:" + xsrfValue);
            response.close();
            List<NameValuePair> valuePairs = new LinkedList<NameValuePair>();
            valuePairs.add(new BasicNameValuePair("_xsrf" , xsrfValue));
            valuePairs.add(new BasicNameValuePair("email", 邮箱));
            valuePairs.add(new BasicNameValuePair("password", 密码));
            valuePairs.add(new BasicNameValuePair("rememberme", "y"));

            UrlEncodedFormEntity entity = new UrlEncodedFormEntity(valuePairs, Consts.UTF_8);
            HttpPost post = new HttpPost("http://www.zhihu.com/login");
            post.setEntity(entity);
            httpClient.execute(post);//登录

            HttpGet g = new HttpGet("http://www.zhihu.com/question/following");
            CloseableHttpResponse r = httpClient.execute(g);//获取子集关注的问题页面测试一下是否登陆成功
            System.out.println(EntityUtils.toString(r.getEntity()));
            r.close();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                httpClient.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

此处要注意开头的RequestConfig,我一开始是没有设置cookie这方面的额内容的,结果一直提示有cookie错误,所以查看了HttpClient手册,上面提到了选择Cookie策略,通过这种方法设置一个全局的Cookie策略,

RequestConfig requestConfig = RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD_STRICT).build();//标准Cookie策略
CloseableHttpClient httpClient = HttpClients.custom().setDefaultRequestConfig(requestConfig).build();//设置进去

你可能感兴趣的:(java,爬虫,知乎网)