Java SpringBoot 爬虫(二)用户模拟登陆

爬虫很多数据需要登录,登录成功后才能获取一些数据,这里记录一下模拟登录江西移动查询自己手机话费。

首先需要抓包获取首页网址:http://service.jx.10086.cn/service/resources/indexNew.html
抓取登录页面网址:https://jx.ac.10086.cn/Login
获取登录参数(手机号码和服务密码已和谐,自己测试填上就可以):
Java SpringBoot 爬虫(二)用户模拟登陆_第1张图片
准备条件做好了下面开始用代码来实现:

				Connection.Response response = JsoupUtil.get(index_url, null, 10000)
						.header("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
						.execute();
				Map cookies = response.cookies();
				update(cookieMap, cookies);
				Document document = Jsoup.parse(response.body());
				String type = document.getElementById("type").val();
				String loginStatus = document.getElementById("loginStatus").val();
				String loginFlag = document.getElementById("loginFlag").val();
				String menuid = document.getElementById("menuid").val();
				String spid = document.select("[name=spid]").val();
				String backurl = document.getElementById("backurl").val();
				String errorurl = document.select("[name=errorurl]").val();
				String sessionToken = document.getElementById("sessionToken").val();
				String login_backurl = document.getElementById("_login_backurl").val();
				String mobileNum = "xxxxxxxxx";
				String servicePassword = "xxx";
				String smsValidCode = "";
				String validCode = getImageCode();

验证码是动态生成的,自动打码的工具开源的有tess4j,具体使用方式可以自己百度,我使用的时候好像无法识别我的验证码,数字的还是可以,我这里使用的若快打码,工具类下载:http://wiki.ruokuai.com/,验证码的获取方式:
String url = "https://jx.ac.10086.cn/common/image.jsp?l=" + Math.random()
获取到验证码的值之后开始做登录操作:

String login_url= "https://jx.ac.10086.cn/Login";
Connection.Response responses = JsoupUtil.post(login_url,cookieMap,10000)
							.header("Host","jx.ac.10086.cn")
							.header("Upgrade-Insecure-Requests","1")
							.header("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
							.header("Content-Type","application/x-www-form-urlencoded")
							.header("User-Agent","Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36")
							.header("Referer","http://service.jx.10086.cn/service/resources/indexNew.html")
							.data("type",type)
							.data("loginStatus",loginStatus)
							.data("loginFlag",loginFlag)
							.data("menuid",menuid)
							.data("spid",spid)
							.data("backurl",backurl)
							.data("errorurl",errorurl)
							.data("sessionToken",sessionToken)
							.data("login_backurl",login_backurl)
							.data("mobileNum",mobileNum)
							.data("servicePassword",servicePassword)
							.data("smsValidCode",smsValidCode)
							.data("validCode",validCode)
							.execute();
					String responstr = responses.body();

登陆成功后需要调到首页地址,通过这个地址获取话费信息:

					if (!StringUtil.isBlank(responstr)&&responstr.contains("location.replace"))
					{

						update(cookieMap,response.cookies());
						System.out.println("登陆成功");

						String url = substring("('", "')", responstr);
						System.out.println("url:"+url);
						response = JsoupUtil.get(url,cookieMap,10000)
								.header("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
								.header("Host","login.10086.cn")
								.execute();
						String body = response.body();


						if(body.contains("location.href"))
						{
							update(cookieMap,response.cookies());

							url = "http://service.jx.10086.cn/service/common/indexNew.jsp";
							response = JsoupUtil.get(url,cookieMap,10000)
									.header("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
									.header("Host","service.jx.10086.cn")
									.execute();
							update(cookieMap,response.cookies());
							body = response.body();
							System.out.println("body: "+body);
							String mobile = Jsoup.parse(body).select("div[class=lc_text]").select("p").get(0).text();
							String amount = Jsoup.parse(body).getElementsByClass("tt_blink").text();
							System.out.println("手机号码:"+mobile+"话费余额:"+amount);
							return;

						}
					}

测试结果:
Java SpringBoot 爬虫(二)用户模拟登陆_第2张图片

官网截图:
Java SpringBoot 爬虫(二)用户模拟登陆_第3张图片

你可能感兴趣的:(爬虫)