对post请求方式,我们需要明确表单内容的类型,一般情况下,直接提交data参数即可,但如果前端对此有所校验,就需要根据实际情况进行调整。
post form-data
这里我自己搭建了个简单的登陆界面
源码如下 login.jsp
<%@ page contentType="text/html;charset=UTF-8" language="java" import="java.util.*" pageEncoding="utf-8" %>
<%
String path = request.getContextPath();
String basePath = request.getScheme() + "://" + request.getServerName() + ":" + request.getServerPort() + path + "/";
System.out.println(basePath);
%>
<html>
<head>
<title>登陆</title>
</head>
<body>
<form action="<%=basePath%>loginServlet" method="post">
<tabel align="center">
<tr>
<td height="200"></td>
</tr>
<tr><td>用户名:</td><td ><input type="text" name="username"></td></tr>
<tr><td>密码:</td><td ><input type="password" name="password"></td></tr>
<tr>
<td colspan="2" align="center">
<input type="submit" value="提交">
</td>
</tr>
</tabel>
</form>
</body>
</html>
运行界面:
用于处理登陆请求的loginServlet.java如下:
@WebServlet("/loginServlet")
public class loginServlet extends HttpServlet {
//doGet实现页面转发
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
req.getRequestDispatcher("/login.jsp").forward(req,resp);
System.out.println("访问了login.jsp");
}
//doPost处理post请求
@Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
String username = req.getParameter("username");
String password = req.getParameter("password");
System.out.println("username");
req.getSession().setAttribute("username",username);
String path = req.getContextPath();
resp.sendRedirect(path + "/index/main.jsp");
}
}
登陆成功后的main.jsp
<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<html>
<head>
<title>主页</title>
</head>
<body>
<h1>这是主页</h1>
welcome you ,${username}
</body>
</html>
我们填写账号,密码后点击登陆
同时打开dev-tools 看一下请求情况
接着用python模拟看看
python代码如下:
import requests
start_url = 'http://localhost:8080/loginServlet'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}
data = {
'username': 'aaa',
'password': 'aaa'
}
response = requests.post(start_url,headers=headers,data=data)
print(response.text)
首先我们来看一个例子
url:aHR0cHM6Ly93d3cua3VhaXNob3UuY29tL3Byb2ZpbGUvM3h4Ymt3ZDhta250ZWFj
先抓包分析接口
注意我这里圈起来的content-type:用于定义网络文件的类型和网页的编码,决定文件接收方将以什么形式、什么编码读取这个文件。这意味这我们直接以data的形式post,或许不行了。可以尝试一下
这里我们使用json = data进行post
response = session.post(self.start_url,headers=self.headers,json=self.data)
class ksSpider(object):
def __init__(self):
self.start_url = "https://www.kuaishou.com/graphql"
self.headers = {
'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_74dfd4a4a602c0350b9d5a6715e61e87',
'Referer': 'https://www.kuaishou.com/profile/3xxbkwd8mknteac',
'User-Agent': random.choice(USER_AGENT_LIST)
}
self.data = {"operationName":"visionProfilePhotoList","variables":{"userId":"3xxbkwd8mknteac","pcursor":"1.637041162378E12","page":"profile"},"query":"fragment photoContent on PhotoEntity {\n id\n duration\n caption\n likeCount\n viewCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n __typename\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n __typename\n}\n\nquery visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n"}
def parse_start_url(self):
response = session.post(self.start_url,headers=self.headers,json=self.data)
print(response.text)
这里我换个url:
aHR0cDovL3d3dy53aGdnenkuY29tL1BvbGljaWVzQW5kUmVndWxhdGlvbnMvaW5kZXguaHRtbD91dG09c2l0ZXNfZ3JvdXBfZnJvbnQuMmVmNTAwMWYuMC4wLmZkNGU1ODMwMDhiYjExZWQ5ZjIzYjUzNmUyN2YwNmFk
我们先传入data进行post
data = {"categoryCode": "GovernmentProcurement", "pageSize": 15, "pageNo": 1}
response = session.post(self.start_url,headers=self.headers,data=data).text
当然失败了
接着我们复制api的资源路径 /front/search/category 添加xhr断点
触发请求
关注scope
在控制台输出一下s[“data”],很明显是字符串
那么我们把Headers补全,试试post 这个字符串
代码如下:
data2 = "{\"categoryCode\":\"GovernmentProcurement\",\"pageSize\":\"15\",\"pageNo\":\"2\"}"
response = session.post(self.start_url,headers=self.headers,data=data2).text
class wSpider(object):
def __init__(self):
self.start_url = "http://www.whggzy.com/front/search/category"
self.headers = {
"Content-Type": "application/json",
"X-Requested-With": "XMLHttpRequest",
'Cookie': 'acw_tc=2f624a2b16583799791827338e7686367845620cecc4e2a1b7ed6f85bbdd0a',
'Referer': 'http://www.whggzy.com/PoliciesAndRegulations/index.html',
'user-agent' : random.choice(USER_AGENT_LIST)
}
def parse_start_url(self):
data = {"categoryCode": "GovernmentProcurement", "pageSize": 15, "pageNo": 1}
data2 = "{\"categoryCode\":\"GovernmentProcurement\",\"pageSize\":\"15\",\"pageNo\":\"2\"}"
response = session.post(self.start_url,headers=self.headers,data=data2).text
print(response)
if __name__ == '__main__':
w = wSpider()
w.parse_start_url()
post的时候需要根据具体情况 大致来说表单用data= 表单字典用 json=;同时还可以通过xhr断点来确定发送的数据。