scrapy 请求继承体系
Request
|-- FormRequest
通过以下请求测试
GET: https://httpbin.org/get
POST: https://httpbin.org/post
import json
from scrapy import Spider, Request, cmdline
class SpiderRequest(Spider):
name = "spider_request"
def start_requests(self):
url = "https://httpbin.org/get?name=tom"
yield Request(url, body=json.dumps({"age": "23"}))
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_request".split())
服务端收到url链接中的参数name,而没有收到body里边的参数age
"args": {
"name": "tom"
},
from scrapy import Spider, cmdline, FormRequest
class SpiderFormData(Spider):
name = "spider_form_data"
def start_requests(self):
url = "https://httpbin.org/post"
yield FormRequest(url, formdata={"name": "Tom"})
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_form_data".split())
服务器接收到参数
"form": {
"name": "Tom"
},
而且headers里边有一个参数
"headers": {
"Content-Type": "application/x-www-form-urlencoded",
},
Request
发送需要添加参数 method="POST"
import json
from scrapy import Spider, Request, cmdline
class SpiderPost(Spider):
name = "spider_post"
def start_requests(self):
url = "https://httpbin.org/post"
yield Request(url, method="POST", body=json.dumps({"name": "Tom"}))
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_post".split())
1、直接发送post请求,服务器端收到参数data,和json:
"data": "{\"name\": \"Tom\"}",
"form": {},
"json": {
"name": "Tom"
},
2、如果添加headers参数:
"headers": {
"Content-Type": "application/x-www-form-urlencoded",
},
服务器收到参数,form将接收到参数,也就是FormRequest
的提交方式
"data": "",
"form": {
"{\"name\": \"Tom\"}": ""
},
"json": null,
3、如果添加headers参数:
"headers": {
"Content-Type": "application/json",
},
服务器端将收到data 和json 参数,和第一个情形一样,不过有时候不加这个请求头参数获取,会请求错误
"data": "{\"name\": \"Tom\"}",
"form": {},
"json": {
"name": "Tom"
},
请求方式 | 使用方法 | headers参数 | 参数 | 服务器端接收到参数 |
---|---|---|---|---|
get | Request | - | ?name=tom | args |
post | FormRequest | 有默认值 | formdata={“name”: “Tom”} | form |
post | Request | - | body=json.dumps({“name”: “Tom”}) | data,json |
post | Request | “Content-Type”: “application/x-www-form-urlencoded” | body=json.dumps({“name”: “Tom”}) | form |
post | Request | “Content-Type”: “application/json”, | body=json.dumps({“name”: “Tom”}) | data, json |
参考
Scrapy Requests and Responses