现在很多网站在访问时,都会在客户端留下cookie,在用脚本实现自动化的时候,也得带上cookie,不然很多功能实现不了。
#!/usr/bin/env python
import http.cookiejar, urllib.request, http.client
import urllib.parse
import time
import re
import logging, random, string
import mimetypes
def random_string (length):
return ''.join (random.choice (string.ascii_letters) for ii in range (length + 1))
def encode_multipart_data (data, files):
def get_content_type (filename):
return mimetypes.guess_type (filename)[0] or 'application/octet-stream'
def encode_field (field_name):
return ('-----------------------------96951961826872/r/n',
'Content-Disposition: form-data; name="%s"' % field_name, '/r/n'
'', str (data [field_name]))
def encode_file (field_name):
filename = files [field_name]
return ('-----------------------------96951961826872/r/n',
'Content-Disposition: form-data; name="%s"; filename="%s"' % (field_name, filename), '/r/n'
'Content-Type: %s' % get_content_type(filename), '/r/n/r/n'
'', open (filename, 'rb').read ())
lines = []
for name in data:
lines.extend (encode_field (name))
for name in files:
lines.extend (encode_file (name))
lines.extend ('/r/n-----------------------------96951961826872--/r/n')
body = b''
for x in lines:
if(type(x) == str):
body += x.encode('ascii')
else:
body += x
headers = {'Content-Type': 'multipart/form-data; boundary=---------------------------96951961826872',
'Content-Length': str (len (body))}
return body, headers
def main():
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
urllib.request.install_opener(opener)
postdata = urllib.parse.urlencode({'user':123,'password':123, 'cookie': 'on'})
r = urllib.request.urlopen("#url#", postdata)
time.sleep(2)
#connection = http.client.HTTPConnection (req.get_host ())
#print(req.get_selector())
data = {}
files = {'notePhoto': '01.jpg'}
#connection.request ('POST', req.get_selector(), #这个实现不能够带上cookie
# *encode_multipart_data (data, files))
#response = connection.getresponse ()
req = urllib.request.Request (url, *encode_multipart_data (data, files))
response = urllib.request.urlopen(req)
if __name__ == '__main__':
main()
一开始使用HTTPConnection,但是在request时,无法带上cookie。所以又改用urllib.request.Request 。这里被用户手册给误导了:
class urllib.request.Request(url[, data][, headers][, origin_req_host][, unverifiable])
This class is an abstraction of a URL request.
url should be a string containing a valid URL.
data may be a string specifying additional data to send to the server, or None if no such data is needed. Currently HTTP requests are the only ones that use data; the HTTP request will be a POST instead of a GET when the data parameter is provided. data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format.
红色的标注让我一开始就忽视了Requst对象,其实data可以是multipart/form-data类型!!