python3实现带cookie的上传文件

http://hi.baidu.com/%CD%D1%CF%C2%D0%AC%D7%D3
2009-12-26 14:23

    现在很多网站在访问时,都会在客户端留下cookie,在用脚本实现自动化的时候,也得带上cookie,不然很多功能实现不了。

 

#!/usr/bin/env python
import http.cookiejar, urllib.request, http.client
import urllib.parse
import time
import re
import logging, random, string
import mimetypes

def random_string (length):
    return ''.join (random.choice (string.ascii_letters) for ii in range (length + 1))

def encode_multipart_data (data, files):

    def get_content_type (filename):
        return mimetypes.guess_type (filename)[0] or 'application/octet-stream'

    def encode_field (field_name):
        return ('-----------------------------96951961826872/r/n',
                'Content-Disposition: form-data; name="%s"' % field_name, '/r/n'
                '', str (data [field_name]))

    def encode_file (field_name):
        filename = files [field_name]
        return ('-----------------------------96951961826872/r/n',
                'Content-Disposition: form-data; name="%s"; filename="%s"' % (field_name, filename), '/r/n'
                'Content-Type: %s' % get_content_type(filename), '/r/n/r/n'
                '', open (filename, 'rb').read ())

    lines = []
    for name in data:
        lines.extend (encode_field (name))
    for name in files:
        lines.extend (encode_file (name))
    lines.extend ('/r/n-----------------------------96951961826872--/r/n')
    body = b''
    for x in lines:
        if(type(x) == str):
            body += x.encode('ascii')
        else:
            body += x
    headers = {'Content-Type': 'multipart/form-data; boundary=---------------------------96951961826872',
               'Content-Length': str (len (body))}

    return body, headers

def main():
    cj = http.cookiejar.CookieJar()
    opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
    urllib.request.install_opener(opener)
    postdata = urllib.parse.urlencode({'user':123,'password':123, 'cookie': 'on'})
    r = urllib.request.urlopen("#url#", postdata)
    time.sleep(2)

    #connection = http.client.HTTPConnection (req.get_host ())
    #print(req.get_selector())
    data = {}
    files = {'notePhoto': '01.jpg'}
    #connection.request ('POST', req.get_selector(),   #这个实现不能够带上cookie
    #     *encode_multipart_data (data, files))
    #response = connection.getresponse ()
    req = urllib.request.Request (url, *encode_multipart_data (data, files))
    response = urllib.request.urlopen(req)
if __name__ == '__main__':
    main()

 

      一开始使用HTTPConnection,但是在request时,无法带上cookie。所以又改用urllib.request.Request 。这里被用户手册给误导了:

class urllib.request.Request(url[, data][, headers][, origin_req_host][, unverifiable])

This class is an abstraction of a URL request.

url should be a string containing a valid URL.

data may be a string specifying additional data to send to the server, or None if no such data is needed. Currently HTTP requests are the only ones that use data; the HTTP request will be a POST instead of a GET when the data parameter is provided. data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format.

 

      红色的标注让我一开始就忽视了Requst对象,其实data可以是multipart/form-data类型!!

你可能感兴趣的:(String,python,url,Random,Class,logging)