scrapy设置代理proxy

http://stackoverflow.com/questions/4710483/scrapy-and-proxies


增加文件middlewares.py放置在setting.py平行的目录下

import base64
class ProxyMiddleware(object):
# overwrite process request
def process_request(self, request, spider):
    # Set the location of the proxy
    request.meta['proxy'] = "http://YOUR_PROXY_IP:PORT"

    # Use the following lines if your proxy requires authentication
    proxy_user_pass = "USERNAME:PASSWORD"
    # setup basic authentication for the proxy
    encoded_user_pass = base64.b64encode(proxy_user_pass)
    request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass

很多网上的答案使用base64.encodestring来编码proxy_user_pass,有一种情况,当username太长的时候,会出现错误,所以推荐使用b64encode编码方式

然后在setting.py中,在DOWNLOADER_MIDDLEWARES中把它打开,projectname.middlewares.ProxyMiddleware: 1就可以了

你可能感兴趣的:(python,爬虫,scrapy)