【Python requests 代理服务问题】

Python requests 代理服务问题

      • 背景
      • 问题复现
        • 代码:
        • 错误信息
      • 分析

背景

学习 Python requests 模块 代理服务时,碰到请求报错,特此记录

问题复现

代码:

import requests

proxy = {
    'http': 'xxx.xxx.xx.xxx:808',
    'https': 'xxx.xxx.xx.xxx:8080'
}  # 设置代理 ip 及对应的端口号

#  对需要爬取的网页发送请求
response = requests.get('http://www.xxxxxx.com/', proxies=proxy)
print(response.content.decode)

错误信息

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 412, in send
    conn = self.get_connection(request.url, proxies)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 309, in get_connection
    proxy_manager = self.proxy_manager_for(proxy)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 193, in proxy_manager_for
    manager = self.proxy_manager[proxy] = proxy_from_url(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/poolmanager.py", line 536, in proxy_from_url
    return ProxyManager(proxy_url=url, **kw)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/poolmanager.py", line 480, in __init__
    raise ProxySchemeUnknown(proxy.scheme)
urllib3.exceptions.ProxySchemeUnknown: Proxy URL had no scheme, should start with http:// or https://

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/kahnlau/Documents/coding/python/jinruicoding/zero_foundation_to_learn_python/chapter_14_web_spider/code1406_proxy.py", line 14, in 
    response = requests.get('http://www.mingrisoft.com/', proxies=proxy)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 414, in send
    raise InvalidURL(e, request=request)
requests.exceptions.InvalidURL: Proxy URL had no scheme, should start with http:// or https://

分析

最后一句是关键「requests.exceptions.InvalidURL: Proxy URL had no scheme, should start with http:// or https://」
翻译过来大概意思就是 代理 URL 需要以 http:// 或者 https://
尝试将 「http://」 和 「https://」 加入到对应的 IP 前,再次执行,成功
经查阅得知,python3.7之后代理格式就变了,所以解决方案:

  1. 降低 Python 版本到 3.7 以下
  2. 使用新的语法编写
proxy = {
    'http': 'http://xxx.xxx.xx.xxx:808',
    'https': 'https://xxx.xxx.xx.xxx:8080'
}  # 设置代理 ip 及对应的端口号

你可能感兴趣的:(Python,错误记录,python,爬虫,开发语言)