学习 Python requests 模块 代理服务时,碰到请求报错,特此记录
import requests
proxy = {
'http': 'xxx.xxx.xx.xxx:808',
'https': 'xxx.xxx.xx.xxx:8080'
} # 设置代理 ip 及对应的端口号
# 对需要爬取的网页发送请求
response = requests.get('http://www.xxxxxx.com/', proxies=proxy)
print(response.content.decode)
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 412, in send
conn = self.get_connection(request.url, proxies)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 309, in get_connection
proxy_manager = self.proxy_manager_for(proxy)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 193, in proxy_manager_for
manager = self.proxy_manager[proxy] = proxy_from_url(
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/poolmanager.py", line 536, in proxy_from_url
return ProxyManager(proxy_url=url, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/poolmanager.py", line 480, in __init__
raise ProxySchemeUnknown(proxy.scheme)
urllib3.exceptions.ProxySchemeUnknown: Proxy URL had no scheme, should start with http:// or https://
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/kahnlau/Documents/coding/python/jinruicoding/zero_foundation_to_learn_python/chapter_14_web_spider/code1406_proxy.py", line 14, in
response = requests.get('http://www.mingrisoft.com/', proxies=proxy)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 414, in send
raise InvalidURL(e, request=request)
requests.exceptions.InvalidURL: Proxy URL had no scheme, should start with http:// or https://
最后一句是关键「requests.exceptions.InvalidURL: Proxy URL had no scheme, should start with http:// or https://」
翻译过来大概意思就是 代理 URL 需要以 http:// 或者 https://
尝试将 「http://」 和 「https://」 加入到对应的 IP 前,再次执行,成功
经查阅得知,python3.7之后代理格式就变了,所以解决方案:
proxy = {
'http': 'http://xxx.xxx.xx.xxx:808',
'https': 'https://xxx.xxx.xx.xxx:8080'
} # 设置代理 ip 及对应的端口号