python-requests库学习

Requests: HTTP for Humans


requests

利用requests发送请求

以github API为例
https://developer.github.com/

请求方法

python-requests库学习_第1张图片
请求方法.jpg

example

response = requests.get(build_uri('user/emails'),auth = ('[username]','[password]'))

带参数的请求

  • url参数

  • 表单参数提交

response = requests.get(build_uri('users'), params={'since': 11})
  • json参数提交
response = requests.patch(build_uri('user'), auth=(
      '[username]', '[password]'), json={'name': '[new name]'})

请求异常处理

  • 请求超时处理
    try:
        response = requests.get(build_uri('user/emails'), timeout=10)
        response.raise_for_status()
    except exceptions.Timeout as e:
        print e.message
    except exceptions.HTTPError as e:
        print e.message
    else:
        print_response(response)

自定义Request

python-requests库学习_第2张图片
requests过程.jpg
    from requests import Request, Session
    s = Session()
    headers = {'User-Agent': 'fake1.3.4'}
    req = Request('GET', build_uri('user/emails'),
                  auth=('[username]', '[password]'), headers=headers)
    prepped = req.prepare()
    print prepped.body
    print prepped.headers

    resp = s.send(prepped,timeout = 5)
    print resp.request.headers
    print resp.text

响应请求 response

响应基本api

  • status_code
  • reason
code reason
2XX 成功
3XX 重定向
4XX 客户端错误
5XX 服务器错误
  • headers
  • url
  • history
  • elapsed
  • request
    与响应主体有关
  • encoding
  • raw
  • content
  • text
  • json

可在python交互模式下获得response,并进行查看相关的属性结果

requests库下载图片

  • 利用爬虫自动下载图片
  • 远程下载服务器上的文本文件
python-requests库学习_第3张图片
download_img.jpg

事件钩子(Event Hooks)

非线性处理事件

python-requests库学习_第4张图片
eventHook.jpg
def get_key_info(response, *args, **kwargs):
    print response.headers['Content-Type']

def main():
    requests.get('https://api.github.com', hooks=dict(response=get_key_info))

进阶话题

HTTP认证

  • oauth认证
from requests.auth import AuthBase


class GithubAuth(AuthBase):

    def __init__(self, token):
        self.token = token

    def __call__(self, r):
        r.headers['Authorization'] = ' '.join(['token', self.token])
        return r


def oauth_advanced():
    auth = GithubAuth('db8c8e8279b82af279b25c3a68168efe1f341f90')
    response = requests.get(construct_url('user/emails'), auth=auth)
    print response.status_code, response.reason
    print response.text

代理(Proxy)

翻墙原理

  1. 启动代理服务Heroku(类似阿里云)
    2, 在主机1080端口启动Socks服务
  2. 将请求转发大1080端口
  3. 获取相应资源

pip install requests[socksv5]

import requests
proxies = {'http':'socks5://127.0.0.1:1080','https':'socks5://127.0.0.1:1080'}
response = requests.get(url,proxies = proxies,timeout = 10)

Session和Cookie

Session 服务器端存储信息
Cookie 浏览器端存储信息

  • Cookie
r = requests.get(url)
r.cookies['cookiename']
#解析
cookies = dict(c = 'uid')
requests.get(url,cookies = cookies)

存储在浏览器端不够安全,而且带宽占用较大

  • session
    在浏览器中的cookie中仅存储session-id,相关信息存放在服务器中,解决了带宽和安全的问题

Think with the first principle
多动动脑子去想

慕课网学习视频地址:http://www.imooc.com/video/13160
我的代码:https://github.com/lamchan2844/mypython/tree/master/requests

你可能感兴趣的:(python-requests库学习)