httplib是一个相对底层的http请求模块,其上有专门的包装模块,如urllib内建模块,goto等第三方模块,但是封装的越高就越不灵活,比如urllib模块里请求错误时就不会返回结果页的内容,只有头信息,对于某些需要检测错误请求返回值的场景就不适用,所以就得用这个模块了。
1、class httplib.HTTPConnection
timeout: 单次请求的超时时间,没有时默认使用httplib模块内的全局的超时时间
实例: conn1 = HTTPConnection('www.baidu.com:80') conn2 = HTTPconnection('www.baidu.com',80) conn3 = HTTPConnection('www.baidu.com',80,True,10) 错误实例: conn3 = HTTPConnection('www.baidu.com:80',True,10)
conn3 = HTTPSConnection('accounts.google.com',443,key_file,cert_file,True,10)
conn.request('GET', '/', '', {'user-agent':'test'})返回:
res = conn.getresponse()
conn.close()
body = res.read() pbody = res.read(10)
#!/usr/bin/env python # -*- coding: utf-8 -*- import httplib import urllib def sendhttp(): data = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 'show'}) headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"} conn = httplib.HTTPConnection('bugs.python.org') conn.request('POST', '/', data, headers) httpres = conn.getresponse() print httpres.status print httpres.reason print httpres.read() if __name__ == '__main__': sendhttp()
还有一个获取头部信息的简单例子
import httplib url = "www.baidu.com" conn = httplib.HTTPConnection(url) conn.request("GET", "/") r=conn.getresponse() h = r.getheaders() for hh in h: print hh
当然了还可以使用urllib2模块,来获取头部信息
from urllib2 import urlopen, HTTPError webURL = 'http://www.sina.com' try: res = urlopen(webURL) print res.info() print res.geturl() except HTTPError as e: print e.code except: print "Error!!"
当然最简单的方法就是用curl命令啦
curl --head http://www.baidu.com
HTTP/1.1 200 OK Date: Wed, 10 Apr 2013 10:12:35 GMT Server: BWS/1.0 Content-Length: 10320 Content-Type: text/html;charset=utf-8 Cache-Control: private Expires: Wed, 10 Apr 2013 10:12:35 GMT Set-Cookie: H_PS_PSSID=1467_1945_1788_2141_2209; path=/; domain=.baidu.com Set-Cookie: BAIDUID=3FB8134FCB07E7C1E557B841CAA1596E:FG=1; expires=Wed, 10-Apr-43 10:12:35 GMT; path=/; domain=.baidu.com P3P: CP=" OTI DSP COR IVA OUR IND COM " Connection: Keep-Alive