http://www.92pifa.com/Product/?Show_3547.html
http://code.google.com/p/httplib2/wiki/Examples
首先,到92pifa的首页,然后使用Live HTTP headers对提交登录的请求进行分析
最后我们使用如下代码提交请求,并且获取返回的set-cookie的键值
http = httplib2.Http()
url = 'http://www.92pifa.com/CheckLogin.asp'
headers = {'Content-type': 'application/x-www-form-urlencoded',
"Cookie": "AJSTAT_ok_times=6; ASPSESSIONIDCQCCSDQT=IILIIFGBPDNPEOBPPPMFEOMD; ASPSESSIONIDCSDAQBTT=BENGJBDCJFMBCCJBMFCIPCPN; YWZG=mypower=&agentid=&username=; AJSTAT_ok_pages=6"}
response, content = http.request(url, 'POST', headers=headers, body="username=mlzboy&password=mlzboy&verifycode=3326&loginforever=1&x=34&y=14")
print response
headers = {'Cookie': response['set-cookie']}
print headers
我们再利用这个headers带上请求,就可以模拟登录后请求页面了
import re
def gethtml(html,start,end):
s=html.find(start)
e=html.find(end)
return all(html[s+len(start):e])
def all(html):
result=html
for elem in ["font","div","td","tr"]:
result=re.sub(r"<%s[\s\S]*?>"%elem,"",result)
result=re.sub(r"</%s[\s\S]*?>"%elem,"",result)
return result
import urllib
import httplib2
http = httplib2.Http()
headers={'Cookie': 'YWZG=mypower=3&agentid=1&username=mlzboy; expires=Mon, 30-Jul-2012 16:00:00 GMT; path=/'}
url = 'http://www.92pifa.com/Product/?Show_3547.html'
response, content = http.request(url, 'GET', headers=headers)
content=content.decode("gb18030").encode("utf8")
print gethtml(content,"[现货数量]","[数量订购]").strip()