最近遇到个很棘手的问题弄了很久才明白。
网页http://www.hkexnews.hk/sdw/search/mutualmarket_c.aspx
我想获取里面的资料,但是需要选取一个日期,那么意味着我需要发送一个post包给此页面。
从而发现了2个随机参数:VIEWSTATE、EVENTVALIDATION

具体解决办法如下:
def get_hiddenvalue(url):
request=urllib.request.Request(url)
reponse=urllib.request.urlopen(request)
resu=reponse.read()
html = resu.decode('utf-8') # python3
VIEWSTATE =re.findall(r'', html,re.I)
EVENTVALIDATION =re.findall(r'', html,re.I)
return VIEWSTATE[0],EVENTVALIDATION[0]

编写函数先访问一次网页。随后获取该值之后再发送post包 。解决!

全部源码如下:
import requests
import urllib.request
import re
NIAN = '2017'
YUE = '12'
RI = '30'
url = 'http://www.hkexnews.hk/sdw/search/mutualmarket_c.aspx'
def get_hiddenvalue(url):
request=urllib.request.Request(url)
reponse=urllib.request.urlopen(request)
resu=reponse.read()
html = resu.decode('utf-8') # python3
VIEWSTATE =re.findall(r'', html,re.I)
EVENTVALIDATION =re.findall(r'', html,re.I)
return VIEWSTATE[0],EVENTVALIDATION[0]
VIEWSTATE, EVENTVALIDATION=get_hiddenvalue(url)
data = {
'EVENTVALIDATION':EVENTVALIDATION,
'
VIEWSTATE':VIEWSTATE,
'__VIEWSTATEGENERATOR':'EC4ACD6F',
'btnSearch.x':'23',
'btnSearch.y':'12',
'ddShareholdingDay':NIAN,
'ddShareholdingMonth':YUE,
'ddShareholdingYear':RI,
'today':'20180509'
}
html_post = requests.post(url, data=data)
print(html_post.text)