python3.3 urllib.error.HTTPError: HTTP Error 403: Forbidden

该错误是因为网站禁止爬虫,可以在请求加上模拟的头信息,伪装成浏览器访问.

myurl = ""
myheaders = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
req = urllib.request.Request(url=myurl,headers = myheaders)
data = urllib.request.urlopen(req).read()

或者

req = urllib.request.Request(myurl)
req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36')
data = urllib.request.urlopen(req).read()

headers是参数不是很懂,回头在研究一下.

你可能感兴趣的:(Web,python,error)