爬虫入门小小框架

爬虫小小框架

简单的基础框架:

import requests
url="http://www.baidu.com"
def getHTMLText(url):
    try:
         r=requests.get(url,timeout=30)
         r.raise_for_status()
         r.encoding=r.apparent_encoding
         return r.text
    except:
        return "产生异常"

print(getHTMLText(url))

举例:

import requests
url="https://item.jd.com/2967929.html"
try:
    r=requests.get(url)
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text[:1000])
except:
    print("爬取失败")

运行结果:
爬虫入门小小框架_第1张图片

你可能感兴趣的:(python学习)