python从url获取pdf文件并保存在本地

思路:

  1. 借助requests下载文件
  2. 将文件转换为字节流
  3. 将字节流保存在本地

代码案例

图片、pdf或文本什么的,思路是一样的:

def get_file_from_url(url_file):
    import requests
    import io
    send_headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36",
        "Connection": "keep-alive",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
        "Accept-Language": "zh-CN,zh;q=0.8"}
    req = requests.get(url_file, headers=send_headers)  # 通过访问互联网得到文件内容
    bytes_io = io.BytesIO(req.content)  # 转换为字节流
    with open('temp.pdf', 'wb') as file:
        file.write(bytes_io.getvalue())  # 保存到本地
    # import time
    # time.sleep(2) # 最好做一个休眠
    return bytes_io


if __name__ == '__main__':
    url = "http://pdf.dfcfw.com/pdf/H301_AP201901241288245777_1.pdf"
    get_file_from_url(url)

你可能感兴趣的:(python,python)