天气数据爬取

目录

  • 历史气象数据获取
    • 浏览器访问模拟

历史气象数据获取

主要的python包
requests
BeautifulSoup
re
pandas
lxml

浏览器访问模拟

根据浏览器Request-Header参数,让request模拟浏览器行为

import requests
from bs4 import BeautifulSoup
import re
import pandas as pd
 
url = 'https://www.wentian123.com/history/?location=%E5%98%89%E5%B3%AA%E5%85%B3&startdate=2024-01-01&enddate=2024-08-15'
header = {
   
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Encoding': 'gzip, deflate, br, zstd',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Cookie': 'Hm_lvt_452d5df9c96fd4e38bdb12c20493de8a=1724145184; HMACCOUNT=7E8A91446E19E40E; Hm_lvt_a1574f7ae5f0b9e15ea9a7c1cd8e90c2=1724145900; Hm_lpvt_a1574f7ae5f0b9e15ea9a7c1cd8e90c2=1724918557; Hm_lpvt_452d5df9c96fd4e38bdb12c20493de8a=1724923348',

你可能感兴趣的:(python地理数据处理,python,beautifulsoup,request)