基于百度慧眼人口迁徙平台,编写脚本批量生成获取长三角城市群人口迁移数据的url,调用python requests包,用get方法获取返回的json数据,抓取长三角26个城市相互之间的迁移指数,其中每个城市按照迁入与迁出两种情况进行抓取(人口流)用类似方法爬取高德大数据平台上的浮动车迁徙意愿数据(交通流)
network中有连个cityrank开头的包,其request URL分别为:
http://huiyan.baidu.com/migration/cityrank.jsonp?dt=city&id=340800&type=move_in&callback=jsonp_1614915621455_2296255
http://huiyan.baidu.com/migration/cityrank.jsonp?dt=city&id=340800&type=move_in&date=20210304&callback=jsonp_1614915621926_879290
长三角https://trp.autonavi.com/cityTravel/line.do?adcode=3&dt=2021-03-07&willReal=WILL&size=20
成渝 https://trp.autonavi.com/cityTravel/line.do?adcode=1&dt=2021-03-07&willReal=WILL&size=20
京津冀https://trp.autonavi.com/cityTravel/line.do?adcode=2&dt=2021-03-07&willReal=WILL&size=20
分别包括:起始城市、线绘图坐标,实际迁徙指数realIdx、起始城市地域代码、终点城市区域代码、重点城市、意愿迁徙指数willIdx
https://trp.autonavi.com/cityTravel/inAndOutCity.do?adcode=3&dt=2021-03-08&willReal=REAL&size=40&inOut=OUT
inAndOutCity.do指令请求的是城市的嵌入与迁出数据(in&out两类别)
response:
{"name":"上海","adcode":310000,"willIdx":89.738611,"point":"121.427649000,31.093825000","realIdx":41.51}
trend.do请求为整个研究区的迁移意愿的整体指标(主页面下方的折线图数据),此处可能不需要用到
response:
{"dt":"2021-02-07","realIdxRatio":-0.12795780360882006,"willIdx":45.6706410385,"willIdxRatio":-0.06935027272202482,"realIdx":23.2149464615}
line.do为热门路线的迁徙指标,包含will index与real index两个值,前者显然比后者值更大
response 为一个数组,其中每个元素为:
{"startCity":"苏州","line":["120.645957000,31.401834000","121.427649000,31.093825000"],"realIdx":17.51222222222222,"startCityAdcode":320500,"endCityAdcode":310000,"endCity":"上海","willIdx":31.24416666666667}