一、安装Anaconda
1、下载Anaconda
https://www.anaconda.com/distribution/
2、安装Anaconda
3、启动Anaconda命令行
启动Anaconda Prompt:
启动python和ipython:
4、启动Spider
利用Spider既可以编写Python程序,也可以交互式操作。
比如要查询“泸州”的天气。
http://toy1.weather.com.cn/search?cityname=泸州
在返回的数据里,看第一项数据的ref值,其开头9位就是泸州的城市代码:101271001。
http://t.weather.sojson.com/api/weather/city/101271001
从返回的json里可以获取想要的天气信息。 在Python里,利用requests库很容易拿到这个json:requests.get(url).json()。
利用requests库爬取指定网页的全部内容:
返回的不是一个json数组字符串,多了一对圆括号,需要将其去除掉,然后再解析成json。
导入json模块,将json数组字符串解析成json:
这个json数组包含5个元素,我们只关心第一个元素:
我们需要的城市代码是键"ref”对应值的前9位:
大家可以看到,返回的就是一个json字符串,所以我们可以调用响应对象的json()函数:
通过json()函数得到的json字符串是有层次,便于我们分析数据结构,获取我们想要的信息。
// 获取城市代码
def getCityCode(city):
url = ‘http://toy1.weather.com.cn/search?cityname=’ + city
r = requests.get(url)
if len(r.text) > 4:
json_arr = json.loads(r.text[1:len(r.text)-1])
code = json_arr[0][‘ref’][0:9]
return code
else:
return “000000000”
可以看到r.text还不是json字符串,但是去掉开头和末尾的圆括号就成了json数组字符串。因此要用r.text[1:len(r.text)-1]切片操作来获取json数组字符串,再利用json的loads方法将json数组字符串转换成json数组对象json_arr,json_arr第一项的ref值,取前9位就是想要获取的城市代码。
如果输入的城市不存在,那么r.text的长度就是4,因此要用len(r.text) > 4作为判断条件,满足条件才能获取城市代码进行返回,否则就返回一个不存在的城市代码“0000000”。
(2)获取城市天气信息
#//获取城市天气信息
def getWeatherInfo(city):
code = getCityCode(city)
url = ‘http://t.weather.sojson.com/api/weather/city/’ + code
r = requests.get(url)
info = r.json()
weather = {}
if info[‘status’] == 200:
weather[‘城市:’] = info[‘cityInfo’][‘parent’] + info[‘cityInfo’][‘city’]
weather[‘时间:’] = info[‘time’] + ’ ’ + info[‘data’][‘forecast’][0][‘week’]
weather[‘温度:’] = info[‘data’][‘forecast’][0][‘high’] + ’ ’ + info[‘data’][‘forecast’][0][‘low’]
weather[‘天气:’] = info[‘data’][‘forecast’][0][‘type’]
else:
weather[‘错误:’] = ‘[’ + city + ‘]不存在!’
return weather
{‘message’: ‘success感谢又拍云(upyun.com)提供CDN赞助’,
‘status’: 200,
‘date’: ‘20191026’,
‘time’: ‘2019-10-26 08:56:44’,
‘cityInfo’: {‘city’: ‘泸州市’,
‘citykey’: ‘101271001’,
‘parent’: ‘四川’,
‘updateTime’: ‘08:38’},
‘data’: {‘shidu’: ‘100%’,
‘pm25’: 6.0,
‘pm10’: 8.0,
‘quality’: ‘优’,
‘wendu’: ‘11’,
‘ganmao’: ‘各类人群可自由活动’,
‘forecast’: [{‘date’: ‘26’,
‘high’: ‘高温 13℃’,
‘low’: ‘低温 12℃’,
‘ymd’: ‘2019-10-26’,
‘week’: ‘星期六’,
‘sunrise’: ‘07:05’,
‘sunset’: ‘18:19’,
‘aqi’: 17,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘27’,
‘high’: ‘高温 14℃’,
‘low’: ‘低温 13℃’,
‘ymd’: ‘2019-10-27’,
‘week’: ‘星期日’,
‘sunrise’: ‘07:06’,
‘sunset’: ‘18:18’,
‘aqi’: 43,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘28’,
‘high’: ‘高温 21℃’,
‘low’: ‘低温 12℃’,
‘ymd’: ‘2019-10-28’,
‘week’: ‘星期一’,
‘sunrise’: ‘07:06’,
‘sunset’: ‘18:17’,
‘aqi’: 55,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘多云’,
‘notice’: ‘阴晴之间,谨防紫外线侵扰’},
{‘date’: ‘29’,
‘high’: ‘高温 24℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-10-29’,
‘week’: ‘星期二’,
‘sunrise’: ‘07:07’,
‘sunset’: ‘18:16’,
‘aqi’: 74,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘多云’,
‘notice’: ‘阴晴之间,谨防紫外线侵扰’},
{‘date’: ‘30’,
‘high’: ‘高温 23℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-10-30’,
‘week’: ‘星期三’,
‘sunrise’: ‘07:08’,
‘sunset’: ‘18:15’,
‘aqi’: 79,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘阵雨’,
‘notice’: ‘阵雨来袭,出门记得带伞’},
{‘date’: ‘31’,
‘high’: ‘高温 23℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-10-31’,
‘week’: ‘星期四’,
‘sunrise’: ‘07:08’,
‘sunset’: ‘18:14’,
‘aqi’: 71,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘阵雨’,
‘notice’: ‘阵雨来袭,出门记得带伞’},
{‘date’: ‘01’,
‘high’: ‘高温 19℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-11-01’,
‘week’: ‘星期五’,
‘sunrise’: ‘07:09’,
‘sunset’: ‘18:14’,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘02’,
‘high’: ‘高温 21℃’,
‘low’: ‘低温 14℃’,
‘ymd’: ‘2019-11-02’,
‘week’: ‘星期六’,
‘sunrise’: ‘07:10’,
‘sunset’: ‘18:13’,
‘fx’: ‘东风’,
‘fl’: ‘<3级’,
‘type’: ‘阴’,
‘notice’: ‘不要被阴云遮挡住好心情’},
{‘date’: ‘03’,
‘high’: ‘高温 22℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-11-03’,
‘week’: ‘星期日’,
‘sunrise’: ‘07:10’,
‘sunset’: ‘18:12’,
‘fx’: ‘东北风’,
‘fl’: ‘<3级’,
‘type’: ‘阴’,
‘notice’: ‘不要被阴云遮挡住好心情’},
{‘date’: ‘04’,
‘high’: ‘高温 21℃’,
‘low’: ‘低温 16℃’,
‘ymd’: ‘2019-11-04’,
‘week’: ‘星期一’,
‘sunrise’: ‘07:11’,
‘sunset’: ‘18:11’,
‘fx’: ‘北风’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘05’,
‘high’: ‘高温 20℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-11-05’,
‘week’: ‘星期二’,
‘sunrise’: ‘07:12’,
‘sunset’: ‘18:11’,
‘fx’: ‘东北风’,
‘fl’: ‘<3级’,
‘type’: ‘多云’,
‘notice’: ‘阴晴之间,谨防紫外线侵扰’},
{‘date’: ‘06’,
‘high’: ‘高温 22℃’,
‘low’: ‘低温 16℃’,
‘ymd’: ‘2019-11-06’,
‘week’: ‘星期三’,
‘sunrise’: ‘07:13’,
‘sunset’: ‘18:10’,
‘fx’: ‘北风’,
‘fl’: ‘<3级’,
‘type’: ‘多云’,
‘notice’: ‘阴晴之间,谨防紫外线侵扰’},
{‘date’: ‘07’,
‘high’: ‘高温 19℃’,
‘low’: ‘低温 15℃’,
‘ymd’: ‘2019-11-07’,
‘week’: ‘星期四’,
‘sunrise’: ‘07:13’,
‘sunset’: ‘18:09’,
‘fx’: ‘西北风’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘08’,
‘high’: ‘高温 16℃’,
‘low’: ‘低温 14℃’,
‘ymd’: ‘2019-11-08’,
‘week’: ‘星期五’,
‘sunrise’: ‘07:14’,
‘sunset’: ‘18:09’,
‘fx’: ‘西北风’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’},
{‘date’: ‘09’,
‘high’: ‘高温 15℃’,
‘low’: ‘低温 13℃’,
‘ymd’: ‘2019-11-09’,
‘week’: ‘星期六’,
‘sunrise’: ‘07:15’,
‘sunset’: ‘18:08’,
‘fx’: ‘西北风’,
‘fl’: ‘<3级’,
‘type’: ‘阴’,
‘notice’: ‘不要被阴云遮挡住好心情’}],
‘yesterday’: {‘date’: ‘25’,
‘high’: ‘高温 13℃’,
‘low’: ‘低温 12℃’,
‘ymd’: ‘2019-10-25’,
‘week’: ‘星期五’,
‘sunrise’: ‘07:04’,
‘sunset’: ‘18:20’,
‘aqi’: 12,
‘fx’: ‘无持续风向’,
‘fl’: ‘<3级’,
‘type’: ‘小雨’,
‘notice’: ‘雨虽小,注意保暖别感冒’}}}
如果输入的城市代码存在,那么可以从返回的json里获取想要的天气信息。
如果输入的城市代码不存在,比如“0000000”,返回的json的status属性就是404,此时getWeatherInfo函数返回的就是一个错误信息。
import requests, json, re
from matplotlib import pyplot as plt
def getCityCode(city):
url = ‘http://toy1.weather.com.cn/search?cityname=’ + city
r = requests.get(url)
if len(r.text) > 4:
json_arr = json.loads(r.text[1:len(r.text)-1])
code = json_arr[0][‘ref’][0:9]
return code
else:
return “000000000”
def getWeatherInfo(city):
code = getCityCode(city)
url = ‘http://t.weather.sojson.com/api/weather/city/’ + code
r = requests.get(url)
info = r.json()
weather = {}
if info[‘status’] == 200:
weather[‘城市:’] = info[‘cityInfo’][‘parent’] + info[‘cityInfo’][‘city’]
weather[‘时间:’] = info[‘time’] + ’ ’ + info[‘data’][‘forecast’][0][‘week’]
weather[‘温度:’] = info[‘data’][‘forecast’][0][‘high’] + ’ ’ + info[‘data’][‘forecast’][0][‘low’]
weather[‘天气:’] = info[‘data’][‘forecast’][0][‘type’]
else:
weather[‘错误:’] = ‘[’ + city + ‘]不存在!’
return weather
def printWeatherInfo(weather):
for key in weather:
print(key + weather[key])
def getTemperatures(city):
code = getCityCode(city)
url = ‘http://t.weather.sojson.com/api/weather/city/’ + code
r = requests.get(url)
info = r.json()
temperatures = {}
if info[‘status’] == 200:
forecast = info[‘data’][‘forecast’]
for i in range(len(forecast)):
dayinfo = forecast[i]
high = int(re.findall(r’\d+’, dayinfo[‘high’])[0])
low = int(re.findall(r’\d+’, dayinfo[‘low’])[0])
temperatures[dayinfo[‘ymd’]] = [high, low]
else:
temperatures[‘错误:’] = ‘[’ + city + ‘]不存在!’
return temperatures
def printTemperatures(temperatures):
if ‘错误:’ not in temperatures.keys():
for key in temperatures:
print(key + ’ 高温:’+ str(temperatures[key][0]) + ’ 低温:’ + str(temperatures[key][1]))
def drawTemperatureLineChart():
temperatures = getTemperatures(city)
if ‘错误:’ not in temperatures.keys():
dates = []
highs = []
lows = []
for key in temperatures:
dates.append(key)
highs.append(temperatures[key][0])
lows.append(temperatures[key][1])
fig = plt.figure(dpi=81, figsize=(5,4))
plt.xlabel(‘Date (YYYY-MM-DD)’, fontsize = 10)
plt.ylabel(“Temperature (℃)”, fontsize=10)
fig.autofmt_xdate()
plt.plot(dates, highs, c=‘red’, alpha=0.5)
plt.plot(dates, lows, c=‘blue’, alpha=0.5)
city = input(‘输入城市名:’)
printWeatherInfo(getWeatherInfo(city))
printTemperatures(getTemperatures(city))
drawTemperatureLineChart()