最近帮同事处理一批气象数据,统计某省多个地市50年的的平均降雨量,excle折腾半天,还可能卡死,python几十行代码即可,不得不佩服pandas库的强大。
"""
@Author : ISR
@contact: [email protected]
@Software: PyCharm
@FileName: get_meteorology.py
@Time : 2020/5/18 9:46
@desc:
"""
import pandas as pd
from pandas.core.frame import DataFrame
import numpy as np
def main(in_path, out_path, star,end):
# 读取气象数据
df = pd.read_csv(in_path, header=0, index_col=0, engine='python')
data = np.zeros((len(citys), 2))
for i in range(len(citys)):
# 进度条
pst = int(100 * (i + 1) / len(citys))
print('\r{0}{1}{2}%'.format(("计算中".split("\\")[-1]), '▉' * int(pst / 10), (pst)), end='')
dataframe = df[(df['站名'] == citys[i]) & (df['year'] >= star) & (df['year'] <= end)]
# 秋季 6月到9月
autumn = df[(df['站名'] == citys[i]) & (df['year'] >= star) & (df['year'] <= end) & (df['mouth'] < 10) & (df['mouth'] >=6)]
# 春季 10月到次年6月(这里利用pandas去重函数drop_duplicates,实际是为了取余)
dataframe = dataframe.append(autumn)
spring = dataframe.drop_duplicates(keep=False)
# 最后一列为降雨量
autumn = (autumn.values)[:, -1]
spring = (spring.values)[:, -1]
# 计算年均值
autumn_mean = (np.sum(autumn[np.where(autumn > 0)]))/(end-star+1)
spring_mean = (np.sum(spring[np.where(spring > 0)]))/(end-star+1)
data[i, 0] = autumn_mean
data[i, 1] = spring_mean
meteorology = DataFrame(data, columns=["autumn","spring"], index=citys)
meteorology.to_csv(out_path)
if __name__ == '__main__':
citys = [“石家庄”,“天津”] # 站名
star = 1990 # 起始年份
end = 2019 # 结束年份
in_path =r"C:\Users\marshmallow\Desktop\04_气象数据\in.csv" # 输入
out_path = r"C:\Users\marshmallow\Desktop04_气象数据\out.csv"# 输出
main(in_path, out_path, star, end)
1、输入数据结构图(降雨数据倒数第一列)。
2、将一年分成了冬、求两季。
秋季作物-当年6月到当年9月
春季作物–当年10月到次年5月