【空气质量数据分析专题七】污染物浓度月变化分析

前言

对空气质量日级别五年数据进行月变化分析,可以看出污染物浓度随月份变化的特征。

分析流程

对数据进行专题二的预处理后,计算出各污染物全时段的各月平均浓度,最后进行可视化分析。月变化分析方式有多种,这里通过峰峦图进行分析。

核心代码

这部分使用Python处理数据,然后使用R进行绘图,其中a_January代表1月,后面的以此类推,在月份英文名前加字母主要是确保月份有序,从而峰峦图才能按月份排列,更为直观。
(1)处理数据(Python代码)

def month_trend_analysis(self, df_station, year_list):
     """
     日级别浓度月变化分析
     :param df_station: 站点数据
     :param year_list: 年份列表
     :return:
     """
     df_station['month'][df_station['month'] == 1] = 'a_January'
     df_station['month'][df_station['month'] == 2] = 'b_February'
     df_station['month'][df_station['month'] == 3] = 'c_March'
     df_station['month'][df_station['month'] == 4] = 'd_April'
     df_station['month'][df_station['month'] == 5] = 'e_May'
     df_station['month'][df_station['month'] == 6] = 'f_June'
     df_station['month'][df_station['month'] == 7] = 'g_July'
     df_station['month'][df_station['month'] == 8] = 'h_August'
     df_station['month'][df_station['month'] == 9] = 'i_September'
     df_station['month'][df_station['month'] == 10] = 'j_October'
     df_station['month'][df_station['month'] == 11] = 'k_November'
     df_station['month'][df_station['month'] == 12] = 'l_December'
     result2 = pd.pivot_table(df_station, index=['month', 'day'], aggfunc=np.mean,values=['PM10', 'PM2.5', 'SO2', 'NO2', 'O3', 'CO'])
     for i in result2.index[:]:
         result2.loc[i, 'month1'] = i[0]
         result2.loc[i, 'day1'] = i[1]
     pic_loc0 = Path(self.cf_info['output']['picture']).joinpath(df_station['city'].values[0])
     pic_loc = pic_loc0.joinpath('污染物月变化特征')
     if not os.path.exists(pic_loc):
         os.mkdir(pic_loc)
     result2.to_excel(pic_loc / (
             df_station['station'].values[0] + str(year_list[0]) + '-' + str(year_list[-1]) + '年各污染物月浓度变化.xls'),
                      encoding='gbk')

(2)可视化分析(R代码,省去了路径)

micefiles <- list.files(micepath, full.names = TRUE)

cols=c("green","yellow","orange","red","purple","maroon")

for(i in 1:length(micefiles)){
  file_full <- strsplit(micefiles[i], '/')
  f_names <- strsplit(file_full[[1]][2], '各')
  name <- f_names[[1]][1]

  data <- read.xlsx(micefiles[i], sheetIndex=1)
  ggplot(data, aes(x=`PM2.5`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="PM2.5浓度(微克/立方米)", option = "D")+labs(title="PM2.5浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'pm2_5', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))
  
  ggplot(data, aes(x=`PM10`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="PM10浓度(微克/立方米)", option = "D")+labs(title="PM10浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'pm10', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))

  ggplot(data, aes(x=`SO2`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="SO2浓度(微克/立方米)", option = "D")+labs(title="SO2浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'so2', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))

  ggplot(data, aes(x=`NO2`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="NO2浓度(微克/立方米)", option = "D")+labs(title="NO2浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'no2', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))

  ggplot(data, aes(x=`O3`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="O3浓度(微克/立方米)", option = "D")+labs(title="O3浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'o3', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))

  ggplot(data, aes(x=`CO`, y=`month1`, fill = stat(x)))+geom_density_ridges_gradient(scale=3, rel_min_height=0.01, gradient_lwd =.6)+scale_x_continuous(expand = c(0.01, 0))+scale_y_discrete(expand = c(0.01,0))+scale_fill_viridis(name="CO浓度(毫克/立方米)", option = "D")+labs(title="CO浓度月变化")+theme_ridges(font_size = 13, grid = TRUE)+theme(axis.title.y = element_blank())
  x1 = paste(name, 'co', sep = "")
  ggsave(paste(x1,"jpeg",sep="."))

结果展示与分析

这里仅展示O3和PM2.5的结果。峰峦图横轴表示浓度区间,山峰高矮表示该月份落在该浓度范围内的天数(频次)。通过峰峦图,可以清晰明了的看到污染物浓度随月份的变化趋势以及每月浓度的分布情况。

【空气质量数据分析专题七】污染物浓度月变化分析_第1张图片
(图片右键新标签页打开会很清晰)
【空气质量数据分析专题七】污染物浓度月变化分析_第2张图片
(图片右键新标签页打开会很清晰)

预告

下期进行污染物浓度日变化的分析。

以下是本人独自运营的微信公众号,用于分享个人学习及工作生活趣事,大佬们可以关注一波。

优良率

你可能感兴趣的:(R语言,空气质量数据分析,python,r语言)