由于大部分计算都是在服务器进行。因此考虑增进NCL和Shell脚本水平,之前的博文Python读取HDF5介绍了HDF5文件及其读写,但是由于服务器本身系统版本较老,且属于内网原因,决定直接使用自带的NCL脚本对其进行撰写记录。
NCL6版本以后便支持HDF5文件的读写,官方文档也给出了读写HDF5文件的实例:
;----------------------------------------------------------------------
; hdf5_1.ncl
;
; Concepts illustrated:
; - Reading group data off an HDF5 file using two methods
; - Creating a color map using RGB triplets
; - Drawing raster contours for faster results
;----------------------------------------------------------------------
; These files are loaded by default in NCL V6.2.0 and newer
; load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
; load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
; load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
;----------------------------------------------------------------------
begin
;---It is necessary to use advanced file structure to read group data
setfileoption("nc", "FileStructure", "Advanced")
fname = "MSG3-SEVI-MSG15-0100-NA-20130521001244.164000000Z-1074164.h5"
f = addfile(fname,"r")
;------------------------------------------------------------
; There are two ways to read variables off a files
; containing nested groups.
;
; Method 1:
; - Open the group with the "=>" syntax
; - Read the data using the usual "->" syntax
;
; Method 2:
; - Open the variable directly using the full path
; to the file
;
; Note: there's an inconsistency with how the group names
; are constructed for the two methods. We have opened a
; trouble ticket for this.
;------------------------------------------------------------
USE_FULL_PATH = True
if(.not.USE_FULL_PATH) then
;---First method; open group, then read data from it.
group_name ="/U_MARF/MSG/Level1_5/DATA/Channel_07"
iname = "IMAGE_DATA"
pname = "Palette"
gp = f=>$group_name$
fdata = tofloat(gp->$iname$)
palette = tofloat(gp->$pname$)
else
;---Second method; use full path to variable.
group_name = "/U-MARF/MSG/Level1.5/DATA/Channel 07"
iname = group_name + "/IMAGE_DATA"
pname = group_name + "/Palette"
fdata = tofloat(f->$iname$)
palette = tofloat(f->$pname$)
end if
wks = gsn_open_wks("png","hdf5") ; send graphics to PNG file
res = True ; plot mods desired
res@gsnMaximize = True ; maximize plot in frame
res@cnFillOn = True ; color fill
res@cnFillMode = "RasterFill" ; Raster mode is much faster
; and uses less memory.
res@cnFillPalette = palette/255.
res@cnLinesOn = False ; turn off contour lines
res@cnLevelSelectionMode = "ManualLevels" ; set manual contour levels
res@cnMinLevelValF = min(fdata) ; set min contour level
res@cnMaxLevelValF = max(fdata) ; set max contour level
res@cnLevelSpacingF = 10.0 ; set contour spacing
res@lbOrientation = "vertical" ; vertical labelbar
res@lbBoxLinesOn = False
res@gsnTickMarksOn = False
if(USE_FULL_PATH)
res@tiMainString = "Variable read using full group path"
else
res@tiMainString = "Variable read by opening group first"
end if
plot = gsn_csm_contour(wks,fdata, res) ; create plot
end
主要有两种读法,但不论哪一种,都需要对HDF5文件本身的结构有所了解,在上一篇博文中。我主要使用了matlab的h5info函数来直接查看,而NCL也提供了对应的函数:ncl_filedump
,
其用法也非常简单:
ncl_filedump K1VHR_15NOV2013_1200_L02_OLR.h5 | less
或者直接:
setfileoption("h5", "FileStructure", "Advanced")
f = addfile("ncl_wrt_uvt.h5", "r")
print(f)
也可查看
值得注意的是:
matlab h5info返回的信息与NCL返回的信息并不一致,这导致很多时候我们并不能像matlab或者python、NCL官方示例一样,从不同的GROUP中读取数据,NCL对于HDF5数据的处理很多时候和nc一样,直接给出变量名。
举个实例,同样是AMSR2文件,matlab给出的结构为:
Group '/HDFEOS'
Group '/HDFEOS/ADDITIONAL'
Group '/HDFEOS/ADDITIONAL/FILE_ATTRIBUTES'
Group '/HDFEOS/GRIDS'
Group '/HDFEOS/GRIDS/NpPolarGrid12km'
Dataset 'XDim'
Size: 608
MaxSize: 608
Datatype: H5T_IEEE_F32LE (single)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'CLASS': 'DIMENSION_SCALE'
'NAME': 'This is a netCDF dimension but not a netCDF variable.'
'axis': 'X'
'long_name': 'x-coordinate in Cartesian system'
'REFERENCE_LIST': H5T_COMPOUND
Dataset 'YDim'
Size: 896
MaxSize: 896
Datatype: H5T_IEEE_F32LE (single)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'CLASS': 'DIMENSION_SCALE'
'NAME': 'This is a netCDF dimension but not a netCDF variable.'
'axis': 'Y'
'long_name': 'y-coordinate in Cartesian system'
'REFERENCE_LIST': H5T_COMPOUND
Dataset 'lat'
Size: 608x896
MaxSize: 608x896
Datatype: H5T_IEEE_F32LE (single)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'DIMENSION_LIST': H5T_VLEN
'long_name': 'Latitude'
'units': 'degrees_north'
'standard_name': 'latitude'
Dataset 'lon'
Size: 608x896
MaxSize: 608x896
Datatype: H5T_IEEE_F32LE (single)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'DIMENSION_LIST': H5T_VLEN
'long_name': 'Longitude'
'standard_name': 'longitude'
'units': 'degrees_east'
Group '/HDFEOS/GRIDS/NpPolarGrid12km/Data Fields'
Dataset 'SI_12km_NH_18H_ASC'
Size: 608x896
MaxSize: 608x896
Datatype: H5T_STD_I32LE (int32)
ChunkSize: []
Filters: none
FillValue: 0
Attributes:
'DIMENSION_LIST': H5T_VLEN
'coordinates': 'lon lat'
'long_name': '18.7 GHz horizontal daily average ascending Tbs'
'units': 'degree_kelvin'
'standard_name': 'brightness_temperature'
'packing_convention': 'netCDF'
'packing_convention_description': 'unpacked = scale_factor x packed + add_offset'
'scale_factor': 0.100000
'add_offset': 0.000000
'_FillValue': 0
算是标准的HDF5结构了,而NCL print出来则是:
dimensions:
YDim_NpPolarGrid12km = 896
XDim_NpPolarGrid12km = 608
YDim_SpPolarGrid12km = 664
XDim_SpPolarGrid12km = 632
variables:
double GridLat_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
projection : Polar Stereographic
corners : ( -39.29786148540076, -39.29786148540076, -41.51518496442763, -41.51518496442763 )
long_name : latitude
units : degrees_north
double GridLon_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
projection : Polar Stereographic
corners : ( -42.23673723653989, 42.23673723653989, 135, -135 )
long_name : longitude
units : degrees_east
integer SI_12km_SH_SNOWDEPTH_5DAY_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : 110 -- missing/not calculated, 120 -- Land, 130 -- Open water, 140 -- multiyear sea ice, 150 -- variability in snow depth, 160 -- snow melt
units : cm
long_name : 5-day snow depth
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : 5-day snow depth
projection : Polar Stereographic
integer SI_12km_SH_ICEDIFF_DAY_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 1 to 100 or -1 to -100 -- Percent difference between algorithms, 120 -- Land, 200 to 300 -- Missing NT2 value (200+BBA), -200 to -300 -- Missing BBA value (-200 - NT2)
units : percent
long_name : Sea ice concentration daily average using the difference between BBA and NT2 algorithms
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily average using the difference between BBA and NT2 algorithms
projection : Polar Stereographic
integer SI_12km_SH_ICEDIFF_DSC_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 1 to 100 or -1 to -100 -- Percent difference between algorithms, 120 -- Land, 200 to 300 -- Missing NT2 value (200+BBA), -200 to -300 -- Missing BBA value (-200 - NT2)
units : percent
long_name : Sea ice concentration daily descending average using the difference between BBA and NT2 algorithms
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily descending average using the difference between BBA and NT2 algorithms
projection : Polar Stereographic
integer SI_12km_SH_ICEDIFF_ASC_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 1 to 100 or -1 to -100 -- Percent difference between algorithms, 120 -- Land, 200 to 300 -- Missing NT2 value (200+BBA), -200 to -300 -- Missing BBA value (-200 - NT2)
units : percent
long_name : Sea ice concentration daily ascending average using the difference between BBA and NT2 algorithms
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily ascending average using the difference between BBA and NT2 algorithms
projection : Polar Stereographic
integer SI_12km_SH_ICECON_DAY_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 110 -- missing/not calculated, 120 -- Land
units : percent
long_name : Sea ice concentration daily average
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily average
projection : Polar Stereographic
integer SI_12km_SH_ICECON_DSC_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 110 -- missing/not calculated, 120 -- Land
units : percent
long_name : Sea ice concentration daily descending average
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily descending average
projection : Polar Stereographic
integer SI_12km_SH_ICECON_ASC_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
comment : data value meaning: 0 -- Open Water, 110 -- missing/not calculated, 120 -- Land
units : percent
long_name : Sea ice concentration daily ascending average
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : Sea ice concentration daily ascending average
projection : Polar Stereographic
integer SI_12km_SH_89H_DAY_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
_FillValue : 0
add_offset : 0
scale_factor : 0.1
packing_convention_description : unpacked = scale_factor x packed + add_offset
packing_convention : netCDF
standard_name : brightness_temperature
units : degree_kelvin
long_name : 89.0 GHz horizontal daily average Tbs
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : 89.0 GHz horizontal daily average Tbs
projection : Polar Stereographic
integer SI_12km_SH_89H_DSC_SpPolarGrid12km ( YDim_SpPolarGrid12km, XDim_SpPolarGrid12km )
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
_FillValue : 0
add_offset : 0
scale_factor : 0.1
packing_convention_description : unpacked = scale_factor x packed + add_offset
packing_convention : netCDF
standard_name : brightness_temperature
units : degree_kelvin
long_name : 89.0 GHz horizontal daily average descending Tbs
coordinates : GridLat_SpPolarGrid12km, GridLon_SpPolarGrid12km
long_name : 89.0 GHz horizontal daily average descending Tbs
projection : Polar Stereographic
没有了Grouo信息,而是直接将变量提取出来。
因此,在使用NCL读取HDF5时,应当注意其NCL对其的处理方式。
接下来介绍实例,用ncl读取AMSR2 12.5Km的海冰密集度数据:
; - Reading am AMSR2 h5 file containing "sea ice"
; - Creating the appropriate grid and time coordinates
; - Chang format to WPS_int file
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRFUserARW.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/esmf/ESMF_regridding.ncl"
begin
setfileoption("h5", "FileStructure", "Advanced")
ymdd1="20191201"
ymdd2="2019-12-01"
; ymdd1=getenv("basedate1")
; ymdd2=getenv("basedate2")
data_filename="/public/home/zhangzilu/AMSR/2019/AMSR_U2_L3_SeaIce12km_B04_"+ymdd1+".he5"
print(data_filename)
f = addfile(data_filename, "r")
lat=f->GridLat_NpPolarGrid12km
lon=f->GridLon_NpPolarGrid12km
ice_con=tofloat(f->SI_12km_NH_ICECON_DAY_NpPolarGrid12km)
printVarSummary(ice_con)
; print(lat)
end
便可,批量读取只要与shell脚本结合便可高效批量完成。
读取HDF5数据唯一的难点便在于对于数据结构的理解。