我需要使用xarray在netCDFCF文件中使用加权纬度计算全局平均值,然后转换为panda



我需要从netCDF cf数据3D(时间,lat,lon(文件计算一个全局时间序列(时间(,然后将其转换为pandas/dataframe。我需要用cos(lat(来加权纬度。我一直在使用numpy进行平均,但将pandas/dataframe转换为加权数组不起作用。

ds=xr.open_dataset('sample_data.nc')
data=ds.tas
start_time='1980-01-01'
end_time='2018-12-31'
time_slice = slice(start_time, end_time)
nrows=len(data.lat.values)
ncols=len(data.lon.values)
t=len(data.time.values)
weights=np.zeros([len(data.lat.values)])
latsr = np.deg2rad(data.lat.values).reshape((nrows,1))
weight_matrix=np.repeat(np.cos(latsr),ncols,axis=1)
wghtpr=np.zeros_like(data)
for i in range (0,t):
wghtpr[i,:,:]=data[i,:,:]*weight_matrix
new_data=wghtpr
wtdata=np.average(new_data,axis=1)
da=np.average(wtdata,axis=1)

这最终导致了一个没有"名称"的nuumpy数组

如果我做一个ds,我得到:

<xarray.Dataset>
Dimensions:    (bnds: 2, lat: 361, lon: 576, time: 477)
Coordinates:
* time       (time) datetime64[ns] 1980-01-16T12:00:00 ... 2019-09-16
* lat        (lat) float64 -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0
* lon        (lon) float64 0.0 0.625 1.25 1.875 ... 357.5 358.1 358.8 359.4
height     float64 ...
Dimensions without coordinates: bnds
Data variables:
time_bnds  (time, bnds) datetime64[ns] ...
lat_bnds   (lat, bnds) float64 ...
lon_bnds   (lon, bnds) float64 ...
tas        (time, lat, lon) float32 244.15399 244.15399 ... 267.52875
Attributes:
institution:     Global Modeling and Assimilation Office, NASA Goddard Sp...
institute_id:    NASA-GMAO
experiment_id:   MERRA-2
source:          MERRA-2 Monthly tavgM_2d_slv_Nx
model_id:        GEOS-5
references:      http://gmao.gsfc.nasa.gov/research/merra/, http://gmao.g...
tracking_id:     e77fd4de-19c2-45ad-afe2-ce3f6c1eb148
mip_specs:       CMIP5
source_id:       MERRA-2
product:         reanalysis
frequency:       mon
creation_date:   2015-10-11T23:12:34Z
history:         2015-10-11T23:12:34Z CMOR rewrote data to comply with CF...
Conventions:     CF-1.4
project_id:      CREATE-IP
table_id:        Table Amon_ana (10 March 2011) c3ffdce87438d8df0839620ee...
title:           Reanalysis output prepared for CREATE-IP.
modeling_realm:  atmos
cmor_version:    2.9.1
doi:             http://dx.doi.org/10.5067/AP1B0BA5PD2K
contact:         MERRA-2, Steven Pawson (steven.pawson-1@nasa.gov)
#

要利用xarray的广播和对齐,您可以这样做加权:

ds=xr.open_dataset('sample_data.nc')
data=ds.tas
#start_time='1980-01-01'
#end_time='2018-12-31'
#time_slice = slice(start_time, end_time)
#nrows=len(data.lat.values)
#ncols=len(data.lon.values)
#t=len(data.time.values)
latsr = xr.ufunc.deg2rad(data.lat)
weights = xr.ufunc.cos(latsr)
weighted = data * weights # broadcasting here
weighted_mean = weighted.mean(['lat','lon'])
# to pandas
df = weighted_mean.to_dataframe()

希望这能有所帮助。

另一种选择是使用CDO。要获得全局平均值,您只需要执行以下操作:

cdo fldmean 'sample_data.nc' out.nc

如果在Linux上,你也可以使用我的Python包nctoolkit,它使用CDO作为后端(https://nctoolkit.readthedocs.io/en/latest/installing.html)。计算全球平均值,然后将其转换为熊猫需要以下内容:

import nctoolkit as nc
data = nc.open_data("sample_data.nc")
data.spatial_mean()
pd_ts = data.to_dataframe()

绘制时间序列需要:

data.plot()

最新更新