我需要从netCDF cf数据3D(时间,lat,lon(文件计算一个全局时间序列(时间(,然后将其转换为pandas/dataframe。我需要用cos(lat(来加权纬度。我一直在使用numpy进行平均,但将pandas/dataframe转换为加权数组不起作用。
ds=xr.open_dataset('sample_data.nc')
data=ds.tas
start_time='1980-01-01'
end_time='2018-12-31'
time_slice = slice(start_time, end_time)
nrows=len(data.lat.values)
ncols=len(data.lon.values)
t=len(data.time.values)
weights=np.zeros([len(data.lat.values)])
latsr = np.deg2rad(data.lat.values).reshape((nrows,1))
weight_matrix=np.repeat(np.cos(latsr),ncols,axis=1)
wghtpr=np.zeros_like(data)
for i in range (0,t):
wghtpr[i,:,:]=data[i,:,:]*weight_matrix
new_data=wghtpr
wtdata=np.average(new_data,axis=1)
da=np.average(wtdata,axis=1)
这最终导致了一个没有"名称"的nuumpy数组
如果我做一个ds,我得到:
<xarray.Dataset>
Dimensions: (bnds: 2, lat: 361, lon: 576, time: 477)
Coordinates:
* time (time) datetime64[ns] 1980-01-16T12:00:00 ... 2019-09-16
* lat (lat) float64 -90.0 -89.5 -89.0 -88.5 ... 88.5 89.0 89.5 90.0
* lon (lon) float64 0.0 0.625 1.25 1.875 ... 357.5 358.1 358.8 359.4
height float64 ...
Dimensions without coordinates: bnds
Data variables:
time_bnds (time, bnds) datetime64[ns] ...
lat_bnds (lat, bnds) float64 ...
lon_bnds (lon, bnds) float64 ...
tas (time, lat, lon) float32 244.15399 244.15399 ... 267.52875
Attributes:
institution: Global Modeling and Assimilation Office, NASA Goddard Sp...
institute_id: NASA-GMAO
experiment_id: MERRA-2
source: MERRA-2 Monthly tavgM_2d_slv_Nx
model_id: GEOS-5
references: http://gmao.gsfc.nasa.gov/research/merra/, http://gmao.g...
tracking_id: e77fd4de-19c2-45ad-afe2-ce3f6c1eb148
mip_specs: CMIP5
source_id: MERRA-2
product: reanalysis
frequency: mon
creation_date: 2015-10-11T23:12:34Z
history: 2015-10-11T23:12:34Z CMOR rewrote data to comply with CF...
Conventions: CF-1.4
project_id: CREATE-IP
table_id: Table Amon_ana (10 March 2011) c3ffdce87438d8df0839620ee...
title: Reanalysis output prepared for CREATE-IP.
modeling_realm: atmos
cmor_version: 2.9.1
doi: http://dx.doi.org/10.5067/AP1B0BA5PD2K
contact: MERRA-2, Steven Pawson (steven.pawson-1@nasa.gov)
#
要利用xarray
的广播和对齐,您可以这样做加权:
ds=xr.open_dataset('sample_data.nc')
data=ds.tas
#start_time='1980-01-01'
#end_time='2018-12-31'
#time_slice = slice(start_time, end_time)
#nrows=len(data.lat.values)
#ncols=len(data.lon.values)
#t=len(data.time.values)
latsr = xr.ufunc.deg2rad(data.lat)
weights = xr.ufunc.cos(latsr)
weighted = data * weights # broadcasting here
weighted_mean = weighted.mean(['lat','lon'])
# to pandas
df = weighted_mean.to_dataframe()
希望这能有所帮助。
另一种选择是使用CDO。要获得全局平均值,您只需要执行以下操作:
cdo fldmean 'sample_data.nc' out.nc
如果在Linux上,你也可以使用我的Python包nctoolkit,它使用CDO作为后端(https://nctoolkit.readthedocs.io/en/latest/installing.html)。计算全球平均值,然后将其转换为熊猫需要以下内容:
import nctoolkit as nc
data = nc.open_data("sample_data.nc")
data.spatial_mean()
pd_ts = data.to_dataframe()
绘制时间序列需要:
data.plot()