如何将scipy.interpolate.interpan函数与xarray(3d)一起使用,以填补nan空白?当前错误



我有点沮丧,因为我找不到解决我的问题的方法,这在r中用包gapfill似乎很容易做到,但在python中更难。

我的问题是:我有一个三维数组,维度是纬度、经度和时间。我想要的是在每个光栅/阵列中插入nan值(由云和其他失真引起(。nan值形成块(由于云(,有时相对较大。我的想法是不仅对每个时间步长的相邻像素进行插值,还对前后的时间步长进行插值(假设前几天和后几天的像素具有实际相似的值,因为陆地覆盖率变化不太快(。我的目标是用相同的像素位置做一个随时间的线性插值。(前后有多少时间步长也是我不确定如何在interpn函数中定义的问题?(

我找到了不同的选择来做这件事,但还没有奏效。我发现最有前途的方法是从带有interpole.interpn函数的软件包scipy中找到的。此函数使用的是numpy数组,而不是xarray。我的尝试:

#change from xarray to numpy 
my array_np = my array.to_numpy()
# lable dimensions (what is done when building a numpy with meshgrid)
x = array_np [0]
y = array_np [1]
z = array_np [2]
#get index of nan values
nanIndex= np.isnan(array_np ).nonzero()
nanIndex
#name dimensions of nan values 
xc= nanIndex[0] 
yc= nanIndex[1]
zc= nanIndex[2] 
# For using the scipy interpolate. interpn function: 
# points = the regular grid  - in my case x,y,z
# values = the data on the regular grid - in my case my array (my_array_np)
# point_nan = the point that is evaluate in the 3D grid - in my case xc, y,c, zy
points = (x, y, z) # dimensions
points_nan = (xc, yc, zc) #nandimensions
print(interpolate.interpn(points, my_array_np, points_nan))

我现在得到的错误是:

"The points in dimension 0 must be strictly ascending"

我哪里错了?感谢您的提前帮助!如果你有其他解决方案,也解决了我的问题,我也很乐意帮助!

这是我的数组的样子:

array([[[      nan,       nan,       nan, ..., 279.64   , 282.16998,
279.66998],
[277.62   , 277.52   , 277.88   , ..., 281.75998, 281.72   ,
281.66   ],
[277.38   , 277.75   , 277.88998, ..., 281.75998, 281.75998,
280.91998],
...,
[      nan,       nan,       nan, ..., 280.72998, 280.33   ,
280.94   ],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]],
[[      nan,       nan,       nan, ..., 272.22   , 271.54   ,
271.02   ],
[280.02   , 280.44998, 281.18   , ..., 271.47998, 271.88   ,
272.03   ],
[280.32   , 281.     , 281.27   , ..., 270.83   , 271.58   ,
272.03   ],
...,
[      nan,       nan,       nan, ..., 290.34   , 290.25   ,
288.365  ],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]],
[[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[276.44998, 276.19998, 276.19   , ...,       nan,       nan,
nan],
[276.50998, 276.79   , 276.58   , ...,       nan,       nan,
nan],
...,
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]],
...,
[[      nan,       nan,       nan, ..., 276.38998, 276.44   ,
275.72998],
[      nan,       nan,       nan, ..., 276.55   , 276.81   ,
276.72998],
[      nan,       nan,       nan, ..., 279.74   , 277.11   ,
276.97   ],
...,
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]],
[[      nan,       nan,       nan, ..., 277.38   , 278.08   ,
277.79   ],
[279.66998, 280.00998, 283.13   , ..., 277.34   , 277.41998,
277.62   ],
[      nan, 277.41   , 277.41   , ..., 277.825  , 277.31   ,
277.52   ],
...,
[      nan,       nan,       nan, ..., 276.52   ,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]],
[[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
...,
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan],
[      nan,       nan,       nan, ...,       nan,       nan,
nan]]], dtype=float32)

interpn不能用于填充规则网格中的间隙-interpn是将完整的规则网格(没有间隙(插值到不同坐标的快速方法。

为了用N维插值填充缺失值,对非结构化N维数据使用scipy插值方法。

由于您正在将插值到常规网格,我将演示scipy.interpolate.griddata:的使用

import xarray as xr, pandas as pd, numpy as np, scipy.interpolate
# create dummy data
x = y = z = np.linspace(0, 1, 5)
da = xr.DataArray(
np.sin(x).reshape(-1, 1, 1) * np.cos(y).reshape(1, -1, 1) + z.reshape(1, 1, -1),
dims=['x', 'y', 'z'],
coords=[x, y, z],
)

# randomly fill with NaNs
da = da.where(np.random.random(size=da.shape) > 0.1)

这看起来像下面的

In [11]: da
Out[11]:
<xarray.DataArray (x: 5, y: 5, z: 5)>
array([[[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ]],
[[0.24740396, 0.49740396, 0.74740396, 0.99740396,        nan],
[       nan, 0.48971277,        nan, 0.98971277, 1.23971277],
[0.2171174 , 0.4671174 , 0.7171174 , 0.9671174 , 1.2171174 ],
[0.18102272, 0.43102272, 0.68102272, 0.93102272, 1.18102272],
[       nan, 0.38367293, 0.63367293, 0.88367293, 1.13367293]],
[[0.47942554,        nan, 0.97942554, 1.22942554, 1.47942554],
[0.46452136, 0.71452136, 0.96452136, 1.21452136, 1.46452136],
[0.42073549, 0.67073549, 0.92073549, 1.17073549,        nan],
[0.35079033, 0.60079033, 0.85079033, 1.10079033, 1.35079033],
[       nan, 0.50903472, 0.75903472,        nan, 1.25903472]],
[[0.68163876, 0.93163876, 1.18163876, 1.43163876, 1.68163876],
[0.66044826, 0.91044826,        nan, 1.41044826, 1.66044826],
[0.59819429, 0.84819429, 1.09819429, 1.34819429,        nan],
[       nan, 0.74874749, 0.99874749, 1.24874749,        nan],
[0.36829099, 0.61829099, 0.86829099, 1.11829099, 1.36829099]],
[[0.84147098, 1.09147098,        nan, 1.59147098, 1.84147098],
[0.81531169, 1.06531169, 1.31531169, 1.56531169, 1.81531169],
[0.73846026, 0.98846026, 1.23846026, 1.48846026,        nan],
[0.61569495, 0.86569495, 1.11569495, 1.36569495, 1.61569495],
[0.45464871, 0.70464871, 0.95464871,        nan, 1.45464871]]])
Coordinates:
* x        (x) float64 0.0 0.25 0.5 0.75 1.0
* y        (y) float64 0.0 0.25 0.5 0.75 1.0
* z        (z) float64 0.0 0.25 0.5 0.75 1.0

要使用非结构化scipy插值器,必须将具有缺失值的网格数据转换为没有缺失值的1D点的矢量:

# ravel all points and find the valid ones
points = da.data.ravel()
valid = ~np.isnan(points)
points_valid = points[valid]
# construct arrays of (x, y, z) points, masked to only include the valid points
xx, yy, zz = np.meshgrid(x, y, z)
xx, yy, zz = xx.ravel(), yy.ravel(), zz.ravel()
xxv = xx[valid]
yyv = yy[valid]
zzv = zz[valid]
# feed these into the interpolator, and also provide the target grid
interpolated = scipy.interpolate.griddata(np.stack([xxv, yyv, zzv]).T, points_valid, (xx, yy, zz), method="linear")
# reshape to match the original array and replace the DataArray values with
# the interpolated data
da.values = interpolated.reshape(da.shape)

这导致阵列被填充

In [32]: da
Out[32]:
<xarray.DataArray (x: 5, y: 5, z: 5)>
array([[[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ],
[0.        , 0.25      , 0.5       , 0.75      , 1.        ]],
[[0.24740396, 0.49740396, 0.74740396, 0.99740396, 1.23971277],
[0.23226068, 0.48971277, 0.73226068, 0.98971277, 1.23971277],
[0.2171174 , 0.4671174 , 0.7171174 , 0.9671174 , 1.2171174 ],
[0.18102272, 0.43102272, 0.68102272, 0.93102272, 1.18102272],
[0.12276366, 0.38367293, 0.63367293, 0.88367293, 1.13367293]],
[[0.47942554, 0.71452136, 0.97942554, 1.22942554, 1.47942554],
[0.46452136, 0.71452136, 0.96452136, 1.21452136, 1.46452136],
[0.42073549, 0.67073549, 0.92073549, 1.17073549, 1.40765584],
[0.35079033, 0.60079033, 0.85079033, 1.10079033, 1.35079033],
[0.24552733, 0.50903472, 0.75903472, 1.00903472, 1.25903472]],
[[0.68163876, 0.93163876, 1.18163876, 1.43163876, 1.68163876],
[0.66044826, 0.91044826, 1.16044826, 1.41044826, 1.66044826],
[0.59819429, 0.84819429, 1.09819429, 1.34819429, 1.57184545],
[0.48324264, 0.74874749, 0.99874749, 1.24874749, 1.48324264],
[0.36829099, 0.61829099, 0.86829099, 1.11829099, 1.36829099]],
[[0.84147098, 1.09147098, 1.34147098, 1.59147098, 1.84147098],
[0.81531169, 1.06531169, 1.31531169, 1.56531169, 1.81531169],
[0.73846026, 0.98846026, 1.23846026, 1.48846026, 1.71550332],
[0.61569495, 0.86569495, 1.11569495, 1.36569495, 1.61569495],
[0.45464871, 0.70464871, 0.95464871, 1.20464871, 1.45464871]]])
Coordinates:
* x        (x) float64 0.0 0.25 0.5 0.75 1.0
* y        (y) float64 0.0 0.25 0.5 0.75 1.0
* z        (z) float64 0.0 0.25 0.5 0.75 1.0

请注意,这填充了整个阵列,因为可用点的凸包覆盖了整个阵列。如果不是这种情况,则可能需要使用最近邻居或将样条曲线拟合到填充数据的第二步。

最新更新