使用内置插值方法线性外推pandas数据框架

考虑以下数据帧:

df = pd.DataFrame([np.nan, np.nan,1, 5,np.nan, 6, 6.1 , np.nan,np.nan])

我想使用pandas.DataFrame.interpolate方法线性外推开始和结束行的数据帧条目，类似于我得到的，如果我做以下操作:

from scipy import interpolate
df_num = df.dropna()
xi = df_num.index.values
yi = df_num.values[:,0]
f = interpolate.interp1d(xi, yi, kind='linear', fill_value='extrapolate')
x = [0, 1 , 7, 8]
print(f(x))
[-7.  -3. 6.2 6.3]

似乎熊猫interpolate中的'linear'选项调用numpy的interpolate方法，该方法不做线性外推。是否有一种方法来调用内置的插值方法来实现这一点?

您可以直接在pandas中使用scipy插值方法。参见pandas.DataFrame.interpolate文档，您可以在method中使用scipy.interpolate.interp1d中的选项技术，如所附链接所示。

您的示例的解决方案可能看起来像:

df.interpolate(method="slinear", fill_value="extrapolate", limit_direction="both")
# Out: 
#      0
# 0 -7.0
# 1 -3.0
# 2  1.0
# 3  5.0
# 4  5.5
# 5  6.0
# 6  6.1
# 7  6.2
# 8  6.3

你可以很容易地选择任何你感兴趣的值，例如df_interpolated.loc[x](其中df_interpolated是前一个代码块的输出)使用索引在你的问题中定义的x变量。

解释:

method="slinear"-熊猫文档中列出的方法之一，它被传递给scipyinterp1d(参见此链接)

fill_value="extrapolate"-传递scipy允许的任何选项(这里推断的正是你想要的)limit_direction="both"-在两个方向上进行外推(否则默认设置为"向前")在这种情况下，你会看到np.nan为前两个值)

相关内容

最新更新

热门标签：