大熊猫插值方法的差异

>我需要检查插值的"索引"方法和"线性"插值方法之间的区别

我创建了一个带有缺失值的随机熊猫系列，然后用线性法和指数法检查插值结果

但它们都返回相同的结果。那么，它应该返回相同的结果吗？如果是，在什么情况下我可以看到不同的结果？

s = pd.Series([21,13,np.nan,152,np.nan,46,98])
s.interpolate(method = 'index')
s.interpolate(method = 'linear')

我得到以下结果：

s.interpolate(method = 'index')
0     21.0
1     13.0
2     82.5
3    152.0
4     99.0
5     46.0
6     98.0
dtype: float64

s.interpolate(method = 'linear')
0     21.0
1     13.0
2     82.5
3    152.0
4     99.0
5     46.0
6     98.0
dtype: float64

当你的index range或具有相同的差距指数和线性将产生相同的结果时，请尝试使用以下示例

s = pd.Series([21,13,np.nan,152,np.nan,46,98],index=[0,1,3,4,7,9,10])
s.interpolate(method = 'index')
Out[536]: 
0      21.000000
1      13.000000
3     105.666667
4     152.000000
7      88.400000
9      46.000000
10     98.000000
dtype: float64
s.interpolate(method = 'linear')
Out[537]: 
0      21.0
1      13.0
3      82.5
4     152.0
7      99.0
9      46.0
10     98.0
dtype: float64

linear 和index方法都将对序列执行线性插值;区别在于哪些值被视为自变量：

method = 'index'使用数字索引值(如果您的系列未指定索引，则默认为 0， 1， 2， ...， n - 1(
method = 'linear' 将序列的元素视为等间距(忽略索引中指定的任何值(;这当然等效于使用序列 0， 1， 2， ...， n - 1 作为自变量范围

因此，对于索引为默认值的任何序列(或任何其他算术级数，例如 0、2、4、6 等(，这两个选项将产生相同的结果。

相关内容

最新更新

热门标签：