在pandas DataFrame中填充NaN值，具体取决于其左侧单元格的值

我试图在一个非常大的panda数据帧中用零填充NaN，但前提是在同一行中但在其左侧的单元格中存在非NaN值。例如，从这个输入DataFrame，

input = pd.DataFrame([[1, np.NaN, 1.5, np.NaN], [np.NaN, 2, np.NaN, np.NaN]], index=['A', 'B'], columns=['col1', 'col2', 'col3', 'col4'])

看起来像：

col1    col2    col3    col4
A   1.0     NaN     1.5     NaN
B   NaN     2.0     NaN     NaN

预期输出为：

col1    col2    col3    col4
A   1.0     0       1.5     0
B   NaN     2.0     0       0

看看[B，col1]是如何保持Nan的，因为它的左边没有非Nan值，但所有四个[a，col2]、[a，col4]、[B，col3]和[B，col4]都用零填充(因为有更左边的非Nan数值(。

有人知道该怎么做吗？非常感谢！

使用正向填充缺失值和测试未缺失值，并通过此掩码分配0:

df[df.ffill(axis=1).notna() & df.isna()] = 0
print (df)
col1  col2  col3  col4
A   1.0   0.0   1.5   0.0
B   NaN   2.0   0.0   0.0

或者，您可以使用测试不等于0值的累积和：

df[df.fillna(0).cumsum(axis=1).ne(0) & df.isna()] = 0
print (df)
col1  col2  col3  col4
A   1.0   0.0   1.5   0.0
B   NaN   2.0   0.0   0.0

相关内容