如何使用该特定列的先前行值使用Python Shift函数来增加单元格的值

我有一个带有布尔值的列'y'的dataFrame。我想创建一个新列X，该列取决于y和x本身的值。因此，如果y为false，我希望x的值为1，如果y为true，我希望x的值为x的先前行值 1。我需要以下输出：

Y     X
False 1
True  2
True  3
False 1
False 1
True  2
True  3

我正在尝试换档函数df.loc [df ['y'] == true，'x'] = df.x.shift（1） 1但没有获得所需的输出。我将输出作为

    Y   X
0   False   1.0
1   True    2.0
2   True    2.0
3   False   1.0
4   False   1.0
5   True    2.0
6   True    2.0

第二次具有真实值，它应该使用X的先前值将其递增x。

。

我不喜欢循环/迭代，因为我有500万行数据，并且迭代将花费数小时的处理时间。

df.loc[df['Y']==True,'X'] = df.X.shift(1)+1
columns = ['Y']
index =0,1,2,3,4,5,6
df = pd.DataFrame(index=index, columns=columns)
df['Y'] = True
df.loc[0,'Y']= False
df.loc[3,'Y']= False
df.loc[4,'Y']= False
df.loc[:,'X']=1
df.loc[df['Y']==True,'X'] = df.X.shift(1)+1

我恐怕不能处理这种情况，至少我尝试了很多次。

这里提供了另一种处理它的方法。

## your codes about initializing df
import pandas as pd
import numpy as np
columns = ['Y']
index = 0, 1, 2, 3, 4, 5, 6
df = pd.DataFrame(index=index, columns=columns)
df['Y'] = True
df.loc[0, 'Y'] = False
df.loc[3, 'Y'] = False
df.loc[4, 'Y'] = False
df.loc[:, 'X'] = 1
print(df)
### initializing of df ends here
### my codes start here

# create an assist column holding the cumsum of X column
df['cum'] = df.X.cumsum()
# calculate the offset
df['offset'] = df.apply(lambda s: 0 - s.cum if s.Y is False else np.nan, axis=1).fillna(method='ffill') + 1
# modify the X column by cumsum and calculated offset
df['X'] = df['cum'] + df['offset']
df.X = df.X.astype(int)
# remove assist columns leaving only the Y, X column
df = df[['Y', 'X']]
print(df)

结果看起来像这样，我认为这是您不太想要的，并且由于它使用熊猫来计算，它不会像纯Python代码中的for-loop那样慢：

       Y  X
0  False  1
1   True  1
2   True  1
3  False  1
4  False  1
5   True  1
6   True  1
       Y  X
0  False  1
1   True  2
2   True  3
3  False  1
4  False  1
5   True  2
6   True  3

您可以通过添加 print（df）在删除这两列（cum，offset）之前检查有关数据框的外观的更多详细信息。

计算暨偏移列：

       Y  X  cum  offset
0  False  1    1     0.0
1   True  1    2     0.0
2   True  1    3     0.0
3  False  1    4    -3.0
4  False  1    5    -4.0
5   True  1    6    -4.0
6   True  1    7    -4.0

更新X列：

       Y    X  cum  offset
0  False  1.0    1     0.0
1   True  2.0    2     0.0
2   True  3.0    3     0.0
3  False  1.0    4    -3.0
4  False  1.0    5    -4.0
5   True  2.0    6    -4.0
6   True  3.0    7    -4.0

相关内容

最新更新

热门标签：