如何根据规则在两个不同值之间刷/修改系列的值,而不更改标记?


import pandas as pd
import numpy as np
nan = np.NaN
data = [['a1',0,'Bottom_Class'],
['a1',0,nan],
['a1',1,nan],
['a1',1,nan],
['a1',1,nan],
['a1',1,'Top_Class'],
['a1',0,'Bottom_Class'],
['a1',0,'Top_Class'],
['a2',1,nan],
['a2',1,nan],
['a2',1,'Bottom_Class'],
['a2',0,nan],
['a2',0,'Bottom_Class'],
['a1',0,'Top_Class'],
['a2',1,nan],
['a1',1,'Top_Class'],
['a2',1,nan],
['a2',1,nan],
['a2',1,nan],
['a2',0,'Bottom_Class']]
df = pd.DataFrame(data,columns=['Id','State','Class'])
print(df)
Id  State         Class
0   a1      0  Bottom_Class
1   a1      0           NaN
2   a1      1           NaN
3   a1      1           NaN
4   a1      1           NaN
5   a1      1     Top_Class
6   a1      0  Bottom_Class
7   a1      0     Top_Class
8   a2      1           NaN
9   a2      1           NaN
10  a2      1  Bottom_Class
11  a2      0           NaN
12  a2      0  Bottom_Class
13  a1      0     Top_Class
14  a2      1           NaN
15  a1      1     Top_Class
16  a2      1           NaN
17  a2      1           NaN
18  a2      1           NaN
19  a2      0  Bottom_Class

所以这是一些股票市场价格的数据框架,但我修改它,以便更容易理解。

只是为了专注于df。类:

我的想法是设置:Bottom_Class作为起点,Top_Class作为终点。反之亦然。

并且 Top_Class 之后的值(不包括自身(将设置为0,直到它满足 Bottom_Class

和 值之后(不包括自身(Bottom_Class将设置为1,直到它满足Top_Class。

我希望像这样修改该系列:

Class
Bottom_Class
1
1
1
1
Top_Class
Bottom_Class
Top_Class
0
0
Bottom_Class
1
Bottom_Class
Top_Class
0
Top_Class
0
0
0
Bottom_Class

您可以使用np.where并使用fillna仅填充NaN

df.Class.fillna(pd.Series(np.where(df.Class.ffill() == 'Bottom_Class',1,0)))
# Output:
0     Bottom_Class
1                1
2                1
3                1
4                1
5        Top_Class
6     Bottom_Class
7        Top_Class
8                0
9                0
10    Bottom_Class
11               1
12    Bottom_Class
13       Top_Class
14               0
15       Top_Class
16               0
17               0
18               0
19    Bottom_Class

最新更新