我有一个数据帧,如下所示
Session ID cumulative_prob
s1 1 0.4
s1 3 0.9
s1 4 -0.1
s1 5 0.3
s1 8 1.2
s1 9 0.2
s2 22 0.4
s2 29 0.7
s2 31 1.4
s2 32 0.4
s2 34 0.9
s3 36 0.9
s3 37 -0.1
s3 38 0.2
s3 40 1.0
从中,我想创建一个新的列,指示会话趋势(增加或减少(
预期输出:
Session ID cumulative_prob Decrease
s1 1 0.4 no
s1 3 0.9 no
s1 4 -0.1 yes
s1 5 0.3 no
s1 8 1.2 no
s1 9 0.2 yes
s2 22 0.4 no
s2 29 0.7 no
s2 31 1.4 no
s2 32 0.4 yes
s2 34 0.9 no
s3 36 0.9 no
s3 37 -0.1 yes
s3 38 0.2 no
s3 40 1.0 no
注意:对于每个会话,在第一行保留deafault"no">
IIUC、GroupBy.diff
和np.where
:
#import numpy as np
df['Decrease'] = np.where(df.groupby('Session')['cumulative_prob']
.diff()
.lt(0),
'yes',
'no')
print(df)
Session ID cumulative_prob Decrease
0 s1 1 0.4 no
1 s1 3 0.9 no
2 s1 4 -0.1 yes
3 s1 5 0.3 no
4 s1 8 1.2 no
5 s1 9 0.2 yes
6 s2 22 0.4 no
7 s2 29 0.7 no
8 s2 31 1.4 no
9 s2 32 0.4 yes
10 s2 34 0.9 no
11 s3 36 0.9 no
12 s3 37 -0.1 yes
13 s3 38 0.2 no
14 s3 40 1.0 no
我们也可以使用Series.map
:
(df.groupby('Session')['cumulative_prob']
.diff()
.lt(0)
.map({True : 'yes' , False : 'no'}))