我发现这篇文章非常有用,我正试图在一个群组中做同样的事情。
这是原始帖子,每次df['step']有6时,都有一个增量计数器:link
在我的例子中,我想在每次1出现时增加这个计数器
所以我修改了这个请求:
df['counter'] = ((df['step']==6) & (df.shift(1)['step']!=6 )).cumsum()
像这样:
df['counter_2'] = ((df['counter1'] == 1) & (df.shift(1)['counter1'] != 1)).cumsum()
现在我试着用分组by ('prd_id')
更新后的答案
df['counter'] = df['step'].eq(1).groupby(df['prd_id']).cumsum()
输出:
prd_id step counter
0 A 1 1
1 A 2 1
2 A 3 1
3 A 4 1
4 A 1 2
5 A 2 2
6 B 1 1
7 B 1 2
8 B 2 2
9 B 1 3
10 B 2 3
11 B 3 3
原始回答
您可以使用duplicated
,布尔NOT (~
)和cumsum
:
df['counter'] = (~df['step'].duplicated()).cumsum()
输出:
step counter
0 2 1
1 2 1
2 2 1
3 3 2
4 4 3
5 4 3
6 5 4
7 6 5
8 6 5
9 6 5
10 6 5
11 7 6
12 5 6 # not incrementing, 5 was seen above
13 6 6 # not incrementing, 6 was seen above
14 6 6
15 6 6
16 7 6 # not incrementing, 7 was seen above
17 5 6 # not incrementing, 5 was seen above
18 6 6 # not incrementing, 6 was seen above
19 7 6 # not incrementing, 7 was seen above
20 5 6 # not incrementing, 5 was seen above
如果您也有组,使用:
df['counter'] = (~df[['step', 'group']].duplicated()).groupby(df['group']).cumsum()
的例子:
group step counter
0 A 1 1
1 A 2 2
2 A 2 2
3 A 3 3
4 A 2 3
5 A 4 4
6 B 1 1 # first time in B
7 B 1 1
8 B 2 2
9 B 1 2 # duplicated in B
10 B 2 2
11 B 3 3