我只想从这个数据帧中恢复序列15后面跟着25,如下例所示:
Step
25
15 <--
25 <--
15 <--
25 <--
25
25
25
15
15
15 <--
25 <--
15
问题是25或15可以不规则地连续重复几次(例如:15-15-25(。所以我不知道该怎么处理。总的来说,我想在第25次之前拿到前15个
结果必须是:
Step
15 <--
25 <--
15 <--
25 <--
15 <--
25 <--
为groupby
创建虚拟组,如果组的长度至少大于1,则保留每个组的前两个值(15,25(。
>>> df.groupby(df['Step'].eq(15).cumsum())
.apply(lambda x: x[:2] if len(x)>1 else None)
.droplevel(0).rename_axis(None)
Step
1 15
2 25
3 15
4 25
10 15
11 25
详细信息:
>>> pd.concat([df['Step'], df['Step'].eq(15).cumsum()], axis=1)
Step Step
0 25 0 # drop, only one item in the group
1 15 1 # keep, first item of a group where length > 1
2 25 1 # keep, second item of a group where length > 1
3 15 2 # keep, first item of a group where length > 1
4 25 2 # keep, second item of a group where length > 1
5 25 2 # drop, third item of a group where length > 1
6 25 2 # drop, fourth item of a group where length > 1
7 25 2 # drop, fifth item of a group where length > 1
8 15 3 # drop, only one item in the group
9 15 4 # drop, only one item in the group
10 15 5 # keep, first item of a group where length > 1
11 25 5 # keep, second item of a group where length > 1
12 15 6 # drop, only one item in the group