如何检索只遵循特定数字序列的行?Python Pandas



我只想从这个数据帧中恢复序列15后面跟着25,如下例所示:

Step
25
15 <--
25 <--
15 <--
25 <--
25
25
25
15
15
15 <--
25 <--
15

问题是25或15可以不规则地连续重复几次(例如:15-15-25(。所以我不知道该怎么处理。总的来说,我想在第25次之前拿到前15个

结果必须是:

Step
15 <--
25 <--
15 <--
25 <--
15 <--
25 <--

groupby创建虚拟组,如果组的长度至少大于1,则保留每个组的前两个值(15,25(。

>>> df.groupby(df['Step'].eq(15).cumsum()) 
.apply(lambda x: x[:2] if len(x)>1 else None) 
.droplevel(0).rename_axis(None)
Step
1     15
2     25
3     15
4     25
10    15
11    25

详细信息:

>>> pd.concat([df['Step'], df['Step'].eq(15).cumsum()], axis=1)
Step  Step
0     25     0    # drop, only one item in the group
1     15     1  # keep, first item of a group where length > 1
2     25     1  # keep, second item of a group where length > 1
3     15     2  # keep, first item of a group where length > 1
4     25     2  # keep, second item of a group where length > 1
5     25     2    # drop, third item of a group where length > 1
6     25     2    # drop, fourth item of a group where length > 1
7     25     2    # drop, fifth item of a group where length > 1
8     15     3    # drop, only one item in the group
9     15     4    # drop, only one item in the group
10    15     5  # keep, first item of a group where length > 1
11    25     5  # keep, second item of a group where length > 1
12    15     6    # drop, only one item in the group

最新更新