如何删除组pandas中的某些行



我有一个Dataframe,并希望为每个类别删除某些行。以下是数据:

data={'GROUP':['A','A','A','B','B','B','B','C','C','C','C','C'],'DATE':['202101','202102','202103','201907','201908','201909',
'201910','202003','202004','202005','202006','202007']}
df=pd.DataFrame(data, columns=['GROUP','DATE']) 

GROUP    DATE
0      A  202101
1      A  202102
2      A  202103
3      B  201907
4      B  201908
5      B  201909
6      B  201910
7      C  202003
8      C  202004
9      C  202005
10     C  202006
11     C  202007

我想删除每组第二次约会之后的所有行。换句话说,我想创造出这样的效果:

GROUP    DATE
0     A  202101
1     A  202102
3     B  201907
4     B  201908
7     C  202003
8     C  202004

使用GroupBy.head:

df.groupby('GROUP').head(2)

GROUP    DATE
0     A  202101
1     A  202102
3     B  201907
4     B  201908
7     C  202003
8     C  202004

将数据帧按GROUP分组,并应用一个函数只取两个值的切片。

>>> df.groupby(['GROUP'])['DATE'].apply(lambda x: x[:2]).droplevel(-1).reset_index()
GROUP    DATE
0     A  202101
1     A  202102
2     B  201907
3     B  201908
4     C  202003
5     C  202004

最新更新