如何在pandas数据框架中只保留一个值的第n个出现



我有一个数据框架如下:

df = pd.DataFrame()
df['Name'] = ['Ankita', 'Ankita', 'Ankita', 'Ankita', 'Ankita', 'Yashvardhan', 'Yashvardhan', 'Yashvardhan', 'Yashvardhan', 'Yashvardhan']
df['Date'] = ['2014-10-07', '2015-03-30', '2015-12-07', '2015-12-09', '2017-01-30', '2017-01-30', '2018-02-19', '2018-02-23', '2018-11-19', '2020-01-23']
df['Value'] = [2200, 75, 100, 22, 98, 0.36, 57, 29, 1026, 1296]
df['Date'] = pd.to_datetime(df['Date'])
Name    Date    Value
0   Ankita  2014-10-07  2200.00
1   Ankita  2015-03-30  75.00
2   Ankita  2015-12-07  100.00
3   Ankita  2015-12-09  22.00
4   Ankita  2017-01-30  98.00
5   Yashvardhan 2017-01-30  0.36
6   Yashvardhan 2018-02-19  57.00
7   Yashvardhan 2018-02-23  29.00
8   Yashvardhan 2018-11-19  1026.00
9   Yashvardhan 2020-01-23  1296.00

如何只保留每个唯一名称的前3行?也就是说,我怎么能让数据帧像这样结束:

Name    Date    Value
0   Ankita  2014-10-07  2200.00
1   Ankita  2015-03-30  75.00
2   Ankita  2015-12-07  100.00
5   Yashvardhan 2017-01-30  0.36
6   Yashvardhan 2018-02-19  57.00
7   Yashvardhan 2018-02-23  29.00

我如何只保留每个唯一名称的最近两行?也就是说,我怎么能让数据帧像这样结束:

Name    Date    Value
3   Ankita  2015-12-09  22.00
4   Ankita  2017-01-30  98.00
8   Yashvardhan 2018-11-19  1026.00
9   Yashvardhan 2020-01-23  1296.00

提前感谢!

可以使用.groupby()+GroupBy.head()GroupBy.tail(),如下所示:

df.groupby('Name').head(3)
Name       Date    Value
0       Ankita 2014-10-07  2200.00
1       Ankita 2015-03-30    75.00
2       Ankita 2015-12-07   100.00
5  Yashvardhan 2017-01-30     0.36
6  Yashvardhan 2018-02-19    57.00
7  Yashvardhan 2018-02-23    29.00
df.groupby('Name').tail(2)
Name       Date   Value
3       Ankita 2015-12-09    22.0
4       Ankita 2017-01-30    98.0
8  Yashvardhan 2018-11-19  1026.0
9  Yashvardhan 2020-01-23  1296.0

最新更新