在groupby之后查找具有相应值的nlargest(2)



我有一个数据帧,如下所示:

Datetime             Volume       Price
2020-08-05 09:15:00  1033         504
2020-08-05 09:15:00  1960         516
2020-08-05 09:15:00  1724         520
2020-08-05 09:15:00  1870         540
2020-08-05 09:20:00  1024         576
2020-08-05 09:20:00  1960         548
2020-08-05 09:20:00  1426         526
2020-08-05 09:20:00  1968         518
2020-08-05 09:30:00  1458         511
2020-08-05 09:30:00  1333         534
2020-08-05 09:30:00  1322         555
2020-08-05 09:30:00  1425         567
2020-08-05 09:30:00  1245         598

我想在Datetime列的groupby之后找到前两个最大成交量和相应的价格。

结果数据帧如下:

Datetime             Volume       Price
2020-08-05 09:15:00  1960         516
2020-08-05 09:15:00  1870         540
2020-08-05 09:20:00  1960         548
2020-08-05 09:20:00  1968         518
2020-08-05 09:30:00  1858         511
2020-08-05 09:30:00  1925         567

groupby:之前使用sort_values

print (df.sort_values("Volume", ascending=False)
.groupby("Datetime").head(2).sort_index())
Datetime  Volume  Price
1   2020-08-05 09:15:00    1960    516
3   2020-08-05 09:15:00    1870    540
5   2020-08-05 09:20:00    1960    548
7   2020-08-05 09:20:00    1968    518
8   2020-08-05 09:30:00    1458    511
11  2020-08-05 09:30:00    1425    567

使用groupby.rank+boolean indexing:

df[df.groupby("Datetime")['Volume'].rank(ascending=False).le(2)]

Datetime  Volume  Price
1   2020-08-05 09:15:00    1960    516
3   2020-08-05 09:15:00    1870    540
5   2020-08-05 09:20:00    1960    548
7   2020-08-05 09:20:00    1968    518
8   2020-08-05 09:30:00    1458    511
11  2020-08-05 09:30:00    1425    567

由于您提到了nlargest

out = df.groupby('Datetime',as_index=False).apply(lambda x : x.nlargest(2, columns=['Volume']))

相关内容

最新更新