小贝子编程

如何提取子集中具有最大行值的数据帧的子集

本文关键字：数据帧子集何提取提取集中 python pandas dataframe pandas-groupby
更新时间 : 2023-09-21
英文 : How to extract subset of a dataframe that has the largest maximum row value within a subset of a subset?

我有一个数据帧(df(，它包含沿河流的数据点，其中每个流域(流域ID(包含多条河流(河流ID(，每条河流沿其长度(length(都有点。它看起来像这样(简化(：

Basin_ID
1		1	1
1	2	5
1	2	7
1	2	12
1	3	5
2		2	1	10
2	1	12
2	1	14
2

df2=df.assign(maxRiverLength=df.groupby('Basin_ID').transform(lambda x: x.max())['Length']).set_index(['Basin_ID','River_ID'])
df.set_index(['Basin_ID','River_ID']).loc[df2[df2['Length']==df2['maxRiverLength']].index].reset_index()

可能有一种更聪明的方法，但我用两个步骤复制了您的输出：

我首先创建一个新列，其中包含每个Basin_ID的maxRiverLength，并将其分配给df2，其中我将索引设置为Basin_ID和River_ID:
然后我取原始df，将索引设置为Basin_ID和River_ID并通过具有"Length"="maxRiverLength"的df2的索引进行过滤

如何提取子集中具有最大行值的数据帧的子集

相关内容

最新更新

热门标签：