只为大熊猫中的一个条目选择第一个最高值



这是我的数据

Column       IV         Source
RRD          5.795765   Personal_Demographics
RRD          5.795765   Cust360_Agreement
RRD          5.792729   External_Data
WO           4.361066   Cust360_Asset
Rating       3.600918   Personal_Demographics

我的预期结果

Column       IV         Source
RRD          5.795765   Personal_Demographics
WODate       4.361066   Cust360_Asset
Rating       3.600918   Personal_Demographics

我尝试的

inds = df.groupby(['Column'])['IV'].transform(max) == df['IV']

但是的结果

Column       IV         Source
RRD          5.795765   Personal_Demographics
RRD          5.795765   Cust360_Agreement
WO           4.361066   Cust360_Asset
Rating       3.600918   Personal_Demographics

第一个是有类似的值,但我只需要一个像一样的输出

Column       IV         Source
RRD          5.795765   Personal_Demographics
WO           4.361066   Cust360_Asset
Rating       3.600918   Personal_Demographics

问候

尝试drop_duplicates+sort_values

out = df.sort_values('IV',ascending=False).drop_duplicates('Column')
Out[121]: 
Column        IV                 Source
0     RRD  5.795765  Personal_Demographics
3      WO  4.361066          Cust360_Asset
4  Rating  3.600918  Personal_Demographics

如果您想要groupby

df.sort_values('IV',ascending=False).groupby(['Column']).head(1)

最新更新