根据聚合数据的结果(即,根据较短列表的index())对pandas DataFrame进行排序



我有一个数据表,希望在显示之前对其进行排序。排序基于聚合。在下面的例子中,我统计了一些汽车,我想按照每个制造商的汽车总数升序排列数据帧,例如"mercedes"后面跟着"bmw",后面跟着"audi"。

我想保留所有原始列-聚合仅用于确定排序顺序。

import pandas as pd
df = pd.DataFrame({
"manufacturer": ["bmw", "bmw", "bmw", "audi", "audi", "audi", "audi", "mercedes", "mercedes"],
"color": ["red", "blue", "silver", "red", "blue", "black", "blue", "green", "white"],
"count": [5, 4, 7, 1, 1, 9, 7, 5, 3],
"misc": ["a", "bc", "def", "gh", "ijk", "l", "mnop", "qrst", "uvwxyz"],
})
# order manufacturers by increasing total car counts
order = list(df.groupby("manufacturer").agg({"count": sum}).sort_values(by="count", ascending=True).index)
# apply ordering to original dataframe based on manufacturer's position in 'order' list
df.sort_values(by="manufacturer", key=lambda x: x.apply(order.index))

有没有一种更像蟒蛇/熊猫的方式来做到这一点,因为上面的情况让我觉得我为一些本应简单的事情太努力了。

让我们试用transformargsort

out = df.iloc[df.groupby("manufacturer")['count'].transform('sum').argsort()]
Out[207]: 
manufacturer   color  count    misc
7     mercedes   green      5    qrst
8     mercedes   white      3  uvwxyz
0          bmw     red      5       a
1          bmw    blue      4      bc
2          bmw  silver      7     def
3         audi     red      1      gh
4         audi    blue      1     ijk
5         audi   black      9       l
6         audi    blue      7    mnop

如果下降

df.iloc[(-df.groupby("manufacturer")['count'].transform('sum')).argsort()]

最新更新