熊猫按年份对数据进行分组，并根据多个(两个)列给出排名

经过几个小时的研究，我仍然无法按年份对数据进行分组并根据两个列给出排名，这样只要第一个列中的值相同，就没有排名上的联系。

我只能根据两列给出排名，但我无法先对数据进行分组。以下是我所做的。

>>> import pandas as pd
>>> data = pd.read_csv('C:/Users/Ene_E/Desktop/Data/data.csv')
>>> cols = ['score1', 'score2']
>>> tups = data[cols].sort_values(cols, ascending=False).apply(tuple, 1)
>>> f, i = pd.factorize(tups)
>>> factorized = pd.Series(f + 1, tups.index)
>>> wellranked = data.assign(Rank=factorized)
>>> wellranked.to_csv('wellrank.csv')

以下是我的数据示例

name        year       score1          score2
brand1       2015       2500            5
brand2       2015       2500            3
brand3       2015       1500            7
brand1       2016       3200            2
brand2       2016       3000            4
brand3       2016       2100            6

我的代码产生这个

name        year       score1          score2     Rank
brand1       2015       2500              1       3            
brand2       2015       2500              2       4
brand3       2015       1500              3       6
brand1       2016       3200              1       1      
brand2       2016       3000              2       2
brand3       2016       2100              3       5

但我想要这样..

name        year       score1          score2     Rank
brand1       2015       2500              1        1           
brand2       2015       2500              2        2
brand3       2015       1500              3        3
brand1       2016       3200              1        2     
brand2       2016       3300              2        1
brand3       2016       2100              3        3

我认为您需要每year秒GroupBy.transform：

cols = ['score1', 'score2']
tups = data[cols].sort_values(cols, ascending=False).apply(tuple, 1)
factorized = tups.groupby(data['year']).transform(lambda x: pd.factorize(x)[0]+1)
wellranked = data.assign(Rank=factorized)
print (wellranked)
name  year  score1  score2  Rank
0  brand1  2015    2500       5     1
1  brand2  2015    2500       3     2
2  brand3  2015    1500       7     3
3  brand1  2016    3200       2     1
4  brand2  2016    3000       4     2
5  brand3  2016    2100       6     3

相关内容

最新更新

热门标签：