我会尽量问清楚我的问题。
我有下面的DataFrame,看起来像这个
import pandas as pd
data = {'player' : ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'game' : ['Soccer', 'Basketball', 'Ping pong', 'Soccer', 'Tennis', 'Tennis', 'Baseball', 'Volleyball', 'Dodgeball']}
df = pd.DataFrame(data, columns=['player','game'])
player game
0 A Soccer
1 A Basketball
2 A Ping pong
3 B Soccer
4 B Tennis
5 B Tennis
6 C Baseball
7 C Volleyball
8 C Dodgeball
现在我只想让每个玩家的价值观保持唯一一次。理想情况下,在一个列表中,但这不是什么大不了的。
例如,玩家A
和B
玩soccer
,所以我不希望足球出现在输出中。tennis
出现两次,但都是为玩家B
出现的,所以它会出现在输出中。
我想输出为:
player game
0 A Basketball
1 A Ping pong
2 B Soccer
3 B Tennis
4 C Baseball
5 C Volleyball
6 C Dodgeball
或者像这样:
player game
0 A [Basketball, Ping Pong]
1 B [Soccer, Tennis]
2 C [Baseball, Volleyball, Dodgeball]
谢谢你的帮助!
似乎需要通过DataFrame.drop_duplicates
保留每列最后一个"游戏"来删除重复项,然后如果需要,列表通过list
:聚合它们
df = (df.drop_duplicates('game', keep='last')
.groupby('player')['game']
.agg(list)
.reset_index())
print (df)
player game
0 A [Basketball, Ping pong]
1 B [Soccer, Tennis]
2 C [Baseball, Volleyball, Dodgeball]