我有一个看起来像这样的数据框架。
username course marks
0 alex physics 20
1 bob chemistry 25
2 alex math 30
3 alex chemistry 40
我想得到一个嵌套字典的列表,看起来像这样。
[{
"alex":{
"course":"physics",
"marks":20
},
{
"course":"math",
"marks":30
},
{
"course":"chemistry",
"marks":40
}
},
{ "bob":{
"course":"chemistry",
"marks":25
}
}]
我尝试使用groupby()
,但无法获得预期的结果。
在GroupBy.apply
中使用自定义lambda函数DataFrame.to_dict
,最后使用Series.to_dict
:
d = (df.groupby('username')[['course','marks']]
.apply(lambda x: x.to_dict('records'))
.to_dict())
或:
d = {name: g[['course','marks']].to_dict('records') for name, g in df.groupby('username')}
print (d)
{'alex': [{'course': 'physics', 'marks': 20},
{'course': 'math', 'marks': 30},
{'course': 'chemistry', 'marks': 40}],
'bob': [{'course': 'chemistry', 'marks': 25}]}
您可以使用下面的命令来获得所需的输出
import pandas as pd
# create the DataFrame
df = pd.DataFrame({
'username': ['alex', 'bob', 'alex', 'alex'],
'course': ['physics', 'chemistry', 'math', 'chemistry'],
'marks': [20, 25, 30, 40]
})
# group the DataFrame by username
grouped = df.groupby('username')
# create a list of nested dictionaries
result_list = []
for name, group in grouped:
# create a dictionary for the current username
user_dict = {}
for _, row in group.iterrows():
# add the course and marks for the current row to the user_dict
user_dict[row['course']] = {
'course': row['course'],
'marks': row['marks']
}
# add the user_dict to the result list
result_list.append({name: list(user_dict.values())})
print(result_list)
输出:
[
{
'alex': [
{'course': 'physics', 'marks': 20},
{'course': 'math', 'marks': 30},
{'course': 'chemistry', 'marks': 40}
]
},
{
'bob': [
{'course': 'chemistry', 'marks': 25}
]
}
]