我有以下数据框架:
Group SubGroup
0 GroupA A1
1 GroupA A2
2 GroupA A3
3 GroupB B1
4 GroupB B2
5 GroupC C1
如何在字典中转换这个数据框?我期望的输出如下:
{'GroupA': ['A1', 'A2', 'A3'], 'GroupB': ['B1', 'B2'], 'GroupC': ['C1']}
您可以使用groupby
,然后to_dict
功能:
In [2]: df = pd.DataFrame({'group':['A','A','A','B','B','C'], 'subgrroup':['A1','A2','A3','B1','B2','C1']})
In [3]: df
Out[3]:
group subgrroup
0 A A1
1 A A2
2 A A3
3 B B1
4 B B2
5 C C1
In [5]: df2 = df.groupby('group')['subgrroup'].apply(list)
In [6]: df2
Out[6]:
group
A [A1, A2, A3]
B [B1, B2]
C [C1]
Name: subgrroup, dtype: object
In [7]: df2.to_dict()
Out[7]: {'A': ['A1', 'A2', 'A3'], 'B': ['B1', 'B2'], 'C': ['C1']}
df.groupby('Group')['SubGroup'].apply(list).to_dict()
您可以尝试使用字典理解pd.DataFrameGroupBy.keys
:
>>> {k: df.loc[v, 'SubGroup'].tolist() for k, v in df.groupby('Group').groups.items()}
{'GroupA': ['A1', 'A2', 'A3'], 'GroupB': ['B1', 'B2'], 'GroupC': ['C1']}
>>>
或者试试pd.SeriesGroupBy.agg
:
>>> dict(df.groupby('Group')['SubGroup'].agg(list))
{'GroupA': ['A1', 'A2', 'A3'], 'GroupB': ['B1', 'B2'], 'GroupC': ['C1']}
>>>
使用to_dict
:
>>> df.groupby('Group')['SubGroup'].agg(list).to_dict()
{'GroupA': ['A1', 'A2', 'A3'], 'GroupB': ['B1', 'B2'], 'GroupC': ['C1']}
>>>