通过字典键和公共列组合panda数据帧的多个字典



我通过按组拆分四个数据帧创建了四个字典。我现在需要使用键和公共列作为连接条件将每个字典中的数据帧连接到一个新字典中。

例如:

import pandas as pd
from functools import reduce

df_1 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'count': [10, 20, 30], 'colors': ['red', 'white', 'blue']})
df_2 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'time': [1.3, 2.5, 3]})
df_3 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'order_num': [2, 4, 7]})
df_4 = pd.DataFrame({'Group': ['A','B','C'] , 'ID': [1,2,3],'result': ['g','b','b']})
dict1= dict(tuple(df_1.groupby('Group')))
dict2= dict(tuple(df_2.groupby('Group')))
dict3= dict(tuple(df_3.groupby('Group')))
dict4= dict(tuple(df_4.groupby('Group')))

使用手动解决方案的预期结果:

datA=[dict1['A'],dict2['A'],dict3['A'],dict4['A']]
datB=[dict1['B'],dict2['B'],dict3['B'],dict4['B']]
datC=[dict1['C'],dict2['C'],dict3['C'],dict4['C']]
final_dict = {'A' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datA),
'B' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datB),
'C' : reduce(lambda left,right: pd.merge(left,right,on=['Group','ID']), datC)}

如能为您找到可扩展的非手动解决方案,我们将不胜感激。

这足够动态吗?

# Put all your dicts into a dict of dicts
dict_dict = {str(i):dict_i for i,dict_i in enumerate([dict1,dict2,dict3,dict4])}
# swap the order of the indices so groups are keys and the
# list of grouped dfs are the items
dat_dicts = {group_key:[df_dict[group_key] for df_dict in dict_dict.values()]
for group_key in list(dict_dict.values())[0].keys()}
# Apply the reduce on each group key to merge the dfs
merged_dat_df_dict = {group_key:reduce(lambda left,right:
pd.merge(left,right,on=['Group','ID']),
dat_df_list)
for group_key,dat_df_list in dat_dicts.items()}

最新更新