我试图合并下面的两个数据帧,但没有得到预期的结果。
import pandas as pd
previous_dict = [{"category1":"Home", "category2":"Power","usage":"15","amount":"65"},
{"category1":"Home", "category2":"Power","usage":"2","amount":"15"},
{"category1":"Home", "category2":"Vehicle","usage":"6","amount":"5"}
]
current_dict = [{"category1":"Home", "category2":"Power","usage":"16","amount":"79"},
{"category1":"Home", "category2":"Power","usage":"0.5","amount":"2"},
{"category1":"Home", "category2":"Vehicle","usage":"3","amount":"4"}
]
df_previous = pd.DataFrame.from_dict(previous_dict)
print(df_previous)
df_current = pd.DataFrame.from_dict(current_dict)
print(df_current)
df_merge = pd.merge(df_previous, df_current, on=['category1','category2'], how='outer',indicator=True, suffixes=('', '_y'))
print(df_merge)
前一年的数据帧
category1 category2 usage amount
0 Home Power 15 65
1 Home Power 2 15
2 Home Vehicle 6 5
当前年份的数据帧
category1 category2 usage amount
0 Home Power 16 79
1 Home Power 0.5 2
2 Home Vehicle 3 4
当前结果:
category1 category2 usage amount usage_y amount_y _merge
0 Home Power 15 65 16 79 both
1 Home Power 15 65 0.5 2 both
2 Home Power 2 15 16 79 both
3 Home Power 2 15 0.5 2 both
4 Home Vehicle 6 5 3 4 both
但我的预期结果是,
category1 category2 usage amount usage_y amount_y _merge
0 Home Power 15 65 16 79 both
3 Home Power 2 15 0.5 2 both
4 Home Vehicle 6 5 3 4 both
当类别1和类别2在两个表中多次具有相同的值时,我只想将其与正确的顺序匹配。我怎样才能得到我期望的价值?
我认为这是由于您所连接的列中存在重复。解决此问题的一种方法是还使用如下索引:
df_merge = pd.merge(df_previous.reset_index(), df_current.reset_index(), on=['category1','category2', 'index'], how='outer',indicator=True, suffixes=('', '_y'))
index category1 category2 usage amount usage_y amount_y _merge
0 0 Home Power 15 65 16 79 both
1 1 Home Power 2 15 0.5 2 both
2 2 Home Vehicle 6 5 3 4 both
看起来您要做的是将一个数据帧的列添加到另一个数据框,而不是通常认为的";合并";。考虑到这一点,请考虑以下内容。
df_new = df_previous.copy()
df_new = df_new.rename(columns = {"usage":"usage_prev","amount":"amount_prev"})
df_new[["usage_current","amount_current"]] = df_current[["usage","amount"]]
print(df_new)
结果输出:
category1 category2 usage_prev amount_prev usage_current amount_current
0 Home Power 15 65 16 79
1 Home Power 2 15 0.5 2
2 Home Vehicle 6 5 3 4