new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})
all = pd.merge(new, old, how='outer', on=['table', 'total'])
输出:
table desc_x total desc_y
0 a 22 NaN
1 b 22 NaN
2 c 22 NaN
3 d 22 NaN
4 a NaN 11 foo
期望输出:
table desc total
0 a foo 22
1 b foo 22
2 c 22
3 d 22
4 a foo 11
我试图外部加入,但它删除了对a和b的描述。`
- 考虑到您试图实现的是在表和总上进行外部联接,这是没有意义的。已更改为表上的外部联接
- 然后可以修改表以使用所需输出和清除列中隐含的首选项
new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})
all = pd.merge(new, old, how='outer', on=['table'])
# select prefered columns
all["desc"] = all["desc_x"].replace('', np.nan).fillna(all["desc_y"]).fillna("")
all["total"] = all["total_x"].fillna(all["total_y"])
# clean up columns
all = all.drop(columns=[c for c in all.columns if c[-2:] in ["_x", "_y"]])
all
table | ||||||
---|---|---|---|---|---|---|
0 | a | oo22 | //tr>||||
1 | b | foo | 22 | |||
2 | c | 3 | d | td style="text align:right;">4e | foo |