外部联接表-保留描述


new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})
all = pd.merge(new, old, how='outer', on=['table', 'total'])

输出:

table desc_x  total desc_y
0     a            22    NaN
1     b            22    NaN
2     c            22    NaN
3     d            22    NaN
4     a    NaN     11    foo

期望输出:

table desc  total
0     a   foo     22
1     b   foo     22
2     c           22
3     d           22
4     a   foo     11

我试图外部加入,但它删除了对a和b的描述。`

  • 考虑到您试图实现的是在上进行外部联接,这是没有意义的。已更改为表上的外部联接
  • 然后可以修改表以使用所需输出和清除列中隐含的首选项
new = pd.DataFrame({'table': ['a','b', 'c', 'd'], 'desc': ['','','',''], 'total':[22,22,22,22]})
old = pd.DataFrame({'table': ['a','b', 'e'], 'desc': ['foo','foo','foo'], 'total':[11,11,11]})
all = pd.merge(new, old, how='outer', on=['table'])
# select prefered columns
all["desc"] = all["desc_x"].replace('', np.nan).fillna(all["desc_y"]).fillna("")
all["total"] = all["total_x"].fillna(all["total_y"])
# clean up columns
all = all.drop(columns=[c for c in all.columns if c[-2:] in ["_x", "_y"]])
all
oo//tr>td style="text align:right;">4
table
0a22
1bfoo22
2c3defoo

最新更新