左连接和反连接在相同的数据帧Pandas上



我有两个类似的数据帧:

df1 = pd.DataFrame(data = {'col1' : ['finance',  'accounting'], 'col2' : ['f1', 'a1']}) 
df2 = pd.DataFrame(data = {'col1' : ['finance', 'finance', 'finance', 'accounting', 'accounting','IT','IT'], 'col2' : ['f1','f2','f3','a1,'a2','I1','I2']})

df1

col1        col2
0   finance      f1
1   accounting   a1

df2

col1        col2
0   finance      f1
1   finance      f2
2   finance      f3
3   accounting   a1 
4   accounting   a2
5   IT           I1
6   IT           I2

我想在col1上执行LEFT JOIN,在col2上执行ANTI JOIN。输出应该是这样的:

col1      col2
finance     f2
finance     f3
accounting  a2 

有人能帮助我如何在熊猫身上正确地做这件事吗。我在熊猫身上试过joinmerge,但都不起作用。提前谢谢。

您可以merge并过滤:

(df1.merge(df2, on='col1', suffixes=('_', None))
.loc[lambda d: d['col2'] != d.pop('col2_')]
)

输出:

col1 col2
1     finance   f2
2     finance   f3
4  accounting   a2

只是为了好玩,这里有另一种方法(除了@mozway的真正优雅的解决方案(:

df2 = ( df2
.reset_index()                  # save index as column 'index'
.set_index('col1')              # make 'col1' the index
.loc[df1.col1,:]                # filter for 'col1' values in df1
.set_index('col2', append=True) # add 'col2' to the index
.drop(index=df1.
set_index(list(df1.columns))
.index)                     # create a multi-index from df1 and drop all matches from df2
.reset_index()                  # make 'col1' and col2' columns again
.set_index('index')             # make 'index' the index again
.rename_axis(index=None) )      # make the index anonymous

输出:

col1 col2
1     finance   f2
2     finance   f3
4  accounting   a2

最新更新