我有两个数据帧DF1
,DF2
具有相同类型的数据并共享一些索引值,但不是全部
DF1
index, a, b, c
[ abc 1, 3, 6 ]
[ acb 2, 4, 5 ]
[ cab 6, 5, 2 ]
[ bac 3, 6, 2 ]
[ bca 6, 8, 3 ]
DF2
index, a, b, d
[ abc 4, 7, 3 ]
[ kde 2, 5, 8 ]
[ lat 7, 2, 6 ]
[ bac 0, 4, 4 ]
[ bca 3, 6, 8 ]
因此,我想实现以下目标
1.( 根据索引匹配将 D 列添加到DF1
2.( 从DF2
中添加DF1
中不存在的索引和行
RESULT
index, a, b, c, d
[ abc 1, 3, 6, 3 ]
[ acb 2, 4, 5, - ]
[ cab 6, 5, 2, - ]
[ bac 3, 6, 2, 4 ]
[ bca 6, 8, 3, 8 ]
[ kde 2, 5, -, 8 ]
[ lat 7, 2, -, 6 ]
让我们使用 combine_first
:
创建数据:
DF1 = pd.DataFrame({'a':[1,2,6,3,6],'b':[3,4,5,6,8],'c':[6,5,2,2,3]},index=['abc','acb','cab','bac','bca'])
DF2 = pd.DataFrame({'a':[4,2,7,0,3],'b':[7,5,2,4,6],'d':[3,8,6,4,8]},index=['abc','kde','lat','bac','bca'])
df_combo = DF1.combine_first(DF2)
print(df_combo)
a b c d
abc 1.0 3.0 6.0 3.0
acb 2.0 4.0 5.0 NaN
bac 3.0 6.0 2.0 4.0
bca 6.0 8.0 3.0 8.0
cab 6.0 5.0 2.0 NaN
kde 2.0 5.0 NaN 8.0
lat 7.0 2.0 NaN 6.0