pandas,如何合并具有多列ID的两个数据帧



对不起,我是熊猫新手,我很挣扎。基本上,我有两个数据集df1:

站点 nb样本
A 3
B 2
C 1

使用pandas.DataFrame.merge.rename()的单行

df_new = df2.merge(df1, left_on='sites1', right_on='sites', how='left').merge(df1, left_on='sites2', right_on='sites', how='left')[['sites1', 'sites2', 'nb links', 'nb samples_x', 'nb samples_y']].rename(columns={'nb samples_x': 'nb samples sites1', 'nb samples_y': 'nb samples sites2'})
[Out]:
sites1 sites2  nb links  nb samples sites1  nb samples sites2
0      A      B         3                  3                  2
1      A      C         1                  3                  1

票据

  • 让我们分解一下,让它更容易理解:

    1. 通过合并两个数据帧开始

      df_new = df2.merge(df1, left_on='sites1', right_on='sites', how='left').merge(df1, left_on='sites2', right_on='sites', how='left')
      [Out]:
      sites1 sites2  nb links sites_x  nb samples_x sites_y  nb samples_y
      0      A      B         3       A             3       B             2
      1      A      C         1       A             3       C             1
      
    2. 只选择要考虑的列

      df_new = df_new[['sites1', 'sites2', 'nb links', 'nb samples_x', 'nb samples_y']]
      [Out]:
      sites1 sites2  nb links  nb samples_x  nb samples_y
      0      A      B         3             3             2
      1      A      C         1             3             1
      
    3. 重命名列

      df_new.columns = ['sites1', 'sites2', 'nb links', 'nb samples sites1', 'nb samples sites2']
      [Out]:
      sites1 sites2  nb links  nb samples sites1  nb samples sites2
      0      A      B         3                  3                  2
      1      A      C         1                  3                  1
      

最新更新