对不起,我是熊猫新手,我很挣扎。基本上,我有两个数据集df1:
站点 | nb样本 |
---|---|
A | 3 |
B | 2 |
C | 1 |
使用pandas.DataFrame.merge
和.rename()
的单行
df_new = df2.merge(df1, left_on='sites1', right_on='sites', how='left').merge(df1, left_on='sites2', right_on='sites', how='left')[['sites1', 'sites2', 'nb links', 'nb samples_x', 'nb samples_y']].rename(columns={'nb samples_x': 'nb samples sites1', 'nb samples_y': 'nb samples sites2'})
[Out]:
sites1 sites2 nb links nb samples sites1 nb samples sites2
0 A B 3 3 2
1 A C 1 3 1
票据
让我们分解一下,让它更容易理解:
通过合并两个数据帧开始
df_new = df2.merge(df1, left_on='sites1', right_on='sites', how='left').merge(df1, left_on='sites2', right_on='sites', how='left') [Out]: sites1 sites2 nb links sites_x nb samples_x sites_y nb samples_y 0 A B 3 A 3 B 2 1 A C 1 A 3 C 1
只选择要考虑的列
df_new = df_new[['sites1', 'sites2', 'nb links', 'nb samples_x', 'nb samples_y']] [Out]: sites1 sites2 nb links nb samples_x nb samples_y 0 A B 3 3 2 1 A C 1 3 1
重命名列
df_new.columns = ['sites1', 'sites2', 'nb links', 'nb samples sites1', 'nb samples sites2'] [Out]: sites1 sites2 nb links nb samples sites1 nb samples sites2 0 A B 3 3 2 1 A C 1 3 1