将上的一列合并为多列



我有以下两个数据帧,DF1:

location     vaccine1            vaccine2          vaccine3           vaccine4
0   Afghanistan Oxford/AstraZeneca  Pfizer/BioNTech   Sinopharm/Beijing  None
1   Albania     Oxford/AstraZeneca  Pfizer/BioNTech   Sinovac            Sputnik V
2   Algeria     Sputnik V           None              None               None
3   Andorra     Oxford/AstraZeneca  Pfizer/BioNTech   None               None

DF2:

Vaccine             Efficacy
0   Oxford/AstraZeneca  0.70
1   Pfizer/BioNTech     0.95
2   Sinopharm/Beijing   0.79
3   Sinovac             0.50
4   Sputnik V           0.92

我知道你可以像下面这样合并,但这个过程重复了4次,效率很低:

v1 = pd.merge(df1, vacc_eff, how='left', left_on='vaccine1', right_on='Vaccine')[['location', 'Efficacy']]
v2 = pd.merge(df1, vacc_eff, how='left', left_on='vaccine2', right_on='Vaccine')[['location', 'Efficacy']]
vmerged = pd.merge(v1,v2,on=['location'])

如何在不反复编写相同的合并函数的情况下,将DF2列的"功效"合并到DF1中的每个疫苗列上?

这里有一个您可以尝试的解决方案,堆叠+映射然后展开

map_ = vacc_eff.set_index('Vaccine')['Efficacy'].to_dict()
print(
df1[['location', 'vaccine1', 'vaccine2']].set_index('location')
.stack().map(map_).unstack()
)

vaccine1  vaccine2
location                       
Afghanistan      0.70      0.95
Albania          0.70      0.95
Algeria          0.92       NaN
Andorra          0.70      0.95

最新更新