通过一列(国家/地区代码)连接两个熊猫数据报



我想在 df 数据帧中将国家/地区代码表示为 df2 数据帧的字段Nationality_Codes中的alpha_3_code。对于 df2 中的每一行,我想将Reviewer_Nationality与 df 中的en_short_name匹配,如果匹配,则在 df2 中为Nationality_Codes分配国家/地区代码。

df2.head()

Nationality_Codes   Reviewer_Nationality    Reviewer_Score
NaN                       Russia                  2.9
NaN                       United Kingdom          7.5
NaN                       Australia               7.1
NaN                       United Kingdom          3.8
NaN                       Russia                  6.7

df.head()

alpha_3_code       en_short_name           nationality
RUS                 Russia                  Russian
ALA                 Åland Islands           Åland Island
ALB                 Albania                 Albanian 
AUS                 Australia               Australian
UK                  United Kingdom          British, UK

最终结果应该是:

df2.head()

Nationality_Codes   Reviewer_Nationality    Reviewer_Score
RUS                       Russia                  2.9
UK                        United Kingdom          7.5
AUS                       Australia               7.1
UK                        United Kingdom          3.8
RUS                       Russia                  6.7

我尝试了这段代码,但没有奏效。

for index, row in df.iterrows():
for index2, row2 in df2.iterrows():
if row2['Reviewer_Nationality']==row['en_short_name']:
df2['Nationality_Codes'][row2]=df['alpha_3_code'][row2]

谁能帮我?

非常感谢!

一种方法是为英文名称和代码创建一个系列映射,并使用.map

#my_map = pd.Series(df.alpha_3_code.values,index=df.en_short_name)
my_map = df.set_index('en_short_name')['alpha_3_code']
df2['Nationality_Codes'] = df2['Reviewer_Nationality'].map(my_map)

输出:

>>> df2
Nationality_Codes Reviewer_Nationality  Reviewer_Score
0               RUS               Russia             2.9
1                UK       United Kingdom             7.5
2               AUS            Australia             7.1
3                UK       United Kingdom             3.8
4               RUS               Russia             6.7

试试这个:

merged = df[['alpha_3_code', 'en_short_name']].merge(df2[['Reviewer_Nationality',
'Reviewer_Score']],
left_on='en_short_name', right_on='Reviewer_Nationality', how='left')]
.rename(columns={'alpha_3_code': 'Nationality_Codes'})
.drop('en_short_name', axis=1)

最新更新