如何比较两个数据帧



df1

index           Count
Duliajan Area      2
HAPJAN             2
KATHALGURI         2

df2

Location           Category
0        NAGAJAN        0
1        JORAJAN        0
2     KATHALGURI        0
3         HEBEDA        0
4          MAKUM        0
5       BAREKURI        0
6        BAGHJAN        0
7  Duliajan Area        0
8      LANGKASHI        0
9         HAPJAN        0

我需要这个输出:

0        NAGAJAN        0
1        JORAJAN        0
2     KATHALGURI        2
3         HEBEDA        0
4          MAKUM        0
5       BAREKURI        0
6        BAGHJAN        0
7  Duliajan Area        2
8      LANGKASHI        0
9         HAPJAN        2

您可以从df1的两列创建dict,然后在df2上使用map

d = dict(zip(df1['index'], df1['Count']))
df2['Category'] = df2['Location'].map(d).fillna(df2['Category']).astype(int)
print(df2)

输出:

Location  Category
0        NAGAJAN         0
1        JORAJAN         0
2     KATHALGURI         2
3         HEBEDA         0
4          MAKUM         0
5       BAREKURI         0
6        BAGHJAN         0
7  Duliajan Area         2
8      LANGKASHI         0
9         HAPJAN         2

您可以使用pandasmerge函数例如:

df2 = df2.rename(columns={"Location": "index"})
result = pd.merge(df1, df2, on="index")

拼接数据帧,然后丢弃重复数据:

mapping = {'index': 'Location', 'Count': 'Category'}
out = (pd.concat([df2, df1.rename(columns=mapping)])
.drop_duplicates('Location', keep='last')
.reset_index(drop=True))
print(out)
# Output
Location  Category
0        NAGAJAN         0
1        JORAJAN         0
2         HEBEDA         0
3          MAKUM         0
4       BAREKURI         0
5        BAGHJAN         0
6      LANGKASHI         0
7  Duliajan Area         2
8         HAPJAN         2
9     KATHALGURI         2

相关内容

  • 没有找到相关文章

最新更新