我有如下两个数据框架:
d = {'serial#': ['AAA111', 'BBB222', 'CCC333', 'DDD444', 'EEE555'],
'Status': ['Compatible', 'Compatible', 'Not compatible', 'Not compatible', 'Not compatible']}
df1 = pd.DataFrame(data=d)
d2 = {'serial#': ['DDD444', 'EEE555'], 'Model': ['printer', 'printer']}
df2 = pd.DataFrame(data=d2)
print(df1)
print(df2)
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Not compatible
4 EEE555 Not compatible
serial# Model
0 DDD444 printer
1 EEE555 printer
我需要修改列'状态'从df1。为此,我需要使用每个数据框的"serial#"列创建一个过滤器。如果'有与df2 'serial#'列匹配的项','Status on df1'列下的行将被修改。
假设我需要用字符串"reporting"替换过滤行中的数据,所需的df1如下所示。
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Reporting
4 EEE555 Reporting
我该怎么做呢?欢迎任何帮助。
使用isin
从df2
中检查serial#
,然后使用它作为布尔掩码将Status
设置为'Reporting':
df1.loc[df1['serial#'].isin(df2['serial#']), 'Status'] = 'Reporting'
print(df1)
# Output
serial# Status
0 AAA111 Compatible
1 BBB222 Compatible
2 CCC333 Not compatible
3 DDD444 Reporting
4 EEE555 Reporting
可以在两者之间执行左连接,然后在Status列上执行条件项:
import pandas as pd
import numpy as np
d = {'serial#': ['AAA111', 'BBB222', 'CCC333', 'DDD444', 'EEE555'], 'Status': ['Compatible', 'Compatible', 'Not compatible', 'Not compatible', 'Not compatible']}
df1 = pd.DataFrame(data=d)
d2 = {'serial#': ['DDD444', 'EEE555'], 'Model': ['printer', 'printer']}
df2 = pd.DataFrame(data=d2)
df3 = df1.merge(df2, on='serial#', how='left')
df3['Status'] = np.where(df3['Model'].notnull(), 'Reporting', df3['Status'])
df3 = df3.drop('Model', axis=1)
df3
df3