我想要一种快速有效的方法,在每行中用" id"'in ['b','c'中的'id'在列中设置值基于另一个数据框中的值。以下是我如何尝试使用DF.Update进行此操作的简单示例。
data = {'id': ['a', 'b', 'b', 'c'],
'col_0': ['e','f','g','h'],
'col_1': ['m','n','o','p'],
'col_2': ['q','r','s','t']}
df=pd.DataFrame.from_dict(data)
df
#the data frame dictating the changes to be made
cols=['col_1','col_2']
chg_dict={'b': ['b_0','b_1'],'c': ['c_0','c_1']}
chg_df=pd.DataFrame.from_dict(chg_dict,orient='index',columns=cols)
chg_df
#make the change
for chg in chg_df.index:
#mask to get index where id is in chg_dict
mask=[r for r in df.index if df.loc[r,'id']==chg]
#this is apparently where I go wrong, nothing changes
df.loc[mask,cols].update(chg_df)
df
我尝试过有没有COLS索引。
https://pandas.pydata.org/pandas-docs/stable/reference/reference/pandas.dataframe.update.html
据我了解,您可以尝试:
m=df.set_index('id')
m.update(chg_df)
df=m.reset_index()
print(df)
id col_0 col_1 col_2
0 a e m q
1 b f b_0 b_1
2 b g b_0 b_1
3 c h c_0 c_1