我有下面的数据帧,需要修改专业列,除非值有博士。
id firstname lastname email profession
0 100 Ekaterina Skell Ekaterina.Skell@yopmail.com developer
1 101 Judy Vernier Judy.Vernier@yopmail.com police officer
2 102 Tarra Diann Tarra.Diann@yopmail.com police officer
3 103 Odessa Maxi Odessa.Maxi@yopmail.com firefighter
4 104 Mallory Peonir Mallory.Peonir@yopmail.com firefighter
5 105 Nataline Hoenack Nataline.Hoenack@yopmail.com doctor
6 106 Dude Adrienne Dode.Adrienne@yopmail.com developer
7 107 Caressa Meli Caressa.Meli@yopmail.com doctor
8 108 Zaria Carey Zaria.Carey@yopmail.com firefighter
9 109 Harmonia Seumas Harmonia.Seumas@yopmail.com worker
我试过的是
if src[src['profession'].isin(['doctor'])]:
src['profession'] = src['profession'].astype(str)+'-Done'
但我的错误越来越少。
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我如何获得以下输出(如果值有doctor,则不应附加)
id firstname lastname email profession
0 100 Ekaterina Skell Ekaterina.Skell@yopmail.com developer-Done
1 101 Judy Vernier Judy.Vernier@yopmail.com police officer-Done
2 102 Tarra Diann Tarra.Diann@yopmail.com police officer-Done
3 103 Odessa Maxi Odessa.Maxi@yopmail.com firefighter-Done
4 104 Mallory Peonir Mallory.Peonir@yopmail.com firefighter-Done
5 105 Nataline Hoenack Nataline.Hoenack@yopmail.com doctor
6 106 Dude Adrienne Dode.Adrienne@yopmail.com developer-Done
7 107 Caressa Meli Caressa.Meli@yopmail.com doctor
8 108 Zaria Carey Zaria.Carey@yopmail.com firefighter-Done
9 109 Harmonia Seumas Harmonia.Seumas@yopmail.com worker-Done
通过~
:使用具有反向掩码的DataFrame.loc
#if need compare one scalar value
m = src['profession'].eq('doctor')
#if need compare by list of values
m = src['profession'].isin(['doctor'])
src.loc[~m, 'profession'] = src.loc[~m, 'profession'].astype(str)+'-Done'
或numpy.where
:
src['profession'] = np.where(m, src['profession'], src['profession'].astype(str)+'-Done')
如果需要100%所有字符串:
s = src['profession'].astype(str)
src['profession'] = np.where(m, s, s+'-Done')