我可以使用pandas-loc方法选择多个列并用NaN替换得到的行值吗



我有一个这样的DataFrame:

students = {'ID': [2, 3, 5, 7, 11, 13], 
'Name':['John','Jane','Sam','James','Stacy','Mary'],
'Gender':['M','F','F','M','F','F'],
'school_name':['College2','College2','College10','College2','College2','College2'],
'grade':['9th','10th','9th','9th','8th','5th'],
'math_score':[90,89,88,89,89,90],
'art_score':[90,89,89,78,90,94]}

students_df = pd.DataFrame(students)

我可以在students_df上使用loc方法来选择College2九年级的所有math_scores和art_scores,并将其替换为NaN吗?有没有一种干净的方法可以在不将流程分成两部分的情况下做到这一点:一部分用于子集,另一部分用于替换?

我试着这样选择:

students_df.loc[(students_df['school_name'] == 'College2') & (students_df['grade'] == "9th"),['grade','school_name','math_score','art_score']]

我用这种方式替换:

students_df['math_score'] = np.where((students_df['school_name']=='College2') & (students_df['grade']=='9th'), np.NaN, students_df['math_score'])

使用loc和np.NaN,我能以更干净、更高效的方式实现同样的事情吗?

首先选择要替换缺失值的列,然后设置NaN:

students_df.loc[(students_df['school_name'] == 'College2') & (students_df['grade'] == "9th"),['math_score','art_score']] = np.nan
print (students_df)
ID   Name Gender school_name grade  math_score  art_score
0   2   John      M    College2   9th         NaN        NaN
1   3   Jane      F    College2  10th        89.0       89.0
2   5    Sam      F   College10   9th        88.0       89.0
3   7  James      M    College2   9th         NaN        NaN
4  11  Stacy      F    College2   8th        89.0       90.0
5  13   Mary      F    College2   5th        90.0       94.0

最新更新