我有这个pd.DataFrame:
test4 = pd.DataFrame({'condition': ['good', np.nan, np.nan, 'excellent', 'good', np.nan],
'odometer': [np.nan, 35000, 20000, 100000, 500000, 50]})
输出:
condition odometer
0 good NaN
1 NaN 35000.0
2 NaN 20000.0
3 excellent 100000.0
4 good 500000.0
5 NaN 50.0
并且我不能在列"中填充NaN值;条件";使用列"0"中的值;里程表";。条件是:
odometer <=30000, then condition = 'good'
odometer > 30000 & odometer <=150000, then value = 'excellent'
odometer > 150000 & odometer <=10000000, then value = 'good'
最后应该是这样的:
condition odometer
0 good NaN
1 excellent 35000.0
2 good 20000.0
3 excellent 100000.0
4 good 500000.0
5 good 50.0
我尝试过不同的东西,但都不起作用。例如:
def f(change):
if change['condition'] == np.nan:
condition = change['odometer']
value = change['condition']
if condition <=30000 value = 'good'
elif condition > 30000 & condition <=150000 value = 'excellent'
elif condition > 150000 & condition <=10000000 value = 'good'
return value
return change['condition']
test4['condition'] = test4.apply(f)
我做错了什么?有办法让它发挥作用吗?非常感谢。
我找到了答案:
m1 = (test4['odometer'] > 30000) & (test4['odometer'] <= 150000)
m2 = (test4['odometer'] <= 30000) | (test4['odometer'] > 150000)
test4.loc[m1,'condition'] = test4.loc[m1,'condition'].fillna('excellent')
test4.loc[m2,'condition'] = test4.loc[m2,'condition'].fillna("good")