根据另一列填充NaN值



我有这个pd.DataFrame:

test4 = pd.DataFrame({'condition': ['good', np.nan, np.nan, 'excellent', 'good', np.nan], 
'odometer': [np.nan, 35000, 20000, 100000, 500000, 50]})

输出:

condition   odometer
0   good    NaN
1   NaN 35000.0
2   NaN 20000.0
3   excellent   100000.0
4   good    500000.0
5   NaN 50.0

并且我不能在列"中填充NaN值;条件";使用列"0"中的值;里程表";。条件是:

odometer <=30000, then condition = 'good'
odometer > 30000 & odometer <=150000, then value = 'excellent'
odometer > 150000 & odometer <=10000000, then value = 'good'

最后应该是这样的:

condition   odometer
0   good    NaN
1   excellent   35000.0
2   good    20000.0
3   excellent   100000.0
4   good    500000.0
5   good    50.0

我尝试过不同的东西,但都不起作用。例如:

def f(change):
if change['condition'] == np.nan:
condition = change['odometer']
value = change['condition']
if   condition <=30000 value = 'good'
elif condition > 30000 & condition <=150000 value = 'excellent'
elif condition > 150000 & condition <=10000000 value = 'good'
return value
return change['condition']
test4['condition'] = test4.apply(f)

我做错了什么?有办法让它发挥作用吗?非常感谢。

我找到了答案:

m1 = (test4['odometer'] > 30000) & (test4['odometer'] <= 150000)
m2 = (test4['odometer'] <= 30000) | (test4['odometer'] > 150000)
test4.loc[m1,'condition'] = test4.loc[m1,'condition'].fillna('excellent')
test4.loc[m2,'condition'] = test4.loc[m2,'condition'].fillna("good")

最新更新