通过iTerrows循环并用模型预测值替换空值

我使用使用KNN Regressor Model构建的预测值填充了一些缺失的值(NaN)。现在，我想将预测值作为原始数据框架中的新列输入，并保留那些不是NaN的行的原始值。这将是我的数据框中的全新列，我将用来构建功能。

我正在使用iterrows循环循环以构建一个新列，但是我遇到了一个错误。我使用了两种不同的方法来隔离NaN值。但是，我在每种方法上都有问题

sticker_price_preds = []
features = ['region_x', 'barrons', 'type_x', 'tier_x', 'iclevel_x', 
'exp_instr_pc_2013']
for index, row in data.iterrows():
    val = row['sticker_price_2013']
    if data[data['sticker_price_2013'].isnull()]:
        f = row['region_x', 'barrons', 'type_x', 'tier_x', 'iclevel_x', 
'exp_instr_pc_2013']
        val = knn.predict(f)
    sticker_price_preds.append(val)
data['sticker_price_preds'] = sticker_price_preds

和

sticker_price_preds = []
features = ['region_x', 'barrons', 'type_x', 'tier_x', 'iclevel_x', 
'exp_instr_pc_2013']
for index, row in data.iterrows():
    val = row['sticker_price_2013']
    if not val:
        f = row['region_x', 'barrons', 'type_x', 'tier_x', 'iclevel_x', 
'exp_instr_pc_2013']
        val = knn.predict(f)
    sticker_price_preds.append(val)
data['sticker_price_preds'] = sticker_price_preds

我正在返回第一个方法的以下错误消息：

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

对于第二种方法，NAN行仍保持为空

没有数据尝试一下，有点艰难，但是如果您想要矢量解决方案，则可能会起作用。制作具有knn.predict值的列，然后过滤np.nan

的数据框

df['predict'] = knn.predict(features)

data.loc[data['sticker_price_2013'].isna(),'sticker_price_2013'] = data.loc[data['sticker_price_2013'].isna(), 'predict']

相关内容

最新更新

热门标签：