避免重复的列标题panda数据帧创建



我是一个初学者,试图在下面的示例中创建存储一些模型性能的数据帧(R²、RMSE、训练时间、预测时间等(。但结果是一个具有重复列标题的数据帧。你能帮我避免这种情况吗?目标是使所有df只具有一个标题。。。这个问题一定来自"for循环",但我不知道如何解决它。感谢

models = [LinearRegression(), Ridge(), Lasso(),ElasticNet()]    
for model in models:  
start = time.time()
model.fit(X_train, y_train)
stop = time.time()
start1 = time.time()
predictions = model.predict(X_train)
stop1 = time.time() 
results={'Model':type(model).__name__, 'R²_score':r2_score(y_train, predictions),'RMSE': 
mean_squared_error(y_train, predictions),'AB_Av_ERR':mean_absolute_error(y_train, predictions),'Training_time':stop-start,'Pred_time':stop1- 
start1}
df_res = pd.DataFrame(results,index=[0])
print(df_res)

这是输出:

>    Model  R²_score    RMSE  AB_Av_ERR  Training_time  Pred_time 0 
> LinearRegression      0.01 1736.28      21.28           0.86      
> 0.07    Model  R²_score    RMSE  AB_Av_ERR  Training_time  Pred_time 0  Ridge      0.01 1736.28      21.28           0.32       0.08    Model 
> R²_score    RMSE  AB_Av_ERR  Training_time  Pred_time 0  Lasso     
> 0.01 1740.02      21.26           0.99       0.08
>         Model  R²_score    RMSE  AB_Av_ERR  Training_time  Pred_time 0  ElasticNet      0.01 1740.14      21.28           0.89       0.08

您可以首先在循环外创建一个空数据帧:

df = pd.DataFrame(columns=['Model', 'R²_score', 'RMSE', 'AB_Av_ERR', 'Training_time', 'Pred_time']

然后在循环中附加值,如下所示:

df = df.append(results, ignore_index=True)

试试这个:

models = [LinearRegression(), Ridge(), Lasso(),ElasticNet()]   
df_res = pd.DataFrame(columns=['Model', 'R²_score', 'RMSE', 'AB_Av_ERR', 'Training_time', 'Pred_time'] 
for model in models:  
start = time.time()
model.fit(X_train, y_train)
stop = time.time()
start1 = time.time()
predictions = model.predict(X_train)
stop1 = time.time() 
results={'Model':type(model).__name__, 'R²_score':r2_score(y_train, predictions),'RMSE': 
mean_squared_error(y_train, predictions),'AB_Av_ERR':mean_absolute_error(y_train, predictions),'Training_time':stop-start,'Pred_time':stop1- 
start1}
df_res = df_res.append(results, ignore_index=True)

print(df_res)

最新更新