Python scikit学习如何在excel中导出分类报告和混淆矩阵结果



如何在excel文件中导出结果?我尝试下面的脚本,但它没有给我适当的输出。如果在预测的列中没有从属标签,并且在测试数据设置中存在的类,则在输出中不显示它。

还有其他方法可以实现这个吗?我想用excel格式显示模型结果。

import pandas as pd
expected = y_test
y_actu = pd.Series(expected, name='Actual')
y_pred = pd.Series(predicted, name='Predicted')
df_confusion = pd.crosstab(y_actu, y_pred,y_test.unique())
df_confusion

df_confusion.to_csv('SVM_Confusion_Matrix.csv')
from pandas import ExcelWriter
writer = ExcelWriter('SVM_Confusion_Matrix.xlsx')
df_confusion.to_excel(writer,'Sheet1')
writer.save()

您可以使用以下代码进行分类报告:

  1. classification_report = classification_report(y_actu, y_pred, output_dict=True)

  2. df = pandas.DataFrame(classification_report ).transpose()

  3. df.to_excel('classification_report.xlsx')

该函数将创建一个包含召回率和精度分数的混淆矩阵数据框架。然后可以很容易地将数据框导出到excel。可用于任意数量的类别

def confusion_max_df(actual, prediction, labels):
    """
    Input: A list of actual values, prediction values and labels
    returns: A data frame of confusion matrix embedded with precision and recall
    """
    
    # confusion matrix from sklearn.metrix library
    cnf_matrix = confusion_matrix(actual, prediction,labels=labels)
    
    # calculatimg recall and precision at category label
    tp_and_fn = cnf_matrix.sum(1)
    tp_and_fp = cnf_matrix.sum(0)
    tp = cnf_matrix.diagonal()
    precision = [str(round(num, 2)*100) +'%' for num in list(tp / tp_and_fp)]
    recall = [str(round(num, 2)*100)+'%' for num in list(tp / tp_and_fn)]
    
    # creating dataframe for exporting to excel
    cnf_matrix_df = pd.DataFrame(cnf_matrix, columns=labels)
    cnf_matrix_df = cnf_matrix_df.add_prefix('Predicted - ')
    actual_list = ['Actual - ' + str(x)  for x in labels]
    cnf_matrix_df['Confusion matrix'] = actual_list
    cnf_matrix_df = cnf_matrix_df.set_index('Confusion matrix')
    cnf_matrix_df['Recall'] = recall
    
    # adding a row in the dataframe for precision scores
    precision_row = ['Precision']
    precision_row.extend(precision)
    precision_row.append('')
    
    cnf_matrix_df.loc['Precision'] = precision_row[1:]
    
    return cnf_matrix_df
confusion_max_df(['Cat A','Cat A','Cat B','Cat B','Cat A','Cat B'],['Cat A','Cat A','Cat B','Cat B','Cat A','Cat A'],['Cat A','Cat B'])

相关内容

  • 没有找到相关文章

最新更新