逆变换预测结果



我有一个包含三列(两列用于数据,第三列用于目标(的训练数据 CSV,并且我成功地预测了测试 CSV 的目标列。问题是我需要将结果反向转换回字符串以进行进一步分析。下面是代码和错误。

from sklearn import datasets
from sklearn import svm
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import train_test_split
from sklearn.preprocessing import LabelEncoder
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from collections import defaultdict
df_train = pd.read_csv('/Users/justinchristensen/Documents/Python_Education/SKLearn/Path_Training_Data.csv')
df_test = pd.read_csv('/Users/justinchristensen/Documents/Python_Education/SKLearn/Path_Test_Data.csv')
#Separate columns in training data set
x_train = df_train.iloc[:,:-1]
y_train = df_train.iloc[:,-1:]
#Separate columns in test data set
x_test = df_test.iloc[:,:-1]
#Initiate classifier
clf = svm.SVC(gamma=0.001, C=100)
le = LabelEncoder()
#Transform strings into integers
x_train_encoded = x_train.apply(LabelEncoder().fit_transform)
y_train_encoded = y_train.apply(LabelEncoder().fit_transform)
x_test_encoded = x_test.apply(LabelEncoder().fit_transform)
#Fit the model into the classifier
clf.fit(x_train_encoded,y_train_encoded)
#Predict test values
y_pred = clf.predict(x_test_encoded)

错误

NotFittedError
Traceback (most recent call last)
<ipython-input-38-09840b0071d5> in <module>()
1 
----> 2 y_pred_inverse = le.inverse_transform(y_pred)
~/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/label.py in inverse_transform(self, y)
146         y : numpy array of shape [n_samples]
147         """
--> 148         check_is_fitted(self, 'classes_')
149 
150         diff = np.setdiff1d(y, np.arange(len(self.classes_)))
~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
766 
767     if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
--> 768         raise NotFittedError(msg % {'name': type(estimator).__name__})
769 
770 
NotFittedError: This LabelEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

您需要使用用于转换目标的相同标签对象来恢复它们。每次使用标签枚举器时,您都会实例化一个新对象。使用相同的对象。

更改以下行

y_train_encoded = y_train.apply(le().fit_transform)
y_test_encoded = y_test.apply(le().fit_transform)

然后使用相同的对象来反转转换。您也可以在文档中查看此处的第一个示例以供参考。

最新更新