我正在尝试运行我的K-fold交叉验证,结果发生了
from sklearn import model_selection
kFold = model_selection.KFold(n_splits=5, shuffle=True)
#use the split function of kfold to split the housing data set
for trainIndex, testIndex in kFold.split(df):
print("Fold: ",i)
print(trainIndex.shape)
print(trainIndex)
i += 1
lRegPara = [0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 1]
final_results = []
i=0
for trainIndex, testIndex in kFold.split(df):
# split the train test further
trainX, validX, trainY, validY = train_test_split(np.array(X.iloc[trainIndex]),
np.array(Y.iloc[trainIndex]),
test_size=0.20, random_state=99)
# optimise the linear regression
lResults = []
for regPara in lRegPara:
polyLassoReg = Lasso(alpha=regPara, normalize=True)
polyFitTrainX = polyreg.fit_transform(trainX)
polyLassoReg.fit(polyFitTrainX, trainY)
polyFitValidX = polyreg.fit_transform(validX)
predictKY = polyLassoReg.predict(polyFitValidX)
mse = mean_squared_error(predictKY, validY)
lResults.append(mse)
final_results.append(lResults)
plt.plot(lRegPara, lResults)
为什么?我收到了这个错误"numpy.ndarray"对象没有属性"iloc"。我到处找,但没有类似的问题。我在numpy中尝试了函数"loc",结果仍然相同。
将numpy数组转换为pandas数据帧
df = pd.DataFrame({'column0': numpy_array[:, 0],
'column1': numpy_array[:, 1],
'column2': numpy_array[:, 2],
'column3': numpy_array[:, 3],
'column4': numpy_array[:, 4] })
然后你可以使用iloc和其他数据帧功能