可视化树和 OOB 错误:'numpy.ndarray'对象不可调用



我想可视化我的 RandomForestRegresser 和 GradietBoostRegressor 的树数和 oob 错误。所以我已经编码了这些行,但由于某种原因,"numpy.ndarray"对象不可调用。这里有人知道为什么这不起作用吗?我希望你有一个美好的一天,谢谢!

train_results = []
test_results = []
list_nb_trees = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65,70 , 75, 80, 85, 90, 95, 100]
for nb_trees in list_nb_trees:
rf = RandomForestRegressor(n_estimators=nb_trees,
max_depth= None,
max_features= 50,
min_samples_leaf= 5,
min_samples_split= 2,
random_state= 42,
oob_score= True, 
n_jobs= -1)
rf.fit(X_train_v1, y_train_v1)
train_results.append(mean_squared_error(y_train_v1, rf.oob_prediction_(X_train_v1)))
test_results.append(mean_squared_error(y_test_v1, rf.oob_prediction_(X_test_v1)))
plt.figure(figsize=(15, 5))
line2, = plt.plot(list_nb_trees, test_results, color="g", label="Test OOB Score")
line1, = plt.plot(list_nb_trees, train_results, color="b", label="Training  OOB Score")
plt.title('Trainings- und Test Out-of-Bag Score')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=2)})
plt.ylabel('MSE')
plt.xlabel('n_estimators')
plt.show()
/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/forest.py:737: UserWarning: Some inputs do not have OOB scores. This probably means too few trees were used to compute any reliable oob estimates.
warn("Some inputs do not have OOB scores. "
/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/forest.py:737: UserWarning: Some inputs do not have OOB scores. This probably means too few trees were used to compute any reliable oob estimates.
warn("Some inputs do not have OOB scores. "
/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/forest.py:737: UserWarning: Some inputs do not have OOB scores. This probably means too few trees were used to compute any reliable oob estimates.
warn("Some inputs do not have OOB scores. "
/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/forest.py:737: UserWarning: Some inputs do not have OOB scores. This probably means too few trees were used to compute any reliable oob estimates.
warn("Some inputs do not have OOB scores. "
/opt/conda/lib/python3.7/site-packages/sklearn/ensemble/forest.py:737: UserWarning: Some inputs do not have OOB scores. This probably means too few trees were used to compute any reliable oob estimates.
warn("Some inputs do not have OOB scores. "
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-282-80f6bbb31b23> in <module>
14     rf.fit(X_train_v1, y_train_v1)
15 
---> 16 train_results.update(mean_squared_error(y_train_v1, rf.oob_prediction_(X_train_v1)))
17 test_results.update(mean_squared_error(y_test_v1, rf.oob_prediction_(X_test_v1)))
18 
TypeError: 'numpy.ndarray' object is not callable

看看这里。oob_prediction_是一个数组,其中包含训练集上的 OOB 预测。

因此,您的代码应该更像:

train_oob_mse = mean_squared_error(y_train_v1, rf.oob_prediction_)

从某种意义上说,所有测试样品都是"出袋子",但这样称呼它并不常见。这只是测试错误。您必须预测才能计算它:

test_mse = mean_squared_error(y_test_v1, rf.predict(X_test_v1))

话虽如此,您的代码仅保留最后一个经过训练的 rf,因此,您的*_results将只包含一个值,但我认为这只是复制/粘贴的错误。此外,警告"Some inputs do not have OOB scores. "指示您计算 oob 误差的方式不正确,因为会有一些样本实际上没有预测。

最新更新