如何在使用网格搜索 CV 时在 Keras 模型中使用自定义指标?



我想在我的 Keras 模型中使用 R2(决定系数(作为指标。为此,我已经定义了一个函数(coeff_determination(。此函数作为指标在没有网格搜索 CV 的情况下运行良好,但对于网格搜索 cv,它会给出类似"模型未配置为计算准确性。您应该将metrics=["accuracy"]传递给model.compile()方法"。代码如下。

def create_model():
#CNN Architecture - Model 7
model = Sequential()
model.add(Convolution1D(filters=10, kernel_size=12, activation="relu", kernel_initializer="glorot_uniform", input_shape=(X_train.shape[1],1)))
model.add(MaxPooling1D(pool_size=4, strides=2))
model.add(BatchNormalization())
model.add(Convolution1D(filters=16, kernel_size=12, activation='relu', kernel_initializer="glorot_uniform"))
model.add(MaxPooling1D(pool_size=3, strides=2))
model.add(BatchNormalization())
model.add(Convolution1D(filters=22, kernel_size=12, activation='relu', kernel_initializer="glorot_uniform"))
model.add(MaxPooling1D(pool_size=3, strides=2))
model.add(BatchNormalization())
model.add(Convolution1D(filters=28, kernel_size=12, activation='relu', kernel_initializer="glorot_uniform"))
model.add(MaxPooling1D(pool_size=4, strides=2))
model.add(BatchNormalization())
model.add(Convolution1D(filters=34, kernel_size=12, activation='relu', kernel_initializer="glorot_uniform"))
model.add(MaxPooling1D(pool_size=3, strides=2))
model.add(BatchNormalization())
model.add(Convolution1D(filters=40, kernel_size=12, activation='relu', kernel_initializer="glorot_uniform"))
model.add(MaxPooling1D(pool_size=3, strides=2))
model.add(BatchNormalization())
model.add(Flatten())
#model.add(Dropout(0.35))
model.add(Dense(130, activation='relu'))
#model.add(Dropout(0.35))
model.add(Dense(130, activation='relu'))
model.add(Dense(1, activation='linear'))
history = History()
model.compile(loss='mean_squared_error',optimizer= Adam(lr=0.0001), metrics=[coeff_determination])
#model.fit(X_train,y_train, validation_data=(X_test,y_test), epochs=400, batch_size=30, callbacks=[history])
return model
def coeff_determination(y_true, y_pred):
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return (1 - SS_res / (SS_tot + K.epsilon()))
# to reprduce the same results next time
seed = 7
np.random.seed(seed)
# Creating Keras model with Scikit learn wrap-up
model = KerasClassifier(build_fn=create_model, verbose=0)
# define the grid search parameters
batch_size = [20,30,40,80]
epochs = [100,200,300,400]
# Using make scorer to convert metric r_2 to a scorer
my_scorer = make_scorer(r2_score, greater_is_better=True)
# passing dictionaries of parameters to the GridSearchCV
param_grid = dict(batch_size=batch_size, epochs=epochs)
grid = GridSearchCV(estimator=model, scoring=my_scorer, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train)
# summarizing the results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))

我认为您需要将自定义评分函数作为scoring参数的输入来GridSearchCV否则它将寻找默认估计器的评分方法,即准确性。

从文档:

scoring:str、可调用、列表/元组或字典,默认值=无 单个 str(请参阅评分参数:定义模型评估规则(或可调用对象(请参阅从指标函数定义评分策略(,用于评估测试集上的预测。

要评估多个指标,请给出(唯一(字符串列表或字典,其中名称作为键,调用对象作为值。

请注意,使用自定义记分器时,每个记分器应返回一个值。返回值列表/数组的指标函数可以包装到多个评分器中,每个评分器返回一个值。

有关示例,请参阅指定多个指标进行评估。

如果为 None,则使用估算器的评分方法。

最新更新