利用非训练数据的训练权值设计新的损失函数



我想在训练迭代中访问训练点,并通过使用不包括在训练集中的数据点将软约束合并到我的损失函数中。我将把这篇文章作为参考。

import numpy as np
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model
# Some random training data and labels
features = np.random.rand(100, 5)
labels = np.random.rand(100, 2)
# Simple neural net with three outputs
input_layer = Input((20,))
hidden_layer = Dense(16)(input_layer)
output_layer = Dense(3)(hidden_layer)

# Model
model = Model(inputs=input_layer, outputs=output_layer)

#each training point has another data pair. In the real example, I will have multiple 
#supporters. That is why I am using dict.
holder =  np.random.rand(100, 5)
iter = np.arange(start=1, stop=features.shape[0], step=1)
supporters = {}
for i,j in zip(iter, holder): #i represent the ith training data
supporters[i]=j

# Write a custom loss function
def custom_loss(y_true, y_pred):
# Normal MSE loss
mse = K.mean(K.square(y_true-y_pred), axis=-1)
new_constraint = .... 

return(mse+new_constraint)

model.compile(loss=custom_loss, optimizer='sgd')
model.fit(features, labels, epochs=1, ,batch_size=1=1)

为简单起见,我们假设我想通过使用固定的网络权值来最小化supporters中存储的对数据的预测值与预测值之间的最小绝对值差。同样,假设我通过了每批的一个训练点。但是,我不知道如何执行这个操作。我尝试了如下所示的方法,但很明显,它是不正确的。

new_constraint = K.sum(y_pred - model.fit(supporters))

Fit是训练评估模型的过程。我认为对你的问题来说,用你当前的权重加载你的模型的一个新实例并评估批损失以计算主模型的损失会更好。

main_model = Model()  # This is your main training model 
def custom_loss_1(y_true, y_pred):  # Avoid recursive calls
mse = K.mean(K.square(y_true-y_pred), axis=-1)
return mse
def custom_loss(y_true, y_pred):
support_model =  tf.keras.models.clone_model(main_model)  # You copy the main model but the weights are uninitialized
support_model.build((20,)) # You build with inputs same as your support data
support_model.compile(loss=custom_loss_1, optimizer='sgd') 
support_model.set_weights(main_model.get_weights())  # You  load the weight of the main model
mse = custom_loss_1(y_true, y_pred)
# You just want to evaluate the model, not to train. If you have more
# metrics than just loss the use support_model.evaluate(supporters)[0]
new_constraint = K.sum(y_pred -  support_model.predict(supporters))  # predict to get the output, evaluate to get the metrics
return(mse+new_constraint)

最新更新