使用Sequential或functional风格构建的同一Keras模型的结果非常不同



我正在尝试实现一个Keras回归模型,该模型学习设置一些参数,例如,输入中有一些参数和一组不相关的输出,与输入一致(例如,相似的输入在训练集中给出相似的输出,并且一些输入和一些输出之间存在部分线性(。输入和输出是标准化的,因为参数具有不同的单位。

训练阶段的mse约为0.48,预测相当合理。

这就是型号:

model = Sequential()
model.add(Dense(78, activation='relu', input_shape = 3))
model.add(Dense(54, activation='relu'))
model.add(Dense(54, activation='relu'))
model.add(Dense(5))

摘要:

X:  (2011, 3) y:  (2011, 5)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 78)                312       
_________________________________________________________________
dense_1 (Dense)              (None, 54)                4266      
_________________________________________________________________
dense_2 (Dense)              (None, 54)                2970      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 275       
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0

然后我做了完全相同的模型功能风格

inputs = keras.layers.Input(shape=3) #(X.shape[1],)
out = keras.layers.Dense(78, activation='relu')(inputs)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(5, activation='relu')(out)

X:  (2011, 3) y:  (2011, 5)
Model: "func_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
dense (Dense)                (None, 78)                312       
_________________________________________________________________
dense_1 (Dense)              (None, 54)                4266      
_________________________________________________________________
dense_2 (Dense)              (None, 54)                2970      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 275       
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0

摘要完全相同,只是函数添加了输入层。。但医生说:

When a popular kwarg input_shape is passed, then keras will create an input layer 
to insert before the current layer. This can be treated equivalent to explicitly
defining an InputLayer.

https://keras.io/api/layers/core_layers/dense/

这就是我在第一个模型中所做的。所以这两个模型应该是相同的。但事实并非如此:训练期间的mse明显更高,约为0.7,与其他模型相反,预测为"0";扁平的":输出集合对输入参数的响应最小。

有什么考虑吗?

区别在于您的输出层激活。在功能中,您使用relu:

out = keras.layers.Dense(5, activation='relu')(out)

按顺序,您使用线性(默认激活(

model.add(Dense(5))

正确的输出激活取决于您正在建模的数据,但不同之处在于会产生令人困惑的结果。

编辑:看了你的问题后,你的功能模型应该把最后一行改为

out = keras.layers.Dense(5, activation='linear')(out)

或简称

out = keras.layers.Dense(5)(out)

最新更新