我正在尝试实现一个Keras回归模型,该模型学习设置一些参数,例如,输入中有一些参数和一组不相关的输出,与输入一致(例如,相似的输入在训练集中给出相似的输出,并且一些输入和一些输出之间存在部分线性(。输入和输出是标准化的,因为参数具有不同的单位。
训练阶段的mse约为0.48,预测相当合理。
这就是型号:
model = Sequential()
model.add(Dense(78, activation='relu', input_shape = 3))
model.add(Dense(54, activation='relu'))
model.add(Dense(54, activation='relu'))
model.add(Dense(5))
摘要:
X: (2011, 3) y: (2011, 5)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 78) 312
_________________________________________________________________
dense_1 (Dense) (None, 54) 4266
_________________________________________________________________
dense_2 (Dense) (None, 54) 2970
_________________________________________________________________
dense_3 (Dense) (None, 5) 275
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0
然后我做了完全相同的模型功能风格
inputs = keras.layers.Input(shape=3) #(X.shape[1],)
out = keras.layers.Dense(78, activation='relu')(inputs)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(54, activation='relu')(out)
out = keras.layers.Dense(5, activation='relu')(out)
X: (2011, 3) y: (2011, 5)
Model: "func_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 3)] 0
_________________________________________________________________
dense (Dense) (None, 78) 312
_________________________________________________________________
dense_1 (Dense) (None, 54) 4266
_________________________________________________________________
dense_2 (Dense) (None, 54) 2970
_________________________________________________________________
dense_3 (Dense) (None, 5) 275
=================================================================
Total params: 7,823
Trainable params: 7,823
Non-trainable params: 0
摘要完全相同,只是函数添加了输入层。。但医生说:
When a popular kwarg input_shape is passed, then keras will create an input layer
to insert before the current layer. This can be treated equivalent to explicitly
defining an InputLayer.
https://keras.io/api/layers/core_layers/dense/
这就是我在第一个模型中所做的。所以这两个模型应该是相同的。但事实并非如此:训练期间的mse明显更高,约为0.7,与其他模型相反,预测为"0";扁平的":输出集合对输入参数的响应最小。
有什么考虑吗?
区别在于您的输出层激活。在功能中,您使用relu:
out = keras.layers.Dense(5, activation='relu')(out)
按顺序,您使用线性(默认激活(
model.add(Dense(5))
正确的输出激活取决于您正在建模的数据,但不同之处在于会产生令人困惑的结果。
编辑:看了你的问题后,你的功能模型应该把最后一行改为
out = keras.layers.Dense(5, activation='linear')(out)
或简称
out = keras.layers.Dense(5)(out)