我是深度学习的新手,我正在努力学习更多关于Tensorflow和Keras的实现。我的工作基于这个链接:https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch
1)。首先,我建立了一个假数据集,每批10个,如下所示:
import tensorflow as tf
import numpy as np
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.losses import mean_squared_error
from tensorflow.keras import Model, optimizers
X = tf.random.shuffle(np.random.normal(size=500).reshape(-1,10,1))
Train_X = X[:30]
Val_X = X[30:40]
Test_X = X[40:50]
Y = 2*X + 2
Train_Y = Y[:30]
Val_Y = Y[30:40]
Test_Y = Y[40:50]
2)。然后我设置模型如下:
inputs = tf.keras.Input(shape=(1,))
x = Dense(2, activation="relu", name="dense_1")(inputs)
x = Dense(2, activation="relu", name="dense_2")(x)
outputs = Dense(1, name="predictions")(x)
model = Model(inputs=inputs, outputs=outputs)
optimizer = optimizers.SGD(learning_rate=1e-3)
loss_fn =mean_squared_error
3)。我设置我的自定义循环如下:
epochs = 10
for epoch in range(epochs):
for step, (x_batch_train, y_batch_train) in enumerate(zip(Train_X, Train_Y)):
with tf.GradientTape() as tape:
predictions = model(x_batch_train, training=True)
loss_value = loss_fn(y_batch_train, predictions)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
#print(grads)
for step, (x_batch_val, y_batch_val) in enumerate(zip(Val_X, Val_Y)):
val_predictions = model(x_batch_val, training=False)
print('Valuation predictions for the batch :')
print(val_predictions)
print('Actual Valuation for the batch :')
print(y_batch_val)
我想通过查看我得到的验证预测来再次检查模型的进展,像这样:
Start of epoch 7
Valuation predictions for the batch :
tf.Tensor(
[[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]], shape=(10, 1), dtype=float32)
Actual Valuation for the batch :
tf.Tensor(
[[ 0.46222022]
[ 4.14915307]
[-0.4027977 ]
[ 5.86534374]
[ 2.4339228 ]
[ 4.09712344]
[ 2.68675164]
[ 1.1638311 ]
[ 1.898602 ]
[ 2.630972 ]], shape=(10, 1), dtype=float64)
Valuation predictions for the batch :
tf.Tensor(
[[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]
[1.9706799]], shape=(10, 1), dtype=float32)
Actual Valuation for the batch :
tf.Tensor(
[[ 2.23510644]
[ 0.26278813]
[ 1.89419175]
[ 2.56711307]
[-0.56344412]
[-0.22523397]
[ 1.81370046]
[ 0.70580016]
[ 3.81906033]
[ 4.39636782]], shape=(10, 1), dtype=float64)
所以不管估值数据集的值是多少,我的模型总是预测同样的事情。如果能帮助我理解为什么会发生这种情况,我将不胜感激。
我注意到以下几点:
-
Tensorflow默认使用float32而不是float64工作,但是您可以为您的网络提供float64。这在这里不会产生问题,但是我要确保硬编码您的dtype。
-
你的缩进是错误的。这是正确的:
epochs = 10
for epoch in range(epochs):
for step, (x_batch_train, y_batch_train) in enumerate(zip(Train_X, Train_Y)):
with tf.GradientTape() as tape:
predictions = model(x_batch_train, training=True)
loss_value = loss_fn(y_batch_train, predictions)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
祝你今后工作顺利!