这是一个简单的Tensorflow XOR实现。
知道为什么当TF随机种子为0时,它不收敛,而当它不收敛时,它收敛吗?如何在不改变网络架构(即保持隐藏层为Dense(2))并保持随机种子= 0的情况下使其收敛?蒂娅!
import tensorflow as tf
import numpy as np
from tensorflow.keras import Model
from tensorflow.keras.layers import (
Dense,
Input,
)
tf.random.set_seed(0) # to reproduce non-convergence
# tf.random.set_seed(1234) # to converges
# XOR
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], "float32")
Y = np.array([[0], [1], [1], [0]], "float32")
x = Input(shape=(2,))
y = Dense(2, activation="sigmoid")(x)
y = Dense(1, activation="sigmoid")(y)
model = Model(inputs=x, outputs=y)
model.compile(loss="mean_squared_error")
class logger(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
if epoch % 1000 == 0:
print("epoch=", epoch, "loss=%.3f" % logs["loss"])
model.fit(X, Y, epochs=20000, verbose=0, callbacks=[logger()])
随机种子= 0时的输出:
epoch= 0 loss=0.255
epoch= 1000 loss=0.235
epoch= 2000 loss=0.190
epoch= 3000 loss=0.154
epoch= 4000 loss=0.137
epoch= 5000 loss=0.130
epoch= 6000 loss=0.127
epoch= 7000 loss=0.126
epoch= 8000 loss=0.125
epoch= 9000 loss=0.125
epoch= 10000 loss=0.125
epoch= 11000 loss=0.125
epoch= 12000 loss=0.125
epoch= 13000 loss=0.125
epoch= 14000 loss=0.125
epoch= 15000 loss=0.125
epoch= 16000 loss=0.125
epoch= 17000 loss=0.125
epoch= 18000 loss=0.125
epoch= 19000 loss=0.125
随机种子= 1234时的输出:
epoch= 0 loss=0.275
epoch= 1000 loss=0.234
epoch= 2000 loss=0.186
epoch= 3000 loss=0.118
epoch= 4000 loss=0.059
epoch= 5000 loss=0.024
epoch= 6000 loss=0.008
epoch= 7000 loss=0.003
epoch= 8000 loss=0.001
epoch= 9000 loss=0.000
epoch= 10000 loss=0.000
epoch= 11000 loss=0.000
epoch= 12000 loss=0.000
epoch= 13000 loss=0.000
epoch= 14000 loss=0.000
epoch= 15000 loss=0.000
epoch= 16000 loss=0.000
epoch= 17000 loss=0.000
epoch= 18000 loss=0.000
epoch= 19000 loss=0.000
默认情况下(因为您没有指定它),优化器是"rmsprop"
,并且它似乎在此任务中表现不佳。原因是:我不知道。但是如果你使用"sgd"
和"tanh"
激活隐藏层,它将工作:
model.compile(loss="mean_squared_error", optimizer='sgd')
epoch= 0 loss=0.425
epoch= 1000 loss=0.213
epoch= 2000 loss=0.182
epoch= 3000 loss=0.160
epoch= 4000 loss=0.130
epoch= 5000 loss=0.063
epoch= 6000 loss=0.023
epoch= 7000 loss=0.010
epoch= 8000 loss=0.006
epoch= 9000 loss=0.004
你也可以尝试手动设置权重;)
step_activation = lambda x: tf.cast(tf.greater_equal(x, 0.5), tf.float32)
x = Input(shape=(2,))
y = Dense(2, activation=step_activation)(x)
y = Dense(1, activation=step_activation)(y)
model = Model(inputs=x, outputs=y, trainable=False)
model.compile(loss="mean_squared_error")
weights = [np.array([[1, 1], [1, 1]]),
np.array([-1.5, -0.5]),
np.array([[-1], [1]]),
np.array([-0.5])]
model.set_weights(weights)
model.evaluate(X, Y)
1/1 [==============================] - 0s 2ms/step - loss: 0.0000e+00
复制/pastable:
import tensorflow as tf
tf.random.set_seed(0)
X = [[0, 0], [0, 1], [1, 0], [1, 1]]
Y = [[0], [1], [1], [0]]
x = tf.keras.layers.Input(shape=(2,))
y = tf.keras.layers.Dense(2, activation="tanh")(x)
y = tf.keras.layers.Dense(1, activation="tanh")(y)
model = tf.keras.Model(inputs=x, outputs=y)
model.compile(loss="mean_squared_error", optimizer='sgd')
history = model.fit(X, Y, epochs=5000)
Epoch 5000/5000
1/1 [==============================] - 0s 998us/step - loss: 0.0630