Tensorflow 1.12与Tensorflow 2.4的结果不同



我试图将我的模型升级到tensorflow 2.4,但升级后网络的精度较低。我注意到单个批次的损失函数是不同的,即使:

  • 我在两个版本中使用相同路径的model = keras.models.load_model('path/to/model.h5')(此文件是使用tf 1.12创建的)
  • 我检查权重是否匹配
  • 我检查使用的批处理是相同的
  • 我在专有数据集和keras.datasets.mnist上复制了这个问题。

我期望,如果我设法在两个版本上实现相同的损失,我也将在训练后实现相同的准确性。

要求tf 1.12版本

# python version == 3.6
tensorflow_gpu==1.12
keras==2.2.4
h5py==2.10.0
opencv-python==4.2.0.34

要求tf 2.4.1

# python version == 3.8
tensorflow==2.4.1
h5py==2.10.0
opencv-python==4.5.3.56

模型定义(这在两个版本中是相同的):


def mobile_net(no_classes):
base = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
for layer in base.layers:
layer.trainable = False
x = GlobalAveragePooling2D()(base.output)
x = Dense(32, activation='relu')(x)
x = Dense(128, activation='relu')(x)
y = GlobalMaxPooling2D()(base.output)
y = Dense(32, activation='relu')(y)
y = Dense(128, activation='relu')(y)
conc = Add()([x, y])
conc = Dense(32, activation='relu')(conc)
prediction = Dense(no_classes, activation='softmax')(conc)
model = Model(inputs=base.input, outputs=prediction)
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
return model

训练方法(两个版本几乎相同):

keras.backend.set_image_dim_ordering('tf')  # only in tf 1.12
# load data
x_train, y_train = ...
x_train, y_train = x_train[:4], y_train[:4]  # select just one batch for testing purposes

model = keras.models.load_model('path/to/model.h5')  # in tf 1.12
model = tensorflow.keras.models.load_model('path/to/model.h5')  # in tf 2.4
print(f'check that the values are the same: {x_train.sum() + y_train.argmax(axis=1).sum()}')
weights = model.get_weights()
print(f'check that weights are the same: {[weight.sum() for weight in weights]}')
model.fit(x_train, y_train, batch_size=4, verbose=2)

tf 1.12输出:

检查值是否相同:18266047
检查权重是否相同:[-4.311309,37.386337,26.299068,…], -10.376889, 0.0, -13.127711, 0.0, 4.9316425, 0.0) 1/1


时代
  • 18s - loss: 2.6805 - acc: 0.2500

tf 2.4输出:

检查值是否相同:18266047
检查权重是否相同:[-4.311309,37.386337,26.299068,…]
1/1 - 6s -损耗:2.8985 -精度:0.2500

损失的差异从何而来?

这种差异来自于MobileNet包含BatchNormalization层的事实。它们的行为在Tensorflow 2.x中改变了。你可以在这里阅读更多。要重新创建Tensorflow 1。我在模型创建代码中添加了以下片段:

def mobile_net(no_classes):
base = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
### changed in Tensorflow 2.4.1
for layer in base.layers:
if layer.__class__ == BatchNormalization:
layer.trainable = True
else:
layer.trainable = False
### end of change
x = GlobalAveragePooling2D()(base.output)
x = Dense(32, activation='relu')(x)
x = Dense(128, activation='relu')(x)
y = GlobalMaxPooling2D()(base.output)
y = Dense(32, activation='relu')(y)
y = Dense(128, activation='relu')(y)
conc = Add()([x, y])
conc = Dense(32, activation='relu')(conc)
prediction = Dense(no_classes, activation='softmax')(conc)
model = Model(inputs=base.input, outputs=prediction)
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
return model

模型现在在tf 1.12和2.4.1上返回相同的损失。

相关内容

  • 没有找到相关文章

最新更新