有一组形状为(1000, 11, 1)
的黑白图像。我试图修改keras mnist示例以处理我的数据,因此我编写了以下代码:
input_img = layers.Input(shape=(1000, 11, 1))
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
打印摘要,我可以看到输出形状与输入形状不同:
Model: "model_16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_18 (InputLayer) [(None, 1000, 11, 1)] 0
_________________________________________________________________
conv2d_119 (Conv2D) (None, 1000, 11, 16) 160
_________________________________________________________________
max_pooling2d_51 (MaxPooling (None, 500, 6, 16) 0
_________________________________________________________________
conv2d_120 (Conv2D) (None, 500, 6, 8) 1160
_________________________________________________________________
max_pooling2d_52 (MaxPooling (None, 250, 3, 8) 0
_________________________________________________________________
conv2d_121 (Conv2D) (None, 250, 3, 8) 584
_________________________________________________________________
max_pooling2d_53 (MaxPooling (None, 125, 2, 8) 0
_________________________________________________________________
conv2d_122 (Conv2D) (None, 125, 2, 8) 584
_________________________________________________________________
up_sampling2d_51 (UpSampling (None, 250, 4, 8) 0
_________________________________________________________________
conv2d_123 (Conv2D) (None, 250, 4, 8) 584
_________________________________________________________________
up_sampling2d_52 (UpSampling (None, 500, 8, 8) 0
_________________________________________________________________
conv2d_124 (Conv2D) (None, 498, 6, 16) 1168
_________________________________________________________________
up_sampling2d_53 (UpSampling (None, 996, 12, 16) 0
_________________________________________________________________
conv2d_125 (Conv2D) (None, 996, 12, 1) 145
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
实际上,训练失败了,出现了一个错误:
ValueError: logits and labels must have the same shape ((None, 996, 12, 1) vs (None, 1000, 11, 1))
我做错了什么?我如何修复我的代码来处理我的图像尺寸?
您可以对解码器的网络结构进行如下修改,以匹配编码器的输入形状和解码器的输出形状。Cropping2D
层沿空间维度(即高度和宽度)生长。
input_img = layers.Input(shape=(1000, 11, 1))
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((4, 4))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
# Add a cropping layer
decoded=layers.Cropping2D(cropping=((0,0),(3,2)))(decoded)
model.summary()输出:
Model: "model_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_9 (InputLayer) [(None, 1000, 11, 1)] 0
conv2d_49 (Conv2D) (None, 1000, 11, 16) 160
max_pooling2d_24 (MaxPoolin (None, 500, 6, 16) 0
g2D)
conv2d_50 (Conv2D) (None, 500, 6, 8) 1160
max_pooling2d_25 (MaxPoolin (None, 250, 3, 8) 0
g2D)
conv2d_51 (Conv2D) (None, 250, 3, 8) 584
max_pooling2d_26 (MaxPoolin (None, 125, 2, 8) 0
g2D)
conv2d_52 (Conv2D) (None, 125, 2, 8) 584
up_sampling2d_24 (UpSamplin (None, 250, 4, 8) 0
g2D)
conv2d_53 (Conv2D) (None, 250, 4, 8) 584
up_sampling2d_25 (UpSamplin (None, 1000, 16, 8) 0
g2D)
conv2d_54 (Conv2D) (None, 1000, 16, 1) 73
cropping2d_6 (Cropping2D) (None, 1000, 11, 1) 0
=================================================================
Total params: 3,145
Trainable params: 3,145
Non-trainable params: 0
_________________________________________________________________