尝试实现类似 3D 自动编码器的算法,将图像映射到视频但在输出尺寸不匹配



我正在尝试使用像架构这样的自动编码器将单个图像映射到整个视频,但是在 o ne 端将具有图像,而输出将是视频。算法预期的尺寸与视频的实际尺寸不匹配。每个视频的尺寸(4500,144,256,3(这是我的代码:

model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same')) #28 x 28 x 32
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same')) #28 x 28 x 32
model.add(MaxPooling2D(pool_size=(2, 2))) #14 x 14 x 32
model.add(Conv2D(64, (3, 3), activation='relu', padding='same')) #14 x 14 x 64
model.add(MaxPooling2D(pool_size=(2, 2))) #7 x 7 x 64
model.add(Conv2D(64, (3, 3), activation='relu', padding='same')) #7 x 7 x 128 (small and thick)
model.add(Flatten())
model.add(Dense(3*6*6*6,activation='relu'))
model.add(Reshape((3,6,6,6)))
model.add(Conv3D(8, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(8, (3,3,3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(32, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(32, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(32, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))

model.add(Conv3D(3, (3, 3, 3), activation='sigmoid', padding='same'))
model.compile(optimizer="adadelta", loss='binary_crossentropy')```
and here is the error message
>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-afa431bc2efe> in <module>()
249 
250 
--> 251 video_proc()
252 
3 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
139                             ': expected ' + names[i] + ' to have shape ' +
140                             str(shape) + ' but got array with shape ' +
--> 141                             str(data_shape))
142     return data
143 
ValueError: Error when checking target: expected conv3d_37 to have shape (96, 192, 192, 3) but got array with shape (4500, 144, 256, 3)>

我将致密层的尺寸更改为


model.add(Dense(542*18*32*1,activation='relu'))
model.add(Reshape((542,18,32,1)))
model.add(Conv3D(8, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(8, (3,3,3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))
model.add(Conv3D(32, (3, 3, 3), activation='relu', padding='same'))
model.add(UpSampling3D((2, 2, 2)))

model.add(Conv3D(3, (3, 3, 3), activation='sigmoid', padding='same'))
model.compile(optimizer="adadelta", loss='binary_crossentropy')

这给了我一个((4336,144,256,3((的输出,然后我调整了视频的填充以使其适合。

最新更新