如何实现cnn+lstm视频预测模型

我目前正在学习如何制作和实现CNN(alexnet(+LSTM模型来预测视频，但我陷入了预测的困境

当我试图预测时，我得到了这个错误

ValueError:输入0与层模型_1不兼容：应为shape=(无，10，384，384，3(，找到的shape=(1，270，480(

我承认我的宽度和高度不同，但如何在预测中添加时间步长(10(，使其与我的模型相同？

这是我的代码：

model_path = 'CCTV_10Frame_SGD_Model_1e4_b16_l21e2_Terbaru.h5'
model = keras.models.load_model(model_path, compile = True)
vid = cv2.VideoCapture('Data16_116.mp4')
prev_frame_time = 0
total_frame = 0
while vid.isOpened():
ret, frame = vid.read()
if ret == True:
total_frame += 1
draw = frame.copy()
draw = cv2.cvtColor(draw, cv2.COLOR_BGR2GRAY)
scale_percent = 25 # percent of original size
width = int(frame.shape[1] * scale_percent / 100)
height = int(frame.shape[0] * scale_percent / 100)
dim = (width, height)
frame_set = cv2.resize(draw, dim, interpolation = cv2.INTER_AREA)
boxes, scores, labels = model.predict_on_batch(
np.expand_dims(frame_set, axis=0))
boxes /= scale
i_iterate = 0
for box, score, label in zip(boxes[0], scores[0], labels[0]):
if score < 0.5 or i_iterate > 0:
break
fps = 1/(start-prev_frame_time)
prev_frame_time = start

cv2.putText(draw, "%.2f" % fps, (7, 70), font,
1, (100, 255, 0), 3, cv2.LINE_AA)

color = label_color(label)
b = box.astype(int)
draw_box(draw, b, color=color)
caption = "{} {:.3f}".format(classes[label], score)
draw_caption(draw, b, caption)
print("=================================")
print("[INFO] Score : ", score)
print("[INFO] Label : ", classes[label])
i_iterate += 1
print("=================================")
cv2.imshow('Result', draw)  
if cv2.waitKey(25) & 0xFF == ord('q'):
break
else:
break
vid.release()
cv2.destroyAllWindows()

我希望你们中任何有经验的人都能帮助我

非常感谢！

在我看来，你可以做的是，对于前10帧，你只需要附加它们，但从第11帧开始，你现在从开始弹出并附加到结束。

通过这样做，您将拥有10帧，模型将基于该帧预测第11帧
我认为该模型被训练为从过去的10帧中预测一帧

此外，如果你只想从第一帧开始预测，那么在训练单个帧的情况下，试着看看模型的输入是什么。它应该像9帧，具有一些任意常数值然后是第10帧际值。

相关内容

最新更新

热门标签：