用Python改进AR应用中视频两帧之间的单应性估计



我应该使用OpenCV库来改进Phyton中的AR应用程序,并进行逐帧比较。我们必须在一本书的封面上投影一个图像,这个图像必须在现有的视频中检测到。

其想法是在两个连续帧之间进行单应性,以保持第一帧和当前帧之间的单应性的更新,从而投影AR层。我在单应性的正确估计中发现了问题。它似乎在单应性的每次更新时都会收集错误,这可能是由于矩阵的乘法,每次帧比较都会重复一次。输出视频的结果是AR层的不良定位增加。

如何解决保持frame2frame方法的问题?

这里有代码的相关部分:


[...]
#################################
img_array = []
success = True
success,img_trainCOLOUR = vid.read()
kp_query=kp_ref
des_query=des_ref
#get shapes of images
h,w = img_ref.shape[:2]
h_t, w_t = img_trainCOLOUR.shape[:2]
M_mask = np.identity(3, dtype='float64')
M_prev=M_ar
#performing iterations until last frame
while success :
#obtain grayscale image of the current RGB frame
img_train = cv2.cvtColor(img_trainCOLOUR, cv2.COLOR_BGR2GRAY)
# Implementing the object detection pipeline
# F2F method: correspondences between the previous video frame and the actual frame 
kp_train = sift.detect(img_train)
kp_train, des_train = sift.compute(img_train, kp_train)

#find matches 
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des_query,des_train,k=2)

#validating matches
good = []
for m ,n in matches:
if m.distance < 0.7*n.distance:
good.append(m)
#checking if we found the object
MIN_MATCH_COUNT = 10
if len(good)>MIN_MATCH_COUNT: 
#differenciate between source points and destination points
src_pts = np.float32([ kp_query[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kp_train[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

#find homography between current and previous video frames
M1, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
#matchesMask = mask.ravel().tolist()
#updated homography between M_mask which contains  from first to current-1 frame homography
# and the current frame2frame
M_mask=np.dot(M_mask,M1)
#updated homography between M_prev which contains img_ar layer and first frame homography,
# and from first to current-1 frame homography and the current frame2frame
M = np.dot(M1, M_prev)

#warping the img_ar (transformed as the first frame)
warped = cv2.warpPerspective(img_arCOLOUR, M, (w_t, h_t), flags= cv2.INTER_LINEAR)
warp_mask = cv2.warpPerspective(img_armask, M, (w_t, h_t), flags= cv2.INTER_LINEAR)

#restore previous values of the train images where the mask is black
warp_mask = np.equal(warp_mask, 0)
warped[warp_mask] = img_trainCOLOUR[warp_mask]

#inserting the frames into the frame array in order to reconstruct video sequence
img_array.append(warped)

#save current homography for the successive iteration
M_prev = M
#save the current frame for the successive iteration
img_query=img_train
#warping the mask of the book cover as the current frame
img_maskTrans = cv2.warpPerspective(img_mask, M_mask, (w_t, h_t), flags= cv2.INTER_NEAREST)
#new sift object detection with the current frame and the current mask 
# to search only the book cover into the next frame      
kp_query=sift.detect(img_query,img_maskTrans)
kp_query, des_query = sift.compute(img_query, kp_query)
#reading next frame for the successive iteration
success,img_trainCOLOUR = vid.read()
[...]

这里有输入数据、完整代码和输出:https://drive.google.com/drive/folders/1EAI7wYVFy7SbNZs8Cet7fWEfK2usw-y1?usp=sharing

感谢的支持

您的解决方案会漂移,因为您总是与前一个图像匹配,而不是与固定的参考图像匹配。保持其中一个图像不变。此外,SIFT或任何其他基于描述符的匹配方法对于短基线跟踪来说都是过度的。你可以检测兴趣点(Shi Tomasi goodFeaturesToTrack或Harris角(,并使用Lucas Kanade跟踪它们。

最新更新