如何将cv2.矩形边界框转换为YoloV4注释格式(相对x，y，w，h)

我训练了一个Yolo4网络，它给了我边界框作为：

img_array = cv2.cvtColor(cv2.imread('image.png'), cv2.COLOR_BGR2RGB)
classes, scores, bboxes = model.detect(img_array, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
box = bboxes[0]
(x, y) = (box[0], box[1])
(w, h) = (box[2], box[3])

当我使用cv2.rectangle将图像保存为：时

cv2.rectangle(img_array, (x, y), (x + w, y + h), (127,0,75), 1)
cv2.imwrite('image.png',img_array)

它给了我一个非常好的边界框绘制。我想使用这个box和图像阵列的形状来创建一个Yolov4格式的文本文件，作为相对于图像大小在0和1之间的x,y,w,h浮动值。

让我们假设我的值为：

img_array.shape -> (443, 1265, 3)
box -> array([489, 126, 161, 216], dtype=int32)

所以它给了我

(x, y) = (box[0], box[1]) -> (489, 126)
(w, h) = (box[2], box[3]) -> (161, 216)

此外，我在文本文件中使用LabelImg创建的边界框也是

0.453125 0.538462 0.132212 0.509615 # 0 is the class

如何使用这些坐标获得Yolov4格式？这有点令人困惑。我用了很多代码，这个答案似乎不起作用。

此外，我尝试过使用这个代码，但我不知道这是否正确。即使是这样，我也不知道如何获得x_, y_

def yolov4_format(img_shape,box):
x_img, y_img, c = img_shape
(x, y) = (box[0], box[1])
(w, h) = (box[2], box[3])

x_, y_ = None # logic for these?
w_ = w/x_img
h_ = h/y_img
return x_,y_, w_, h_

我猜我接近于只求解x和y是NOT绝对值，而是AlexyAB在这个答案中描述的矩形框的中心。因此，我对LabelImg的代码进行了跟踪，找到了一个代码并将其修改为我的用例。

def bnd_box_to_yolo_line(box,img_size):
(x_min, y_min) = (box[0], box[1])
(w, h) = (box[2], box[3])
x_max = x+w
y_max = y+h

x_center = float((x_min + x_max)) / 2 / img_size[1]
y_center = float((y_min + y_max)) / 2 / img_size[0]
w = float((x_max - x_min)) / img_size[1]
h = float((y_max - y_min)) / img_size[0]
return x_center, y_center, w, h

你所需要的只是边界框和图像形状

有一种更直接的方法可以用pybboxes来做这些事情。使用安装，

pip install pybboxes

在您的情况下，

import pybboxes as pbx
voc_bbox = (489, 126, 161, 216)
W, H = 443, 1265  # WxH of the image
pbx.convert_bbox(voc_bbox, from_type="coco", to_type="yolo", image_width=W, image_height=H)
>>> (1.2855530474040633, 0.18498023715415018, 0.36343115124153497, 0.1707509881422925)

请注意，转换为YOLO格式需要图像宽度和高度进行缩放。

相关内容

最新更新

热门标签：