带有vnRecognizedObjectObserative的边界箱的不正确框架



我使用core ML&视觉。

水平检测似乎正常工作,但是,垂直的盒子太高了,越过视频的顶部边缘,并不能一直延伸到视频的底部,并且不会遵循相机的运动正确。在这里,您可以看到:https://i.stack.imgur.com/zlhfp.jpg

这就是视频数据输出的初始化:

let videoDataOutput = AVCaptureVideoDataOutput()
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
videoDataOutput.setSampleBufferDelegate(self, queue: dataOutputQueue!)
self.videoDataOutput = videoDataOutput
session.addOutput(videoDataOutput)
let c = videoDataOutput.connection(with: .video)
c?.videoOrientation = .portrait

我还尝试了其他视频取向,没有太大的成功。

执行视觉请求:

let handler = VNImageRequestHandler(cvPixelBuffer: image, options: [:])
try? handler.perform(vnRequests)

最后一旦处理了请求。viewRect设置为视频视图的大小:812x375(我知道,视频层本身有点短,但这不是这里的问题):

let observationRect = VNImageRectForNormalizedRect(observation.boundingBox, Int(viewRect.width), Int(viewRect.height))

我还尝试过(还有更多问题):

var observationRect = observation.boundingBox
observationRect.origin.y = 1.0 - observationRect.origin.y
observationRect = videoPreviewLayer.layerRectConverted(fromMetadataOutputRect: observationRect)

我试图减少我认为是无关紧要的代码的尽可能多的。

我实际上使用苹果的示例代码遇到了一个类似的问题,当边界框不会像预期的那样垂直绕过对象时:https://developer.apple.com/documentation/visional/vision/recognizing/recognizing_objects_in_in_in_in_in_in_capture,这可能意味着这意味着这意味着这意味着这意味着这意味着这意味着这意味着这意味着这意味着API存在一些问题?

我使用类似的东西:

let width = view.bounds.width
let height = width * 16 / 9
let offsetY = (view.bounds.height - height) / 2
let scale = CGAffineTransform.identity.scaledBy(x: width, y: height)
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -height - offsetY)
let rect = prediction.boundingBox.applying(scale).applying(transform)

这是肖像方向和16:9的纵横比。它假设.imageCropAndScaleOption = .scaleFill

信用:转换代码是从此仓库中获取的:https://github.com/willjay90/applefacedetection

最新更新