Google MLKit在iOS上没有检测到对象

我是Google MLKit的新手，想要检测Android和iOS的账单/收据。我使用对象检测和这个模型。

检测由Google MLkit完成，然后由反应-本地视觉相机进行解释

android在java我没有问题，账单很好地检测:

形象在iOS上，对于相同的代码(但在Objective C而不是Java)，我从未检测到发票:

形象

#import <Foundation/Foundation.h>
#import <VisionCamera/FrameProcessorPlugin.h>
#import <VisionCamera/Frame.h>
#import <MLKit.h>
@interface VisionScanObjectsFrameProcessorPlugin : NSObject
+ (MLKObjectDetector*) objectDetector;
@end
@implementation VisionScanObjectsFrameProcessorPlugin
+ (MLKObjectDetector*) objectDetector {
static MLKObjectDetector* objectDetector = nil;
if (objectDetector == nil) {
NSString *path = [[NSBundle mainBundle] pathForResource:@"lite-model_object_detection_mobile_object_labeler_v1_1" ofType:@"tflite"];
MLKLocalModel *localModel = [[MLKLocalModel alloc] initWithPath:path];
MLKCustomObjectDetectorOptions *options = [[MLKCustomObjectDetectorOptions alloc] initWithLocalModel:localModel];
options.detectorMode = MLKObjectDetectorModeSingleImage;
options.shouldEnableClassification = YES;
options.classificationConfidenceThreshold = @(0.5);
options.maxPerObjectLabelCount = 3;
objectDetector = [MLKObjectDetector objectDetectorWithOptions:options];
}
return objectDetector;
}
static inline id scanObjects(Frame* frame, NSArray* arguments) {
MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:frame.buffer];
image.orientation = frame.orientation; // <-- TODO: is mirrored?
NSError* error;
NSArray<MLKObject*>* objects = [[VisionScanObjectsFrameProcessorPlugin objectDetector] resultsInImage:image error:&error];
NSLog(@"Object detected : %ld", objects.count);
NSMutableArray* results = [NSMutableArray arrayWithCapacity:objects.count];
for (MLKObject* object in objects) {
NSMutableArray* labels = [NSMutableArray arrayWithCapacity:object.labels.count];
for (MLKObjectLabel* label in object.labels) {
if (122 == label.index || 188 == label.index || 288 == label.index || 325 == label.index || 357 == label.index || 370 == label.index || 480 == label.index || 510 == label.index || 551 == label.index) {
[labels addObject:@{
@"index": [NSNumber numberWithFloat:label.index],
@"label": label.text,
@"confidence": [NSNumber numberWithFloat:label.confidence]
}];
}
}
if (labels.count != 0) {
[results addObject:@{
@"width": [NSNumber numberWithFloat:object.frame.size.width],
@"height": [NSNumber numberWithFloat:object.frame.size.height],
@"top": [NSNumber numberWithFloat:object.frame.origin.y],
@"left": [NSNumber numberWithFloat:object.frame.origin.x],
@"frameRotation": [NSNumber numberWithFloat:frame.orientation],
@"labels": labels
}];
}
}
return results;
}
VISION_EXPORT_FRAME_PROCESSOR(scanObjects)
@end

我真的认为这段代码可以工作，因为我没有任何崩溃(我在它工作之前做过^^)，但我从来没有检测到文档。:/

NSLog(@"Object detected : %ld", objects.count);几乎总是return 0。异常情况下，return 1在检测我的电脑键盘，但这是非常非常非常罕见的。

我在过去的4天里尝试了很多东西(不同的模型，异步检测，检测前调整大小等)，但它仍然是相同的:/

对于Jaroslaw K.，我不得不减小帧大小以使其正常工作。

我的模型的文档(https://tfhub.dev/tensorflow/efficientnet/lite0/classification/2)中建议这样做:

对于这个模块，输入图像的大小是灵活的，但是最好匹配模型的训练输入，这个模型的训练输入是高度x宽度= 224 x 224像素。输入图像的颜色值应该在[0,1]范围内，遵循常见的图像输入约定。

public static func resizeFrameToUiimage(frame: Frame) -> UIImage! {
let targetSize = CGSize(width: 224.0, height: 224.0)
let imageBuffer = CMSampleBufferGetImageBuffer(frame.buffer)!
let ciimage = CIImage(cvPixelBuffer: imageBuffer)

let context = CIContext(options: nil)
let cgImage = context.createCGImage(ciimage, from: ciimage.extent)!
let uiimage = UIImage(cgImage: cgImage)
let widthRatio  = targetSize.width  / uiimage.size.width
let heightRatio = targetSize.height / uiimage.size.height

var newSize: CGSize
if(widthRatio > heightRatio) {
newSize = CGSize(width: uiimage.size.width * heightRatio, height: uiimage.size.height * heightRatio)
} else {
newSize = CGSize(width: uiimage.size.width * widthRatio, height: uiimage.size.height * widthRatio)
}

let rect = CGRect(origin: .zero, size: newSize)

UIGraphicsBeginImageContextWithOptions(newSize, false, 1.0)
uiimage.draw(in: rect)
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return newImage
}

相关内容

最新更新

热门标签：