C++Tesseract OCR:Getting ObjectCache():警告!泄漏!对象仍然具有计数1



我在Ubuntu 20.04&我正在尝试一个C++代码来OCR将图像转换为可搜索的PDF。

我的代码比官方网站上提供的C++API示例代码有所修改:

/home/test/Desktop/Example2/testexample2.cpp:

#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>
#include <tesseract/renderer.h>
int main()
{
//const char* input_image = "/usr/src/tesseract-oc/testing/phototest.tif";
//const char* output_base = "my_first_tesseract_pdf";
//const char* datapath = "/Projects/OCR/tesseract/tessdata";

const char* input_image = "001.jpg";
const char* output_base = "001";
const char* datapath = ".";
int timeout_ms = 5000;
const char* retry_config = nullptr;
bool textonly = false;
int jpg_quality = 92;
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
if (api->Init(datapath, "eng")) {
fprintf(stderr, "Could not initialize tesseract.n");
exit(1);
}
/*
tesseract::TessPDFRenderer *renderer = new tesseract::TessPDFRenderer(
output_base, api->GetDatapath(), textonly, jpg_quality);
*/
tesseract::TessPDFRenderer *renderer = new tesseract::TessPDFRenderer(
output_base, api->GetDatapath(), textonly);
bool succeed = api->ProcessPages(input_image, retry_config, timeout_ms, renderer);
if (!succeed) {
fprintf(stderr, "Error during processing.n");
return EXIT_FAILURE;
}
api->End();
return EXIT_SUCCESS;
}

我也跟着https://stackoverflow.com/a/59382664如下所示:

cd /home/test/Desktop/Example2
wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
wget https://github.com/tesseract-ocr/tesseract/blob/master/tessdata/pdf.ttf
export TESSDATA_PREFIX=$(pwd)
gedit config
(In the config file, entered the contents:
tessedit_create_pdf     1       Write .pdf output file
tessedit_create txt     1       Write .txt output file
)
g++ testexample2.cpp -o testexample2 -ltesseract
./testexample2

但在执行时,它会显示以下错误:

Warning: Invalid resolution 0 dpi. Using 70 instead.
Error during processing.
ObjectCache(0x7f1b096669c0)::~ObjectCache(): WARNING! LEAK! object 0x55af5c5241a0 still has count 1 (id /home/test/Desktop/Example2/eng.traineddatapunc-dawg)
ObjectCache(0x7f1b096669c0)::~ObjectCache(): WARNING! LEAK! object 0x55af5c506770 still has count 1 (id /home/test/Desktop/Example2/eng.traineddataword-dawg)
ObjectCache(0x7f1b096669c0)::~ObjectCache(): WARNING! LEAK! object 0x55af5c9a4a70 still has count 1 (id /home/test/Desktop/Example2/eng.traineddatanumber-dawg)
ObjectCache(0x7f1b096669c0)::~ObjectCache(): WARNING! LEAK! object 0x55af5c9a4980 still has count 1 (id /home/test/Desktop/Example2/eng.traineddatabigram-dawg)
ObjectCache(0x7f1b096669c0)::~ObjectCache(): WARNING! LEAK! object 0x55af5d7d5170 still has count 1 (id /home/test/Desktop/Example2/eng.traineddatafreq-dawg)

我的目录结构是:

示例2
|------->001.jpg
|------->配置
|------->eng.traineddata
|------->pdf.ttf
|------->测试示例2
|------->testexample2.cpp

  1. 我在多个来源上搜索过这一点,但找不到任何修复方法。

  2. 此外,我想知道是否有什么方法可以使用C++编译从这个代码+libtesseract构建二进制文件,使我的二进制文件成为一个独立的可移植二进制文件,在其他Ubuntu系统上运行它不需要重新安装tesseract库&它们的依赖

您必须为类释放使用动态内存"api";

用途:

... you code...
if (renderer) delete renderer;
if (api) delete api;

tesseract API示例是使用tesseract功能的示例,不包括您选择的编程语言的所有细节(在您的示例中为c++(。

只需查看代码,即使不尝试:动态分配内存2倍,但没有解除分配。请尝试解决这些问题。

最新更新