我正在开发的软件使用LAPACK函数的"英特尔MKL"实现来解决特征值问题。当我运行Valgrind检查代码是否存在内存泄漏时,它只在使用函数"STEVR"时报告了错误,或者更准确地说是C函数LAPACKE_dstevr
。为了查明我的接口是问题还是被调用的函数,我编写了一个独立的测试应用程序。代码如下:
#include <mkl/mkl.h>
#include <random>
int main() {
// Tolerance
double absTol = 1e-12;
// Problem size
lapack_int n = 64;
// Generate random tridiagonal symmetric matrix
std::mt19937 randomGen;
std::normal_distribution<double> normal(1., 1.);
double *mainDiagonal = new double[n];
double *subDiagonal = new double[n-1];
for (int i=0; i<n-1; i++) {
mainDiagonal[i] = normal(randomGen);
subDiagonal[i] = normal(randomGen);
}
mainDiagonal[n-1] = normal(randomGen);
// Allocate memory for results
double *eigenValues = new double[n];
double *eigenVectors = new double[n*n];
// Resulting integer array and integer for leading dimension
// allocated/initialized according to MKL/LAPACK documentation
lapack_int *isuppz = new lapack_int[2*n]();
lapack_int ldz = n;
// Eigenvectors shall be computed
char job = 'V';
// All pairs of eigenvalues and -vectors shall be computed
char range = 'A';
// These values can remain uninitialized (irrelevant it range=='A')
lapack_int lowerIndex, upperIndex, upperBound, lowerBound;
// Number of eigenvalues found (output parameter)
lapack_int m;
// Solve problem using MKL/LAPACK function
LAPACKE_dstevr(LAPACK_ROW_MAJOR, job, range, n, mainDiagonal, subDiagonal,
lowerBound, upperBound, lowerIndex, upperIndex, absTol, &m,
eigenValues, eigenVectors, ldz, isuppz);
// Free memory
delete[] mainDiagonal;
delete[] subDiagonal;
delete[] eigenValues;
delete[] eigenVectors;
delete[] isuppz;
return 0;
}
用编译
g++ -fopenmp -ggdb3 -Wall -Wextra test_dstevr.cpp -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lpthread -lm -ldl -lmkl_rt -o test_dstevr
并使用命令运行valgrind
valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose --log-file=valgrind.out ./test_dstevr
给了我0个错误。然而,如果我将矩阵的大小更改为n = 65
或任何大于64的数字,Valgrind会报告
ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
和
==48382== 960 bytes in 3 blocks are possibly lost in loss record 10 of 12
==48382== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==48382== by 0x40149DA: allocate_dtv (dl-tls.c:286)
==48382== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532)
==48382== by 0xB549322: allocate_stack (allocatestack.c:622)
==48382== by 0xB549322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660)
==48382== by 0xB320DEA: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==48382== by 0xA08AA10: mkl_trans_mkl_domatcopy2_par (in /usr/lib/x86_64-linux-gnu/libmkl_gnu_thread.so)
==48382== by 0xCF5D5F4: mkl_trans_avx2_mkl_domatcopy (in /usr/lib/x86_64-linux-gnu/libmkl_avx2.so)
==48382== by 0x4D76FBC: LAPACKE_dge_trans (in /usr/lib/x86_64-linux-gnu/libmkl_intel_lp64.so)
==48382== by 0x4DB57DB: LAPACKE_dstevr_work (in /usr/lib/x86_64-linux-gnu/libmkl_intel_lp64.so)
==48382== by 0x4DB5430: LAPACKE_dstevr (in /usr/lib/x86_64-linux-gnu/libmkl_intel_lp64.so)
==48382== by 0x10951C: main (test_dstevr.cpp:45)
当然,作为二次幂,数字64在我看来并不是随机的,但我完全不知道问题出在哪里。这里有人吗?我使用的是Ubuntu 20.04 LTS和GCC 9.4.0。
这是其他有类似问题的人的解决方案。请尝试使用最新的oneMKL 2022.2.0版本,该版本现在可供下载,下面是英特尔oneAPI编译器的命令
icpx -ggdb3 -I"${MKLROOT}/include" test_dstevr.cpp -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm -ldl -o test_dstevr
编译后,尝试使用以下命令运行Valgrind,我们看到现在没有问题
valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose --log-file=valgrind.out ./test_dstevr