分段故障'gsl_spmatrix_add'

编辑：我已经把问题改为产生相同错误的新代码，这样做更可靠

一段时间以来，我一直在努力寻找代码中的分段错误，并将其归结为以下代码：

#include <gsl/gsl_spmatrix.h>
#include <iostream>
using namespace std;
void test_gsl() {
size_t size = 5;
size_t nzmax = 5 * 5;
constexpr size_t threads = 5;
// allocate
gsl_spmatrix* thread_matrices[threads];
for (size_t thread = 0; thread < threads; thread++) {
thread_matrices[thread] = gsl_spmatrix_alloc_nzmax(size, size, nzmax, GSL_SPMATRIX_TRIPLET);
}
// set
for (size_t i = 0; i < threads; i++) {
gsl_spmatrix_set(thread_matrices[i], 0, 0, 1.0);
}
// crs
for (size_t i = 0; i < threads; i++) {
gsl_spmatrix* temp = thread_matrices[i];
thread_matrices[i] = gsl_spmatrix_crs(thread_matrices[i]);
gsl_spmatrix_free(temp);
}
// add to total
gsl_spmatrix* total_matrix = gsl_spmatrix_alloc_nzmax(size, size, nzmax, GSL_SPMATRIX_CRS);
gsl_spmatrix* total_copy = gsl_spmatrix_alloc_nzmax(size, size, nzmax, GSL_SPMATRIX_CRS);
for (size_t i = 0; i < threads; i++) {
gsl_spmatrix_memcpy(total_copy, total_matrix);  // this is required to avoid another segfault
gsl_spmatrix_add(total_matrix, total_copy, thread_matrices[i]); // unknown segfault!
}
gsl_spmatrix_free(total_matrix);
gsl_spmatrix_free(total_copy);
}
int main(int argc, char* argv[]) {

test_gsl();
printf("endn");
return 0;
}

当我运行这个程序时，我总是得到以下输出：

Segmentation fault (core dumped)

分段故障与gsl_spmatrix_add(total_matrix, total_copy, thread_matrices[i]);一致。

我正在使用cmake:编译此代码

cmake_minimum_required(VERSION 3.22.1)
project(diskmodel)
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED YES)
add_subdirectory("src")

project(galaxy)
find_package(GSL REQUIRED)
add_executable(${PROJECT_NAME} main.cpp)
set_target_properties(${PROJECT_NAME} PROPERTIES OUTPUT_NAME "${PROJECT_NAME}" SUFFIX ".exe")
target_link_libraries(${PROJECT_NAME} GSL::gsl GSL::gslcblas )

是什么导致了这个seg故障

编辑：

编译后使用：g++ 'gsl-config --libs' main.cpp -fsanitize=undefined -g我得到了和以前一样的输出。当用address编译时，我得到：

=================================================================
==31330==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 400 byte(s) in 5 object(s) allocated from:
#0 0x7efd44b64a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
#1 0x7efd449d393e in gsl_spmatrix_alloc_nzmax (/lib/x86_64-linux-gnu/libgsl.so.23+0x1f893e)
Indirect leak of 240 byte(s) in 5 object(s) allocated from:
#0 0x7efd44b64808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x7efd449d3b6c in gsl_spmatrix_alloc_nzmax (/lib/x86_64-linux-gnu/libgsl.so.23+0x1f8b6c)
Indirect leak of 200 byte(s) in 5 object(s) allocated from:
#0 0x7efd44b64808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x7efd449d3b88 in gsl_spmatrix_alloc_nzmax (/lib/x86_64-linux-gnu/libgsl.so.23+0x1f8b88)
Indirect leak of 40 byte(s) in 5 object(s) allocated from:
#0 0x7efd44b64808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x7efd449d39ac in gsl_spmatrix_alloc_nzmax (/lib/x86_64-linux-gnu/libgsl.so.23+0x1f89ac)
Indirect leak of 40 byte(s) in 5 object(s) allocated from:
#0 0x7efd44b64808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
#1 0x7efd449d397d in gsl_spmatrix_alloc_nzmax (/lib/x86_64-linux-gnu/libgsl.so.23+0x1f897d)

当使用我的cmake文件进行编译并运行gdb galaxy.exe时，我会得到以下回溯：

#0  0x00007ffff7f2c185 in gsl_spblas_scatter () from /lib/x86_64-linux-gnu/libgsl.so.23
#1  0x00007ffff7f2b364 in gsl_spmatrix_add () from /lib/x86_64-linux-gnu/libgsl.so.23
#2  0x00005555555553d2 in test_gsl () at .../src/main.cpp:35
#3  0x0000555555555420 in main (argc=1, argv=0x7fffffffdaf8) at .../src/main.cpp:44

并且在使用CCD_ 5时没有历史。

当使用ulimit -c unlimited然后运行时，不会生成核心文件。我试着研究这个，但我似乎找不到它在任何地方生成，我也不知道为什么。

看起来像是GSL中的一个bug。请报告：-(

线路

gsl_spmatrix *total_matrix = gsl_spmatrix_alloc_nzmax(size, size, nzmax, GSL_SPMATRIX_CRS);

是GSL稀疏矩阵的有效分配器。然而，它的初始化是"；"聪明"；因为它的一些存储器缓冲器是malloced的，但没有初始化。这是指成员p。init_source.c的第130行(来自GSL源，子模块(目录(spmatrix(：

m->p = malloc((n1 + 1) * sizeof(int));

你的代码接下来要做的就是

gsl_spmatrix_memcpy(total_copy, total_matrix); // this is required to avoid another segfault

好吧，这个评论有点有趣，但让我们看看代码(copy_source.c的第93-96行(：

for (n = 0; n < src->size1 + 1; ++n)
{
dest->p[n] = src->p[n];
}

这里，size1似乎是矩阵行的数量，它们被声明为5。因此，代码用垃圾替换(通过复制(垃圾。这告诉我们，如果一个声明为具有5行的矩阵具有少于5个非零行，那么GSL似乎不能很好地工作。我相信这就是你问题的解决办法。您声明了一些矩阵，例如total_matrix和total_copy，它们有5行，但实际上没有。然而，到目前为止，代码还没有出现错误，因为将垃圾复制到垃圾上是没有错误的。

代码中的下一步：

gsl_spmatrix_add(total_matrix, total_copy, thread_matrices[i]);

调用与成员p:相关的代码

for (j = 0; j < outer_size; ++j)
{
Cp[j] = nz;

这将打开一个循环，在您的情况下，该循环将被执行5次。这里Cp是C->p的简写。因此，到目前为止被初始化的p成员的唯一元素是C = A + B中的C的第j个元素。接下来，在这个循环中，我们可以看到：

/* CSC: x += A(:,j); CSR: x += A(j,:) */
nz = FUNCTION (spmatrix, scatter) (a, j, w, x, (int) (j + 1), c, nz);

请注意，j作为第二个参数传递，而未完全初始化的a作为第一个参数。这通过宏调用在第538行中定义的CCD_ 23。

static size_t
FUNCTION (spmatrix, scatter) (const TYPE (gsl_spmatrix) * A, const size_t j, int * w,
ATOMIC * x, const int mark, TYPE (gsl_spmatrix) * C, size_t nz)
{
int p;
int * Ai = A->i;
int * Ap = A->p;
ATOMIC * Ad = A->data;
int * Ci = C->i;
for (p = Ap[j]; p < Ap[j + 1]; ++p)
{

现在，可以看出，GSL访问未初始化的值Ap[j]和Ap[j + 1]。这将导致几条指令之后立即出现segfault。

现在，如何避免这种情况？

让我们看看"；犹太洁食；创建CSR矩阵的方法(第152-156行，compress_source.c(：

Cp = dest->p;
/* initialize row pointers to 0 */
for (n = 0; n < dest->size1 + 1; ++n)
Cp[n] = 0;

万岁！这是对p成员的正确初始化。顺便说一下，接下来的几行解释了CRS表示中的p成员用于存储每行中的元素数量。这似乎是gsl_spmatrix_alloc_nzmax中缺少的代码。

结论：不要依赖gsl_spmatrix_alloc_nzmax返回的矩阵。它们应该可以用作"；目的地矩阵"；，例如作为C = A + B中的C，但不是作为零填充源。

希望这能有所帮助。

PS。您可以删除这种完全不必要的gsl_spmatrix_memcpy(total_copy, total_matrix);调用

相关内容

最新更新

热门标签：