最终编辑:事实证明,这是一个与字符串函数或malloc无关的堆栈溢出问题。GDB 输出说问题比它所在的地方高出几行,让我感到困惑,但当我花时间在瓦尔格林德跑步时,我就弄清楚了。
我编写了一个双向广度优先搜索程序,用于在非常大的有向图(~600 万个节点)中查找最短路径。使用 100 个节点的测试输入文件,一切正常。使用完整的输入,将使用更多的内存,然后程序出现分段错误。
GDB 说它在搜索函数开始时出现段错误,当我在 n = sprintf(result, "");
处清理结果缓冲区时。以下是相关功能:
char *bidirbfs(int x, int y, char *result){
int n;
n = sprintf(result, "");
...
下面是对它的调用和result
缓冲区的分配:
int main (){
int n=0;
char *result;
result = (char *)malloc(sizeof(char)*2000);
if(result == NULL){
printf("MALLOC FAILED!"); exit(1);}
//Methods for initializing graph
readStructureFromFile();
calcArticlesIn();
//Search the graph
result = bidirbfs(1,2, result);
printf("%sn", result);
...
}
同样,只要输入很少,一切正常。当我使用全尺寸输入时,程序可以正常读取所有内容,但随后出现段错误。当我改用非常相似的 strncpy 调用来清空数组时,我得到了相同的行为,所以这似乎是字符串函数的普遍问题。我不确定会发生什么。
似乎 sprintf 不喜欢它得到的指针,这让我怀疑 malloc 是否在做一些奇怪的事情。使用完整输入时,malloc 被调用 1300 万次*,所以我想知道它是否因此而表现出奇怪的行为并用奇怪的东西覆盖字符串缓冲区。同时,我非常犹豫要不要责怪图书馆。
知道会发生什么吗?
*可悲的是,我认为这实际上是必要的。图形中的每个元素都有一个用于入站边和出站边的数组。在读取输入之前,每个数组的大小都是未知的,因此必须通过 malloc 将其动态分配给正确的大小。
编辑:瓦尔格林德返回以下内容。我正在努力弄清楚它可能意味着什么,但乍一看,它实际上可能是某种堆栈溢出。
==27263== Warning: client switching stacks? SP change: 0xbea50634 --> 0xbb815340
==27263== to suppress, use: --max-stackframe=52671220 or greater
==27263== Invalid write of size 4
==27263== at 0x8048D78: bidirbfs (load_data.c:184)
==27263== by 0x80491CD: main (load_data.c:304)
==27263== Address 0xbb815348 is on thread 1's stack
==27263==
==27263==
==27263== Process terminating with default action of signal 11 (SIGSEGV)
==27263== Access not within mapped region at address 0xBB815348
==27263== at 0x8048D78: bidirbfs (load_data.c:184)
==27263== If you believe this happened as a result of a stack
==27263== overflow in your program's main thread (unlikely but
==27263== possible), you can try to increase the size of the
==27263== main thread stack using the --main-stacksize= flag.
==27263== The main thread stack size used in this run was 8388608.
==27263==
==27263== Process terminating with default action of signal 11 (SIGSEGV)
==27263== Access not within mapped region at address 0xBB81533C
==27263== at 0x401F4DD: _vgnU_freeres (vg_preloaded.c:58)
==27263== If you believe this happened as a result of a stack
==27263== overflow in your program's main thread (unlikely but
==27263== possible), you can try to increase the size of the
==27263== main thread stack using the --main-stacksize= flag.
==27263== The main thread stack size used in this run was 8388608.
==27263==
==27263== HEAP SUMMARY:
==27263== in use at exit: 1,021,539,288 bytes in 13,167,791 blocks
==27263== total heap usage: 13,167,792 allocs, 1 frees, 1,047,874,864 bytes allocated
==27263==
==27263== LEAK SUMMARY:
==27263== definitely lost: 0 bytes in 0 blocks
==27263== indirectly lost: 0 bytes in 0 blocks
==27263== possibly lost: 0 bytes in 0 blocks
==27263== still reachable: 1,021,539,288 bytes in 13,167,791 blocks
==27263== suppressed: 0 bytes in 0 blocks
==27263== Rerun with --leak-check=full to see details of leaked memory
==27263==
==27263== For counts of detected and suppressed errors, rerun with: -v
==27263== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 12 from 7)
编辑2:最终解决方案:这是堆栈溢出。在 sprintf 语句之后,我创建了一个数组,其大小与节点数成正比。由于我没有使用 malloc,因此它是直接在堆栈上创建的,溢出了它。更改为使用 malloc 解决了这个问题,现在一切都按预期运行。感谢大家的建议!
在瓦尔格林德运行你的程序。看看它怎么说。我敢打赌你会发现输出很有启发性。
在这一点上这是一个猜测,但请尝试以下简单更改:
在将内容写入result
变量的地方,请尝试使用
n = snprintf(result, 2000, "...", ...);
哪里。。。代表您实际想要写入result
字符串的内容。
如果你写超过result
分配的末尾,效果将是不可预测的。