写入大型 4G 动态分配阵列时的分段故障



在 64 位 Ubuntu 12.04 系统中,一个简单的 C 程序分配一个大于或等于 4GB 的动态数组。当程序尝试将值写入数组的每个项目时,会发生段错误。以下是原始代码:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#define ROW 1024 * 1024 
#define COL 1024
#define u32 unsigned int
#define u64 unsigned long
int main(int argc, char const *argv[])
{
    u32 count = 0;
    u64* ary = (u64 *)malloc(sizeof(u32) * ROW * COL);
    assert(ary != NULL);
    printf("ary:%pn", ary);
    for(u32 r = 0; r < ROW; r++) {
        for (u32 c = 0; c < COL; c++) {
            ary[r*ROW + c] = count;
            count++;
        }
    }
    free(ary);
    printf("free arrayn");
    return 0;
}

将源代码编译为:

gcc -o t_32_64_ary_64 t_32_64_ary.c -g -std=c99 -m64

运行时,段故障:

ary:0x7fa13584b010
Segmentation fault (core dumped)

由于机器有 16GB 内存,所以我相信它可以毫无问题地分配 4GB 内存。

如果我注释掉写入值代码为:

    /*
    for(u32 r = 0; r < ROW; r++) {
        for (u32 c = 0; c < COL; c++) {
            ary[r*ROW + c] = count;
            count++;
        }
    }
    */

程序正常退出为:

ary:0x7f136cd85010
free array

这表示它已成功分配 4GB 内存。

使用 valgrind 运行程序:

valgrind ./t_32_64_ary_64
...
==4830== Warning: set address range perms: large range [0x395a5040, 0x1395a5040) (undefined)
ary:0x395a5040
==4830== Invalid write of size 8
==4830==    at 0x4006AE: main (t_32_64_ary.c:21)
==4830==  Address 0x1395a5040 is 0 bytes after a block of size 4,294,967,296 alloc'd
==4830==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4830==    by 0x400648: main (t_32_64_ary.c:14)
==4830== 
==4830== 
==4830== Process terminating with default action of signal 11 (SIGSEGV)
==4830==  Access not within mapped region at address 0x1395A6000
==4830==    at 0x4006AE: main (t_32_64_ary.c:21)
==4830==  If you believe this happened as a result of a stack
==4830==  overflow in your program's main thread (unlikely but
==4830==  possible), you can try to increase the size of the
==4830==  main thread stack using the --main-stacksize= flag.
==4830==  The main thread stack size used in this run was 8388608.
==4830== 
==4830== HEAP SUMMARY:
==4830==     in use at exit: 4,294,967,296 bytes in 1 blocks
==4830==   total heap usage: 1 allocs, 0 frees, 4,294,967,296 bytes allocated
==4830== 
==4830== LEAK SUMMARY:
==4830==    definitely lost: 0 bytes in 0 blocks
==4830==    indirectly lost: 0 bytes in 0 blocks
==4830==      possibly lost: 0 bytes in 0 blocks
==4830==    still reachable: 4,294,967,296 bytes in 1 blocks
==4830==         suppressed: 0 bytes in 0 blocks
==4830== Rerun with --leak-check=full to see details of leaked memory
==4830== 
==4830== For counts of detected and suppressed errors, rerun with: -v
==4830== ERROR SUMMARY: 505 errors from 1 contexts (suppressed: 2 from 2)
Segmentation fault (core dumped)

根据信息,似乎行:

        ary[r*ROW + c] = count;

触发段错误,但我不明白为什么。我相信索引:r*RWO+c 在数组范围内。

请帮助和感谢!

您的索引是错误的,并且您可能没有正确分配。

您可能希望通过执行malloc(sizeof(u64) * ROW * COL)进行分配(注意我使用了u64而不是u32(。按照目前的情况,您只分配了您可能打算分配的一半内存。

您应该通过执行ary[r*COL + c](或ary[c*ROW + r],根据您的需要(来索引。您应该对索引值使用 size_t 而不是 u32,以避免C_Elegans提及的溢出问题。

此外,虽然这不会在这个小程序中引起问题,但您应该养成用括号括住宏定义的习惯(例如 #define ROW (1024 * 1024) (。

这一行是你的主要问题:

u64* ary = (u64 *)malloc(sizeof(u32) * ROW * COL);

您正在创建指向u64的指针,但根据u32所需的大小计算要分配的大小。

您将分配的内存不足以容纳ROW * COL数量的u64。它可能只够这个数字的一半(除非使用对unsigned intunsigned long使用相同的大小的编译器进行编译(。


编辑:

如其他答案中所述,您的代码存在更多问题。

使用 calloc() 可能是计算所需内存的更优雅的解决方案。我认为它也不太容易出错...

#define SIZE 0x40000000  // = 1 GiB
uint32_t *array = (uint32_t *)calloc(SIZE, sizeof(uint32_t));

要分配的字节数的计算malloc溢出,因为ROWCOL属于 int 类型(它们是文本(。要解决此问题,您需要将它们声明为:

#define ROW 1024LL * 1024LL 
#define COL 1024LL

或者,考虑使用如下calloc

u64* ary = calloc(sizeof(u64) * COL, ROW); //casting malloc is not recommended in C
calloc在乘

以其参数时检查溢出,并始终将它们乘以size_t,因此您的乘法将不再溢出

最新更新