MPI使用派生数据类型(FORTRAN)发送错误

当我尝试发送带有"大"数组的MPI派生数据类型（每个100000个浮子的2个阵列）时，我的程序segfaults。它通常以较小的数组运行。

下面是一个小的可重复示例。这个小程序SEGFAULT具有以下MPI实现： intelmpi ， bullxmpi 。它正常使用 OpenMPI 和 PlatformMpi 。这是带有示例回溯的日志：http：//pastebin.com/fmbpcuj2

将mpi_send更改为mpi_ssend无济于事。但是，单个较大阵列为2*100000浮子的mpi_send工作正常。我认为，这指出了派生数据类型的问题。

program struct 
include 'mpif.h' 
type Data
  integer :: id
  real, allocatable :: ratio(:)
  real, allocatable :: winds(:)
end type 
type (Data) :: test
integer :: datatype, oldtypes(3), blockcounts(3) 
integer :: offsets(3)
integer :: numtasks, rank, i,  ierr 
integer :: n, status(mpi_status_size)
call mpi_init(ierr) 
call mpi_comm_rank(mpi_comm_world, rank, ierr) 
call mpi_comm_size(mpi_comm_world, numtasks, ierr) 
if (numtasks /= 2) then
  write (*,*) "Needs 2 procs"
  call exit(1)
endif
n = 100000
allocate(test%ratio(n))
allocate(test%winds(n))
if (rank == 0) then
  test%ratio = 6
  test%winds = 7
  test%id = 2
else
  test%id = 0
  test%ratio = 0
  test%winds = 0
endif
call mpi_get_address(test%id, offsets(1), ierr)
call mpi_get_address(test%ratio, offsets(2), ierr)
call mpi_get_address(test%winds, offsets(3), ierr)
do i = 2, size(offsets)
  offsets(i) = offsets(i) - offsets(1)
end do
offsets(1) = 0
oldtypes = (/mpi_integer, mpi_real, mpi_real/)
blockcounts = (/1, n, n/)
call mpi_type_struct(3, blockcounts, offsets, oldtypes, datatype, ierr) 
call mpi_type_commit(datatype, ierr) 
if (rank == 0) then 
  !call mpi_ssend(test, 1, datatype, 1, 0,  mpi_comm_world, ierr) 
  call mpi_send(test, 1, datatype, 1, 0,  mpi_comm_world, ierr) 
else
  call mpi_recv(test, 1, datatype, 0, 0,  mpi_comm_world, status, ierr) 
end if
print *, 'rank= ',rank
print *, 'data= ',test%ratio(1:5),test%winds(1:5)
deallocate (test%ratio)
deallocate (test%winds)
call mpi_finalize(ierr) 

end

注意：不同MPI固定之间的比较不是客观的，因为测试并非全部都在同一台计算机上（其中一些是超级计算机）。不过，我认为这应该有所作为。

编辑：代码可与静态数组一起使用。这是Fortran90。

我可以建议您使用调试器吗？我只是在Allinea DDT中尝试了您的榜样，并在两分钟内看到了问题。您需要使用调试器 - 您的代码"看起来正确"，所以是时候观看它在实践中的行为了。

我单击以打开内存调试（一种迫使某些隐藏错误显示的方法），您的示例每次都崩溃了。坠机在发件人中。

所以，我开始使用ddt踏上 - 随着DDT的内存调试打开。

首先，您调用mpi_get_address-填充一系列偏移。看看这些偏移！整数的地址是正面的，可分配的阵列偏移是负面的：不良信号。地址溢出。

分配数据的地址将与静态分配整数的内存截然不同。如果您使用32位算术来操纵64位指针（MPI_Get_Address警告此），则所有赌注都关闭。使用静态数组，它没有崩溃，因为它们的地址与整数的地址足够近，以至于不会溢出。

您将此不正确的偏移阵列发送到mpi_send，它读取了不应该的数据（再次查看偏移缓冲区以说服您自己），从而使Segfaults。

真正的修复是 -

使用mpi_get_address-使用integer（kint = mpi_address_kind）进行偏移声明 - 确保一个64位代码获得64位整数。
MPI_TYPE_STRUCT应该用MPI_TYPE_CREATE_STRUCT替换 - 前者被弃用，并且不会以MPI_ADDRESS_KIND整数的形式取消偏移，只有4个Byte Integers - 因此，
又被抛弃。。。

随着这些更改，您的代码运行。

祝你好运！

相关内容

最新更新

热门标签：