具有滑动访问的 MPI 共享内存通信

我有两个关于使用 mpi 共享内存通信的问题

1(如果我有一个MPI等级，这是唯一写入窗口的等级，是否有必要使用mpi_win_lock和mpi_win_unlock？我知道我的应用程序永远不会让其他人尝试写入该窗口。他们只读取窗口的内容，我确保他们在MPI_BARRIER后读取，因此窗口的内容已更新。

2(在我的应用程序中，我有一个MPI等级，它分配了一个共享窗口，需要由1：N其他MPI等级读取。

MPI 等级 1 应仅显示：RMA(1：10(

MPI 等级 2 应仅读取 rma(11：20(

MPI 秩 N 只能读取 rma(10*(N-1(+1：10*N(

目前，所有 1 到 N 个等级都在查询整个共享窗口，即大小"10*N"和 MPI_WIN_SHARED_QUERY。

我问是否可以应用 MPI_WIN_SHARED_QUERY 函数，以便 MPI 等级 1 只能从 1：10 访问窗口，从 11：20 开始访问排名 2，依此类推。

这样，每个等级都有从 1：10 开始的本地访问，但它们引用共享窗口的不同块？这可能吗？

非常感谢！

更新

基于下面的答案似乎做了我想做的。但是使用MPI_WIN_SHARED_QUERY时它不起作用

但是，我不明白本地指针如何自动指向数组的不同部分。它怎么知道怎么做。您唯一要做的就是执行大小为 nlocal=5 的c_f_pointer调用。它怎么知道例如排名 3 的 rma 必须从 5-16 开始访问 20 个位置。我真的不清楚，我担心它是否便携，即我可以依靠它吗？

首先，我建议使用 MPI_Win_fence 而不是MPI_Barrier进行同步 - 这确保了像屏障一样在时间上的同步，但也确保了窗口上的所有操作都是可见的(例如，写入应该刷新到内存(。

如果你使用 MPI_Win_allocate_shared((，那么你会自动实现你想要的 - 每个等级都有一个指向其本地部分的指针。但是，内存是连续的，因此您可以通过过度/不足索引数组元素来访问所有内存(您可以使用普通的 Fortran 指针指向纯粹由秩 0 分配的数组子部分，但我认为 MPI_Win_allocate_shared(( 更优雅(。

这里有一些代码来说明这一点 - 创建一个共享数组，由秩 0 初始化，但由所有等级读取。

这似乎在我的笔记本电脑上工作正常：

me@laptop:~$ mpirun -n 4 ./rmatest
Running on            4  processes with n =           20
Rank            2  in COMM_WORLD is rank            2  in nodecomm on node laptop
Rank            3  in COMM_WORLD is rank            3  in nodecomm on node laptop
Rank            0  in COMM_WORLD is rank            0  in nodecomm on node laptop
Rank            1  in COMM_WORLD is rank            1  in nodecomm on node laptop
rank, noderank, arr:            0           0           1           2           3           4           5
rank, noderank, arr:            3           3          16          17          18          19          20
rank, noderank, arr:            2           2          11          12          13          14          15
rank, noderank, arr:            1           1           6           7           8           9          10

尽管一般来说，这仅适用于同一共享内存节点中的所有等级。

program rmatest
use iso_c_binding, only: c_ptr, c_f_pointer
use mpi
implicit none
! Set the size of the road
integer, parameter :: nlocal = 5
integer :: i, n
integer, dimension(MPI_STATUS_SIZE) :: status
integer, pointer, dimension(:) :: rma
integer :: comm, nodecomm, nodewin
integer :: ierr, size, rank, nodesize, noderank, nodestringlen
integer(MPI_ADDRESS_KIND) :: winsize
integer :: intsize, disp_unit
character*(MPI_MAX_PROCESSOR_NAME) :: nodename
type(c_ptr) :: baseptr
comm = MPI_COMM_WORLD
call MPI_Init(ierr)
call MPI_Comm_size(comm, size, ierr)
call MPI_Comm_rank(comm, rank, ierr)
! Create node-local communicator
call MPI_Comm_split_type(comm, MPI_COMM_TYPE_SHARED, rank, &
MPI_INFO_NULL, nodecomm, ierr)
! Check it all went as expected
call MPI_Get_processor_name(nodename, nodestringlen, ierr)
call MPI_Comm_size(nodecomm, nodesize, ierr)
call MPI_Comm_rank(nodecomm, noderank, ierr)
n = nlocal*nodesize
if (rank == 0) then
write(*,*) "Running on ", size, " processes with n = ", n
end if
write(*,*) "Rank ", rank," in COMM_WORLD is rank ", noderank, &
" in nodecomm on node ", nodename(1:nodestringlen)
call MPI_Type_size(MPI_INTEGER, intsize, ierr)
winsize = nlocal*intsize
! displacements counted in units of integers
disp_unit = intsize
call MPI_Win_allocate_shared(winsize, disp_unit, &
MPI_INFO_NULL, nodecomm, baseptr, nodewin, ierr)
! coerce baseptr to a Fortran array: global on rank 0, local on others
if (noderank == 0) then
call c_f_pointer(baseptr, rma, [n])
else
call c_f_pointer(baseptr, rma, [nlocal])
end if
! Set the local arrays
rma(1:nlocal) = 0
! Set values on noderank 0
call MPI_Win_fence(0, nodewin, ierr)
if (rank == 0) then
do i = 1, n
rma(i) = i
end do
end if
call MPI_Win_fence(0, nodewin, ierr)
! Print the values  
write(*,*) "rank, noderank, arr: ", rank, noderank, (rma(i), i=1,nlocal)
call MPI_Finalize(ierr)
end program rmatest

相关内容

最新更新

热门标签：