MPI与派生数据类型在Fortran 90.广播的麻烦



我在使用模块,派生数据类型和MPI的Fortran 90代码中卡住了一段时间。

我遇到的问题是,在广播派生数据类型之后,只有主节点的变量具有正确的值,所有其他节点上的变量都不包含它们应该包含的值。我从较大的代码中抽象出一个最小的示例。它包含主程序:


include 'hello_types.f90'
include 'mpi_circle.f90'
program hello_world
use type_hello_world
use create_mpi_types
implicit none
include 'mpif.h'
integer         :: ierr, num_procs, my_id, mesg_mpi_circle
type(circle_)       :: circle
call MPI_Init(ierr)
!find out MY process ID, and how many processes were started.
call MPI_COMM_RANK (MPI_COMM_WORLD, my_id, ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, num_procs, ierr)
allocate(circle%diameter(3),circle%straal(3))
if (my_id==0) then
print*,'enter straal and diameter'
read*,circle%diameter(1),circle%straal(1)
circle%diameter(2)=circle%diameter(1)
circle%straal(2)=circle%straal(1)
endif
call build_derived_circle(circle,mesg_mpi_circle)
call MPI_BCAST(circle,1,mesg_mpi_circle,0,MPI_COMM_WORLD,ierr)

print *, "Hello world! I'm process ", my_id, " out of", num_procs, " processes."
print*,my_id,mesg_mpi_circle%diameter(my_id+1),mesg_mpi_circle%straal(my_id+1)
call MPI_Finalize(ierr)

end program hello_world

输出包含两个打印语句,其中第一个只是打印proc_id(工作正常),第二个打印出各自节点上的变量(这就是我有问题的地方,只有在主节点上的值是好的)。这些变量是通过主节点上的读入来的。

还有一个定义类型的模块:


module type_hello_world
type circle_
   real,allocatable  :: straal(:),diameter(:)
end type circle_
end module type_hello_world

正如我所说的,我是从一个更大的代码中抽象出来的,所以这个模块可能看起来没用,但在原始代码中是有意义的。

作为第三个模块,它包含一个子例程,用于计算派生数据类型广播的位移.....我遵循了来自http://ladon.iqfr.csic.es/docs/MPI_ug_in_FORTRAN.pdf

的MPI Fortran用户指南
module create_mpi_types
contains
subroutine build_derived_circle(circle,mesg_mpi_circle)
use type_hello_world
implicit none

include 'mpif.h'

type(circle_),intent(in)     :: circle
! local
integer,parameter       :: number=2
integer                 :: ierr, i
integer             :: block_lengths(number)
integer                 :: displacements(number)
integer                 :: address(number+1)
integer                 :: typelist(number)

!output
integer,intent(out) :: mesg_mpi_circle
!----------------------------------------
!  first specify the types
typelist(1)=MPI_REAL
typelist(2)=MPI_REAL
! specify the number of elements of each type
block_lengths(1)=size(circle%straal)
block_lengths(2)=size(circle%diameter)
! calculate displacements relative to refr. 
call MPI_Address(circle,address(1),ierr)
call MPI_Address(circle%straal,address(2),ierr)
call MPI_Address(circle%diameter,address(3),ierr)
do i = 1, number
    displacements(i)=address(i+1)-address(i)
enddo
! build the derived data type
call MPI_TYPE_STRUCT(number,block_lengths,displacements,&
                    typelist,mesg_mpi_circle,ierr)
! commit it to the system, so it knows we ll use it 
! for communication
call MPI_TYPE_COMMIT(mesg_mpi_circle,ierr)
return
end subroutine build_derived_circle
!------------- END SUBROUTINE----------------------------
end module create_mpi_types

用于设置:代码旨在运行在使用Intel fortran编译的CentOs6下的ETH Brutus集群上。但是,我们在一些机器上测试了同样的问题,所以我认为这不是版本问题。

这不是版本问题。简单地说,MPI不喜欢具有可分配数组的类型。这和这个问题一样。它与指针和虚拟内存地址有关,引用的答案告诉您如何做到这一点,如果您真的想要的话。但是,如果可能的话,我会用下面列出的方法2来做。

有两种可能的方法。1)在type_hello_world

中创建固定长度的数组
type circle_
   real :: straal(100)
   real :: diameter(100)
end type circle_

2)创建一个只有一个元素的circle类型,然后创建一个圆数组。

type circle_
   real :: straal
   real :: diameter
end type circle_

我更喜欢第二种方法。还要注意位移类型应该是MPI_ADDRESS_KIND。所以你的代码将create_mpi_types改为

module create_mpi_types
contains
subroutine build_derived_circle(mesg_mpi_circle)
use type_hello_world
use mpi
implicit none

! local
type(circle_)                  :: circle
integer,parameter              :: number=2
integer                        :: ierr, i
integer                        :: block_lengths(number)
integer(kind=MPI_ADDRESS_KIND) :: displacements(number)
integer                        :: typelist(number)
real                           :: r
!output
integer,intent(out) :: mesg_mpi_circle
!----------------------------------------
do i = 0, number
    typelist(i) = MPI_REAL
    block_lengths(i) = 1
    displacements(i) = i * sizeof(r)
enddo
! build the derived data type
call MPI_Type_create_struct(number,block_lengths,displacements,&
                typelist,mesg_mpi_circle,ierr)
if (ierr /= 0 ) then
    print *, 'got an error in type create: ', ierr
    call MPI_Abort(MPI_COMM_WORLD, ierr, ierr)
endif
! commit it to the system, so it knows we ll use it
! for communication
call MPI_TYPE_COMMIT(mesg_mpi_circle,ierr)
if (ierr /= 0 ) then
    print *, 'got an error in type commit: ', ierr
    call MPI_Abort(MPI_COMM_WORLD, ierr, ierr)
endif
return
end subroutine build_derived_circle
!------------- END SUBROUTINE----------------------------
end module create_mpi_types

然后在hello_world中使用

include 'hello_types.f90'
include 'mpi_circle.f90'
program hello_world
use mpi
use type_hello_world
use create_mpi_types
implicit none
integer                    :: ierr, num_procs, my_id, mpi_circle_t
type(circle_), allocatable :: circles(:)
call MPI_Init(ierr)
!find out MY process ID, and how many processes were started.
call MPI_COMM_RANK (MPI_COMM_WORLD, my_id, ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, num_procs, ierr)
allocate(circles(num_procs))
if (my_id==0) then
    !print*,'enter straal and diameter'
    !read*,circle%diameter(1),circle%straal(1)
    circles(:)%diameter = 10.0
    circles(:)%straal = 2.0
endif
call build_derived_circle(mpi_circle_t)
call MPI_BCAST(circles,num_procs,mpi_circle_t,0,MPI_COMM_WORLD,ierr)
print *, "Hello world! I'm process ", my_id, " out of", num_procs, " processes."
print*,my_id,circles(my_id+1)%diameter,circles(my_id+1)%straal
call MPI_TYPE_FREE(mpi_circle_t, ierr)
deallocate(circles)
call MPI_Finalize(ierr)

最新更新