MPI_Comm_rank()中的MPI分段故障



我是MPI的初学者,这段代码似乎会生成分段错误。

int luDecomposeP(double *LU, int n)
{
int i, j, k;
int sendcount, recvcount, remaining, rank, numProcs, status;
double *row, *rowFinal, *start, factor;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
row = (double *)malloc(n*sizeof(double));
rowFinal = (double *)malloc(n*n*sizeof(double));
for(i=0; i<n-1; i++)
{
if(rank == 0)
{
status = pivot(LU,i,n);
for(j=0; j<n; j++)
row[j] = LU[n*i+j];
}
MPI_Bcast(&status, 1, MPI_INT, 0, MPI_COMM_WORLD);
if(status == -1)
return -1;
MPI_Bcast(row, n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
sendcount = (n-i-1)/numProcs;  
recvcount = (n-i-1)/numProcs;
remaining = (n-i-1)%numProcs;
if(rank == 0)
start = LU + n*(i+1);
else
start = NULL;
MPI_Scatter(start, sendcount*n, MPI_DOUBLE, rowFinal, recvcount*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
for(j=0; j<recvcount; j++)
{
factor = rowFinal[n*j+i]/row[i];
for(k=i+1; k<n; k++)
rowFinal[n*j+k] -= row[k]*factor;
rowFinal[n*j+i] = factor;
}
MPI_Gather(rowFinal, recvcount*n, MPI_DOUBLE, start, sendcount*n, MPI_DOUBLE, 0, MPI_COMM_WORLD);
if(rank == 0)
{
int ctr = 0;
while(ctr<remaining)
{
int index = sendcount*numProcs + ctr + i + 1;
factor = LU[n*index+i]/row[i];
for(k=i+1; k<n; k++)
LU[n*index+k] -= row[k]*factor;
LU[n*index+i] = factor;
ctr++;
}
}
}   
free(row);
free(rowFinal);
return 0;
}

此代码导致分段错误。我读了很多答案,并试图解决它但这并没有发生我读到关于取消引用NULL指针的问题,我通过使用名为start的指针解决了这个问题。但是分割错误仍然不断出现。

错误:

[seshnag:332334]*处理接收信号*

[Seshnag:33234]信号:分段故障(11)

信号代码:地址未映射(1)

[seshnag:332334]在地址0x44000098 处失败

[seshnag:33234][0]/lib/libpthread.so.0(+0xf8f0)[0x2b082eafe8f0]

[seshnag:332334][1]/usr/lib/openmpi/lib/libmpi.so.0(MPI_Comm_rank+0x5e)[0x2b082d5ff6ee]

[舍师那书:32334][2]/liblu分解.so(lu分解p+0x2f)[0x2b082d17ea2f]

[seshnag:33234][3]_tmp/beach.mpi.exe(main+0x2e7)[0x40b61d]

[Seshnag:332334][4]/lib/libc.so.6(__libc_start_main+0xfd)[0x2b082ed2ac4d]

[seshnag:332334][5]_tmp/back.mpi.exe()[0x40ac49]

从您报告的堆栈跟踪来看,分段错误似乎发生在对MPI_Comm_rank()的调用中。

我看到两个可能的问题:

  • MPI_Init()丢失。通常MPI会明确报告它丢失了,但您的MPI实现可能只是导致崩溃?MPI_Init()必须在任何其他MPI调用之前调用(并且MPI_Finalize()必须在退出之前调用)。

  • MPI安装损坏。一个简单的MPI"helloworld"程序能正常工作吗?

哦,是的。。。第三种选择:

  • 调用发生在一个受损的堆栈中(从调用luDecomposeP()之前的指令开始):MPI_Comm_rank()是写入堆栈变量的第一个操作

相关内容

  • 没有找到相关文章

最新更新