C语言 分段错误和 MPI



在我的程序中,我需要使用 MPI 进行一些矩阵乘法。当我运行我的程序时,我收到以下错误:

=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

它执行:

 printf("Sent an");  

错误出在:

 MPI_Send(&b, nColA*nColB, MPI_FLOAT, dest, mtype, MPI_COMM_WORLD);

它不会执行:

 printf("Sent bn");

我现在不知道为什么。

你可以帮我吗?

void multiplicaMatriz (int taskid, int numtasks, float **a, float **b, float **c, long int nLinA, long int nColA, long int nLinB, long int nColB)
{
    long int    i, j, k, rc;           /* misc */
    int numworkers,        /* number of worker tasks */
    source,                /* task id of message source */
    dest,                  /* task id of message destination */
    mtype,                 /* message type */
    rows,                  /* rows of matrix A sent to each worker */
    averow, extra, offset; /* used to determine rows sent to each worker */
    MPI_Status status;
    numworkers = numtasks-1;

   /**************************** master task ************************************/
   if (taskid == MASTER)
   {
      printf("mpi_mm has started with %d tasks.n",numtasks);
      /* Send matrix data to the worker tasks */
      averow = nLinA/numworkers;
      extra = nLinA%numworkers;
      offset = 0;
      mtype = FROM_MASTER;
      for (dest=1; dest<=numworkers; dest++)
      {
         rows = (dest <= extra) ? averow+1 : averow;    
         printf("Sending %d rows to task %d offset=%dn",rows,dest,offset);
         MPI_Send(&offset, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
         printf("Sent offset %dn", offset);
         MPI_Send(&rows, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
         printf("Sent rows %dn", rows);
         MPI_Send(&a[offset][0], rows*nColA, MPI_FLOAT, dest, mtype,
                   MPI_COMM_WORLD);
         printf("Sent an");          
         MPI_Send(&b, nColA*nColB, MPI_FLOAT, dest, mtype, MPI_COMM_WORLD);
         printf("Sent bn");
         offset = offset + rows;
      }
      /* Receive results from worker tasks */
      mtype = FROM_WORKER;
      for (i=1; i<=numworkers; i++)
      {
         source = i;
         MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
         MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
         MPI_Recv(&c[offset][0], rows*nColB, MPI_FLOAT, source, mtype, 
                  MPI_COMM_WORLD, &status);
         printf("Received results from task %dn",source);
      }
      /* Print results */
      printf("******************************************************n");
      printf("Result Matrix:n");
      for (i=0; i<nLinA; i++)
      {
         printf("n"); 
         for (j=0; j<nColB; j++) 
            printf("%6.2f   ", c[i][j]);
      }
      printf("n******************************************************n");
      printf ("Done.n");
   }

   /**************************** worker task ************************************/
   if (taskid > MASTER)
   {
      mtype = FROM_MASTER;
      MPI_Recv(&offset, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
      MPI_Recv(&rows, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
      MPI_Recv(&a, rows*nColA, MPI_FLOAT, MASTER, mtype, MPI_COMM_WORLD, &status);
      MPI_Recv(&b, nColA*nColB, MPI_FLOAT, MASTER, mtype, MPI_COMM_WORLD, &status);
      for (k=0; k<nColB; k++)
         for (i=0; i<rows; i++)
         {
            c[i][k] = 0.0;
            for (j=0; j<nColA; j++)
               c[i][k] = c[i][k] + a[i][j] * b[j][k];
         }
      mtype = FROM_WORKER;
      MPI_Send(&offset, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
      MPI_Send(&rows, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
      MPI_Send(&c, rows*nColB, MPI_FLOAT, MASTER, mtype, MPI_COMM_WORLD);
   }
 }

这是由于错误地访问b

仔细阅读本声明:

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);

如果您查看 buf 参数,它是一个void*,它被转换为datatype的任何类型。当你打电话给MPI_Send(&b, nColA*nColB, MPI_FLOAT, dest, mtype, MPI_COMM_WORLD);时,你正在通过&b。这是对 b 的引用,其类型为 float*** 。该函数将此视为类型 float* ,导致错误。

在你对MPI_Send()的其他调用中,你传递了&a[offset][0],它确实具有正确的float*类型。尝试传递&b[offset][0],或者您需要对这些数组索引进行排序以使乘法正确。

我不会费力为你找出这些指数,那是你的工作。但这就是导致段错误的原因。

相关内容

  • 没有找到相关文章

最新更新