首页 > 解决方案 > MPI 在接收特定输入时挂起

问题描述

我是 MPI 的新手,我不知道我的程序为什么会挂起。

我的程序打开一个文件并接收大小 N,然后它使用 N 读取 NxN 2D 数组。然后我将一般信息发送到每个处理器,然后将数组水平拆分为块并将其发送到每个处理器。

int size, 
int N = -1, generations = -1, sliceSize;
MPI_Status Stat;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);



if(rank==0){
std::fstream file;
file.open(argv[1]);

file >> N;

std::vector<std::vector<int>> grid(N, std::vector<int>(N));
sliceSize = N / size;

//read in matrix board from file
for (int i = 0; i < N; ++i) {
        std::string temp;
        file >> temp;
        for (int j = 0; j < N; ++j) {
            grid[i][j] = temp[j] - '0';
        }
    }
    file.close();

// send general info to each processor including 0;
int info[2];
        info[0] = N;
        info[1] = sliceSize;
        for (int i = 0; i < size; i++) {
            MPI_Send(&info, 3, MPI_INT, i, 1, MPI_COMM_WORLD);
        }

//Split grid and send to each processor including 0;
/*
Examples Split
00000
00000

00000
00000
*/
for (int z = 0; z < size; ++z) {
            int slice[sliceSize][N];
            for (int i = 0; i < sliceSize; ++i) {
                for (int j = 0; j < N; ++j) {
                    slice[i][j] = grid[i + (z * sliceSize)][j];
                }
            }
            MPI_Send(&(slice[0][0]), N * sliceSize, MPI_INT, z, 2, MPI_COMM_WORLD);
        }

}

//All ranks to execute this
 int localInfo[2];        // local info for initial information

    std::cout << "Here1 " << rank << std::endl;
    MPI_Recv(&localInfo, 4, MPI_INT, 0, 1, MPI_COMM_WORLD, &Stat);
    std::cout << "Here2 " << rank << std::endl;
    N = localInfo[0];
    sliceSize = localInfo[1];
    generations = localInfo[2];

    int mySlice[sliceSize][N];
    std::cout << "Here3 " << rank << std::endl;
    MPI_Recv(&(mySlice[0][0]), sliceSize * N, MPI_INT, 0, 2, MPI_COMM_WORLD, &Stat);
    std::cout << "Here4 " << rank << std::endl;

//Do stuff with mySlice

我使用打印语句来尝试帮助过滤掉问题。有趣的是,当 N=10 或 N=20 时它工作正常,但任何更大并且出现问题。

示例输入:

30
001000000000000000000000000000
101000000000000000000000000000
011000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000
000000000000000000000000000000

使用 2 个处理器运行具有上述输入的代码时: mpirun -oversubscribe -np 2 ./Parallel 40x40Input.txt

我的输出:

Here1 1
Here2 1
Here3 1

我认为问题出在第二个 MPI_Recv 调用上,但我不知道为什么。

任何帮助将不胜感激。

标签: c++parallel-processingmpi

解决方案


推荐阅读