首页 > 解决方案 > 半并行 OpenMP 中的代码执行速度较慢

问题描述

再会。我想用 3 种方法实现内积:1 - 顺序 2 - 半并行 3 - 全并行

半并行意味着并行乘法和顺序求和。

这是我的代码:

int main(int argc, char *argv[]) {
    int *x, *y, *z, *w, xy_p, xy_s, xy_ss, i, N=5000;
    double s, e;
    x = (int *) malloc(sizeof(int)*N);
    y = (int *) malloc(sizeof(int)*N);
    z = (int *) malloc(sizeof(int)*N);
    w = (int *) malloc(sizeof(int)*N);

    for(i=0; i < N; i++) {
        x[i] = rand();
        y[i] = rand();
        z[i] = 0;
    }

    s = omp_get_wtime();

    xy_ss = 0;

    for(i=0; i < N; i++)
    {
        xy_ss += x[i] * y[i];
    }

    e = omp_get_wtime() - s;
    printf ( "[**] Sequential execution time is:\n%15.10f and <A,B> is %d\n", e, xy_ss );


    s = omp_get_wtime();

    xy_s = 0;

    #pragma omp parallel for shared ( N, x, y, z ) private ( i )
    for(i = 0; i < N; i++)
    {
        z[i] = x[i] * y[i];
    }
    for(i=0; i < N; i++)
    {
        xy_s += z[i];
    }

    e = omp_get_wtime() - s;
    printf ( "[**] Half-Parallel execution time is:\n%15.10f and <A,B> is %d\n", e, xy_s );


    s = omp_get_wtime();

    xy_p = 0;

    # pragma omp parallel shared (N, x, y) private(i)
    # pragma omp for reduction ( + : xy_p )
    for(i = 0; i < N; i++)
    {
        xy_p += x[i] * y[i];
    }
    e = omp_get_wtime() - s;
    printf ( "[**] Full-Parallel execution time is:\n%15.10f and <A,B> is %d\n", e, xy_p );
}

所以我有一些问题:首先我想知道:我的代码是否正确?!!!!第二:为什么半并行比顺序快?!第三:5000 是否适合并行性?最后为什么顺序是最快的?因为5000?!示例输出:

顺序执行时间为:0.0000196100,点为-1081001655

半并行执行时间为:0.0090819710,点为-1081001655

全并行执行时间为:0.0080959420,点为-1081001655

对于 N=5000000

顺序执行时间为:0.0150297650 为 -1629514371

半并行执行时间为:0.0292110600 为 -1629514371

全并行执行时间为:0.0072323760 为 -1629514371

无论如何,为什么半并行是最慢的?

标签: cperformanceparallel-processingopenmppragma

解决方案


推荐阅读