我是OpenMP的新手,并且我面临这样的情况:
int someArray[ARRAY_SIZE];
//outer loop
for(int i = 0; i < 100; ++i) {
//inner loop
for(int j = 0; i < ARRAY_SIZE; ++i) {
//calculaations in someArray (every cell can be calculated separately)
}
//some code that needs to be run by only one thread - for example sorting someArray
}
我想使内部循环并行,但是我尝试过的想法(下面的代码)无效(单线程可以比多线程更快地做事)。我认为在这里一遍又一遍地创建多个线程需要很多时间。
我的糟糕解决方案:
int someArray[ARRAY_SIZE];
//outer loop
for(int i = 0; i < 100; ++i) {
#pragma omp parallel num_threads(THREADS_NUMBER) shared(someArray)
{
//inner loop
#pragma omp for
for(int j = 0; i < ARRAY_SIZE; ++i) {
//calculaations in someArray (every cell can be calculated separately)
}
}
//some code that needs to be run by only one thread - for example sorting someArray
}
您知道如何优化此任务吗?
当你有双 for 循环时,你几乎总是想对外循环进行对准化。在您的情况下:
#pragma omp parallel for
for(int i = 0; i < 100; ++i) {
for(int j = 0; i < ARRAY_SIZE; ++i) {
//calculations in someArray (every cell can be calculated separately)
}
//some code that needs to be run by only one thread - for example sorting someArray
}
如果您有 4 个可用 CPU,则会在 4 个 CPU 上将 100 次迭代拆分为 25 次。这比您的代码效率高得多,对于 100 次迭代中的每一次,最终在 CPU 之间拆分ARRAY_SIZE(因此您有 100 倍的开销)。