patterncMinor
OpenMP workaround for barrier in loop
Viewed 0 times
barrierloopopenmpforworkaround
Problem
What I basically want to do is this:
except we cannot put a barrier into a
However, I find this a bit dirty and I would have preferred something cleaner. Does somebody have a better idea?
int main()
{
const int n = 100;
#pragma omp parallel for
for (int i=0; i<n; i++)
{
int thread_ID = omp_get_thread_num();
printf("%d) work 1 %d\n", thread_ID, i);
#pragma omp barrier
printf(" %d) work 2 %d\n", thread_ID, i);
#pragma omp barrier
}
return 0;
}except we cannot put a barrier into a
parallel for in OpenMP; it just cannot be done. However, it's possible to explicitly do the separation between the threads in order to make it work:#include
#include
#include
int main()
{
const int n = 100;
int nthreads;
#pragma omp parallel
{
// get the number of threads
#pragma omp single
{
nthreads = omp_get_num_threads();
}
int thread_ID = omp_get_thread_num();
// calculate which threads have to do one more iteration
int one_more = thread_ID<(n%nthreads);
int step = n/nthreads;
int start, end;
if(one_more){
start = step*thread_ID + thread_ID;
end = start + step + 1;
}
else{
start = step*thread_ID + n%nthreads;
end = start + step;
}
// the real work is here
for (int i=start; i<start+step+1; i++)
{
if(i<end)
printf("%d) work 1 %d\n", thread_ID, i);
#pragma omp barrier
if(i<end)
printf(" %d) work 2 %d\n", thread_ID, i);
#pragma omp barrier
}
}
return 0;
}However, I find this a bit dirty and I would have preferred something cleaner. Does somebody have a better idea?
Solution
You want to synchronize two parts of your iteration and parallelize the each one of them independently. A good option is splitting the code in two:
int main()
{
const int n = 100;
#pragma omp parallel
{
#pragma omp for
for (int i=0; i<n; i++)
{
int thread_ID = omp_get_thread_num();
printf("%d) work 1 %d\n", thread_ID, i);
} // barrier (implicit, unless using 'nowait')
#pragma omp for
for (int i=0; i<n; i++)
{
printf(" %d) work 2 %d\n", thread_ID, i);
} // barrier
} // omp parallel
return 0;
}Code Snippets
int main()
{
const int n = 100;
#pragma omp parallel
{
#pragma omp for
for (int i=0; i<n; i++)
{
int thread_ID = omp_get_thread_num();
printf("%d) work 1 %d\n", thread_ID, i);
} // barrier (implicit, unless using 'nowait')
#pragma omp for
for (int i=0; i<n; i++)
{
printf(" %d) work 2 %d\n", thread_ID, i);
} // barrier
} // omp parallel
return 0;
}Context
StackExchange Code Review Q#159882, answer score: 3
Revisions (0)
No revisions yet.