HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

OpenMP workaround for barrier in loop

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
barrierloopopenmpforworkaround

Problem

What I basically want to do is this:

int main()
{
    const int n = 100;

    #pragma omp parallel for
    for (int i=0; i<n; i++)
    {
        int thread_ID = omp_get_thread_num();
        printf("%d) work 1 %d\n", thread_ID, i);
    #pragma omp barrier
        printf("    %d) work 2 %d\n", thread_ID, i);
    #pragma omp barrier
    }
    return 0;
}


except we cannot put a barrier into a parallel for in OpenMP; it just cannot be done. However, it's possible to explicitly do the separation between the threads in order to make it work:

#include 
#include 
#include 

int main()
{
    const int n = 100;
    int nthreads;
    #pragma omp parallel
    {
        // get the number of threads
        #pragma omp single
        {
            nthreads = omp_get_num_threads();
        }

        int thread_ID = omp_get_thread_num();

        // calculate which threads have to do one more iteration
        int one_more = thread_ID<(n%nthreads);
        int step = n/nthreads;
        int start, end;

        if(one_more){
            start = step*thread_ID + thread_ID;
            end = start + step + 1;
        }
        else{
            start = step*thread_ID + n%nthreads;
            end = start + step;
        }

        // the real work is here
        for (int i=start; i<start+step+1; i++)
        {
            if(i<end)
                printf("%d) work 1 %d\n", thread_ID, i);
        #pragma omp barrier
            if(i<end)
                printf("    %d) work 2 %d\n", thread_ID, i);
        #pragma omp barrier
        }

    }
    return 0;
}


However, I find this a bit dirty and I would have preferred something cleaner. Does somebody have a better idea?

Solution

You want to synchronize two parts of your iteration and parallelize the each one of them independently. A good option is splitting the code in two:

int main()
{
  const int n = 100;

  #pragma omp parallel
  {
  #pragma omp for
  for (int i=0; i<n; i++)
  {
      int thread_ID = omp_get_thread_num();
      printf("%d) work 1 %d\n", thread_ID, i);
  } // barrier (implicit, unless using 'nowait')

  #pragma omp for
  for (int i=0; i<n; i++)
  {
      printf("    %d) work 2 %d\n", thread_ID, i);
  } // barrier
  } // omp parallel
  return 0;
}

Code Snippets

int main()
{
  const int n = 100;

  #pragma omp parallel
  {
  #pragma omp for
  for (int i=0; i<n; i++)
  {
      int thread_ID = omp_get_thread_num();
      printf("%d) work 1 %d\n", thread_ID, i);
  } // barrier (implicit, unless using 'nowait')

  #pragma omp for
  for (int i=0; i<n; i++)
  {
      printf("    %d) work 2 %d\n", thread_ID, i);
  } // barrier
  } // omp parallel
  return 0;
}

Context

StackExchange Code Review Q#159882, answer score: 3

Revisions (0)

No revisions yet.