HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Lock-free cache oblivious n-body algorithm

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
freeobliviousalgorithmcachebodylock

Problem

I'm currently looking at, from a rather high level, the parallelization of the gravity calculation in an N-body simulation for approximating a solution to the N-body problem.

The simple form of the algorithm looks something like this:

for (int i = 0; i Pos;
    double r2 = Dot(dr, dr);
    double ir3 = 1 / (r2 * sqrt(r2));
    this->Acc += (other.Mass * ir3) * dr;
}


This simple version has an obvious parallelization over the outer loop:

#pragma omp parallel for
for (int i = 0; i < n; i++)
    for (int j = 0; j < n; j++)
        if (i != j)
            bodies[i].ApplyGravityFrom(bodies[j]);


However, you're doing twice as many units of work as necessary. It's the case that gravity acting on body i from body j is the same as the gravity acting on body j from body i, but with the opposite sign.

You can calculate gravity pairwise instead:

for (int i = 0; i Pos;
    double r2 = Dot(dr, dr);
    double ir3 = 1 / (r2 * sqrt(r2));
    this->Acc += (other.Mass * ir3) * dr;
    other.Acc -= (this->Mass * ir3) * dr;
}


This is the same exact calculation but you're making use of the fact that the force is symmetric, but with a sign flip.

But now parallelizing this loop becomes much harder. But if you treat this problem geometrically, then you can realize that you can actually divide up the work in a special way to never require locks.

First off, you can calculate pairwise forces for the first n / 2 bodies. This is inherently serial. Then, you can calculate the pairwise forces between the bodies [0, n / 2] and [n / 2, n]. Lastly you calculate the pairwise forces for the last n / 2 bodies.

The algorithm then looks like:

```
void BlockGravity(Body *bodies, int i1, int i2, int j1, int j2)
{
for (int i = i1; i < i2; i++)
for (int j = j1; j < j2; j++)
bodies[i].Pairwise(bodies[j]);
}

void TriangleGravity(Body *bodies, int i1, int i2)
{
for (int i = i1; i < i2; i++)
for (int j = i1 + 1; j < i2; j++)
bo

Solution

You seem to be making the problem more complex:

I would take a step back to your original algorithm:

#pragma omp parallel for
for (int i = 0; i < n; i++)
    for (int j = i + 1; j < n; j++)
        bodies[i].PairwiseGravity(bodies[j]);


Your problem (as you stated) that the parallelism is uneven (the lower outer loops is doing more work than the higher values of the outer loop).

But why not combine the two loops.

#pragma omp parallel for
for (int i = 0; i < ((n*(n-1))/2; i++)
    bodies[src(i,n)].PairwiseGravity(bodies[dst(i,n)]);


Each loop is completely independent.

All you have to do is calculate how the functions src() and dst() work.

Code Snippets

#pragma omp parallel for
for (int i = 0; i < n; i++)
    for (int j = i + 1; j < n; j++)
        bodies[i].PairwiseGravity(bodies[j]);
#pragma omp parallel for
for (int i = 0; i < ((n*(n-1))/2; i++)
    bodies[src(i,n)].PairwiseGravity(bodies[dst(i,n)]);

Context

StackExchange Code Review Q#7578, answer score: 2

Revisions (0)

No revisions yet.