HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Finding the maximum pairwise difference in a collection of colors

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
themaximumcollectiondifferencecolorsfindingpairwise

Problem

Note that this problem is equivalent to finding the longest line segment defined by any two points in a collection of 3D coordinates, which may be an easier way to visualize the problem, and is almost certainly a version that has received more research.

Essentially, given a List and this method...

public static int ColorDiff(Color a, Color b)
{
    int rDiff = a.R - b.R;
    int gDiff = a.G - b.G;
    int bDiff = a.B - b.B;
    return (int)Math.Sqrt(rDiff * rDiff + gDiff * gDiff + bDiff * bDiff);
}


... I'd like to get the highest ColorDiff value for any pair of points in the list.

Informed Sequential Approach

int max = 0;
for (int i = 0; i  max)
            max = temp;
    }
}


Naive LINQ Parallel Approach

int max = colors.AsParallel().Max(a => colors.Max(b => ColorDiff(a, b)));


Informed Parallel Approach

int max = 0;
Object locker = new Object();
System.Threading.Tasks.Parallel.For(0, colors.Count - 1, () => 0,
    (i, unused, localMax) =>
    {
        int temp = 0;
        for (int j = i + 1; j 
    {
        lock (locker)
        {
            if (localMax > max)
                max = localMax;
        }
    });


My testing indicates the naive LINQ approach takes about 60-70% of the time the sequential approach takes. However, I'm pretty sure it's naively doing the comparisons twice (ColorDiff(colors[i], colors[j]) and ColorDiff(colors[j], colors[i])), and that the speedup is entirely due to the parallelization across 4 cores. The performance of the informed parallel approach supports that; it takes approximately 30-40% of the time the LINQ approach takes, making it about 5 times faster than the informed sequential approach.

The LINQ approach is taking approximately 1 minute on my trial datasets (about 60 thousand elements in the list), and a little over 15 minutes on a larger dataset (230 thousand elements). I don't even want to know how long it will take on a full size dataset (about 1.1 million elements in the li

Solution

Ramos (2001) gives a \$O(n \log n)\$ algorithm, but the paper is not freely available, and apparently the algorithm is very complex. So take a look instead at Malandain and Boissonnat (2004) who give a simple algorithm that is \$O(n^2)\$ in the worst case, but apparently performs well for distributions found in practice.

  • E. A. Ramos (2001). "An Optimal Deterministic Algorithm for Computing the Diameter of a Three-Dimensional Point Set". Discrete and Computational Geometry, 26:2, pp. 233–244.



  • Grégoire Malandain and Jean-Daniel Boissonnat (2004). "Computing the Diameter of a Point Set". INRIA research report.

Context

StackExchange Code Review Q#92396, answer score: 2

Revisions (0)

No revisions yet.