snippetcppMinor
Multi-threaded sort
Viewed 0 times
sortthreadedmulti
Problem
I wanted to write a multi-threaded sort, but unfortunately I don't know much about threads, especially in C++11. I managed to make something work, but I would be highly surprised if it was the best way to do it.
Some of the tests I did sorting integers:
Also, I tried using
template
void ___sort(T *data, int len, int nbThreads){
if(nbThreads());
}
else{
std::future b = std::async(std::launch::async,___sort, data, len/2, nbThreads-2);
std::future a = std::async(std::launch::async,___sort, &data[len/2], len/2, nbThreads-2);
a.wait();
b.wait();
std::inplace_merge (data,&data[len/2],&data[len],std::less());
}
}Some of the tests I did sorting integers:
size is the number of ints sorted, and the time is in seconds.size : 2097152
time with 1 thread : 4.961
time with 2 thread : 3.191
time with 4 thread : 2.377
size : 4194304
time with 1 thread : 10.002
time with 2 thread : 6.214
time with 4 thread : 4.689
size : 8388608
time with 1 thread : 19.975
time with 2 thread : 12.332
time with 4 thread : 9.29
size : 16777216
time with 1 thread : 38.712
time with 2 thread : 25.257
time with 4 thread : 18.706Also, I tried using
std::qsort instead of std::sort, and the results were at least twice as long as that. Any reasons why?Solution
Something like this is probably more efficient:
You could set grain size to something like,
template
void parallel_sort(T* data, int len, int grainsize)
{
// Use grainsize instead of thread count so that we don't e.g.
// spawn 4 threads just to sort 8 elements.
if(len ());
}
else
{
auto future = std::async(parallel_sort, data, len/2, grainsize);
// No need to spawn another thread just to block the calling
// thread which would do nothing.
parallel_sort(data + len/2, len - len/2, grainsize);
future.wait();
std::inplace_merge(data, data + len/2, data + len, std::less());
}
}You could set grain size to something like,
std::max(256, len/nb_threads).Code Snippets
template<class T>
void parallel_sort(T* data, int len, int grainsize)
{
// Use grainsize instead of thread count so that we don't e.g.
// spawn 4 threads just to sort 8 elements.
if(len < grainsize)
{
std::sort(data, data + len, std::less<T>());
}
else
{
auto future = std::async(parallel_sort<T>, data, len/2, grainsize);
// No need to spawn another thread just to block the calling
// thread which would do nothing.
parallel_sort(data + len/2, len - len/2, grainsize);
future.wait();
std::inplace_merge(data, data + len/2, data + len, std::less<T>());
}
}Context
StackExchange Code Review Q#22744, answer score: 8
Revisions (0)
No revisions yet.