HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Fill a container from several threads

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
containerthreadsseveralfromfill

Problem

I want to generate some random data and store it in a container. I split the container into disjoint intervals using iterators and pass them to a thread function (I hope its body isn't important here).

There are several design-related questions here:

  • Is it a good practice to solve such task in this manner?



  • If it is, do I optimally pass and receive parameters in thread_func?



#include 
#include 
#include 
#include 

template 
void thread_func(Iterator b, Iterator e, Distributions& ds)
{
    // *b++ = something, while b != e
}

int main()
{
    auto nthreads = std::thread::hardware_concurrency() ?: 1;
    std::cout > events(nevents);

    int nparts = nevents / nthreads;

    std::vector> distributions;
    for (int i = 0; i  threads;
    for (int i = 0; i < nthreads - 1; i++) {
        auto left  = begin(events) + nparts * i;
        auto right = begin(events) + nparts * (i+1);

        auto f = [l=std::move(left), r=std::move(right), &d=distributions] { thread_func(l, r, d); };

        threads.emplace_back(f);
    }
    auto f = [l=begin(events)+nparts*(nthreads-1), r=end(events), &d=distributions] { thread_func(l, r, d); };
    threads.emplace_back(f);

    for (auto&& t : threads)
        t.join();
}

Solution

Questions

Is it a good practice to solve such task in this manner?

Yes. Sort of.

Distributing work across threads will get the work done quicker usually (especially when there is no interactions between threads).

BUT: The cost of spawning threads is relatively high. Doing it for such a simple problem may not be worth the cost (you should time it and try). So I am going to assume this is a trivial example just for review.

Normally you would spin up your threads into a generic pool. Then send the job to the pool for processing. The threads will continue to live after they have done the job waiting for more jobs to be added to the pool

Secondly this is a very low level interface.

C++ has added the concept of future and promise as a higher level interface to threading.

If it is, do I optimally pass and receive parameters in thread_func?

Sure that looks fine.
Code Review

So here ds is a shared resource used by all threads.

void thread_func(Iterator b, Iterator e, Distributions& ds)


If there is any mutation in this object then any gains can immediately be lost because of contention. So either the usage of this object has to be very well known or it needs to be const. To prevent contention you should make that a read only object that is shared across the threads.

So even if there is only 1 for concurrency you are spinning up threads?

auto nthreads = std::thread::hardware_concurrency() ?: 1;


If you want to use this I would check that you have more than one core available.

auto nthreads = std::thread::hardware_concurrency();
    if (nthreads < 2) {
        thread_func(std::begin(events), std::end(events), distributions);
    }
    else // Do it in parallel.
    {
    }


Don't rely on statement size scope:

for (int i = 0; i < 10; i++)
        distributions.emplace_back(5);


It is just as easy to write:

for (int i = 0; i < 10; i++) {
        distributions.emplace_back(5);
    }


This will protect you from accidental errors.

You don't need to re-calculate these every time.

auto left  = begin(events) + nparts * i;
        auto right = begin(events) + nparts * (i+1);


Also if you update your code to use another container (or any generic container) the operator+ may not work. You should be using std::advance()

auto  segmentBegin = begin(events);
auto  segmentEnd   = segmentBegin;
std::advance(segmentEnd, nparts);

for (int i = 0; i < nthreads - 1; i++) {
   threads.emplace_back([l=segmentBegin, r=segmentEnd, &d=distributions] { thread_func(l, r, d); });

    
    segmentBegin = segmentEnd;
    std::advance(segmentEnd, nparts);
}
threads.emplace_back([l=segmentBegin, r=std::end(events), &d=distributions] { thread_func(l, r, d); });


Iterators are very cheap objects.

l=std::move(left), r=std::move(right)


don't see the need to move them.

Code Snippets

void thread_func(Iterator b, Iterator e, Distributions& ds)
auto nthreads = std::thread::hardware_concurrency() ?: 1;
auto nthreads = std::thread::hardware_concurrency();
    if (nthreads < 2) {
        thread_func(std::begin(events), std::end(events), distributions);
    }
    else // Do it in parallel.
    {
    }
for (int i = 0; i < 10; i++)
        distributions.emplace_back(5);
for (int i = 0; i < 10; i++) {
        distributions.emplace_back(5);
    }

Context

StackExchange Code Review Q#104975, answer score: 6

Revisions (0)

No revisions yet.