patterncppMinor
Searching through a huge set with threads
Viewed 0 times
searchingwiththreadshugethroughset
Problem
I wrote a class that finds values in a huge search volume. I have a pretty strong feeling that my code is not good and would like to get your opinion on it. My experience with C++11 threading is very limited and I think I use the wrong features for my task.
There should be one search running for many
The futures are not really guaranteed to be valid, because the user has to call the
My "algorithm" to fulfil the promises looks very ugly and complicated, but I can't think of a better way.
#include
#include
#include
#include
class Searcher
{
public:
std::future Find(int x) {
m_searches.emplace_back();
m_searches.rbegin()->first = x;
return m_searches.rbegin()->second.get_future();
}
void Search()
{
// very large search data
static const std::vector haystack = { 4, 5, 1, 3, 9, 8, 7, 6, 11, 42, 13 };
std::vector foundIndices;
// find values
for (auto h : haystack) {
for (size_t i = 0; i >> m_searches;
};
int main()
{
Searcher s;
auto s1 = s.Find(1);
auto s2 = s.Find(2);
auto s3 = s.Find(3);
s.Search();
std::cout << s1.get() << std::endl;
std::cout << s2.get() << std::endl;
std::cout << s3.get() << std::endl;
return 0;
}There should be one search running for many
Finds, because the set that is searched is very large. It could however run concurrently to split the search up to multiple threads. For example with two threads: The first thread searches the first half, the second thread searches the second half.The futures are not really guaranteed to be valid, because the user has to call the
Search function. Would it be possible to start the search, once the first .get was called?My "algorithm" to fulfil the promises looks very ugly and complicated, but I can't think of a better way.
Solution
There are definitely some problems if you want to make this threaded. Firstly, access to
Generally in situations like these, it's far better to let
Ok, this is just our normal linear search, but it allows us to start from anywhere in our search range since we're only passing in a few iterators.
Next, we define a
Create our
And then populate it:
Note the
Now we'll set up a
Finally, print out anything that was found:
The whole thing can be found here.
Hopefully this gives you an idea of a different way of approaching your problem.
m_searches is not protected in any way. Since this is a shared bit of memory, it would need to be protected by something (say a mutex) before you modify it.Generally in situations like these, it's far better to let
std::async do a lot of the hard work for you. Also, try and eliminate shared mutable state as much as possible, because it makes life a lot harder. Firstly, let's write a simple search function that'll return an index if we find a match:const std::size_t not_found = -1;
template
std::size_t find_index(Iterator begin, Iterator end,
const T& value, std::size_t start)
{
std::size_t s = start;
Iterator it = begin;
while(it != end) {
if(*it == value) {
return s;
}
++s;
++it;
}
return not_found;
}Ok, this is just our normal linear search, but it allows us to start from anywhere in our search range since we're only passing in a few iterators.
Next, we define a
vector to search over:std::vector v { 4, 5, 1, 3, 9, 8, 7, 6, 11, 42, 13 };
typedef typename std::vector::iterator iter_t;Create our
vector of future to hold results:std::vector> futures;And then populate it:
futures.emplace_back(
std::async(std::launch::async, find_index,
v.begin(), v.begin() + 6, 8, 0));
futures.emplace_back(
std::async(std::launch::async, find_index,
v.begin() + 6, v.end(), 8, 6));Note the
std::launch::async flag specifies we want to create a new thread to run our function in. I've hardcoded the search start and end places here, but you can obviously add some logic to split things up evenly if you'd like. Also, note that we need to explicitly define the template types here, since they won't be deduced.Now we'll set up a
vector to store our results in, and get the results back:std::vector results;
for(auto& future : futures) {
results.emplace_back(future.get());
}Finally, print out anything that was found:
std::for_each(results.begin(), results.end(),
[](std::size_t index)
{ if(index != not_found) std::cout << index << "\n"; });
}The whole thing can be found here.
Hopefully this gives you an idea of a different way of approaching your problem.
Code Snippets
const std::size_t not_found = -1;
template <typename Iterator, typename T>
std::size_t find_index(Iterator begin, Iterator end,
const T& value, std::size_t start)
{
std::size_t s = start;
Iterator it = begin;
while(it != end) {
if(*it == value) {
return s;
}
++s;
++it;
}
return not_found;
}std::vector<int> v { 4, 5, 1, 3, 9, 8, 7, 6, 11, 42, 13 };
typedef typename std::vector<int>::iterator iter_t;std::vector<std::future<std::size_t>> futures;futures.emplace_back(
std::async(std::launch::async, find_index<iter_t, int>,
v.begin(), v.begin() + 6, 8, 0));
futures.emplace_back(
std::async(std::launch::async, find_index<iter_t, int>,
v.begin() + 6, v.end(), 8, 6));std::vector<std::size_t> results;
for(auto& future : futures) {
results.emplace_back(future.get());
}Context
StackExchange Code Review Q#29964, answer score: 3
Revisions (0)
No revisions yet.