snippetcppCritical
What's the most efficient way to erase duplicates and sort a vector?
Viewed 0 times
efficientsorteraseandthevectorwayduplicateswhatmost
Problem
I need to take a C++ vector with potentially a lot of elements, erase duplicates, and sort it.
I currently have the below code, but it doesn't work.
How can I correctly do this?
Additionally, is it faster to erase the duplicates first (similar to coded above) or perform the sort first? If I do perform the sort first, is it guaranteed to remain sorted after
Or is there another (perhaps more efficient) way to do all this?
I currently have the below code, but it doesn't work.
vec.erase(
std::unique(vec.begin(), vec.end()),
vec.end());
std::sort(vec.begin(), vec.end());How can I correctly do this?
Additionally, is it faster to erase the duplicates first (similar to coded above) or perform the sort first? If I do perform the sort first, is it guaranteed to remain sorted after
std::unique is executed?Or is there another (perhaps more efficient) way to do all this?
Solution
I agree with R. Pate and Todd Gardner; a
Let's compare three approaches:
Just using vector, sort + unique
Convert to set (manually)
Convert to set (using a constructor)
Here's how these perform as the number of duplicates changes:
Summary: when the number of duplicates is large enough, it's actually faster to convert to a set and then dump the data back into a vector.
And for some reason, doing the set conversion manually seems to be faster than using the set constructor -- at least on the toy random data that I used.
std::set might be a good idea here. Even if you're stuck using vectors, if you have enough duplicates, you might be better off creating a set to do the dirty work.Let's compare three approaches:
Just using vector, sort + unique
sort( vec.begin(), vec.end() );
vec.erase( unique( vec.begin(), vec.end() ), vec.end() );Convert to set (manually)
set s;
unsigned size = vec.size();
for( unsigned i = 0; i < size; ++i ) s.insert( vec[i] );
vec.assign( s.begin(), s.end() );Convert to set (using a constructor)
set s( vec.begin(), vec.end() );
vec.assign( s.begin(), s.end() );Here's how these perform as the number of duplicates changes:
Summary: when the number of duplicates is large enough, it's actually faster to convert to a set and then dump the data back into a vector.
And for some reason, doing the set conversion manually seems to be faster than using the set constructor -- at least on the toy random data that I used.
Code Snippets
sort( vec.begin(), vec.end() );
vec.erase( unique( vec.begin(), vec.end() ), vec.end() );set<int> s;
unsigned size = vec.size();
for( unsigned i = 0; i < size; ++i ) s.insert( vec[i] );
vec.assign( s.begin(), s.end() );set<int> s( vec.begin(), vec.end() );
vec.assign( s.begin(), s.end() );Context
Stack Overflow Q#1041620, score: 760
Revisions (0)
No revisions yet.